The previous articles explain how to build applications using the OCaml-LLVM
bindings, and how to use the API to manipulate the LLVM objects. This was the
“read-only” part of the tutorial, which can be used to analyze LLVMIR.
This part explains how to create LLVMIR, and write a simple application from
scratch, and see how to build and run it.
Modules
As in the previous tutorial, we need
to create a context and a module:
There are two actions that can be done on functions:
declare_function to give only a declaration of the prototype,
define_function to give both the declaration and the implementation.
In both cases, we need to give the signature (return type, number and type of
arguments) of the function.
This is pretty similar to C. We’ll use this to declare the function
int main(void).
The int type is a bit problematic in LLVM (and in C, but for other reasons):
integer types must have a known size in LLVM. While this does not change the
architecture-independent property …
In the previous tutorial, we’ve seen how to use ocamlbuild and make to build
a simple application. In this part, we’ll start exploring the API, and see how
to access values and attributes of LLVM objects.
The base of the code is the same as in part 1: it reads an existing LLVM bitcode
file, for example one generated by clang.
As in previous tutorial part, knowing the LLVM C++ API is not required (but can help).
LLVM objects
The top-level container is a module (llmodule). The module contains global
variables, types and functions, which in turn contains basic blocks, and basic blocks
contain instructions.
Values
In the OCaml bindings, all objects (variables, functions, instructions) are
instances of the opaque type llvalue.
A value has a type, a name, a definition, a list of users, and other things like
attributes (for ex. visibility or linkage options) or aliases.
Each value has a type (lltype), which is a composite object to define the type
of a value and its arguments. To match the real type, it needs to be converted
to a TypeKind.t:
This is the first part of a tutorial series, on how to use the OCaml bindings
for LLVM.
Why use OCaml bindings ? Because you can avoid using the C++ API, spending huge
amounts of time compiling Clang sources, then your plugin, then debugging the
segfaults again and again. The bindings are stable, cover most of the API, and
are quite simple to use, thanks to the Debian packages.
This tutorial is written based on a Debian Sid, things may differ but should
stay similar on other distributions.
The objectives of this first part are:
install the required packages
setup a build environment for ocamlbuild
build a simple application that reads an LLVM bitcode file and prints it
Installation
The required packages are:
llvm-3.5-dev
libllvm-3.5-ocaml-dev
the LLVM and OCaml compilers (llvm-3.5, ocaml)
optionally, clang
The current LLVM version is 3.6, however the OCaml bindings are currently
disabled (See Debian bug
#783919), because of
changes in the required dependencies.
Here are the materials for the talk PICON : Control Flow Integrity on LLVM IR,
given during SSTIC 2015. While SSTIC is a
french-speaking conference, I publish here in English because my other posts
also are in English.
Here is the summary, from the website:
Control flow integrity has been a well explored field of software security for
more than a decade.
However, most of the proposed approaches are stalled in a
proof of concept state - when the implementation is publicly available - or have
been designed with a minimal performance overhead as their primary objective,
sacrificing security.
Currently, none of the proposed approaches can be used to
fully protect real-world programs compiled with most common compilers (e.g. GCC,
Clang/LLVM).
In this paper we describe a control flow integrity enforcement
mechanism whose main objective is security. Our approach is based on
compile-time code instrumentation, making the program communicate with its
external execution monitor. The program is terminated by the monitor as soon as
a control flow integrity violation is detected.
Our approach is implemented as
an LLVM plugin and is working on LLVM’s Intermediate Representation.
I have started a new project (yet another), pretty different from my
usual programming languages: a framework for visualizing data in a
browser. This framework is a Extract-Transform-Visualize tool, where
data come from a database and are rendered by the browser.
Features
While some other project exist, I wanted to create a project with the
following features:
simplicity: it provide objects (widgets) that you just place in
your page as you want. It also provides dashboards to manage
widgets, and in its simplest form you just give the name of a div
element where a graph will be rendered.
modularity: every part of the project can be replaced easily by
another component, either on the server-side (you only need an ajax
server, not especially django) or the client-side (you can use
javascript, svg, flash etc.)
interactive: interactions are important, to make the interface
pretty, and also to navigate in data, or to enhance visualization.
Most recent web toolkits allow a good number of interactions and
animations (and most of them, without using flash)
working with big data sets: existing toolkits generally fail
when dealing with big databases. Here, all requests are asynchronous
and are designed to work on big tables …
Since version 7.0, gdb has gained the ability to execute Python scripts.
This allows to write gdb extensions, commands, or manipulate data in a
very easy way. It can also allow to manipulate graphic data (by spawning
commands in threads), change the program, or even write a firewall (ahem
..). I’ll assume you’re familiar with both gdb commands and basic Python scripts.
The first and very basic test is to check a simple command
(gdb) python print "Hello, world !"
Hello, world !
So far so good. Yet, printing hello world won’t help us to debug our
programs :)
The reference documentation can be found
here,
but does not really help for really manipulating data. I’ll try to give
a few examples here.
The Python script
The first thing to do is to write a script (we’ll call it
gdb-wzdftpd.py) containing the Python commands.
We will define a command to print the Glib’s type
GList,
with nodes and content (which is stored using a void*).
To define a new command, we have to create a new class inherited from
gdb.Command. This class has two mandatory methods, __init__ and
invoke.
I’m currently trying to generated interactive (and animated) charts in
Python + Qt. The wanted library would be:
portable: this is one of the reasons of the choice of PyQt
simple: same reason
interactive: I want to be able to select, for example, the slices of
a pie chart. A signal of events like Qt’s would be perfect
animated: this is useless, but looking at things like
AnyChart
or FusionCharts,
the result is really nice !
light on dependencies: relying on tons of libs makes the project
hard to maintain and not portable, especially for windows where
there is not packaging and dependency system.
free software
A quick search gave me the following products:
matplotlib: mostly for
scientific plots, but there is a nice number of options, a
well-documented API.
pyQwt: Python bindings for Qwt.
Again, it’s more scientific plot than charts
cairoplot: projects looks
dead (or in the "yeah, the project’s not finished, but we’re
recoding it in \$LANG to be faster" syndrome, which is more or
less the same). It generates images, though item maps can be
extracted. The name tells it, it uses Cairo.
pyCha: some nice
charts, uses Cairo. Very simple API (not …
I just released nfqueue-bindings 0.2 and nflog-bindings 0.1. Despite the
difference of versions, functions are almost the same :)
Here is a short diff since previous version:
Add af_family argument to bind operations (allow IPv6 binds)
Add notes on set_queue_maxlen requiring a kernel >= 2.6.20
bugfix: use queue number when creating queue
bugfix: really link Perl binding to Perl library
Fix cmake warning
The code for nfqueue-bindings is now almost ready, I have made some
progress since last week:
you can now modify packets in live, and send the new packet with the verdict
new functions are wrapped, and the creation of the queue can be done in one function
more examples
I have presented a special script for SSTIC,
using the weather to decide if a packet should be accepted or dropped
:)While the utility of the module still has to be proven, it is a good
example of how easy it is to use the new bindings.
The slides can be found online
here,
and contains some code examples (with some funny things ;). They are in
french, but they should be quite easy to understand.
Random ideas:
The Netfilter workshop will be held in Paris from 30 September to 3 October 2008.
Eric has presented nf3d, a nice tool to view netfilter logs (from ulogd2) in 3D.
Gamers will recognize a nice try to convert network logs into Guitar
Hero tracks ;)