OCaml LLVM bindings tutorial, part 1
This is the first part of a tutorial series, on how to use the OCaml bindings for LLVM. Why use OCaml bindings ? Because you can avoid using the C++ API, spending huge amounts of time compiling Clang sources, then your plugin, then debugging the segfaults again and again. The bindings are stable, cover most of the API, and are quite simple to use, thanks to the Debian packages.
This tutorial is written based on a Debian Sid, things may differ but should stay similar on other distributions.
The objectives of this first part are:
- install the required packages
- setup a build environment for ocamlbuild
- build a simple application that reads an LLVM bitcode file and prints it
Installation
The required packages are:
llvm-3.5-dev
libllvm-3.5-ocaml-dev
- the LLVM and OCaml compilers (
llvm-3.5
,ocaml
) - optionally,
clang
The current LLVM version is 3.6, however the OCaml bindings are currently disabled (See Debian bug #783919), because of changes in the required dependencies.
Project Layout
The sources are organized as follows:
part1/
├── build
├── Makefile
└── src
└── tutorial01.ml
First application
First, create file src/tutorial01.ml
:
let _ =
let llctx = Llvm.global_context () in
let llmem = Llvm.MemoryBuffer.of_file Sys.argv.(1) in
let llm = Llvm_bitreader.parse_bitcode llctx llmem in
Llvm.dump_module llm ;
()
Let’s look at the file contents, and comment it a bit.
let llctx = Llvm.global_context () in
LLVM requires a context (LLVMContext
in the C++ API), to transparently own and
manage all data. Here, there is no need to create a context, so we get the
global one
let llmem = Llvm.MemoryBuffer.of_file Sys.argv.(1) in
This line takes the first command-line argument of the application, and uses the
LLVM-OCaml bindings API to read it into memory (as a llmemorybuffer
opaque object).
Input format should be LLVM bitcode, usually a file with the .bc
extension.
let llm = Llvm_bitreader.parse_bitcode llctx llmem in
After reading the LLVM bitcode file, the llmemorybuffer
can now be parsed to
create a LLVM module, in OCaml a llmodule
. In LLVM, a module is a single unit
of code to process. It contains things like functions, structures definitions
and global variables, and usually matches the content of a single file to be compiled.
Llvm.dump_module llm ;
The dump_module
function prints the contents of the module to stderr
, in the
textual LLVM IR form. Its main purpose is debugging, and fits well the goal of
this first tutorial.
Makefile
The build system is certainly not an OCaml strength. To make things a little bit
easier, I’ve decided to use ocamlbuild
, but with a wrapper (a Makefile
) to
simplify arguments. As I don’t like _tags
files, everything will be on the
CLI.
The Makefile
only wraps (more or less) the following command:
export OCAMLPATH=/usr/lib/ocaml/llvm-$(LLVM_VERSION)
ocamlbuild -classic-display -j 0 -cflags -w,@a-4 -use-ocamlfind -pkgs llvm,llvm.bitreader -I src -build-dir build/tutorial01 tutorial01.byte
The options should be rather easy to understand:
- The first group of options
-classic-display -j 0 -cflags -w,@a-4
sets some generic ocamlbuild flags (classic build display, parallel build if possible, and ask the compiler for warnings), -use-ocamlfind -pkgs llvm,llvm.bitreader
are the most important options: they ask ocamlbuild to find thellvm
andllvm.bitreader
packages, required by our example. This is why we have to setOCAMLPATH
to the directory containing the bindings,- the remaining options specifies where the sources are, and where to put the compiled files.
Running the application
We use clang to transform a simple Hello World file to a LLVM bitcode file.
$ clang -c -emit-llvm hello.c
$ file hello.bc
hello.bc: LLVM IR bitcode
We can now use our first application to dump to LLVM bitcode:
$ LD_LIBRARY_PATH=/usr/lib/ocaml/llvm-3.5/ ./build/tutorial01/src/tutorial01.byte ./hello.bc
; ModuleID = './hello.bc'
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-linux-gnu"
@.str = private unnamed_addr constant [14 x i8] c"hello, world\0A\00", align 1
; Function Attrs: nounwind uwtable
define i32 @main() #0 {
%1 = alloca i32, align 4
store i32 0, i32* %1
%2 = call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([14 x i8]* @.str, i32 0, i32 0))
ret i32 0
}
declare i32 @printf(i8*, ...) #1
attributes #0 = { nounwind uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
!llvm.ident = !{!0}
!0 = metadata !{metadata !"Debian clang version 3.5.2-1 (tags/RELEASE_352/final) (based on LLVM 3.5.2)"}
Our example application works as expected. That’s it for part 1 of the tutorial, you should now be able to build an application using the OCaml LLVM bindings.
Example code has been published on github, project ocaml-llvm-tutorial.
To get it, run
$ git clone https://github.com/chifflier/ocaml-llvm-tutorial.git
Next time
In part 2, we’ll see how to iterate on functions, and access simple values and attributes.
Links
- The LLVM Compiler Infastructure
- LLVM Language Reference Manual
- http://llvm.moe/: OCaml bindings documentation