2008-05-06 18:17:19 +00:00
|
|
|
Tutorial - Writing LLVMC Configuration files
|
2008-03-23 08:57:20 +00:00
|
|
|
=============================================
|
|
|
|
|
2008-05-06 18:17:19 +00:00
|
|
|
LLVMC is a generic compiler driver, designed to be customizable and
|
|
|
|
extensible. It plays the same role for LLVM as the ``gcc`` program
|
|
|
|
does for GCC - LLVMC's job is essentially to transform a set of input
|
|
|
|
files into a set of targets depending on configuration rules and user
|
|
|
|
options. What makes LLVMC different is that these transformation rules
|
|
|
|
are completely customizable - in fact, LLVMC knows nothing about the
|
|
|
|
specifics of transformation (even the command-line options are mostly
|
|
|
|
not hard-coded) and regards the transformation structure as an
|
|
|
|
abstract graph. This makes it possible to adapt LLVMC for other
|
|
|
|
purposes - for example, as a build tool for game resources. This
|
|
|
|
tutorial describes the basic usage and configuration of LLVMC.
|
|
|
|
|
|
|
|
Because LLVMC employs TableGen [1]_ as its configuration language, you
|
|
|
|
need to be familiar with it to customize LLVMC.
|
|
|
|
|
|
|
|
Compiling with LLVMC
|
|
|
|
--------------------
|
|
|
|
|
|
|
|
In general, LLVMC tries to be command-line compatible with ``gcc`` as
|
|
|
|
much as possible, so most of the familiar options work::
|
|
|
|
|
|
|
|
$ llvmc2 -O3 -Wall hello.cpp
|
|
|
|
$ ./a.out
|
|
|
|
hello
|
|
|
|
|
|
|
|
One nice feature of LLVMC is that you don't have to distinguish
|
|
|
|
between different compilers for different languages (think ``g++`` and
|
|
|
|
``gcc``) - the right toolchain is chosen automatically based on input
|
|
|
|
language names (which are, in turn, determined from file extension). If
|
|
|
|
you want to force files ending with ".c" compile as C++, use the
|
|
|
|
``-x`` option, just like you would do it with ``gcc``::
|
|
|
|
|
|
|
|
$ llvmc2 -x c hello.cpp
|
|
|
|
$ # hello.cpp is really a C file
|
|
|
|
$ ./a.out
|
|
|
|
hello
|
|
|
|
|
|
|
|
On the other hand, when using LLVMC as a linker to combine several C++
|
|
|
|
object files you should provide the ``--linker`` option since it's
|
|
|
|
impossible for LLVMC to choose the right linker in that case::
|
|
|
|
|
|
|
|
$ llvmc2 -c hello.cpp
|
|
|
|
$ llvmc2 hello.o
|
|
|
|
[A lot of link-time errors skipped]
|
|
|
|
$ llvmc2 --linker=c++ hello.o
|
|
|
|
$ ./a.out
|
|
|
|
hello
|
|
|
|
|
|
|
|
For further help on command-line LLVMC usage, refer to the ``llvmc
|
|
|
|
--help`` output.
|
|
|
|
|
|
|
|
Customizing LLVMC: the compilation graph
|
|
|
|
----------------------------------------
|
|
|
|
|
|
|
|
At the time of writing LLVMC does not support on-the-fly reloading of
|
|
|
|
configuration, so to customize LLVMC you'll have to edit and recompile
|
|
|
|
the source code (which lives under ``$LLVM_DIR/tools/llvmc2``). The
|
|
|
|
relevant files are ``Common.td``, ``Tools.td`` and ``Example.td``.
|
|
|
|
|
|
|
|
Internally, LLVMC stores information about possible transformations in
|
|
|
|
form of a graph. Nodes in this graph represent tools, and edges
|
|
|
|
between two nodes represent a transformation path. A special "root"
|
|
|
|
node represents entry points for the transformations. LLVMC also
|
|
|
|
assigns a weight to each edge (more on that below) to choose between
|
|
|
|
several alternative edges.
|
|
|
|
|
|
|
|
The definition of the compilation graph (see file ``Example.td``) is
|
|
|
|
just a list of edges::
|
|
|
|
|
|
|
|
def CompilationGraph : CompilationGraph<[
|
|
|
|
Edge<root, llvm_gcc_c>,
|
|
|
|
Edge<root, llvm_gcc_assembler>,
|
|
|
|
...
|
|
|
|
|
|
|
|
Edge<llvm_gcc_c, llc>,
|
|
|
|
Edge<llvm_gcc_cpp, llc>,
|
|
|
|
...
|
|
|
|
|
|
|
|
OptionalEdge<llvm_gcc_c, opt, [(switch_on "opt")]>,
|
|
|
|
OptionalEdge<llvm_gcc_cpp, opt, [(switch_on "opt")]>,
|
|
|
|
...
|
|
|
|
|
|
|
|
OptionalEdge<llvm_gcc_assembler, llvm_gcc_cpp_linker,
|
|
|
|
[(if_input_languages_contain "c++"),
|
|
|
|
(or (parameter_equals "linker", "g++"),
|
|
|
|
(parameter_equals "linker", "c++"))]>,
|
|
|
|
...
|
2008-03-23 08:57:20 +00:00
|
|
|
|
2008-05-06 18:17:19 +00:00
|
|
|
]>;
|
|
|
|
|
|
|
|
As you can see, the edges can be either default or optional, where
|
|
|
|
optional edges are differentiated by sporting a list of patterns (or
|
|
|
|
edge properties) which are used to calculate the edge's weight. The
|
|
|
|
default edges are assigned a weight of 1, and optional edges get a
|
|
|
|
weight of 0 + 2*N where N is the number of succesful edge property
|
|
|
|
matches. When passing an input file through the graph, LLVMC picks the
|
|
|
|
edge with the maximum weight. To avoid ambiguity, there should be only
|
|
|
|
one default edge between two nodes (with the exception of the root
|
|
|
|
node, which gets a special treatment - there you are allowed to
|
|
|
|
specify one default edge *per language*).
|
|
|
|
|
|
|
|
* Possible edge properties are:
|
2008-03-23 08:57:20 +00:00
|
|
|
|
2008-05-06 18:17:19 +00:00
|
|
|
- ``switch_on`` - Returns true if a given command-line option is
|
|
|
|
provided by the user. Example: ``(switch_on "opt")``. Note that
|
|
|
|
you have to define all possible command-line options separately in
|
|
|
|
the tool descriptions. See the next section for the discussion of
|
|
|
|
different kinds of command-line options.
|
2008-03-23 08:57:20 +00:00
|
|
|
|
2008-05-06 18:17:19 +00:00
|
|
|
- ``parameter_equals`` - Returns true if a command-line parameter equals
|
|
|
|
a given value. Example: ``(parameter_equals "W", "all")``.
|
2008-03-23 08:57:20 +00:00
|
|
|
|
2008-05-06 18:17:19 +00:00
|
|
|
- ``element_in_list`` - Returns true if a command-line parameter list
|
|
|
|
includes a given value. Example: ``(parameter_in_list "l", "pthread")``.
|
2008-03-23 08:57:20 +00:00
|
|
|
|
2008-05-06 18:17:19 +00:00
|
|
|
- ``if_input_languages_contain`` - Returns true if a given input
|
|
|
|
language belongs to the current input language set.
|
2008-03-23 08:57:20 +00:00
|
|
|
|
2008-05-06 18:17:19 +00:00
|
|
|
- ``and`` - Edge property combinator. Returns true if all of its
|
2008-05-06 18:18:20 +00:00
|
|
|
arguments return true. Used like this: ``(and (prop1), (prop2),
|
|
|
|
... (propN))``. Nesting is allowed, but not encouraged.
|
2008-03-23 08:57:20 +00:00
|
|
|
|
2008-05-06 18:17:19 +00:00
|
|
|
- ``or`` - Edge property combinator that returns true if any one of its
|
2008-05-06 18:18:20 +00:00
|
|
|
arguments returns true. Example: ``(or (prop1), (prop2), ... (propN))``.
|
|
|
|
|
|
|
|
- ``weight`` - Makes it possible to explicitly specify the quantity
|
|
|
|
added to the edge weight if this edge property matches. Used like
|
|
|
|
this: ``(weight N, (prop))``. The inner property can include
|
|
|
|
``and`` and ``or`` combinators. When N is equal to 2, equivalent
|
|
|
|
to ``(prop)``.
|
|
|
|
|
|
|
|
Example: ``(weight 8, (and (switch_on "a"), (switch_on "b")))``.
|
|
|
|
|
2008-03-23 08:57:20 +00:00
|
|
|
|
2008-05-06 18:17:19 +00:00
|
|
|
To get a visual representation of the compilation graph (useful for
|
|
|
|
debugging), run ``llvmc2 --view-graph``. You will need ``dot`` and
|
|
|
|
``gsview`` installed for this to work properly.
|
2008-03-23 08:57:20 +00:00
|
|
|
|
2008-05-06 18:17:19 +00:00
|
|
|
|
|
|
|
Writing a tool description
|
|
|
|
--------------------------
|
|
|
|
|
|
|
|
As was said earlier, nodes in the compilation graph represent tools. A
|
|
|
|
tool definition looks like this (taken from the ``Tools.td`` file)::
|
2008-03-23 08:57:20 +00:00
|
|
|
|
|
|
|
def llvm_gcc_cpp : Tool<[
|
|
|
|
(in_language "c++"),
|
|
|
|
(out_language "llvm-assembler"),
|
|
|
|
(output_suffix "bc"),
|
|
|
|
(cmd_line "llvm-g++ -c $INFILE -o $OUTFILE -emit-llvm"),
|
|
|
|
(sink)
|
|
|
|
]>;
|
|
|
|
|
|
|
|
This defines a new tool called ``llvm_gcc_cpp``, which is an alias for
|
|
|
|
``llvm-g++``. As you can see, a tool definition is just a list of
|
|
|
|
properties; most of them should be self-evident. The ``sink`` property
|
|
|
|
means that this tool should be passed all command-line options that
|
|
|
|
aren't handled by the other tools.
|
|
|
|
|
|
|
|
The complete list of the currently implemented tool properties follows:
|
|
|
|
|
|
|
|
* Possible tool properties:
|
|
|
|
|
2008-05-06 18:17:19 +00:00
|
|
|
- ``in_language`` - input language name.
|
|
|
|
|
|
|
|
- ``out_language`` - output language name.
|
2008-03-23 08:57:20 +00:00
|
|
|
|
2008-05-06 18:17:19 +00:00
|
|
|
- ``output_suffix`` - output file suffix.
|
2008-03-23 08:57:20 +00:00
|
|
|
|
2008-05-06 18:17:19 +00:00
|
|
|
- ``cmd_line`` - the actual command used to run the tool. You can use
|
|
|
|
``$INFILE`` and ``$OUTFILE`` variables, as well as output
|
|
|
|
redirection with ``>``.
|
2008-03-23 08:57:20 +00:00
|
|
|
|
2008-05-06 18:17:19 +00:00
|
|
|
- ``join`` - this tool is a "join node" in the graph, i.e. it gets a
|
2008-03-23 08:57:20 +00:00
|
|
|
list of input files and joins them together. Used for linkers.
|
|
|
|
|
2008-05-06 18:17:19 +00:00
|
|
|
- ``sink`` - all command-line options that are not handled by other
|
2008-03-23 08:57:20 +00:00
|
|
|
tools are passed to this tool.
|
|
|
|
|
|
|
|
The next tool definition is slightly more complex::
|
|
|
|
|
|
|
|
def llvm_gcc_linker : Tool<[
|
|
|
|
(in_language "object-code"),
|
|
|
|
(out_language "executable"),
|
|
|
|
(output_suffix "out"),
|
|
|
|
(cmd_line "llvm-gcc $INFILE -o $OUTFILE"),
|
|
|
|
(join),
|
|
|
|
(prefix_list_option "L", (forward), (help "add a directory to link path")),
|
|
|
|
(prefix_list_option "l", (forward), (help "search a library when linking")),
|
|
|
|
(prefix_list_option "Wl", (unpack_values), (help "pass options to linker"))
|
|
|
|
]>;
|
|
|
|
|
|
|
|
This tool has a "join" property, which means that it behaves like a
|
|
|
|
linker (because of that this tool should be the last in the
|
|
|
|
toolchain). This tool also defines several command-line options: ``-l``,
|
|
|
|
``-L`` and ``-Wl`` which have their usual meaning. An option has two
|
|
|
|
attributes: a name and a (possibly empty) list of properties. All
|
|
|
|
currently implemented option types and properties are described below:
|
|
|
|
|
|
|
|
* Possible option types:
|
|
|
|
|
2008-05-06 18:17:19 +00:00
|
|
|
- ``switch_option`` - a simple boolean switch, for example ``-time``.
|
2008-03-23 08:57:20 +00:00
|
|
|
|
2008-05-06 18:17:19 +00:00
|
|
|
- ``parameter_option`` - option that takes an argument, for example
|
|
|
|
``-std=c99``;
|
|
|
|
|
|
|
|
- ``parameter_list_option`` - same as the above, but more than one
|
2008-03-23 08:57:20 +00:00
|
|
|
occurence of the option is allowed.
|
|
|
|
|
2008-05-06 18:17:19 +00:00
|
|
|
- ``prefix_option`` - same as the parameter_option, but the option name
|
2008-03-23 08:57:20 +00:00
|
|
|
and parameter value are not separated.
|
|
|
|
|
2008-05-06 18:17:19 +00:00
|
|
|
- ``prefix_list_option`` - same as the above, but more than one
|
2008-03-23 08:57:20 +00:00
|
|
|
occurence of the option is allowed; example: ``-lm -lpthread``.
|
|
|
|
|
2008-05-06 18:17:19 +00:00
|
|
|
|
2008-03-23 08:57:20 +00:00
|
|
|
* Possible option properties:
|
|
|
|
|
2008-05-06 18:17:19 +00:00
|
|
|
- ``append_cmd`` - append a string to the tool invocation command.
|
|
|
|
|
|
|
|
- ``forward`` - forward this option unchanged.
|
2008-03-23 08:57:20 +00:00
|
|
|
|
2008-05-06 18:17:19 +00:00
|
|
|
- ``stop_compilation`` - stop compilation after this phase.
|
2008-03-23 08:57:20 +00:00
|
|
|
|
2008-05-06 18:17:19 +00:00
|
|
|
- ``unpack_values`` - used for for splitting and forwarding
|
2008-03-23 08:57:20 +00:00
|
|
|
comma-separated lists of options, e.g. ``-Wa,-foo=bar,-baz`` is
|
|
|
|
converted to ``-foo=bar -baz`` and appended to the tool invocation
|
|
|
|
command.
|
|
|
|
|
2008-05-06 18:17:19 +00:00
|
|
|
- ``help`` - help string associated with this option.
|
|
|
|
|
|
|
|
- ``required`` - this option is obligatory.
|
2008-03-23 08:57:20 +00:00
|
|
|
|
|
|
|
|
|
|
|
Language map
|
|
|
|
------------
|
|
|
|
|
2008-05-06 18:17:19 +00:00
|
|
|
One last thing that you need to modify when adding support for a new
|
|
|
|
language to LLVMC is the language map, which defines mappings from
|
|
|
|
file extensions to language names. It is used to choose the proper
|
|
|
|
toolchain based on the input. Language map definition is located in
|
|
|
|
the file ``Tools.td`` and looks like this::
|
2008-03-23 08:57:20 +00:00
|
|
|
|
|
|
|
def LanguageMap : LanguageMap<
|
|
|
|
[LangToSuffixes<"c++", ["cc", "cp", "cxx", "cpp", "CPP", "c++", "C"]>,
|
|
|
|
LangToSuffixes<"c", ["c"]>,
|
|
|
|
...
|
|
|
|
]>;
|
|
|
|
|
|
|
|
|
|
|
|
References
|
|
|
|
==========
|
|
|
|
|
|
|
|
.. [1] TableGen Fundamentals
|
|
|
|
http://llvm.cs.uiuc.edu/docs/TableGenFundamentals.html
|