mirror of
https://github.com/c64scene-ar/llvm-6502.git
synced 2025-08-05 13:26:55 +00:00
Since the old llvmc was removed, rename llvmc2 to llvmc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60048 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is contained in:
517
tools/llvmc/doc/LLVMC-Reference.rst
Normal file
517
tools/llvmc/doc/LLVMC-Reference.rst
Normal file
@@ -0,0 +1,517 @@
|
||||
===================================
|
||||
Customizing LLVMC: Reference Manual
|
||||
===================================
|
||||
:Author: Mikhail Glushenkov <foldr@codedegers.com>
|
||||
|
||||
LLVMC is a generic compiler driver, designed to be customizable and
|
||||
extensible. It plays the same role for LLVM as the ``gcc`` program
|
||||
does for GCC - LLVMC's job is essentially to transform a set of input
|
||||
files into a set of targets depending on configuration rules and user
|
||||
options. What makes LLVMC different is that these transformation rules
|
||||
are completely customizable - in fact, LLVMC knows nothing about the
|
||||
specifics of transformation (even the command-line options are mostly
|
||||
not hard-coded) and regards the transformation structure as an
|
||||
abstract graph. The structure of this graph is completely determined
|
||||
by plugins, which can be either statically or dynamically linked. This
|
||||
makes it possible to easily adapt LLVMC for other purposes - for
|
||||
example, as a build tool for game resources.
|
||||
|
||||
Because LLVMC employs TableGen [1]_ as its configuration language, you
|
||||
need to be familiar with it to customize LLVMC.
|
||||
|
||||
|
||||
.. contents::
|
||||
|
||||
|
||||
Compiling with LLVMC
|
||||
====================
|
||||
|
||||
LLVMC tries hard to be as compatible with ``gcc`` as possible,
|
||||
although there are some small differences. Most of the time, however,
|
||||
you shouldn't be able to notice them::
|
||||
|
||||
$ # This works as expected:
|
||||
$ llvmc -O3 -Wall hello.cpp
|
||||
$ ./a.out
|
||||
hello
|
||||
|
||||
One nice feature of LLVMC is that one doesn't have to distinguish
|
||||
between different compilers for different languages (think ``g++`` and
|
||||
``gcc``) - the right toolchain is chosen automatically based on input
|
||||
language names (which are, in turn, determined from file
|
||||
extensions). If you want to force files ending with ".c" to compile as
|
||||
C++, use the ``-x`` option, just like you would do it with ``gcc``::
|
||||
|
||||
$ # hello.c is really a C++ file
|
||||
$ llvmc -x c++ hello.c
|
||||
$ ./a.out
|
||||
hello
|
||||
|
||||
On the other hand, when using LLVMC as a linker to combine several C++
|
||||
object files you should provide the ``--linker`` option since it's
|
||||
impossible for LLVMC to choose the right linker in that case::
|
||||
|
||||
$ llvmc -c hello.cpp
|
||||
$ llvmc hello.o
|
||||
[A lot of link-time errors skipped]
|
||||
$ llvmc --linker=c++ hello.o
|
||||
$ ./a.out
|
||||
hello
|
||||
|
||||
|
||||
Predefined options
|
||||
==================
|
||||
|
||||
LLVMC has some built-in options that can't be overridden in the
|
||||
configuration files:
|
||||
|
||||
* ``-o FILE`` - Output file name.
|
||||
|
||||
* ``-x LANGUAGE`` - Specify the language of the following input files
|
||||
until the next -x option.
|
||||
|
||||
* ``-load PLUGIN_NAME`` - Load the specified plugin DLL. Example:
|
||||
``-load $LLVM_DIR/Release/lib/LLVMCSimple.so``.
|
||||
|
||||
* ``-v`` - Enable verbose mode, i.e. print out all executed commands.
|
||||
|
||||
* ``--view-graph`` - Show a graphical representation of the compilation
|
||||
graph. Requires that you have ``dot`` and ``gv`` programs
|
||||
installed. Hidden option, useful for debugging.
|
||||
|
||||
* ``--write-graph`` - Write a ``compilation-graph.dot`` file in the
|
||||
current directory with the compilation graph description in the
|
||||
Graphviz format. Hidden option, useful for debugging.
|
||||
|
||||
* ``--save-temps`` - Write temporary files to the current directory
|
||||
and do not delete them on exit. Hidden option, useful for debugging.
|
||||
|
||||
* ``--help``, ``--help-hidden``, ``--version`` - These options have
|
||||
their standard meaning.
|
||||
|
||||
|
||||
Compiling LLVMC plugins
|
||||
=======================
|
||||
|
||||
It's easiest to start working on your own LLVMC plugin by copying the
|
||||
skeleton project which lives under ``$LLVMC_DIR/plugins/Simple``::
|
||||
|
||||
$ cd $LLVMC_DIR/plugins
|
||||
$ cp -r Simple MyPlugin
|
||||
$ cd MyPlugin
|
||||
$ ls
|
||||
Makefile PluginMain.cpp Simple.td
|
||||
|
||||
As you can see, our basic plugin consists of only two files (not
|
||||
counting the build script). ``Simple.td`` contains TableGen
|
||||
description of the compilation graph; its format is documented in the
|
||||
following sections. ``PluginMain.cpp`` is just a helper file used to
|
||||
compile the auto-generated C++ code produced from TableGen source. It
|
||||
can also contain hook definitions (see `below`__).
|
||||
|
||||
__ hooks_
|
||||
|
||||
The first thing that you should do is to change the ``LLVMC_PLUGIN``
|
||||
variable in the ``Makefile`` to avoid conflicts (since this variable
|
||||
is used to name the resulting library)::
|
||||
|
||||
LLVMC_PLUGIN=MyPlugin
|
||||
|
||||
It is also a good idea to rename ``Simple.td`` to something less
|
||||
generic::
|
||||
|
||||
$ mv Simple.td MyPlugin.td
|
||||
|
||||
Note that the plugin source directory must be placed under
|
||||
``$LLVMC_DIR/plugins`` to make use of the existing build
|
||||
infrastructure. To build a version of the LLVMC executable called
|
||||
``mydriver`` with your plugin compiled in, use the following command::
|
||||
|
||||
$ cd $LLVMC_DIR
|
||||
$ make BUILTIN_PLUGINS=MyPlugin DRIVER_NAME=mydriver
|
||||
|
||||
To build your plugin as a dynamic library, just ``cd`` to its source
|
||||
directory and run ``make``. The resulting file will be called
|
||||
``LLVMC$(LLVMC_PLUGIN).$(DLL_EXTENSION)`` (in our case,
|
||||
``LLVMCMyPlugin.so``). This library can be then loaded in with the
|
||||
``-load`` option. Example::
|
||||
|
||||
$ cd $LLVMC_DIR/plugins/Simple
|
||||
$ make
|
||||
$ llvmc -load $LLVM_DIR/Release/lib/LLVMCSimple.so
|
||||
|
||||
Sometimes, you will want a 'bare-bones' version of LLVMC that has no
|
||||
built-in plugins. It can be compiled with the following command::
|
||||
|
||||
$ cd $LLVMC_DIR
|
||||
$ make BUILTIN_PLUGINS=""
|
||||
|
||||
How plugins are loaded
|
||||
======================
|
||||
|
||||
It is possible for LLVMC plugins to depend on each other. For example,
|
||||
one can create edges between nodes defined in some other plugin. To
|
||||
make this work, however, that plugin should be loaded first. To
|
||||
achieve this, the concept of plugin priority was introduced. By
|
||||
default, every plugin has priority zero; to specify the priority
|
||||
explicitly, put the following line in your ``.td`` file::
|
||||
|
||||
def Priority : PluginPriority<$PRIORITY_VALUE>;
|
||||
# Where PRIORITY_VALUE is some integer > 0
|
||||
|
||||
Plugins are loaded in order of their (increasing) priority, starting
|
||||
with 0. Therefore, the plugin with the highest priority value will be
|
||||
loaded last.
|
||||
|
||||
|
||||
Customizing LLVMC: the compilation graph
|
||||
========================================
|
||||
|
||||
Each TableGen configuration file should include the common
|
||||
definitions::
|
||||
|
||||
include "llvm/CompilerDriver/Common.td"
|
||||
// And optionally:
|
||||
// include "llvm/CompilerDriver/Tools.td"
|
||||
// which contains some useful tool definitions.
|
||||
|
||||
Internally, LLVMC stores information about possible source
|
||||
transformations in form of a graph. Nodes in this graph represent
|
||||
tools, and edges between two nodes represent a transformation path. A
|
||||
special "root" node is used to mark entry points for the
|
||||
transformations. LLVMC also assigns a weight to each edge (more on
|
||||
this later) to choose between several alternative edges.
|
||||
|
||||
The definition of the compilation graph (see file
|
||||
``plugins/Base/Base.td`` for an example) is just a list of edges::
|
||||
|
||||
def CompilationGraph : CompilationGraph<[
|
||||
Edge<"root", "llvm_gcc_c">,
|
||||
Edge<"root", "llvm_gcc_assembler">,
|
||||
...
|
||||
|
||||
Edge<"llvm_gcc_c", "llc">,
|
||||
Edge<"llvm_gcc_cpp", "llc">,
|
||||
...
|
||||
|
||||
OptionalEdge<"llvm_gcc_c", "opt", (case (switch_on "opt"),
|
||||
(inc_weight))>,
|
||||
OptionalEdge<"llvm_gcc_cpp", "opt", (case (switch_on "opt"),
|
||||
(inc_weight))>,
|
||||
...
|
||||
|
||||
OptionalEdge<"llvm_gcc_assembler", "llvm_gcc_cpp_linker",
|
||||
(case (input_languages_contain "c++"), (inc_weight),
|
||||
(or (parameter_equals "linker", "g++"),
|
||||
(parameter_equals "linker", "c++")), (inc_weight))>,
|
||||
...
|
||||
|
||||
]>;
|
||||
|
||||
As you can see, the edges can be either default or optional, where
|
||||
optional edges are differentiated by an additional ``case`` expression
|
||||
used to calculate the weight of this edge. Notice also that we refer
|
||||
to tools via their names (as strings). This makes it possible to add
|
||||
edges to an existing compilation graph in plugins without having to
|
||||
know about all tool definitions used in the graph.
|
||||
|
||||
The default edges are assigned a weight of 1, and optional edges get a
|
||||
weight of 0 + 2*N where N is the number of tests that evaluated to
|
||||
true in the ``case`` expression. It is also possible to provide an
|
||||
integer parameter to ``inc_weight`` and ``dec_weight`` - in this case,
|
||||
the weight is increased (or decreased) by the provided value instead
|
||||
of the default 2.
|
||||
|
||||
When passing an input file through the graph, LLVMC picks the edge
|
||||
with the maximum weight. To avoid ambiguity, there should be only one
|
||||
default edge between two nodes (with the exception of the root node,
|
||||
which gets a special treatment - there you are allowed to specify one
|
||||
default edge *per language*).
|
||||
|
||||
To get a visual representation of the compilation graph (useful for
|
||||
debugging), run ``llvmc --view-graph``. You will need ``dot`` and
|
||||
``gsview`` installed for this to work properly.
|
||||
|
||||
|
||||
Writing a tool description
|
||||
==========================
|
||||
|
||||
As was said earlier, nodes in the compilation graph represent tools,
|
||||
which are described separately. A tool definition looks like this
|
||||
(taken from the ``include/llvm/CompilerDriver/Tools.td`` file)::
|
||||
|
||||
def llvm_gcc_cpp : Tool<[
|
||||
(in_language "c++"),
|
||||
(out_language "llvm-assembler"),
|
||||
(output_suffix "bc"),
|
||||
(cmd_line "llvm-g++ -c $INFILE -o $OUTFILE -emit-llvm"),
|
||||
(sink)
|
||||
]>;
|
||||
|
||||
This defines a new tool called ``llvm_gcc_cpp``, which is an alias for
|
||||
``llvm-g++``. As you can see, a tool definition is just a list of
|
||||
properties; most of them should be self-explanatory. The ``sink``
|
||||
property means that this tool should be passed all command-line
|
||||
options that lack explicit descriptions.
|
||||
|
||||
The complete list of the currently implemented tool properties follows:
|
||||
|
||||
* Possible tool properties:
|
||||
|
||||
- ``in_language`` - input language name. Can be either a string or a
|
||||
list, in case the tool supports multiple input languages.
|
||||
|
||||
- ``out_language`` - output language name.
|
||||
|
||||
- ``output_suffix`` - output file suffix.
|
||||
|
||||
- ``cmd_line`` - the actual command used to run the tool. You can
|
||||
use ``$INFILE`` and ``$OUTFILE`` variables, output redirection
|
||||
with ``>``, hook invocations (``$CALL``), environment variables
|
||||
(via ``$ENV``) and the ``case`` construct (more on this below).
|
||||
|
||||
- ``join`` - this tool is a "join node" in the graph, i.e. it gets a
|
||||
list of input files and joins them together. Used for linkers.
|
||||
|
||||
- ``sink`` - all command-line options that are not handled by other
|
||||
tools are passed to this tool.
|
||||
|
||||
The next tool definition is slightly more complex::
|
||||
|
||||
def llvm_gcc_linker : Tool<[
|
||||
(in_language "object-code"),
|
||||
(out_language "executable"),
|
||||
(output_suffix "out"),
|
||||
(cmd_line "llvm-gcc $INFILE -o $OUTFILE"),
|
||||
(join),
|
||||
(prefix_list_option "L", (forward),
|
||||
(help "add a directory to link path")),
|
||||
(prefix_list_option "l", (forward),
|
||||
(help "search a library when linking")),
|
||||
(prefix_list_option "Wl", (unpack_values),
|
||||
(help "pass options to linker"))
|
||||
]>;
|
||||
|
||||
This tool has a "join" property, which means that it behaves like a
|
||||
linker. This tool also defines several command-line options: ``-l``,
|
||||
``-L`` and ``-Wl`` which have their usual meaning. An option has two
|
||||
attributes: a name and a (possibly empty) list of properties. All
|
||||
currently implemented option types and properties are described below:
|
||||
|
||||
* Possible option types:
|
||||
|
||||
- ``switch_option`` - a simple boolean switch, for example ``-time``.
|
||||
|
||||
- ``parameter_option`` - option that takes an argument, for example
|
||||
``-std=c99``;
|
||||
|
||||
- ``parameter_list_option`` - same as the above, but more than one
|
||||
occurence of the option is allowed.
|
||||
|
||||
- ``prefix_option`` - same as the parameter_option, but the option name
|
||||
and parameter value are not separated.
|
||||
|
||||
- ``prefix_list_option`` - same as the above, but more than one
|
||||
occurence of the option is allowed; example: ``-lm -lpthread``.
|
||||
|
||||
- ``alias_option`` - a special option type for creating
|
||||
aliases. Unlike other option types, aliases are not allowed to
|
||||
have any properties besides the aliased option name. Usage
|
||||
example: ``(alias_option "preprocess", "E")``
|
||||
|
||||
|
||||
* Possible option properties:
|
||||
|
||||
- ``append_cmd`` - append a string to the tool invocation command.
|
||||
|
||||
- ``forward`` - forward this option unchanged.
|
||||
|
||||
- ``forward_as`` - Change the name of this option, but forward the
|
||||
argument unchanged. Example: ``(forward_as "--disable-optimize")``.
|
||||
|
||||
- ``output_suffix`` - modify the output suffix of this
|
||||
tool. Example: ``(switch "E", (output_suffix "i")``.
|
||||
|
||||
- ``stop_compilation`` - stop compilation after this phase.
|
||||
|
||||
- ``unpack_values`` - used for for splitting and forwarding
|
||||
comma-separated lists of options, e.g. ``-Wa,-foo=bar,-baz`` is
|
||||
converted to ``-foo=bar -baz`` and appended to the tool invocation
|
||||
command.
|
||||
|
||||
- ``help`` - help string associated with this option. Used for
|
||||
``--help`` output.
|
||||
|
||||
- ``required`` - this option is obligatory.
|
||||
|
||||
|
||||
Option list - specifying all options in a single place
|
||||
======================================================
|
||||
|
||||
It can be handy to have all information about options gathered in a
|
||||
single place to provide an overview. This can be achieved by using a
|
||||
so-called ``OptionList``::
|
||||
|
||||
def Options : OptionList<[
|
||||
(switch_option "E", (help "Help string")),
|
||||
(alias_option "quiet", "q")
|
||||
...
|
||||
]>;
|
||||
|
||||
``OptionList`` is also a good place to specify option aliases.
|
||||
|
||||
Tool-specific option properties like ``append_cmd`` have (obviously)
|
||||
no meaning in the context of ``OptionList``, so the only properties
|
||||
allowed there are ``help`` and ``required``.
|
||||
|
||||
Option lists are used at file scope. See the file
|
||||
``plugins/Clang/Clang.td`` for an example of ``OptionList`` usage.
|
||||
|
||||
.. _hooks:
|
||||
|
||||
Using hooks and environment variables in the ``cmd_line`` property
|
||||
==================================================================
|
||||
|
||||
Normally, LLVMC executes programs from the system ``PATH``. Sometimes,
|
||||
this is not sufficient: for example, we may want to specify tool names
|
||||
in the configuration file. This can be achieved via the mechanism of
|
||||
hooks - to write your own hooks, just add their definitions to the
|
||||
``PluginMain.cpp`` or drop a ``.cpp`` file into the
|
||||
``$LLVMC_DIR/driver`` directory. Hooks should live in the ``hooks``
|
||||
namespace and have the signature ``std::string hooks::MyHookName
|
||||
(void)``. They can be used from the ``cmd_line`` tool property::
|
||||
|
||||
(cmd_line "$CALL(MyHook)/path/to/file -o $CALL(AnotherHook)")
|
||||
|
||||
It is also possible to use environment variables in the same manner::
|
||||
|
||||
(cmd_line "$ENV(VAR1)/path/to/file -o $ENV(VAR2)")
|
||||
|
||||
To change the command line string based on user-provided options use
|
||||
the ``case`` expression (documented below)::
|
||||
|
||||
(cmd_line
|
||||
(case
|
||||
(switch_on "E"),
|
||||
"llvm-g++ -E -x c $INFILE -o $OUTFILE",
|
||||
(default),
|
||||
"llvm-g++ -c -x c $INFILE -o $OUTFILE -emit-llvm"))
|
||||
|
||||
Conditional evaluation: the ``case`` expression
|
||||
===============================================
|
||||
|
||||
The 'case' construct can be used to calculate weights of the optional
|
||||
edges and to choose between several alternative command line strings
|
||||
in the ``cmd_line`` tool property. It is designed after the
|
||||
similarly-named construct in functional languages and takes the form
|
||||
``(case (test_1), statement_1, (test_2), statement_2, ... (test_N),
|
||||
statement_N)``. The statements are evaluated only if the corresponding
|
||||
tests evaluate to true.
|
||||
|
||||
Examples::
|
||||
|
||||
// Increases edge weight by 5 if "-A" is provided on the
|
||||
// command-line, and by 5 more if "-B" is also provided.
|
||||
(case
|
||||
(switch_on "A"), (inc_weight 5),
|
||||
(switch_on "B"), (inc_weight 5))
|
||||
|
||||
// Evaluates to "cmdline1" if option "-A" is provided on the
|
||||
// command line, otherwise to "cmdline2"
|
||||
(case
|
||||
(switch_on "A"), "cmdline1",
|
||||
(switch_on "B"), "cmdline2",
|
||||
(default), "cmdline3")
|
||||
|
||||
Note the slight difference in 'case' expression handling in contexts
|
||||
of edge weights and command line specification - in the second example
|
||||
the value of the ``"B"`` switch is never checked when switch ``"A"`` is
|
||||
enabled, and the whole expression always evaluates to ``"cmdline1"`` in
|
||||
that case.
|
||||
|
||||
Case expressions can also be nested, i.e. the following is legal::
|
||||
|
||||
(case (switch_on "E"), (case (switch_on "o"), ..., (default), ...)
|
||||
(default), ...)
|
||||
|
||||
You should, however, try to avoid doing that because it hurts
|
||||
readability. It is usually better to split tool descriptions and/or
|
||||
use TableGen inheritance instead.
|
||||
|
||||
* Possible tests are:
|
||||
|
||||
- ``switch_on`` - Returns true if a given command-line switch is
|
||||
provided by the user. Example: ``(switch_on "opt")``. Note that
|
||||
you have to define all possible command-line options separately in
|
||||
the tool descriptions. See the next section for the discussion of
|
||||
different kinds of command-line options.
|
||||
|
||||
- ``parameter_equals`` - Returns true if a command-line parameter equals
|
||||
a given value. Example: ``(parameter_equals "W", "all")``.
|
||||
|
||||
- ``element_in_list`` - Returns true if a command-line parameter list
|
||||
includes a given value. Example: ``(parameter_in_list "l", "pthread")``.
|
||||
|
||||
- ``input_languages_contain`` - Returns true if a given language
|
||||
belongs to the current input language set. Example:
|
||||
``(input_languages_contain "c++")``.
|
||||
|
||||
- ``in_language`` - Evaluates to true if the language of the input
|
||||
file equals to the argument. At the moment works only with
|
||||
``cmd_line`` property on non-join nodes. Example: ``(in_language
|
||||
"c++")``.
|
||||
|
||||
- ``not_empty`` - Returns true if a given option (which should be
|
||||
either a parameter or a parameter list) is set by the
|
||||
user. Example: ``(not_empty "o")``.
|
||||
|
||||
- ``default`` - Always evaluates to true. Should always be the last
|
||||
test in the ``case`` expression.
|
||||
|
||||
- ``and`` - A standard logical combinator that returns true iff all
|
||||
of its arguments return true. Used like this: ``(and (test1),
|
||||
(test2), ... (testN))``. Nesting of ``and`` and ``or`` is allowed,
|
||||
but not encouraged.
|
||||
|
||||
- ``or`` - Another logical combinator that returns true only if any
|
||||
one of its arguments returns true. Example: ``(or (test1),
|
||||
(test2), ... (testN))``.
|
||||
|
||||
|
||||
Language map
|
||||
============
|
||||
|
||||
One last thing that you will need to modify when adding support for a
|
||||
new language to LLVMC is the language map, which defines mappings from
|
||||
file extensions to language names. It is used to choose the proper
|
||||
toolchain(s) for a given input file set. Language map definition looks
|
||||
like this::
|
||||
|
||||
def LanguageMap : LanguageMap<
|
||||
[LangToSuffixes<"c++", ["cc", "cp", "cxx", "cpp", "CPP", "c++", "C"]>,
|
||||
LangToSuffixes<"c", ["c"]>,
|
||||
...
|
||||
]>;
|
||||
|
||||
Debugging
|
||||
=========
|
||||
|
||||
When writing LLVMC plugins, it can be useful to get a visual view of
|
||||
the resulting compilation graph. This can be achieved via the command
|
||||
line option ``--view-graph``. This command assumes that Graphviz [2]_ and
|
||||
Ghostview [3]_ are installed. There is also a ``--dump-graph`` option that
|
||||
creates a Graphviz source file(``compilation-graph.dot``) in the
|
||||
current directory.
|
||||
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
.. [1] TableGen Fundamentals
|
||||
http://llvm.cs.uiuc.edu/docs/TableGenFundamentals.html
|
||||
|
||||
.. [2] Graphviz
|
||||
http://www.graphviz.org/
|
||||
|
||||
.. [3] Ghostview
|
||||
http://pages.cs.wisc.edu/~ghost/
|
Reference in New Issue
Block a user