diff --git a/docs/CompilerDriverTutorial.html b/docs/CompilerDriverTutorial.html index 2eb452af0fe..cc4707506e3 100644 --- a/docs/CompilerDriverTutorial.html +++ b/docs/CompilerDriverTutorial.html @@ -4,13 +4,13 @@
-Mikhail Glushenkov <foldr@codedegers.com> |
LLVMC is a generic compiler driver, designed to be customizable and -extensible. It plays the same role for LLVM as the gcc program -does for GCC - LLVMC's job is essentially to transform a set of input -files into a set of targets depending on configuration rules and user -options. What makes LLVMC different is that these transformation rules -are completely customizable - in fact, LLVMC knows nothing about the -specifics of transformation (even the command-line options are mostly -not hard-coded) and regards the transformation structure as an -abstract graph. The structure of this graph is completely determined -by plugins, which can be either statically or dynamically linked. This -makes it possible to easily adapt LLVMC for other purposes - for -example, as a build tool for game resources.
-Because LLVMC employs TableGen [1] as its configuration language, you -need to be familiar with it to customize LLVMC.
+LLVMC is a generic compiler driver, which plays the same role for LLVM +as the gcc program does for GCC - the difference being that LLVMC +is designed to be more adaptable and easier to customize. Most of +LLVMC functionality is implemented via plugins, which can be loaded +dynamically or compiled in. This tutorial describes the basic usage +and configuration of LLVMC.
LLVMC tries hard to be as compatible with gcc as possible, -although there are some small differences. Most of the time, however, -you shouldn't be able to notice them:
+In general, LLVMC tries to be command-line compatible with gcc as +much as possible, so most of the familiar options work:
-$ # This works as expected: $ llvmc -O3 -Wall hello.cpp $ ./a.out hello-
One nice feature of LLVMC is that one doesn't have to distinguish -between different compilers for different languages (think g++ and -gcc) - the right toolchain is chosen automatically based on input -language names (which are, in turn, determined from file -extensions). If you want to force files ending with ".c" to compile as -C++, use the -x option, just like you would do it with gcc:
--$ # hello.c is really a C++ file -$ llvmc -x c++ hello.c -$ ./a.out -hello --
On the other hand, when using LLVMC as a linker to combine several C++ -object files you should provide the --linker option since it's -impossible for LLVMC to choose the right linker in that case:
--$ llvmc -c hello.cpp -$ llvmc hello.o -[A lot of link-time errors skipped] -$ llvmc --linker=c++ hello.o -$ ./a.out -hello --
By default, LLVMC uses llvm-gcc to compile the source code. It is -also possible to choose the work-in-progress clang compiler with -the -clang option.
+This will invoke llvm-g++ under the hood (you can see which +commands are executed by using the -v option). For further help on +command-line LLVMC usage, refer to the llvmc --help output.
LLVMC has some built-in options that can't be overridden in the -configuration libraries:
-It's easiest to start working on your own LLVMC plugin by copying the -skeleton project which lives under $LLVMC_DIR/plugins/Simple:
+LLVMC plugins are written mostly using TableGen [1], so you need to +be familiar with it to get anything done.
+Start by compiling plugins/Simple/Simple.td, which is a primitive +wrapper for gcc:
-$ cd $LLVMC_DIR/plugins -$ cp -r Simple MyPlugin -$ cd MyPlugin -$ ls -Makefile PluginMain.cpp Simple.td +$ cd $LLVM_DIR/tools/llvmc +$ make DRIVER_NAME=mygcc BUILTIN_PLUGINS=Simple +$ cat > hello.c +[...] +$ mygcc hello.c +$ ./hello.out +Hello-
As you can see, our basic plugin consists of only two files (not -counting the build script). Simple.td contains TableGen -description of the compilation graph; its format is documented in the -following sections. PluginMain.cpp is just a helper file used to -compile the auto-generated C++ code produced from TableGen source. It -can also contain hook definitions (see below).
-The first thing that you should do is to change the LLVMC_PLUGIN -variable in the Makefile to avoid conflicts (since this variable -is used to name the resulting library):
--LLVMC_PLUGIN=MyPlugin --
It is also a good idea to rename Simple.td to something less -generic:
--$ mv Simple.td MyPlugin.td --
Note that the plugin source directory must be placed under -$LLVMC_DIR/plugins to make use of the existing build -infrastructure. To build a version of the LLVMC executable called -mydriver with your plugin compiled in, use the following command:
--$ cd $LLVMC_DIR -$ make BUILTIN_PLUGINS=MyPlugin DRIVER_NAME=mydriver --
To build your plugin as a dynamic library, just cd to its source -directory and run make. The resulting file will be called -LLVMC$(LLVMC_PLUGIN).$(DLL_EXTENSION) (in our case, -LLVMCMyPlugin.so). This library can be then loaded in with the --load option. Example:
--$ cd $LLVMC_DIR/plugins/Simple -$ make -$ llvmc -load $LLVM_DIR/Release/lib/LLVMCSimple.so --
Sometimes, you will want a 'bare-bones' version of LLVMC that has no -built-in plugins. It can be compiled with the following command:
--$ cd $LLVMC_DIR -$ make BUILTIN_PLUGINS="" --
Each TableGen configuration file should include the common -definitions:
+Here we link our plugin with the LLVMC core statically to form an +executable file called mygcc. It is also possible to build our +plugin as a standalone dynamic library; this is described in the +reference manual.
+Contents of the file Simple.td look like this:
+// Include common definitions include "llvm/CompilerDriver/Common.td" --
Internally, LLVMC stores information about possible source -transformations in form of a graph. Nodes in this graph represent -tools, and edges between two nodes represent a transformation path. A -special "root" node is used to mark entry points for the -transformations. LLVMC also assigns a weight to each edge (more on -this later) to choose between several alternative edges.
-The definition of the compilation graph (see file -plugins/Base/Base.td for an example) is just a list of edges:
--def CompilationGraph : CompilationGraph<[ - Edge<"root", "llvm_gcc_c">, - Edge<"root", "llvm_gcc_assembler">, - ... - Edge<"llvm_gcc_c", "llc">, - Edge<"llvm_gcc_cpp", "llc">, - ... - - OptionalEdge<"llvm_gcc_c", "opt", (case (switch_on "opt"), - (inc_weight))>, - OptionalEdge<"llvm_gcc_cpp", "opt", (case (switch_on "opt"), - (inc_weight))>, - ... - - OptionalEdge<"llvm_gcc_assembler", "llvm_gcc_cpp_linker", - (case (input_languages_contain "c++"), (inc_weight), - (or (parameter_equals "linker", "g++"), - (parameter_equals "linker", "c++")), (inc_weight))>, - ... - - ]>; --
As you can see, the edges can be either default or optional, where -optional edges are differentiated by an additional case expression -used to calculate the weight of this edge. Notice also that we refer -to tools via their names (as strings). This makes it possible to add -edges to an existing compilation graph in plugins without having to -know about all tool definitions used in the graph.
-The default edges are assigned a weight of 1, and optional edges get a -weight of 0 + 2*N where N is the number of tests that evaluated to -true in the case expression. It is also possible to provide an -integer parameter to inc_weight and dec_weight - in this case, -the weight is increased (or decreased) by the provided value instead -of the default 2. It is also possible to change the default weight of -an optional edge by using the default clause of the case -construct.
-When passing an input file through the graph, LLVMC picks the edge -with the maximum weight. To avoid ambiguity, there should be only one -default edge between two nodes (with the exception of the root node, -which gets a special treatment - there you are allowed to specify one -default edge per language).
-When multiple plugins are loaded, their compilation graphs are merged -together. Since multiple edges that have the same end nodes are not -allowed (i.e. the graph is not a multigraph), an edge defined in -several plugins will be replaced by the definition from the plugin -that was loaded last. Plugin load order can be controlled by using the -plugin priority feature described above.
-To get a visual representation of the compilation graph (useful for -debugging), run llvmc --view-graph. You will need dot and -gsview installed for this to work properly.
-Command-line options that the plugin supports are defined by using an -OptionList:
--def Options : OptionList<[ -(switch_option "E", (help "Help string")), -(alias_option "quiet", "q") -... +// Tool descriptions +def gcc : Tool< +[(in_language "c"), + (out_language "executable"), + (output_suffix "out"), + (cmd_line "gcc $INFILE -o $OUTFILE"), + (sink) ]>; --
As you can see, the option list is just a list of DAGs, where each DAG -is an option description consisting of the option name and some -properties. A plugin can define more than one option list (they are -all merged together in the end), which can be handy if one wants to -separate option groups syntactically.
-Possible option types:
----
-- switch_option - a simple boolean switch, for example -time.
-- parameter_option - option that takes an argument, for example --std=c99;
-- parameter_list_option - same as the above, but more than one -occurence of the option is allowed.
-- prefix_option - same as the parameter_option, but the option name -and parameter value are not separated.
-- prefix_list_option - same as the above, but more than one -occurence of the option is allowed; example: -lm -lpthread.
-- alias_option - a special option type for creating -aliases. Unlike other option types, aliases are not allowed to -have any properties besides the aliased option name. Usage -example: (alias_option "preprocess", "E")
-
Possible option properties:
----
-- help - help string associated with this option. Used for ---help output.
-- required - this option is obligatory.
-- hidden - this option should not appear in the --help -output (but should appear in the --help-hidden output).
-- really_hidden - the option should not appear in any help -output.
-- extern - this option is defined in some other plugin, see below.
-
Sometimes, when linking several plugins together, one plugin needs to -access options defined in some other plugin. Because of the way -options are implemented, such options should be marked as -extern. This is what the extern option property is -for. Example:
--... -(switch_option "E", (extern)) -... --
See also the section on plugin priorities.
-The 'case' construct is the main means by which programmability is -achieved in LLVMC. It can be used to calculate edge weights, program -actions and modify the shell commands to be executed. The 'case' -expression is designed after the similarly-named construct in -functional languages and takes the form (case (test_1), statement_1, -(test_2), statement_2, ... (test_N), statement_N). The statements -are evaluated only if the corresponding tests evaluate to true.
-Examples:
--// Edge weight calculation -// Increases edge weight by 5 if "-A" is provided on the -// command-line, and by 5 more if "-B" is also provided. -(case - (switch_on "A"), (inc_weight 5), - (switch_on "B"), (inc_weight 5)) +// Language map +def LanguageMap : LanguageMap<[LangToSuffixes<"c", ["c"]>]>; - -// Tool command line specification - -// Evaluates to "cmdline1" if the option "-A" is provided on the -// command line; to "cmdline2" if "-B" is provided; -// otherwise to "cmdline3". - -(case - (switch_on "A"), "cmdline1", - (switch_on "B"), "cmdline2", - (default), "cmdline3") +// Compilation graph +def CompilationGraph : CompilationGraph<[Edge<"root", "gcc">]>;-
Note the slight difference in 'case' expression handling in contexts -of edge weights and command line specification - in the second example -the value of the "B" switch is never checked when switch "A" is -enabled, and the whole expression always evaluates to "cmdline1" in -that case.
-Case expressions can also be nested, i.e. the following is legal:
--(case (switch_on "E"), (case (switch_on "o"), ..., (default), ...) - (default), ...) --
You should, however, try to avoid doing that because it hurts -readability. It is usually better to split tool descriptions and/or -use TableGen inheritance instead.
-As you can see, this file consists of three parts: tool descriptions, +language map, and the compilation graph definition.
+At the heart of LLVMC is the idea of a compilation graph: vertices in +this graph are tools, and edges represent a transformation path +between two tools (for example, assembly source produced by the +compiler can be transformed into executable code by an assembler). The +compilation graph is basically a list of edges; a special node named +root is used to mark graph entry points.
+Tool descriptions are represented as property lists: most properties +in the example above should be self-explanatory; the sink property +means that all options lacking an explicit description should be +forwarded to this tool.
+The LanguageMap associates a language name with a list of suffixes +and is used for deciding which toolchain corresponds to a given input +file.
+To learn more about LLVMC customization, refer to the reference +manual and plugin source code in the plugins directory.
As was said earlier, nodes in the compilation graph represent tools, -which are described separately. A tool definition looks like this -(taken from the include/llvm/CompilerDriver/Tools.td file):
--def llvm_gcc_cpp : Tool<[ - (in_language "c++"), - (out_language "llvm-assembler"), - (output_suffix "bc"), - (cmd_line "llvm-g++ -c $INFILE -o $OUTFILE -emit-llvm"), - (sink) - ]>; --
This defines a new tool called llvm_gcc_cpp, which is an alias for -llvm-g++. As you can see, a tool definition is just a list of -properties; most of them should be self-explanatory. The sink -property means that this tool should be passed all command-line -options that aren't mentioned in the option list.
-The complete list of all currently implemented tool properties follows.
-A tool often needs to react to command-line options, and this is -precisely what the actions property is for. The next example -illustrates this feature:
--def llvm_gcc_linker : Tool<[ - (in_language "object-code"), - (out_language "executable"), - (output_suffix "out"), - (cmd_line "llvm-gcc $INFILE -o $OUTFILE"), - (join), - (actions (case (not_empty "L"), (forward "L"), - (not_empty "l"), (forward "l"), - (not_empty "dummy"), - [(append_cmd "-dummy1"), (append_cmd "-dummy2")]) - ]>; --
The actions tool property is implemented on top of the omnipresent -case expression. It associates one or more different actions -with given conditions - in the example, the actions are forward, -which forwards a given option unchanged, and append_cmd, which -appends a given string to the tool execution command. Multiple actions -can be associated with a single condition by using a list of actions -(used in the example to append some dummy options). The same case -construct can also be used in the cmd_line property to modify the -tool command line.
-The "join" property used in the example means that this tool behaves -like a linker.
-The list of all possible actions follows.
-Possible actions:
----
-- append_cmd - append a string to the tool invocation -command. -Example: (case (switch_on "pthread"), (append_cmd "-lpthread"))
-- forward - forward an option unchanged. -Example: (forward "Wall").
-- forward_as - Change the name of an option, but forward the -argument unchanged. -Example: (forward_as "O0" "--disable-optimization").
-- output_suffix - modify the output suffix of this -tool. -Example: (output_suffix "i").
-- stop_compilation - stop compilation after this tool processes -its input. Used without arguments.
-- unpack_values - used for for splitting and forwarding -comma-separated lists of options, e.g. -Wa,-foo=bar,-baz is -converted to -foo=bar -baz and appended to the tool invocation -command. -Example: (unpack_values "Wa,").
-
If you are adding support for a new language to LLVMC, you'll need to -modify the language map, which defines mappings from file extensions -to language names. It is used to choose the proper toolchain(s) for a -given input file set. Language map definition looks like this:
--def LanguageMap : LanguageMap< - [LangToSuffixes<"c++", ["cc", "cp", "cxx", "cpp", "CPP", "c++", "C"]>, - LangToSuffixes<"c", ["c"]>, - ... - ]>; --
For example, without those definitions the following command wouldn't work:
--$ llvmc hello.cpp -llvmc: Unknown suffix: cpp --
The language map entries should be added only for tools that are -linked with the root node. Since tools are not allowed to have -multiple output languages, for nodes "inside" the graph the input and -output languages should match. This is enforced at compile-time.
-Normally, LLVMC executes programs from the system PATH. Sometimes, -this is not sufficient: for example, we may want to specify tool names -in the configuration file. This can be achieved via the mechanism of -hooks - to write your own hooks, just add their definitions to the -PluginMain.cpp or drop a .cpp file into the -$LLVMC_DIR/driver directory. Hooks should live in the hooks -namespace and have the signature std::string hooks::MyHookName -(void). They can be used from the cmd_line tool property:
--(cmd_line "$CALL(MyHook)/path/to/file -o $CALL(AnotherHook)") --
It is also possible to use environment variables in the same manner:
--(cmd_line "$ENV(VAR1)/path/to/file -o $ENV(VAR2)") --
To change the command line string based on user-provided options use -the case expression (documented above):
--(cmd_line - (case - (switch_on "E"), - "llvm-g++ -E -x c $INFILE -o $OUTFILE", - (default), - "llvm-g++ -c -x c $INFILE -o $OUTFILE -emit-llvm")) --
It is possible for LLVMC plugins to depend on each other. For example, -one can create edges between nodes defined in some other plugin. To -make this work, however, that plugin should be loaded first. To -achieve this, the concept of plugin priority was introduced. By -default, every plugin has priority zero; to specify the priority -explicitly, put the following line in your plugin's TableGen file:
--def Priority : PluginPriority<$PRIORITY_VALUE>; -# Where PRIORITY_VALUE is some integer > 0 --
Plugins are loaded in order of their (increasing) priority, starting -with 0. Therefore, the plugin with the highest priority value will be -loaded last.
-When writing LLVMC plugins, it can be useful to get a visual view of -the resulting compilation graph. This can be achieved via the command -line option --view-graph. This command assumes that Graphviz [2] and -Ghostview [3] are installed. There is also a --dump-graph option that -creates a Graphviz source file(compilation-graph.dot) in the -current directory.
-[1] | TableGen Fundamentals + |
[1] | TableGen Fundamentals http://llvm.cs.uiuc.edu/docs/TableGenFundamentals.html |
[2] | Graphviz -http://www.graphviz.org/ |
[3] | Ghostview -http://pages.cs.wisc.edu/~ghost/ |