From 99a3a2c44c7735cafab8d60fef16c63b7187e76a Mon Sep 17 00:00:00 2001 From: Mikhail Glushenkov Date: Thu, 11 Dec 2008 23:33:33 +0000 Subject: [PATCH] Use correct file for the llvmc tutorial. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60910 91177308-0d34-0410-b5e6-96231b3b80d8 --- docs/CompilerDriverTutorial.html | 609 ++++--------------------------- tools/llvmc/doc/Makefile | 2 +- 2 files changed, 68 insertions(+), 543 deletions(-) diff --git a/docs/CompilerDriverTutorial.html b/docs/CompilerDriverTutorial.html index 2eb452af0fe..cc4707506e3 100644 --- a/docs/CompilerDriverTutorial.html +++ b/docs/CompilerDriverTutorial.html @@ -4,13 +4,13 @@ -Customizing LLVMC: Reference Manual +Tutorial - Using LLVMC -
-

Customizing LLVMC: Reference Manual

+
+

Tutorial - Using LLVMC

@@ -19,574 +19,99 @@
Mikhail Glushenkov <foldr@codedegers.com>
-

LLVMC is a generic compiler driver, designed to be customizable and -extensible. It plays the same role for LLVM as the gcc program -does for GCC - LLVMC's job is essentially to transform a set of input -files into a set of targets depending on configuration rules and user -options. What makes LLVMC different is that these transformation rules -are completely customizable - in fact, LLVMC knows nothing about the -specifics of transformation (even the command-line options are mostly -not hard-coded) and regards the transformation structure as an -abstract graph. The structure of this graph is completely determined -by plugins, which can be either statically or dynamically linked. This -makes it possible to easily adapt LLVMC for other purposes - for -example, as a build tool for game resources.

-

Because LLVMC employs TableGen [1] as its configuration language, you -need to be familiar with it to customize LLVMC.

+

LLVMC is a generic compiler driver, which plays the same role for LLVM +as the gcc program does for GCC - the difference being that LLVMC +is designed to be more adaptable and easier to customize. Most of +LLVMC functionality is implemented via plugins, which can be loaded +dynamically or compiled in. This tutorial describes the basic usage +and configuration of LLVMC.

-

Compiling with LLVMC

-

LLVMC tries hard to be as compatible with gcc as possible, -although there are some small differences. Most of the time, however, -you shouldn't be able to notice them:

+

Compiling with LLVMC

+

In general, LLVMC tries to be command-line compatible with gcc as +much as possible, so most of the familiar options work:

-$ # This works as expected:
 $ llvmc -O3 -Wall hello.cpp
 $ ./a.out
 hello
 
-

One nice feature of LLVMC is that one doesn't have to distinguish -between different compilers for different languages (think g++ and -gcc) - the right toolchain is chosen automatically based on input -language names (which are, in turn, determined from file -extensions). If you want to force files ending with ".c" to compile as -C++, use the -x option, just like you would do it with gcc:

-
-$ # hello.c is really a C++ file
-$ llvmc -x c++ hello.c
-$ ./a.out
-hello
-
-

On the other hand, when using LLVMC as a linker to combine several C++ -object files you should provide the --linker option since it's -impossible for LLVMC to choose the right linker in that case:

-
-$ llvmc -c hello.cpp
-$ llvmc hello.o
-[A lot of link-time errors skipped]
-$ llvmc --linker=c++ hello.o
-$ ./a.out
-hello
-
-

By default, LLVMC uses llvm-gcc to compile the source code. It is -also possible to choose the work-in-progress clang compiler with -the -clang option.

+

This will invoke llvm-g++ under the hood (you can see which +commands are executed by using the -v option). For further help on +command-line LLVMC usage, refer to the llvmc --help output.

-

Predefined options

-

LLVMC has some built-in options that can't be overridden in the -configuration libraries:

-
    -
  • -o FILE - Output file name.
  • -
  • -x LANGUAGE - Specify the language of the following input files -until the next -x option.
  • -
  • -load PLUGIN_NAME - Load the specified plugin DLL. Example: --load $LLVM_DIR/Release/lib/LLVMCSimple.so.
  • -
  • -v - Enable verbose mode, i.e. print out all executed commands.
  • -
  • --view-graph - Show a graphical representation of the compilation -graph. Requires that you have dot and gv programs -installed. Hidden option, useful for debugging.
  • -
  • --write-graph - Write a compilation-graph.dot file in the -current directory with the compilation graph description in the -Graphviz format. Hidden option, useful for debugging.
  • -
  • --save-temps - Write temporary files to the current directory -and do not delete them on exit. Hidden option, useful for debugging.
  • -
  • --help, --help-hidden, --version - These options have -their standard meaning.
  • -
-
-
-

Compiling LLVMC plugins

-

It's easiest to start working on your own LLVMC plugin by copying the -skeleton project which lives under $LLVMC_DIR/plugins/Simple:

+

Using LLVMC to generate toolchain drivers

+

LLVMC plugins are written mostly using TableGen [1], so you need to +be familiar with it to get anything done.

+

Start by compiling plugins/Simple/Simple.td, which is a primitive +wrapper for gcc:

-$ cd $LLVMC_DIR/plugins
-$ cp -r Simple MyPlugin
-$ cd MyPlugin
-$ ls
-Makefile PluginMain.cpp Simple.td
+$ cd $LLVM_DIR/tools/llvmc
+$ make DRIVER_NAME=mygcc BUILTIN_PLUGINS=Simple
+$ cat > hello.c
+[...]
+$ mygcc hello.c
+$ ./hello.out
+Hello
 
-

As you can see, our basic plugin consists of only two files (not -counting the build script). Simple.td contains TableGen -description of the compilation graph; its format is documented in the -following sections. PluginMain.cpp is just a helper file used to -compile the auto-generated C++ code produced from TableGen source. It -can also contain hook definitions (see below).

-

The first thing that you should do is to change the LLVMC_PLUGIN -variable in the Makefile to avoid conflicts (since this variable -is used to name the resulting library):

-
-LLVMC_PLUGIN=MyPlugin
-
-

It is also a good idea to rename Simple.td to something less -generic:

-
-$ mv Simple.td MyPlugin.td
-
-

Note that the plugin source directory must be placed under -$LLVMC_DIR/plugins to make use of the existing build -infrastructure. To build a version of the LLVMC executable called -mydriver with your plugin compiled in, use the following command:

-
-$ cd $LLVMC_DIR
-$ make BUILTIN_PLUGINS=MyPlugin DRIVER_NAME=mydriver
-
-

To build your plugin as a dynamic library, just cd to its source -directory and run make. The resulting file will be called -LLVMC$(LLVMC_PLUGIN).$(DLL_EXTENSION) (in our case, -LLVMCMyPlugin.so). This library can be then loaded in with the --load option. Example:

-
-$ cd $LLVMC_DIR/plugins/Simple
-$ make
-$ llvmc -load $LLVM_DIR/Release/lib/LLVMCSimple.so
-
-

Sometimes, you will want a 'bare-bones' version of LLVMC that has no -built-in plugins. It can be compiled with the following command:

-
-$ cd $LLVMC_DIR
-$ make BUILTIN_PLUGINS=""
-
-
-
-

Customizing LLVMC: the compilation graph

-

Each TableGen configuration file should include the common -definitions:

+

Here we link our plugin with the LLVMC core statically to form an +executable file called mygcc. It is also possible to build our +plugin as a standalone dynamic library; this is described in the +reference manual.

+

Contents of the file Simple.td look like this:

+// Include common definitions
 include "llvm/CompilerDriver/Common.td"
-
-

Internally, LLVMC stores information about possible source -transformations in form of a graph. Nodes in this graph represent -tools, and edges between two nodes represent a transformation path. A -special "root" node is used to mark entry points for the -transformations. LLVMC also assigns a weight to each edge (more on -this later) to choose between several alternative edges.

-

The definition of the compilation graph (see file -plugins/Base/Base.td for an example) is just a list of edges:

-
-def CompilationGraph : CompilationGraph<[
-    Edge<"root", "llvm_gcc_c">,
-    Edge<"root", "llvm_gcc_assembler">,
-    ...
 
-    Edge<"llvm_gcc_c", "llc">,
-    Edge<"llvm_gcc_cpp", "llc">,
-    ...
-
-    OptionalEdge<"llvm_gcc_c", "opt", (case (switch_on "opt"),
-                                      (inc_weight))>,
-    OptionalEdge<"llvm_gcc_cpp", "opt", (case (switch_on "opt"),
-                                              (inc_weight))>,
-    ...
-
-    OptionalEdge<"llvm_gcc_assembler", "llvm_gcc_cpp_linker",
-        (case (input_languages_contain "c++"), (inc_weight),
-              (or (parameter_equals "linker", "g++"),
-                  (parameter_equals "linker", "c++")), (inc_weight))>,
-    ...
-
-    ]>;
-
-

As you can see, the edges can be either default or optional, where -optional edges are differentiated by an additional case expression -used to calculate the weight of this edge. Notice also that we refer -to tools via their names (as strings). This makes it possible to add -edges to an existing compilation graph in plugins without having to -know about all tool definitions used in the graph.

-

The default edges are assigned a weight of 1, and optional edges get a -weight of 0 + 2*N where N is the number of tests that evaluated to -true in the case expression. It is also possible to provide an -integer parameter to inc_weight and dec_weight - in this case, -the weight is increased (or decreased) by the provided value instead -of the default 2. It is also possible to change the default weight of -an optional edge by using the default clause of the case -construct.

-

When passing an input file through the graph, LLVMC picks the edge -with the maximum weight. To avoid ambiguity, there should be only one -default edge between two nodes (with the exception of the root node, -which gets a special treatment - there you are allowed to specify one -default edge per language).

-

When multiple plugins are loaded, their compilation graphs are merged -together. Since multiple edges that have the same end nodes are not -allowed (i.e. the graph is not a multigraph), an edge defined in -several plugins will be replaced by the definition from the plugin -that was loaded last. Plugin load order can be controlled by using the -plugin priority feature described above.

-

To get a visual representation of the compilation graph (useful for -debugging), run llvmc --view-graph. You will need dot and -gsview installed for this to work properly.

-
-
-

Describing options

-

Command-line options that the plugin supports are defined by using an -OptionList:

-
-def Options : OptionList<[
-(switch_option "E", (help "Help string")),
-(alias_option "quiet", "q")
-...
+// Tool descriptions
+def gcc : Tool<
+[(in_language "c"),
+ (out_language "executable"),
+ (output_suffix "out"),
+ (cmd_line "gcc $INFILE -o $OUTFILE"),
+ (sink)
 ]>;
-
-

As you can see, the option list is just a list of DAGs, where each DAG -is an option description consisting of the option name and some -properties. A plugin can define more than one option list (they are -all merged together in the end), which can be handy if one wants to -separate option groups syntactically.

-
    -
  • Possible option types:

    -
    -
      -
    • switch_option - a simple boolean switch, for example -time.
    • -
    • parameter_option - option that takes an argument, for example --std=c99;
    • -
    • parameter_list_option - same as the above, but more than one -occurence of the option is allowed.
    • -
    • prefix_option - same as the parameter_option, but the option name -and parameter value are not separated.
    • -
    • prefix_list_option - same as the above, but more than one -occurence of the option is allowed; example: -lm -lpthread.
    • -
    • alias_option - a special option type for creating -aliases. Unlike other option types, aliases are not allowed to -have any properties besides the aliased option name. Usage -example: (alias_option "preprocess", "E")
    • -
    -
    -
  • -
  • Possible option properties:

    -
    -
      -
    • help - help string associated with this option. Used for ---help output.
    • -
    • required - this option is obligatory.
    • -
    • hidden - this option should not appear in the --help -output (but should appear in the --help-hidden output).
    • -
    • really_hidden - the option should not appear in any help -output.
    • -
    • extern - this option is defined in some other plugin, see below.
    • -
    -
    -
  • -
-
-

External options

-

Sometimes, when linking several plugins together, one plugin needs to -access options defined in some other plugin. Because of the way -options are implemented, such options should be marked as -extern. This is what the extern option property is -for. Example:

-
-...
-(switch_option "E", (extern))
-...
-
-

See also the section on plugin priorities.

-
-
-
-

Conditional evaluation

-

The 'case' construct is the main means by which programmability is -achieved in LLVMC. It can be used to calculate edge weights, program -actions and modify the shell commands to be executed. The 'case' -expression is designed after the similarly-named construct in -functional languages and takes the form (case (test_1), statement_1, -(test_2), statement_2, ... (test_N), statement_N). The statements -are evaluated only if the corresponding tests evaluate to true.

-

Examples:

-
-// Edge weight calculation
 
-// Increases edge weight by 5 if "-A" is provided on the
-// command-line, and by 5 more if "-B" is also provided.
-(case
-    (switch_on "A"), (inc_weight 5),
-    (switch_on "B"), (inc_weight 5))
+// Language map
+def LanguageMap : LanguageMap<[LangToSuffixes<"c", ["c"]>]>;
 
-
-// Tool command line specification
-
-// Evaluates to "cmdline1" if the option "-A" is provided on the
-// command line; to "cmdline2" if "-B" is provided;
-// otherwise to "cmdline3".
-
-(case
-    (switch_on "A"), "cmdline1",
-    (switch_on "B"), "cmdline2",
-    (default), "cmdline3")
+// Compilation graph
+def CompilationGraph : CompilationGraph<[Edge<"root", "gcc">]>;
 
-

Note the slight difference in 'case' expression handling in contexts -of edge weights and command line specification - in the second example -the value of the "B" switch is never checked when switch "A" is -enabled, and the whole expression always evaluates to "cmdline1" in -that case.

-

Case expressions can also be nested, i.e. the following is legal:

-
-(case (switch_on "E"), (case (switch_on "o"), ..., (default), ...)
-      (default), ...)
-
-

You should, however, try to avoid doing that because it hurts -readability. It is usually better to split tool descriptions and/or -use TableGen inheritance instead.

-
    -
  • Possible tests are:
      -
    • switch_on - Returns true if a given command-line switch is -provided by the user. Example: (switch_on "opt").
    • -
    • parameter_equals - Returns true if a command-line parameter equals -a given value. -Example: (parameter_equals "W", "all").
    • -
    • element_in_list - Returns true if a command-line parameter -list contains a given value. -Example: (parameter_in_list "l", "pthread").
    • -
    • input_languages_contain - Returns true if a given language -belongs to the current input language set. -Example: (input_languages_contain "c++").
    • -
    • in_language - Evaluates to true if the input file language -equals to the argument. At the moment works only with cmd_line -and actions (on non-join nodes). -Example: (in_language "c++").
    • -
    • not_empty - Returns true if a given option (which should be -either a parameter or a parameter list) is set by the -user. -Example: (not_empty "o").
    • -
    • default - Always evaluates to true. Should always be the last -test in the case expression.
    • -
    • and - A standard logical combinator that returns true iff all -of its arguments return true. Used like this: (and (test1), -(test2), ... (testN)). Nesting of and and or is allowed, -but not encouraged.
    • -
    • or - Another logical combinator that returns true only if any -one of its arguments returns true. Example: (or (test1), -(test2), ... (testN)).
    • -
    -
  • -
+

As you can see, this file consists of three parts: tool descriptions, +language map, and the compilation graph definition.

+

At the heart of LLVMC is the idea of a compilation graph: vertices in +this graph are tools, and edges represent a transformation path +between two tools (for example, assembly source produced by the +compiler can be transformed into executable code by an assembler). The +compilation graph is basically a list of edges; a special node named +root is used to mark graph entry points.

+

Tool descriptions are represented as property lists: most properties +in the example above should be self-explanatory; the sink property +means that all options lacking an explicit description should be +forwarded to this tool.

+

The LanguageMap associates a language name with a list of suffixes +and is used for deciding which toolchain corresponds to a given input +file.

+

To learn more about LLVMC customization, refer to the reference +manual and plugin source code in the plugins directory.

-

Writing a tool description

-

As was said earlier, nodes in the compilation graph represent tools, -which are described separately. A tool definition looks like this -(taken from the include/llvm/CompilerDriver/Tools.td file):

-
-def llvm_gcc_cpp : Tool<[
-    (in_language "c++"),
-    (out_language "llvm-assembler"),
-    (output_suffix "bc"),
-    (cmd_line "llvm-g++ -c $INFILE -o $OUTFILE -emit-llvm"),
-    (sink)
-    ]>;
-
-

This defines a new tool called llvm_gcc_cpp, which is an alias for -llvm-g++. As you can see, a tool definition is just a list of -properties; most of them should be self-explanatory. The sink -property means that this tool should be passed all command-line -options that aren't mentioned in the option list.

-

The complete list of all currently implemented tool properties follows.

-
    -
  • Possible tool properties:
      -
    • in_language - input language name. Can be either a string or a -list, in case the tool supports multiple input languages.
    • -
    • out_language - output language name. Tools are not allowed to -have multiple output languages.
    • -
    • output_suffix - output file suffix. Can also be changed -dynamically, see documentation on actions.
    • -
    • cmd_line - the actual command used to run the tool. You can -use $INFILE and $OUTFILE variables, output redirection -with >, hook invocations ($CALL), environment variables -(via $ENV) and the case construct.
    • -
    • join - this tool is a "join node" in the graph, i.e. it gets a -list of input files and joins them together. Used for linkers.
    • -
    • sink - all command-line options that are not handled by other -tools are passed to this tool.
    • -
    • actions - A single big case expression that specifies how -this tool reacts on command-line options (described in more detail -below).
    • -
    -
  • -
-
-

Actions

-

A tool often needs to react to command-line options, and this is -precisely what the actions property is for. The next example -illustrates this feature:

-
-def llvm_gcc_linker : Tool<[
-    (in_language "object-code"),
-    (out_language "executable"),
-    (output_suffix "out"),
-    (cmd_line "llvm-gcc $INFILE -o $OUTFILE"),
-    (join),
-    (actions (case (not_empty "L"), (forward "L"),
-                   (not_empty "l"), (forward "l"),
-                   (not_empty "dummy"),
-                             [(append_cmd "-dummy1"), (append_cmd "-dummy2")])
-    ]>;
-
-

The actions tool property is implemented on top of the omnipresent -case expression. It associates one or more different actions -with given conditions - in the example, the actions are forward, -which forwards a given option unchanged, and append_cmd, which -appends a given string to the tool execution command. Multiple actions -can be associated with a single condition by using a list of actions -(used in the example to append some dummy options). The same case -construct can also be used in the cmd_line property to modify the -tool command line.

-

The "join" property used in the example means that this tool behaves -like a linker.

-

The list of all possible actions follows.

-
    -
  • Possible actions:

    -
    -
      -
    • append_cmd - append a string to the tool invocation -command. -Example: (case (switch_on "pthread"), (append_cmd "-lpthread"))
    • -
    • forward - forward an option unchanged. -Example: (forward "Wall").
    • -
    • forward_as - Change the name of an option, but forward the -argument unchanged. -Example: (forward_as "O0" "--disable-optimization").
    • -
    • output_suffix - modify the output suffix of this -tool. -Example: (output_suffix "i").
    • -
    • stop_compilation - stop compilation after this tool processes -its input. Used without arguments.
    • -
    • unpack_values - used for for splitting and forwarding -comma-separated lists of options, e.g. -Wa,-foo=bar,-baz is -converted to -foo=bar -baz and appended to the tool invocation -command. -Example: (unpack_values "Wa,").
    • -
    -
    -
  • -
-
-
-
-

Language map

-

If you are adding support for a new language to LLVMC, you'll need to -modify the language map, which defines mappings from file extensions -to language names. It is used to choose the proper toolchain(s) for a -given input file set. Language map definition looks like this:

-
-def LanguageMap : LanguageMap<
-    [LangToSuffixes<"c++", ["cc", "cp", "cxx", "cpp", "CPP", "c++", "C"]>,
-     LangToSuffixes<"c", ["c"]>,
-     ...
-    ]>;
-
-

For example, without those definitions the following command wouldn't work:

-
-$ llvmc hello.cpp
-llvmc: Unknown suffix: cpp
-
-

The language map entries should be added only for tools that are -linked with the root node. Since tools are not allowed to have -multiple output languages, for nodes "inside" the graph the input and -output languages should match. This is enforced at compile-time.

-
-
-

More advanced topics

-
-

Hooks and environment variables

-

Normally, LLVMC executes programs from the system PATH. Sometimes, -this is not sufficient: for example, we may want to specify tool names -in the configuration file. This can be achieved via the mechanism of -hooks - to write your own hooks, just add their definitions to the -PluginMain.cpp or drop a .cpp file into the -$LLVMC_DIR/driver directory. Hooks should live in the hooks -namespace and have the signature std::string hooks::MyHookName -(void). They can be used from the cmd_line tool property:

-
-(cmd_line "$CALL(MyHook)/path/to/file -o $CALL(AnotherHook)")
-
-

It is also possible to use environment variables in the same manner:

-
-(cmd_line "$ENV(VAR1)/path/to/file -o $ENV(VAR2)")
-
-

To change the command line string based on user-provided options use -the case expression (documented above):

-
-(cmd_line
-  (case
-    (switch_on "E"),
-       "llvm-g++ -E -x c $INFILE -o $OUTFILE",
-    (default),
-       "llvm-g++ -c -x c $INFILE -o $OUTFILE -emit-llvm"))
-
-
-
-

How plugins are loaded

-

It is possible for LLVMC plugins to depend on each other. For example, -one can create edges between nodes defined in some other plugin. To -make this work, however, that plugin should be loaded first. To -achieve this, the concept of plugin priority was introduced. By -default, every plugin has priority zero; to specify the priority -explicitly, put the following line in your plugin's TableGen file:

-
-def Priority : PluginPriority<$PRIORITY_VALUE>;
-# Where PRIORITY_VALUE is some integer > 0
-
-

Plugins are loaded in order of their (increasing) priority, starting -with 0. Therefore, the plugin with the highest priority value will be -loaded last.

-
-
-

Debugging

-

When writing LLVMC plugins, it can be useful to get a visual view of -the resulting compilation graph. This can be achieved via the command -line option --view-graph. This command assumes that Graphviz [2] and -Ghostview [3] are installed. There is also a --dump-graph option that -creates a Graphviz source file(compilation-graph.dot) in the -current directory.

-
-
-
-

References

- +

References

+
-
[1]TableGen Fundamentals +
[1]TableGen Fundamentals http://llvm.cs.uiuc.edu/docs/TableGenFundamentals.html
- - - - - -
[2]Graphviz -http://www.graphviz.org/
- - - - - -
[3]Ghostview -http://pages.cs.wisc.edu/~ghost/

Valid CSS diff --git a/tools/llvmc/doc/Makefile b/tools/llvmc/doc/Makefile index 60ba2ed0b07..d126e51c6ea 100644 --- a/tools/llvmc/doc/Makefile +++ b/tools/llvmc/doc/Makefile @@ -17,7 +17,7 @@ RST2HTML=rst2html --stylesheet=$(RST_CSS) --link-stylesheet all : LLVMC-Reference.html LLVMC-Tutorial.html $(RST_CSS) $(CP) $(RST_CSS) $(DOC_DIR)/$(RST_CSS) $(CP) LLVMC-Reference.html $(DOC_DIR)/CompilerDriver.html - $(CP) LLVMC-Reference.html $(DOC_DIR)/CompilerDriverTutorial.html + $(CP) LLVMC-Tutorial.html $(DOC_DIR)/CompilerDriverTutorial.html LLVMC-Tutorial.html : LLVMC-Tutorial.rst $(RST_CSS) $(RST2HTML) $< $@