Update documentation, add examples.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51736 91177308-0d34-0410-b5e6-96231b3b80d8
2025-03-28 21:38:44 +00:00 · 2008-05-30 06:14:42 +00:00 · 2008-05-30 06:14:42 +00:00 · cd0858e170
commit cd0858e170
parent 1e4bab2c02
5 changed files with 238 additions and 74 deletions
--- a/tools/llvmc2/Makefile
+++ b/tools/llvmc2/Makefile
@ -15,8 +15,9 @@ REQUIRES_EH := 1
 include $(LEVEL)/Makefile.common

 GRAPH=Graph.td
-TOOLS=Tools.td
-TOOLS_SOURCE=$(GRAPH) $(TOOLS) Common.td
+$(GRAPH) : Common.td
+Graph.td : Tools.td
+TOOLS_SOURCE=$(GRAPH)

 # TOFIX: integrate this part into Makefile.rules?
 # The degree of horrorshowness in that file is too much for me atm.
--- a/tools/llvmc2/doc/LLVMC-Reference.rst
+++ b/tools/llvmc2/doc/LLVMC-Reference.rst
@ -1,5 +1,5 @@
-Tutorial - Writing LLVMC Configuration files
-=============================================
+Customizing LLVMC: Reference Manual
+===================================

 LLVMC is a generic compiler driver, designed to be customizable and
 extensible. It plays the same role for LLVM as the ``gcc`` program
@ -10,8 +10,7 @@ are completely customizable - in fact, LLVMC knows nothing about the
 specifics of transformation (even the command-line options are mostly
 not hard-coded) and regards the transformation structure as an
 abstract graph. This makes it possible to adapt LLVMC for other
-purposes - for example, as a build tool for game resources. This
-tutorial describes the basic usage and configuration of LLVMC.
+purposes - for example, as a build tool for game resources.

 Because LLVMC employs TableGen [1]_ as its configuration language, you
 need to be familiar with it to customize LLVMC.
@ -19,19 +18,21 @@ need to be familiar with it to customize LLVMC.
 Compiling with LLVMC
 --------------------

-In general, LLVMC tries to be command-line compatible with ``gcc`` as
-much as possible, so most of the familiar options work::
+LLVMC tries hard to be as compatible with ``gcc`` as possible,
+although there are some small differences. Most of the time, however,
+you shouldn't be able to notice them::

+     $ # This works as expected:
     $ llvmc2 -O3 -Wall hello.cpp
     $ ./a.out
     hello

-One nice feature of LLVMC is that you don't have to distinguish
+One nice feature of LLVMC is that one doesn't have to distinguish
 between different compilers for different languages (think ``g++`` and
 ``gcc``) - the right toolchain is chosen automatically based on input
-language names (which are, in turn, determined from file extension). If
-you want to force files ending with ".c" compile as C++, use the
-``-x`` option, just like you would do it with ``gcc``::
+language names (which are, in turn, determined from file
+extensions). If you want to force files ending with ".c" to compile as
+C++, use the ``-x`` option, just like you would do it with ``gcc``::

      $ llvmc2 -x c hello.cpp
      $ # hello.cpp is really a C file
@ -49,25 +50,36 @@ impossible for LLVMC to choose the right linker in that case::
    $ ./a.out
    hello

-For further help on command-line LLVMC usage, refer to the ``llvmc
--help`` output.

 Customizing LLVMC: the compilation graph
 ----------------------------------------

 At the time of writing LLVMC does not support on-the-fly reloading of
-configuration, so to customize LLVMC you'll have to edit and recompile
-the source code (which lives under ``$LLVM_DIR/tools/llvmc2``). The
-relevant files are ``Common.td``, ``Tools.td`` and ``Example.td``.
+configuration, so to customize LLVMC you'll have to recompile the
+source code (which lives under ``$LLVM_DIR/tools/llvmc2``). The
+default configuration files are ``Common.td`` (contains common
+definitions, don't forget to ``include`` it in your configuration
+files), ``Tools.td`` (tool descriptions) and ``Graph.td`` (compilation
+graph definition).

-Internally, LLVMC stores information about possible transformations in
-form of a graph. Nodes in this graph represent tools, and edges
-between two nodes represent a transformation path. A special "root"
-node represents entry points for the transformations. LLVMC also
-assigns a weight to each edge (more on that below) to choose between
-several alternative edges.
+To compile LLVMC with your own configuration file (say,``MyGraph.td``),
+run ``make`` like this::

-The definition of the compilation graph (see file ``Example.td``) is
+    $ cd $LLVM_DIR/tools/llvmc2
+    $ make GRAPH=MyGraph.td TOOLNAME=my_llvmc
+
+This will build an executable named ``my_llvmc``. There are also
+several sample configuration files in the ``llvmc2/examples``
+subdirectory that should help to get you started.
+
+Internally, LLVMC stores information about possible source
+transformations in form of a graph. Nodes in this graph represent
+tools, and edges between two nodes represent a transformation path. A
+special "root" node is used to mark entry points for the
+transformations. LLVMC also assigns a weight to each edge (more on
+this later) to choose between several alternative edges.
+
+The definition of the compilation graph (see file ``Graph.td``) is
 just a list of edges::

    def CompilationGraph : CompilationGraph<[
@ -84,25 +96,46 @@ just a list of edges::
        ...

        OptionalEdge<llvm_gcc_assembler, llvm_gcc_cpp_linker,
-            [(if_input_languages_contain "c++"),
-             (or (parameter_equals "linker", "g++"),
-             (parameter_equals "linker", "c++"))]>,
+            (case (input_languages_contain "c++"), (inc_weight),
+                  (or (parameter_equals "linker", "g++"),
+                      (parameter_equals "linker", "c++")), (inc_weight))>,
        ...

        ]>;

 As you can see, the edges can be either default or optional, where
-optional edges are differentiated by sporting a list of patterns (or
-edge properties) which are used to calculate the edge's weight. The
-default edges are assigned a weight of 1, and optional edges get a
-weight of 0 + 2*N where N is the number of succesful edge property
-matches. When passing an input file through the graph, LLVMC picks the
-edge with the maximum weight. To avoid ambiguity, there should be only
-one default edge between two nodes (with the exception of the root
-node, which gets a special treatment - there you are allowed to
-specify one default edge *per language*).
+optional edges are differentiated by sporting a ``case`` expression
+used to calculate the edge's weight.

-* Possible edge properties are:
+The default edges are assigned a weight of 1, and optional edges get a
+weight of 0 + 2*N where N is the number of tests that evaluated to
+true in the ``case`` expression. It is also possible to provide an
+integer parameter to ``inc_weight`` and ``dec_weight`` - in this case,
+the weight is increased (or decreased) by the provided value instead
+of the default 2.
+
+When passing an input file through the graph, LLVMC picks the edge
+with the maximum weight. To avoid ambiguity, there should be only one
+default edge between two nodes (with the exception of the root node,
+which gets a special treatment - there you are allowed to specify one
+default edge *per language*).
+
+To get a visual representation of the compilation graph (useful for
+debugging), run ``llvmc2 --view-graph``. You will need ``dot`` and
+``gsview`` installed for this to work properly.
+
+
+The 'case' construct
+--------------------
+
+The 'case' construct can be used to calculate weights for optional
+edges and to choose between several alternative command line strings
+in the ``cmd_line`` tool property. It is designed after the
+similarly-named construct in functional languages and takes the
+form ``(case (test_1), statement_1, (test_2), statement_2,
+... (test_N), statement_N)``.
+
+* Possible tests are:

  - ``switch_on`` - Returns true if a given command-line option is
    provided by the user. Example: ``(switch_on "opt")``. Note that
@ -116,35 +149,28 @@ specify one default edge *per language*).
  - ``element_in_list`` - Returns true if a command-line parameter list
    includes a given value. Example: ``(parameter_in_list "l", "pthread")``.

-  - ``if_input_languages_contain`` - Returns true if a given input
-    language belongs to the current input language set.
+  - ``input_languages_contain`` - Returns true if a given language
+    belongs to the current input language set. Example:
+    ```(input_languages_contain "c++")``.

-  - ``and`` - Edge property combinator. Returns true if all of its
-    arguments return true. Used like this: ``(and (prop1), (prop2),
-    ... (propN))``. Nesting is allowed, but not encouraged.
+  - ``default`` - Always evaluates to true. Should be used

-  - ``or`` - Edge property combinator that returns true if any one of its
-    arguments returns true. Example: ``(or (prop1), (prop2), ... (propN))``.
+  - ``and`` - A standard logical combinator that returns true iff all
+    of its arguments return true. Used like this: ``(and (test1),
+    (test2), ... (testN))``. Nesting of ``and`` and ``or`` is allowed,
+    but not encouraged.

-  - ``weight`` - Makes it possible to explicitly specify the quantity
-    added to the edge weight if this edge property matches. Used like
-    this: ``(weight N, (prop))``. The inner property can include
-    ``and`` and ``or`` combinators. When N is equal to 2, equivalent
-    to ``(prop)``.
-
-    Example: ``(weight 8, (and (switch_on "a"), (switch_on "b")))``.
-
-
-To get a visual representation of the compilation graph (useful for
-debugging), run ``llvmc2 --view-graph``. You will need ``dot`` and
-``gsview`` installed for this to work properly.
+  - ``or`` - Another logical combinator that returns true only if any
+    one of its arguments returns true. Example: ``(or (test1),
+    (test2), ... (testN))``.


 Writing a tool description
 --------------------------

-As was said earlier, nodes in the compilation graph represent tools. A
-tool definition looks like this (taken from the ``Tools.td`` file)::
+As was said earlier, nodes in the compilation graph represent tools,
+which are described separately. A tool definition looks like this
+(taken from the ``Tools.td`` file)::

  def llvm_gcc_cpp : Tool<[
      (in_language "c++"),
@ -156,9 +182,9 @@ tool definition looks like this (taken from the ``Tools.td`` file)::

 This defines a new tool called ``llvm_gcc_cpp``, which is an alias for
 ``llvm-g++``. As you can see, a tool definition is just a list of
-properties; most of them should be self-evident. The ``sink`` property
-means that this tool should be passed all command-line options that
-aren't handled by the other tools.
+properties; most of them should be self-explanatory. The ``sink``
+property means that this tool should be passed all command-line
+options that lack explicit descriptions.

 The complete list of the currently implemented tool properties follows:

@ -170,9 +196,10 @@ The complete list of the currently implemented tool properties follows:

  - ``output_suffix`` - output file suffix.

-  - ``cmd_line`` - the actual command used to run the tool. You can use
-    ``$INFILE`` and ``$OUTFILE`` variables, as well as output
-    redirection with ``>``.
+  - ``cmd_line`` - the actual command used to run the tool. You can
+    use ``$INFILE`` and ``$OUTFILE`` variables, output redirection
+    with ``>``, hook invocations (``$CALL``), environment variables
+    (via ``$ENV``) and the ``case`` construct (more on this below).

  - ``join`` - this tool is a "join node" in the graph, i.e. it gets a
    list of input files and joins them together. Used for linkers.
@ -188,14 +215,16 @@ The next tool definition is slightly more complex::
      (output_suffix "out"),
      (cmd_line "llvm-gcc $INFILE -o $OUTFILE"),
      (join),
-      (prefix_list_option "L", (forward), (help "add a directory to link path")),
-      (prefix_list_option "l", (forward), (help "search a library when linking")),
-      (prefix_list_option "Wl", (unpack_values), (help "pass options to linker"))
+      (prefix_list_option "L", (forward),
+                          (help "add a directory to link path")),
+      (prefix_list_option "l", (forward),
+                          (help "search a library when linking")),
+      (prefix_list_option "Wl", (unpack_values),
+                          (help "pass options to linker"))
      ]>;

 This tool has a "join" property, which means that it behaves like a
-linker (because of that this tool should be the last in the
-toolchain). This tool also defines several command-line options: ``-l``,
+linker. This tool also defines several command-line options: ``-l``,
 ``-L`` and ``-Wl`` which have their usual meaning. An option has two
 attributes: a name and a (possibly empty) list of properties. All
 currently implemented option types and properties are described below:
@ -223,6 +252,9 @@ currently implemented option types and properties are described below:

   - ``forward`` - forward this option unchanged.

+   - ``output_suffix`` - modify the output suffix of this
+     tool. Example : ``(switch "E", (output_suffix "i")``.
+
   - ``stop_compilation`` - stop compilation after this phase.

   - ``unpack_values`` - used for for splitting and forwarding
@ -230,19 +262,48 @@ currently implemented option types and properties are described below:
     converted to ``-foo=bar -baz`` and appended to the tool invocation
     command.

-   - ``help`` - help string associated with this option.
+   - ``help`` - help string associated with this option. Used for
+     ``--help`` output.

   - ``required`` - this option is obligatory.


+Hooks and environment variables
+-------------------------------
+
+Normally, LLVMC executes programs from the system ``PATH``. Sometimes,
+this is not sufficient: for example, we may want to specify tool names
+in the configuration file. This can be achieved via the mechanism of
+hooks - to compile LLVMC with your hooks, just drop a .cpp file into
+``tools/llvmc2`` directory. Hooks should live in the ``hooks``
+namespace and have the signature ``std::string hooks::MyHookName
+(void)``. They can be used from the ``cmd_line`` tool property::
+
+    (cmd_line "$CALL(MyHook)/path/to/file -o $CALL(AnotherHook)")
+
+It is also possible to use environment variables in the same manner::
+
+   (cmd_line "$ENV(VAR1)/path/to/file -o $ENV(VAR2)")
+
+To change the command line string based on user-provided options use
+the ``case`` expression (which we have already seen before)::
+
+    (cmd_line
+      (case
+        (switch_on "E"),
+           "llvm-g++ -E -x c $INFILE -o $OUTFILE",
+        (default),
+           "llvm-g++ -c -x c $INFILE -o $OUTFILE -emit-llvm"))
+
+
 Language map
 ------------

-One last thing that you need to modify when adding support for a new
-language to LLVMC is the language map, which defines mappings from
+One last thing that you will need to modify when adding support for a
+new language to LLVMC is the language map, which defines mappings from
 file extensions to language names. It is used to choose the proper
-toolchain based on the input. Language map definition is located in
-the file ``Tools.td`` and looks like this::
+toolchain(s) for a given input file set. Language map definition is
+located in the file ``Tools.td`` and looks like this::

    def LanguageMap : LanguageMap<
        [LangToSuffixes<"c++", ["cc", "cp", "cxx", "cpp", "CPP", "c++", "C"]>,
--- a/tools/llvmc2/doc/LLVMC-Tutorial.rst
+++ b/tools/llvmc2/doc/LLVMC-Tutorial.rst
@ -0,0 +1,87 @@
+Tutorial - Using LLVMC
+======================
+
+LLVMC is a generic compiler driver, which plays the same role for LLVM
+as the ``gcc`` program does for GCC - the difference being that LLVMC
+is designed to be more adaptable and easier to customize. This
+tutorial describes the basic usage and configuration of LLVMC.
+
+Compiling with LLVMC
+--------------------
+
+In general, LLVMC tries to be command-line compatible with ``gcc`` as
+much as possible, so most of the familiar options work::
+
+     $ llvmc2 -O3 -Wall hello.cpp
+     $ ./a.out
+     hello
+
+For further help on command-line LLVMC usage, refer to the ``llvmc
+--help`` output.
+
+Using LLVMC to generate toolchain drivers
+-----------------------------------------
+
+At the time of writing LLVMC does not support on-the-fly reloading of
+configuration, so it will be necessary to recompile its source
+code. LLVMC uses TableGen [1]_ as its configuration language, so
+you'll need to familiar with it.
+
+Start by compiling ``examples/Simple.td``, which is a simple wrapper
+for ``gcc``::
+
+    $ cd $LLVM_DIR/tools/llvmc2
+    $ make TOOLNAME=mygcc GRAPH=examples/Simple.td
+    $ edit hello.c
+    $ mygcc hello.c
+    $ ./hello.out
+    Hello
+
+Contents of the file ``Simple.td`` look like this::
+
+    // Include common definitions
+    include "Common.td"
+
+    // Tool descriptions
+    def gcc : Tool<
+    [(in_language "c"),
+     (out_language "executable"),
+     (output_suffix "out"),
+     (cmd_line "gcc $INFILE -o $OUTFILE"),
+     (sink)
+    ]>;
+
+    // Language map
+    def LanguageMap : LanguageMap<[LangToSuffixes<"c", ["c"]>]>;
+
+    // Compilation graph
+    def CompilationGraph : CompilationGraph<[Edge<root, gcc>]>;
+
+As you can see, this file consists of three parts: tool descriptions,
+language map, and the compilation graph definition.
+
+At the heart of LLVMC is the idea of a transformation graph: vertices
+in this graph are tools, and edges signify that there is a
+transformation path between two tools (for example, assembly source
+produced by the compiler can be transformed into executable code by an
+assembler). A special node named ``root`` is used to mark graph entry
+points.
+
+Tool descriptions are basically lists of properties: most properties
+in the example above should be self-explanatory; the ``sink`` property
+means that all options lacking an explicit description should be
+forwarded to this tool.
+
+``LanguageMap`` associates a language name with a list of suffixes and
+is used for deciding which toolchain corresponds to a given input
+file.
+
+To learn more about LLVMC customization, refer to the reference
+manual and sample configuration files in the ``examples`` directory.
+
+References
+==========
+
+.. [1] TableGen Fundamentals
+       http://llvm.cs.uiuc.edu/docs/TableGenFundamentals.html
+
--- a/tools/llvmc2/examples/Clang.td
+++ b/tools/llvmc2/examples/Clang.td
--- a/tools/llvmc2/examples/Simple.td
+++ b/tools/llvmc2/examples/Simple.td
@ -0,0 +1,15 @@
+// A simple wrapper for gcc.
+
+include "Common.td"
+
+def gcc : Tool<
+[(in_language "c"),
+ (out_language "executable"),
+ (output_suffix "out"),
+ (cmd_line "gcc $INFILE -o $OUTFILE"),
+ (sink)
+]>;
+
+def LanguageMap : LanguageMap<[LangToSuffixes<"c", ["c"]>]>;
+
+def CompilationGraph : CompilationGraph<[Edge<root, gcc>]>;