Update documentation to reflect the current state of affairs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50763 91177308-0d34-0410-b5e6-96231b3b80d8
2024-12-14 11:32:34 +00:00 · 2008-05-06 18:17:19 +00:00 · 2008-05-06 18:17:19 +00:00 · 77ddce97ad
commit 77ddce97ad
parent be86712de8
1 changed files with 157 additions and 61 deletions
--- a/tools/llvmc2/doc/LLVMC-Tutorial.rst
+++ b/tools/llvmc2/doc/LLVMC-Tutorial.rst
@ -1,44 +1,141 @@
-Tutorial - Writing LLVMCC Configuration files
+Tutorial - Writing LLVMC Configuration files
 =============================================

-LLVMCC is a generic compiler driver (just like ``gcc``), designed to be
-customizable and extensible. Its job is essentially to transform a set
-of input files into a set of targets, depending on configuration rules
-and user options. This tutorial describes how one can write
-configuration files for ``llvmcc``.
+LLVMC is a generic compiler driver, designed to be customizable and
+extensible. It plays the same role for LLVM as the ``gcc`` program
+does for GCC - LLVMC's job is essentially to transform a set of input
+files into a set of targets depending on configuration rules and user
+options. What makes LLVMC different is that these transformation rules
+are completely customizable - in fact, LLVMC knows nothing about the
+specifics of transformation (even the command-line options are mostly
+not hard-coded) and regards the transformation structure as an
+abstract graph. This makes it possible to adapt LLVMC for other
+purposes - for example, as a build tool for game resources. This
+tutorial describes the basic usage and configuration of LLVMC.

-Because LLVMCC uses TableGen [1]_ as the language of its configuration
-files, you need to be familiar with it.
+Because LLVMC employs TableGen [1]_ as its configuration language, you
+need to be familiar with it to customize LLVMC.

-Describing a toolchain
----------------------
+Compiling with LLVMC
+--------------------

-The main concept that ``llvmcc`` operates with is a *toolchain*, which
-is just a list of tools that process input files in a pipeline-like
-fashion. Toolchain definitions look like this::
+In general, LLVMC tries to be command-line compatible with ``gcc`` as
+much as possible, so most of the familiar options work::

-   def ToolChains : ToolChains<[
-       ToolChain<[llvm_gcc_c, llc, llvm_gcc_assembler, llvm_gcc_linker]>,
-       ToolChain<[llvm_gcc_cpp, llc, llvm_gcc_assembler, llvm_gcc_linker]>,
-       ...
-       ]>;
+     $ llvmc2 -O3 -Wall hello.cpp
+     $ ./a.out
+     hello

-Every configuration file should have a single toolchains list called
-``ToolChains``.
+One nice feature of LLVMC is that you don't have to distinguish
+between different compilers for different languages (think ``g++`` and
+``gcc``) - the right toolchain is chosen automatically based on input
+language names (which are, in turn, determined from file extension). If
+you want to force files ending with ".c" compile as C++, use the
+``-x`` option, just like you would do it with ``gcc``::

-At the time of writing, ``llvmcc`` does not support mixing various
-toolchains together - in other words, all input files should be in the
-same language.
+      $ llvmc2 -x c hello.cpp
+      $ # hello.cpp is really a C file
+      $ ./a.out
+      hello

-Another temporary limitation is that every toolchain should end with a
-"join" node - a linker-like program that combines its inputs into a
-single output file.
+On the other hand, when using LLVMC as a linker to combine several C++
+object files you should provide the ``--linker`` option since it's
+impossible for LLVMC to choose the right linker in that case::

-Describing a tool
-----------------
+    $ llvmc2 -c hello.cpp
+    $ llvmc2 hello.o
+    [A lot of link-time errors skipped]
+    $ llvmc2 --linker=c++ hello.o
+    $ ./a.out
+    hello

-A single element of a toolchain is a tool. A tool definition looks
-like this (taken from the Tools.td file)::
+For further help on command-line LLVMC usage, refer to the ``llvmc
+--help`` output.
+
+Customizing LLVMC: the compilation graph
+----------------------------------------
+
+At the time of writing LLVMC does not support on-the-fly reloading of
+configuration, so to customize LLVMC you'll have to edit and recompile
+the source code (which lives under ``$LLVM_DIR/tools/llvmc2``). The
+relevant files are ``Common.td``, ``Tools.td`` and ``Example.td``.
+
+Internally, LLVMC stores information about possible transformations in
+form of a graph. Nodes in this graph represent tools, and edges
+between two nodes represent a transformation path. A special "root"
+node represents entry points for the transformations. LLVMC also
+assigns a weight to each edge (more on that below) to choose between
+several alternative edges.
+
+The definition of the compilation graph (see file ``Example.td``) is
+just a list of edges::
+
+    def CompilationGraph : CompilationGraph<[
+        Edge<root, llvm_gcc_c>,
+        Edge<root, llvm_gcc_assembler>,
+        ...
+
+        Edge<llvm_gcc_c, llc>,
+        Edge<llvm_gcc_cpp, llc>,
+        ...
+
+        OptionalEdge<llvm_gcc_c, opt, [(switch_on "opt")]>,
+        OptionalEdge<llvm_gcc_cpp, opt, [(switch_on "opt")]>,
+        ...
+
+        OptionalEdge<llvm_gcc_assembler, llvm_gcc_cpp_linker,
+            [(if_input_languages_contain "c++"),
+             (or (parameter_equals "linker", "g++"),
+             (parameter_equals "linker", "c++"))]>,
+        ...
+
+        ]>;
+
+As you can see, the edges can be either default or optional, where
+optional edges are differentiated by sporting a list of patterns (or
+edge properties) which are used to calculate the edge's weight. The
+default edges are assigned a weight of 1, and optional edges get a
+weight of 0 + 2*N where N is the number of succesful edge property
+matches. When passing an input file through the graph, LLVMC picks the
+edge with the maximum weight. To avoid ambiguity, there should be only
+one default edge between two nodes (with the exception of the root
+node, which gets a special treatment - there you are allowed to
+specify one default edge *per language*).
+
+* Possible edge properties are:
+
+  - ``switch_on`` - Returns true if a given command-line option is
+    provided by the user. Example: ``(switch_on "opt")``. Note that
+    you have to define all possible command-line options separately in
+    the tool descriptions. See the next section for the discussion of
+    different kinds of command-line options.
+
+  - ``parameter_equals`` - Returns true if a command-line parameter equals
+    a given value. Example: ``(parameter_equals "W", "all")``.
+
+  - ``element_in_list`` - Returns true if a command-line parameter list
+    includes a given value. Example: ``(parameter_in_list "l", "pthread")``.
+
+  - ``if_input_languages_contain`` - Returns true if a given input
+    language belongs to the current input language set.
+
+  - ``and`` - Edge property combinator. Returns true if all of its
+    arguments return true. Used like this: (and
+    (prop1), (prop2), ... (propN)). Nesting not allowed.
+
+  - ``or`` - Edge property combinator that returns true if any one of its
+    arguments returns true. Example: (or (prop1), (prop2), ... (propN))
+
+To get a visual representation of the compilation graph (useful for
+debugging), run ``llvmc2 --view-graph``. You will need ``dot`` and
+``gsview`` installed for this to work properly.
+
+
+Writing a tool description
+--------------------------
+
+As was said earlier, nodes in the compilation graph represent tools. A
+tool definition looks like this (taken from the ``Tools.td`` file)::

  def llvm_gcc_cpp : Tool<[
      (in_language "c++"),
@ -57,19 +154,21 @@ aren't handled by the other tools.
 The complete list of the currently implemented tool properties follows:

 * Possible tool properties:
-  - in_language - input language name.

-  - out_language - output language name.
+  - ``in_language`` - input language name.

-  - output_suffix - output file suffix.
+  - ``out_language`` - output language name.

-  - cmd_line - the actual command used to run the tool. You can use
-    ``$INFILE`` and ``$OUTFILE`` variables.
+  - ``output_suffix`` - output file suffix.

-  - join - this tool is a "join node" in the graph, i.e. it gets a
+  - ``cmd_line`` - the actual command used to run the tool. You can use
+    ``$INFILE`` and ``$OUTFILE`` variables, as well as output
+    redirection with ``>``.
+
+  - ``join`` - this tool is a "join node" in the graph, i.e. it gets a
    list of input files and joins them together. Used for linkers.

-  - sink - all command-line options that are not handled by other
+  - ``sink`` - all command-line options that are not handled by other
    tools are passed to this tool.

 The next tool definition is slightly more complex::
@ -93,43 +192,48 @@ attributes: a name and a (possibly empty) list of properties. All
 currently implemented option types and properties are described below:

 * Possible option types:
-   - switch_option - a simple boolean switch, for example ``-time``.

-   - parameter_option - option that takes an argument, for example ``-std=c99``;
+   - ``switch_option`` - a simple boolean switch, for example ``-time``.

-   - parameter_list_option - same as the above, but more than one
+   - ``parameter_option`` - option that takes an argument, for example
+     ``-std=c99``;
+
+   - ``parameter_list_option`` - same as the above, but more than one
     occurence of the option is allowed.

-   - prefix_option - same as the parameter_option, but the option name
+   - ``prefix_option`` - same as the parameter_option, but the option name
     and parameter value are not separated.

-   - prefix_list_option - same as the above, but more than one
+   - ``prefix_list_option`` - same as the above, but more than one
     occurence of the option is allowed; example: ``-lm -lpthread``.

+
 * Possible option properties:
-   - append_cmd - append a string to the tool invocation command.

-   - forward - forward this option unchanged.
+   - ``append_cmd`` - append a string to the tool invocation command.

-   - stop_compilation - stop compilation after this phase.
+   - ``forward`` - forward this option unchanged.

-   - unpack_values - used for for splitting and forwarding
+   - ``stop_compilation`` - stop compilation after this phase.
+
+   - ``unpack_values`` - used for for splitting and forwarding
     comma-separated lists of options, e.g. ``-Wa,-foo=bar,-baz`` is
     converted to ``-foo=bar -baz`` and appended to the tool invocation
     command.

-   - help - help string associated with this option.
+   - ``help`` - help string associated with this option.
+
+   - ``required`` - this option is obligatory.

-   - required - this option is obligatory.

 Language map
 ------------

-One last bit that you probably should change is the language map,
-which defines mappings between language names and file extensions. It
-is used internally to choose the proper toolchain based on the names
-of the input files. Language map definition is located in the file
-``Tools.td`` and looks like this::
+One last thing that you need to modify when adding support for a new
+language to LLVMC is the language map, which defines mappings from
+file extensions to language names. It is used to choose the proper
+toolchain based on the input. Language map definition is located in
+the file ``Tools.td`` and looks like this::

    def LanguageMap : LanguageMap<
        [LangToSuffixes<"c++", ["cc", "cp", "cxx", "cpp", "CPP", "c++", "C"]>,
@ -138,14 +242,6 @@ of the input files. Language map definition is located in the file
        ]>;


-Putting it all together
-----------------------
-
-Since at the time of writing LLVMCC does not support on-the-fly
-reloading of the configuration, the only way to test your changes is
-to recompile the program. To do this, ``cd`` to the source code
-directory and run ``make``.
-
 References
 ==========