diff --git a/docs/CompilerDriver.html b/docs/CompilerDriver.html new file mode 100644 index 00000000000..a5ba1a68542 --- /dev/null +++ b/docs/CompilerDriver.html @@ -0,0 +1,572 @@ + + +
+ +NOTE: This document is a work in progress!
+This document describes the requirements, design, and configuration of the + LLVM compiler driver, llvmc. The compiler driver knows about LLVM's + tool set and can be configured to know about a variety of compilers for + source languages. It uses this knowledge to execute the tools necessary + to accomplish general compilation, optimization, and linking tasks. The main + purpose of llvmc is to provide a simple and consistent interface to + all compilation tasks. This reduces the burden on the end user who can just + learn to use llvmc instead of the entire LLVM tool set and all the + source language compilers compatible with LLVM.
+The llvmc tool is a configurable compiler + driver. As such, it isn't the compiler, optimizer, + or linker itself but it drives (invokes) other software that perform those + tasks. If you are familiar with the GNU Compiler Collection's gcc + tool, llvmc is very similar.
+The following introductory sections will help you understand why this tool + is necessary and what it does.
+llvmc was invented to make compilation with LLVM based compilers + easier. To accomplish this, llvmc strives to:
+Additionally, llvmc makes it easier to write a compiler for use + with LLVM, because it:
+At a high level, llvmc operation is very simple. The basic action
+ taken by llvmc is to simply invoke some tool or set of tools to fill
+ the user's request for compilation. Every execution of llvmctakes the
+ following sequence of steps:
+
llvmc's operation must be simple, regular and predictable. + Developers need to be able to rely on it to take a consistent approach to + compilation. For example, the invocation:
++ llvmc -O2 x.c y.c z.c -o xyz+
must produce exactly the same results as:
++ llvmc -O2 x.c + llvmc -O2 y.c + llvmc -O2 z.c + llvmc -O2 x.o y.o z.o -o xyz+
To accomplish this, llvmc uses a very simple goal oriented + procedure to do its work. The overall goal is to produce a functioning + executable. To accomplish this, llvmc always attempts to execute a + series of compilation phases in the same sequence. + However, the user's options to llvmc can cause the sequence of phases + to start in the middle or finish early.
+llvmc breaks every compilation task into the following five + distinct phases:
+The following table shows the inputs, outputs, and command line options + applicabe to each phase.
+Phase | +Inputs | +Outputs | +Options | +
---|---|---|---|
Preprocessing | +
|
+
|
+
|
+
Translation | +
|
+
|
+
|
+
Optimization | +
|
+
|
+
|
+
Linking | +
|
+
|
+
|
+
An action, with regard to llvmc is a basic operation that it takes + in order to fulfill the user's request. Each phase of compilation will invoke + zero or more actions in order to accomplish that phase.
+Actions come in two forms:
This section of the document describes the configuration files used by
+ llvmc. Configuration information is relatively static for a
+ given release of LLVM and a front end compiler. However, the details may
+ change from release to release of either. Users are encouraged to simply use
+ the various options of the B
llvmc is highly configurable both on the command line and in +configuration files. The options it understands are generic, consistent and +simple by design. Furthermore, the llvmc options apply to the +compilation of any LLVM enabled programming language. To be enabled as a +supported source language compiler, a compiler writer must provide a +configuration file that tells llvmc how to invoke the compiler +and what its capabilities are. The purpose of the configuration files then +is to allow compiler writers to specify to llvmc how the compiler +should be invoked. Users may but are not advised to alter the compiler's +llvmc configuration.
+ +Because llvmc just invokes other programs, it must deal with the +available command line options for those programs regardless of whether they +were written for LLVM or not. Furthermore, not all compilation front ends will +have the same capabilities. Some front ends will simply generate LLVM assembly +code, others will be able to generate fully optimized byte code. In general, +llvmc doesn't make any assumptions about the capabilities or command +line options of a sub-tool. It simply uses the details found in the configuration +files and leaves it to the compiler writer to specify the configuration +correctly.
+ +This approach means that new compiler front ends can be up and working very +quickly. As a first cut, a front end can simply compile its source to raw +(unoptimized) bytecode or LLVM assembly and llvmc can be configured +to pick up the slack (translate LLVM assembly to bytecode, optimize the +bytecode, generate native assembly, link, etc.). In fact, the front end need +not use any LLVM libraries, and it could be written in any language (instead of +C++). The configuration data will allow the full range of optimization, +assembly, and linking capabilities that LLVM provides to be added to these kinds +of tools. Enabling the rapid development of front-ends is one of the primary +goals of llvmc.
+ +As a compiler front end matures, it may utilize the LLVM libraries and tools +to more efficiently produce optimized bytecode directly in a single compilation +and optimization program. In these cases, multiple tools would not be needed +and the configuration data for the compiler would change.
+ +Configuring llvmc to the needs and capabilities of a source language +compiler is relatively straight forward. A compiler writer must provide a +definition of what to do for each of the five compilation phases for each of +the optimization levels. The specification consists simply of prototypical +command lines into which llvmc can substitute command line +arguments and file names. Note that any given phase can be completely blank if +the source language's compiler combines multiple phases into a single program. +For example, quite often pre-processing, translation, and optimization are +combined into a single program. The specification for such a compiler would have +blank entries for pre-processing and translation but a full command line for +optimization.
+There are two types of configuration files: the master configuration file + and the language specific configuration file. The master configuration file + contains the general configuration of llvmc itself and is supplied + with the tool. It contains information that is source language agnostic. + Language specific configuration files tell llvmc how to invoke the + language's compiler for a variety of different tasks and what other tools + are needed to backfill the compiler's missing features (e.g. + optimization).
+ +llvmc always looks for files of a specific name. It uses the
+ first file with the name its looking for by searching directories in the
+ following order:
+
In the directories searched, a file named master will be + recognized as the master configuration file for llvmc. Note that + users may override the master file with a copy in their home directory + but they are advised not to. This capability is only useful for compiler + implementers needing to alter the master configuration while developing + their compiler front end. When reading the configuration files, the master + files are always read first.
+Language specific configuration files are given specific names to foster + faster lookup. The name of a given language specific configuration file is + the same as the suffix used to identify files containing source in that + language. For example, a configuration file for C++ source might be named + cpp, C, or cxx.
+ +The master configuration file is always read. Which language specific + configuration files are read depends on the command line options and the + suffixes of the file names provided on llvmc's command line. Note + that the --x LANGUAGE option alters the language that llvmc + uses for the subsequent files on the command line. Only the language + specific configuration files actually needed to complete llvmc's + task are read. Other language specific files will be ignored.
+The syntax of the configuration files is yet to be determined. There are
+ two viable options remaining:
+
+ +=head3 Section: [lang=I+] + +This section provides the master configuration data for a given language. The +language specific data will be found in a file named I . + +=over + +=item C I + +This adds the I specified to the list of recognized suffixes for +the I identified in the section. As many suffixes as are commonly used +for source files for the I should be specified. + +=back + +=begin html + + For example, the following might appear for C++: +
+[lang=C++] +suffix=.cpp +suffix=.cxx +suffix=.C ++ +=end html +
+=head3 Section: [general] + +=over + +=item C++ +This item specifies whether the language has a pre-processing phase or not. This +controls whether the B<-E> option works for the language or not. + +=item C
This document uses precise terms in reference to the various artifacts and + concepts related to compilation. The terms used throughout this document are + defined below.
+