From aaa3da966583bd64ea241369385ebeff8a801c1e Mon Sep 17 00:00:00 2001
From: Reid Spencer
At a high level, llvmc operation is very simple. The basic action
taken by llvmc is to simply invoke some tool or set of tools to fill
the user's request for compilation. Every execution of llvmctakes the
- following sequence of steps:
+ following sequence of steps:
llvmc's operation must be simple, regular and predictable. Developers need to be able to rely on it to take a consistent approach to compilation. For example, the invocation:
-- llvmc -O2 x.c y.c z.c -o xyz+
+ llvmc -O2 x.c y.c z.c -o xyz
must produce exactly the same results as:
-- llvmc -O2 x.c - llvmc -O2 y.c - llvmc -O2 z.c - llvmc -O2 x.o y.o z.o -o xyz+
+ llvmc -O2 x.c
+ llvmc -O2 y.c
+ llvmc -O2 z.c
+ llvmc -O2 x.o y.o z.o -o xyz
To accomplish this, llvmc uses a very simple goal oriented procedure to do its work. The overall goal is to produce a functioning executable. To accomplish this, llvmc always attempts to execute a @@ -254,10 +254,11 @@
An action, with regard to llvmc is a basic operation that it takes in order to fulfill the user's request. Each phase of compilation will invoke zero or more actions in order to accomplish that phase.
-Actions come in two forms:
Actions come in two forms:
+This approach means that new compiler front ends can be up and working very quickly. As a first cut, a front end can simply compile its source to raw @@ -336,15 +337,12 @@ optimization.
There are two types of configuration files: the master configuration file - and the language specific configuration file. The master configuration file - contains the general configuration of llvmc itself and is supplied - with the tool. It contains information that is source language agnostic. - Language specific configuration files tell llvmc how to invoke the - language's compiler for a variety of different tasks and what other tools - are needed to backfill the compiler's missing features (e.g. - optimization).
+Each configuration file provides the details for a single source language + that is to be compiled. This configuration information tells llvmc + how to invoke the language's pre-processor, translator, optimizer, assembler + and linker. Note that a given source language needn't provide all these tools + as many of them exist in llvm currently.
llvmc always looks for files of a specific name. It uses the @@ -365,77 +363,192 @@ optimization.
The first file found in this search will be used. Other files with the + same name will be ignored even if they exist in one of the subsequent search locations.
In the directories searched, a file named master will be - recognized as the master configuration file for llvmc. Note that - users may override the master file with a copy in their home directory - but they are advised not to. This capability is only useful for compiler - implementers needing to alter the master configuration while developing - their compiler front end. When reading the configuration files, the master - files are always read first.
-Language specific configuration files are given specific names to foster - faster lookup. The name of a given language specific configuration file is - the same as the suffix used to identify files containing source in that - language. For example, a configuration file for C++ source might be named - cpp, C, or cxx.
+In the directories searched, each configuration file is given a specific + name to foster faster lookup (so llvmc doesn't have to do directory searches). + The name of a given language specific configuration file is simply the same + as the suffix used to identify files containing source in that language. + For example, a configuration file for C++ source might be named + cpp, C, or cxx. For languages that support multiple + file suffixes, multiple (probably identical) files (or symbolic links) will + need to be provided.
The master configuration file is always read. Which language specific - configuration files are read depends on the command line options and the - suffixes of the file names provided on llvmc's command line. Note +
Which configuration files are read depends on the command line options and + the suffixes of the file names provided on llvmc's command line. Note that the --x LANGUAGE option alters the language that llvmc - uses for the subsequent files on the command line. Only the language - specific configuration files actually needed to complete llvmc's - task are read. Other language specific files will be ignored.
+ uses for the subsequent files on the command line. Only the configuration + files actually needed to complete llvmc's task are read. Other + language specific files will be ignored.The syntax of the configuration files is yet to be determined. There are
- two viable options remaining:
+
The syntax of the configuration files is very simple and somewhat + compatible with Java's property files. Here are the syntax rules:
The following description of configuration items is syntax-less and simply - uses a naming hierarchy to describe the configuration items. Whatever - syntax is chosen will need to map the hierarchy to the given syntax.
+The table below provides definitions of the allowed configuration items + that may appear in a configuration file. Every item has a default value and + does not need to appear in the configuration file. Missing items will have the + default value. Each identifier may appear as all lower case, first letter + capitalized or all upper case.
Name | Value Type | Description | +Default | +||
---|---|---|---|---|---|
LANG ITEMS | |||||
lang.name | +string | +Provides the common name for a language definition. + For example "C++", "Pascal", "FORTRAN", etc. | +blank | ||
Capabilities.hasPreProcessor | +lang.opt1 | +string | +Specifies the parameters to give the optimizer when -O1 is + specified on the llvmc command line. | +-simplifycfg -instcombine -mem2reg | +|
lang.opt2 | +string | +Specifies the parameters to give the optimizer when -O2 is + specified on the llvmc command line. | +TBD | +||
lang.opt3 | +string | +Specifies the parameters to give the optimizer when -O3 is + specified on the llvmc command line. | +TBD | +||
lang.opt4 | +string | +Specifies the parameters to give the optimizer when -O4 is + specified on the llvmc command line. | +TBD | +||
lang.opt5 | +string | +Specifies the parameters to give the optimizer when -O5 is + specified on the llvmc command line. | +TBD | +||
PREPROCESSOR ITEMS | |||||
preprocessor.command | +command | +This provides the command prototype that will be used + to run the preprocessor. Valid substitutions are @in@ for the + input file and @out@ for the output file. This is generally only + used with the -E option. | +<blank> | +||
preprocessor.required | boolean | -This item specifies whether the language has a - pre-processing phase or not. This controls whether the B<-E> option works - for the language or not. | +This item specifies whether the pre-processing phase + is required by the language. If the value is true, then the + preprocessor.command value must not be blank. With this option, + llvmc will always run the preprocessor as it assumes that the + translation and optimization phases don't know how to pre-process their + input. | +false | +|
TRANSLATOR ITEMS | |||||
translator.command | +command | +This provides the command prototype that will be used + to run the translator. Valid substitutions are @in@ for the + input file and @out@ for the output file. | +<blank> | ||
Capabilities.outputFormat | -"bc" or "ll" | +translator.output | +native, bytecode or assembly | This item specifies the kind of output the language's - compiler generates. The choices are either bytecode (bc) or LLVM - assembly (ll). | + translator generates. +bytecode |
Capabilities.understandsOptimization | +translator.preprocesses | boolean | -Indicates whether the compiler for this language understands the - -O options or not | +Indicates that the translator also preprocesses. If this is true, then + llvmc will skip the pre-processing phase whenever the final + phase is not pre-processing. | +false |
translator.optimizers | +boolean | +Indicates that the translator also optimizes. If this is true, then + llvmc will skip the optimization phase whenever the final phase + is optimization or later. | +false | +||
translator.groks_dash_o | +boolean | +Indicates that the translator understands the intent of the + various -On options to llvmc. This will cause the + -On option to be based to the translator instead of the + equivalent options provided by lang.optn. | +false | +||
OPTIMIZER ITEMS | |||||
ASSEMBLER ITEMS | |||||
LINKER ITEMS |