update to document new lto API

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47764 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is contained in:
Nick Kledzik
2008-02-29 19:34:52 +00:00
parent 02e4e173bf
commit 84e5f77880
+88 -125
View File
@@ -21,23 +21,18 @@
<ul>
<li><a href="#phase1">Phase 1 : Read LLVM Bytecode Files</a></li>
<li><a href="#phase2">Phase 2 : Symbol Resolution</a></li>
<li><a href="#phase3">Phase 3 : Optimize Bytecode Files</a></li>
<li><a href="#phase3">Phase 3 : Optimize Bitcode Files</a></li>
<li><a href="#phase4">Phase 4 : Symbol Resolution after optimization</a></li>
</ul></li>
<li><a href="#lto">LLVMlto</a>
<li><a href="#lto">libLTO</a>
<ul>
<li><a href="#llvmsymbol">LLVMSymbol</a></li>
<li><a href="#readllvmobjectfile">readLLVMObjectFile()</a></li>
<li><a href="#optimizemodules">optimizeModules()</a></li>
<li><a href="#gettargettriple">getTargetTriple()</a></li>
<li><a href="#removemodule">removeModule()</a></li>
<li><a href="#getalignment">getAlignment()</a></li>
</ul></li>
<li><a href="#debug">Debugging Information</a></li>
<li><a href="#lto_module_t">lto_module_t</a></li>
<li><a href="#lto_code_gen_t">lto_code_gen_t</a></li>
</ul>
</ul>
<div class="doc_author">
<p>Written by Devang Patel</p>
<p>Written by Devang Patel and Nick Kledzik</p>
</div>
<!-- *********************************************************************** -->
@@ -49,9 +44,9 @@
<div class="doc_text">
<p>
LLVM features powerful intermodular optimizations which can be used at link
time. Link Time Optimization is another name for intermodular optimization
time. Link Time Optimization (LTO) is another name for intermodular optimization
when performed during the link stage. This document describes the interface
and design between the LLVM intermodular optimizer and the linker.</p>
and design between the LTO optimizer and the linker.</p>
</div>
<!-- *********************************************************************** -->
@@ -68,8 +63,8 @@ the developer take advantage of intermodular optimizations without making any
significant changes to the developer's makefiles or build system. This is
achieved through tight integration with the linker. In this model, the linker
treates LLVM bitcode files like native object files and allows mixing and
matching among them. The linker uses <a href="#lto">LLVMlto</a>, a dynamically
loaded library, to handle LLVM bitcode files. This tight integration between
matching among them. The linker uses <a href="#lto">libLTO</a>, a shared
object, to handle LLVM bitcode files. This tight integration between
the linker and LLVM optimizer helps to do optimizations that are not possible
in other models. The linker input allows the optimizer to avoid relying on
conservative escape analysis.
@@ -136,9 +131,8 @@ $ llvm-gcc -c main.c -o main.o # &lt;-- main.o is native object file
$ llvm-gcc a.o main.o -o main # &lt;-- standard link command without any modifications
</pre></div>
<p>In this example, the linker recognizes that <tt>foo2()</tt> is an
externally visible symbol defined in LLVM bitcode file. This information
is collected using <a href="#readllvmobjectfile"> readLLVMObjectFile()</a>.
Based on this information, the linker completes its usual symbol resolution
externally visible symbol defined in LLVM bitcode file. The linker completes
its usual symbol resolution
pass and finds that <tt>foo2()</tt> is not used anywhere. This information
is used by the LLVM optimizer and it removes <tt>foo2()</tt>. As soon as
<tt>foo2()</tt> is removed, the optimizer recognizes that condition
@@ -183,7 +177,7 @@ $ llvm-gcc a.o main.o -o main # &lt;-- standard link command without any modific
<!-- *********************************************************************** -->
<div class="doc_section">
<a name="multiphase">Multi-phase communication between LLVM and linker</a>
<a name="multiphase">Multi-phase communication between libLTO and linker</a>
</div>
<div class="doc_text">
@@ -208,14 +202,19 @@ $ llvm-gcc a.o main.o -o main # &lt;-- standard link command without any modific
<div class="doc_text">
<p>The linker first reads all object files in natural order and collects
symbol information. This includes native object files as well as LLVM bitcode
files. In this phase, the linker uses
<a href="#readllvmobjectfile"> readLLVMObjectFile() </a> to collect symbol
information from each LLVM bitcode files and updates its internal global
symbol table accordingly. The intent of this interface is to avoid overhead
in the non LLVM case, where all input object files are native object files,
by putting this code in the error path of the linker. When the linker sees
the first llvm .o file, it <tt>dlopen()</tt>s the dynamic library. This is
to allow changes to the LLVM LTO code without relinking the linker.
files. To minimize the cost to the linker in the case that all .o files
are native object files, the linker only calls <tt>lto_module_create()</tt>
when a supplied object file is found to not be a native object file. If
<tt>lto_module_create()</tt> returns that the file is an LLVM bitcode file,
the linker
then iterates over the module using <tt>lto_module_get_symbol_name()</tt> and
<tt>lto_module_get_symbol_attribute()</tt> to get all symbols defined and
referenced.
This information is added to the linker's global symbol table.
</p>
<p>The lto* functions are all implemented in a shared object libLTO. This
allows the LLVM LTO code to be updated independently of the linker tool.
On platforms that support it, the shared object is lazily loaded.
</p>
</div>
@@ -225,12 +224,10 @@ $ llvm-gcc a.o main.o -o main # &lt;-- standard link command without any modific
</div>
<div class="doc_text">
<p>In this stage, the linker resolves symbols using global symbol table
information to report undefined symbol errors, read archive members, resolve
weak symbols, etc. The linker is able to do this seamlessly even though it
does not know the exact content of input LLVM bitcode files because it uses
symbol information provided by
<a href="#readllvmobjectfile">readLLVMObjectFile()</a>. If dead code
<p>In this stage, the linker resolves symbols using global symbol table.
It may report undefined symbol errors, read archive members, replace
weak symbols, etc. The linker is able to do this seamlessly even though it
does not know the exact content of input LLVM bitcode files. If dead code
stripping is enabled then the linker collects the list of live symbols.
</p>
</div>
@@ -240,14 +237,13 @@ $ llvm-gcc a.o main.o -o main # &lt;-- standard link command without any modific
<a name="phase3">Phase 3 : Optimize Bitcode Files</a>
</div>
<div class="doc_text">
<p>After symbol resolution, the linker updates symbol information supplied
by LLVM bitcode files appropriately. For example, whether certain LLVM
bitcode supplied symbols are used or not. In the example above, the linker
reports that <tt>foo2()</tt> is not used anywhere in the program, including
native <tt>.o</tt> files. This information is used by the LLVM interprocedural
optimizer. The linker uses <a href="#optimizemodules">optimizeModules()</a>
and requests an optimized native object file of the LLVM portion of the
program.
<p>After symbol resolution, the linker tells the LTO shared object which
symbols are needed by native object files. In the example above, the linker
reports that only <tt>foo1()</tt> is used by native object files using
<tt>lto_codegen_add_must_preserve_symbol()</tt>. Next the linker invokes
the LLVM optimizer and code generators using <tt>lto_codegen_compile()</tt>
which returns a native object file creating by merging the LLVM bitcode files
and applying various optimization passes.
</p>
</div>
@@ -270,108 +266,75 @@ $ llvm-gcc a.o main.o -o main # &lt;-- standard link command without any modific
<!-- *********************************************************************** -->
<div class="doc_section">
<a name="lto">LLVMlto</a>
<a name="lto">libLTO</a>
</div>
<div class="doc_text">
<p><tt>LLVMlto</tt> is a dynamic library that is part of the LLVM tools, and
is intended for use by a linker. <tt>LLVMlto</tt> provides an abstract C++
<p><tt>libLTO</tt> is a shared object that is part of the LLVM tools, and
is intended for use by a linker. <tt>libLTO</tt> provides an abstract C
interface to use the LLVM interprocedural optimizer without exposing details
of LLVM's internals. The intention is to keep the interface as stable as
possible even when the LLVM optimizer continues to evolve.</p>
possible even when the LLVM optimizer continues to evolve. It should even
be possible for a completely different compilation technology to provide
a different libLTO that works with their object files and the standard
linker tool.</p>
</div>
<!-- ======================================================================= -->
<div class="doc_subsection">
<a name="llvmsymbol">LLVMSymbol</a>
<a name="lto_module_t">lto_module_t</a>
</div>
<div class="doc_text">
<p>The <tt>LLVMSymbol</tt> class is used to describe the externally visible
functions and global variables, defined in LLVM bitcode files, to the linker.
This includes symbol visibility information. This information is used by
the linker to do symbol resolution. For example: function <tt>foo2()</tt> is
defined inside an LLVM bitcode module and it is an externally visible symbol.
This helps the linker connect the use of <tt>foo2()</tt> in native object
files with a future definition of the symbol <tt>foo2()</tt>. The linker
will see the actual definition of <tt>foo2()</tt> when it receives the
optimized native object file in
<a href="#phase4">Symbol Resolution after optimization</a> phase. If the
linker does not find any uses of <tt>foo2()</tt>, it updates LLVMSymbol
visibility information to notify LLVM intermodular optimizer that it is dead.
The LLVM intermodular optimizer takes advantage of such information to
generate better code.</p>
<p>A non-native object file is handled via an <tt>lto_module_t</tt>.
The following functions allow the linker to check if a file (on disk
or in a memory buffer) is a file which libLTO can process: <pre>
lto_module_is_object_file(const char*)
lto_module_is_object_file_for_target(const char*, const char*)
lto_module_is_object_file_in_memory(const void*, size_t)
lto_module_is_object_file_in_memory_for_target(const void*, size_t, const char*)</pre>
If the object file can be processed by libLTO, the linker creates a
<tt>lto_module_t</tt> by using one of <pre>
lto_module_create(const char*)
lto_module_create_from_memory(const void*, size_t)</pre>
and when done, the handle is released via<pre>
lto_module_dispose(lto_module_t)</pre>
The linker can introspect the non-native object file by getting the number
of symbols and getting the name and attributes of each symbol via: <pre>
lto_module_get_num_symbols(lto_module_t)
lto_module_get_symbol_name(lto_module_t, unsigned int)
lto_module_get_symbol_attribute(lto_module_t, unsigned int)</pre>
The attributes of a symbol include the alignment, visibility, and kind.
</p>
</div>
<!-- ======================================================================= -->
<div class="doc_subsection">
<a name="readllvmobjectfile">readLLVMObjectFile()</a>
<a name="lto_code_gen_t">lto_code_gen_t</a>
</div>
<div class="doc_text">
<p>The <tt>readLLVMObjectFile()</tt> function is used by the linker to read
LLVM bitcode files and collect LLVMSymbol information. This routine also
supplies a list of externally defined symbols that are used by LLVM bitcode
files. The linker uses this symbol information to do symbol resolution.
Internally, <a href="#lto">LLVMlto</a> maintains LLVM bitcode modules in
memory. This function also provides a list of external references used by
bitcode files.</p>
</div>
<!-- ======================================================================= -->
<div class="doc_subsection">
<a name="optimizemodules">optimizeModules()</a>
</div>
<div class="doc_text">
<p>The linker invokes <tt>optimizeModules</tt> to optimize already read
LLVM bitcode files by applying LLVM intermodular optimization techniques.
This function runs the LLVM intermodular optimizer and generates native
object code as <tt>.o</tt> files at the name and location provided by the
linker.</p>
</div>
<!-- ======================================================================= -->
<div class="doc_subsection">
<a name="gettargettriple">getTargetTriple()</a>
</div>
<div class="doc_text">
<p>The linker may use <tt>getTargetTriple()</tt> to query target architecture
while validating LLVM bitcode file.</p>
</div>
<!-- ======================================================================= -->
<div class="doc_subsection">
<a name="removemodule">removeModule()</a>
</div>
<div class="doc_text">
<p>Internally, <a href="#lto">LLVMlto</a> maintains LLVM bitcode modules in
memory. The linker may use <tt>removeModule()</tt> method to remove desired
modules from memory. </p>
</div>
<!-- ======================================================================= -->
<div class="doc_subsection">
<a name="getalignment">getAlignment()</a>
</div>
<div class="doc_text">
<p>The linker may use <a href="#llvmsymbol">LLVMSymbol</a> method
<tt>getAlignment()</tt> to query symbol alignment information.</p>
</div>
<!-- *********************************************************************** -->
<div class="doc_section">
<a name="debug">Debugging Information</a>
</div>
<!-- *********************************************************************** -->
<div class="doc_text">
<p><tt> ... To be completed ... </tt></p>
<p>Once the linker has loaded each non-native object files into an
<tt>lto_module_t</tt>, it can request libLTO to process them all and
generate a native object file. This is done in a couple of steps.
First a code generator is created with:<pre>
lto_codegen_create() </pre>
then each non-native object file is added to the code generator with:<pre>
lto_codegen_add_module(lto_code_gen_t, lto_module_t)</pre>
The linker then has the option of setting some codegen options. Whether
or not to generate DWARF debug info is set with: <pre>
lto_codegen_set_debug_model(lto_code_gen_t) </pre>
Which kind of position independence is set with: <pre>
lto_codegen_set_pic_model(lto_code_gen_t) </pre>
And each symbol that is referenced by a native object file or otherwise
must not be optimized away is set with: <pre>
lto_codegen_add_must_preserve_symbol(lto_code_gen_t, const char*)</pre>
After all these settings are done, the linker requests that a native
object file be created from the modules with the settings using:
lto_codegen_compile(lto_code_gen_t, size*)</pre>
which returns a pointer to a buffer containing the generated native
object file. The linker then parses that and links it with the rest
of the native object files.
</div>
<!-- *********************************************************************** -->
@@ -383,7 +346,7 @@ $ llvm-gcc a.o main.o -o main # &lt;-- standard link command without any modific
<a href="http://validator.w3.org/check/referer"><img
src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
Devang Patel<br>
Devang Patel and Nick Kledzik<br>
<a href="http://llvm.org">LLVM Compiler Infrastructure</a><br>
Last modified: $Date$
</address>