mirror of
https://github.com/c64scene-ar/llvm-6502.git
synced 2025-02-04 23:32:00 +00:00
Move the "High Level Structure" to before "Type System"
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18695 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is contained in:
parent
d4f0f9849a
commit
fa73021cf1
@ -17,6 +17,13 @@
|
||||
<li><a href="#abstract">Abstract</a></li>
|
||||
<li><a href="#introduction">Introduction</a></li>
|
||||
<li><a href="#identifiers">Identifiers</a></li>
|
||||
<li><a href="#highlevel">High Level Structure</a>
|
||||
<ol>
|
||||
<li><a href="#modulestructure">Module Structure</a></li>
|
||||
<li><a href="#globalvars">Global Variables</a></li>
|
||||
<li><a href="#functionstructure">Function Structure</a></li>
|
||||
</ol>
|
||||
</li>
|
||||
<li><a href="#typesystem">Type System</a>
|
||||
<ol>
|
||||
<li><a href="#t_primitive">Primitive Types</a>
|
||||
@ -35,12 +42,7 @@
|
||||
</li>
|
||||
</ol>
|
||||
</li>
|
||||
<li><a href="#highlevel">High Level Structure</a>
|
||||
<ol>
|
||||
<li><a href="#modulestructure">Module Structure</a></li>
|
||||
<li><a href="#globalvars">Global Variables</a></li>
|
||||
<li><a href="#functionstructure">Function Structure</a></li>
|
||||
</ol>
|
||||
<li><a href="#constants">Constants</a>
|
||||
</li>
|
||||
<li><a href="#instref">Instruction Reference</a>
|
||||
<ol>
|
||||
@ -279,10 +281,172 @@ exactly. For example, NaN's, infinities, and other special cases are
|
||||
represented in their IEEE hexadecimal format so that assembly and
|
||||
disassembly do not cause any bits to change in the constants.</p>
|
||||
</div>
|
||||
|
||||
<!-- *********************************************************************** -->
|
||||
<div class="doc_section"> <a name="highlevel">High Level Structure</a> </div>
|
||||
<!-- *********************************************************************** -->
|
||||
|
||||
<!-- ======================================================================= -->
|
||||
<div class="doc_subsection"> <a name="modulestructure">Module Structure</a>
|
||||
</div>
|
||||
|
||||
<div class="doc_text">
|
||||
|
||||
<p>LLVM programs are composed of "Module"s, each of which is a
|
||||
translation unit of the input programs. Each module consists of
|
||||
functions, global variables, and symbol table entries. Modules may be
|
||||
combined together with the LLVM linker, which merges function (and
|
||||
global variable) definitions, resolves forward declarations, and merges
|
||||
symbol table entries. Here is an example of the "hello world" module:</p>
|
||||
|
||||
<pre><i>; Declare the string constant as a global constant...</i>
|
||||
<a href="#identifiers">%.LC0</a> = <a href="#linkage_internal">internal</a> <a
|
||||
href="#globalvars">constant</a> <a href="#t_array">[13 x sbyte]</a> c"hello world\0A\00" <i>; [13 x sbyte]*</i>
|
||||
|
||||
<i>; External declaration of the puts function</i>
|
||||
<a href="#functionstructure">declare</a> int %puts(sbyte*) <i>; int(sbyte*)* </i>
|
||||
|
||||
<i>; Definition of main function</i>
|
||||
int %main() { <i>; int()* </i>
|
||||
<i>; Convert [13x sbyte]* to sbyte *...</i>
|
||||
%cast210 = <a
|
||||
href="#i_getelementptr">getelementptr</a> [13 x sbyte]* %.LC0, long 0, long 0 <i>; sbyte*</i>
|
||||
|
||||
<i>; Call puts function to write out the string to stdout...</i>
|
||||
<a
|
||||
href="#i_call">call</a> int %puts(sbyte* %cast210) <i>; int</i>
|
||||
<a
|
||||
href="#i_ret">ret</a> int 0<br>}<br></pre>
|
||||
|
||||
<p>This example is made up of a <a href="#globalvars">global variable</a>
|
||||
named "<tt>.LC0</tt>", an external declaration of the "<tt>puts</tt>"
|
||||
function, and a <a href="#functionstructure">function definition</a>
|
||||
for "<tt>main</tt>".</p>
|
||||
|
||||
<a name="linkage"> In general, a module is made up of a list of global
|
||||
values, where both functions and global variables are global values.
|
||||
Global values are represented by a pointer to a memory location (in
|
||||
this case, a pointer to an array of char, and a pointer to a function),
|
||||
and have one of the following linkage types:</a>
|
||||
|
||||
<p> </p>
|
||||
|
||||
<dl>
|
||||
<dt><tt><b><a name="linkage_internal">internal</a></b></tt> </dt>
|
||||
<dd>Global values with internal linkage are only directly accessible
|
||||
by objects in the current module. In particular, linking code into a
|
||||
module with an internal global value may cause the internal to be
|
||||
renamed as necessary to avoid collisions. Because the symbol is
|
||||
internal to the module, all references can be updated. This
|
||||
corresponds to the notion of the '<tt>static</tt>' keyword in C, or the
|
||||
idea of "anonymous namespaces" in C++.
|
||||
<p> </p>
|
||||
</dd>
|
||||
<dt><tt><b><a name="linkage_linkonce">linkonce</a></b></tt>: </dt>
|
||||
<dd>"<tt>linkonce</tt>" linkage is similar to <tt>internal</tt>
|
||||
linkage, with the twist that linking together two modules defining the
|
||||
same <tt>linkonce</tt> globals will cause one of the globals to be
|
||||
discarded. This is typically used to implement inline functions.
|
||||
Unreferenced <tt>linkonce</tt> globals are allowed to be discarded.
|
||||
<p> </p>
|
||||
</dd>
|
||||
<dt><tt><b><a name="linkage_weak">weak</a></b></tt>: </dt>
|
||||
<dd>"<tt>weak</tt>" linkage is exactly the same as <tt>linkonce</tt>
|
||||
linkage, except that unreferenced <tt>weak</tt> globals may not be
|
||||
discarded. This is used to implement constructs in C such as "<tt>int
|
||||
X;</tt>" at global scope.
|
||||
<p> </p>
|
||||
</dd>
|
||||
<dt><tt><b><a name="linkage_appending">appending</a></b></tt>: </dt>
|
||||
<dd>"<tt>appending</tt>" linkage may only be applied to global
|
||||
variables of pointer to array type. When two global variables with
|
||||
appending linkage are linked together, the two global arrays are
|
||||
appended together. This is the LLVM, typesafe, equivalent of having
|
||||
the system linker append together "sections" with identical names when
|
||||
.o files are linked.
|
||||
<p> </p>
|
||||
</dd>
|
||||
<dt><tt><b><a name="linkage_external">externally visible</a></b></tt>:</dt>
|
||||
<dd>If none of the above identifiers are used, the global is
|
||||
externally visible, meaning that it participates in linkage and can be
|
||||
used to resolve external symbol references.
|
||||
<p> </p>
|
||||
</dd>
|
||||
</dl>
|
||||
|
||||
<p> </p>
|
||||
|
||||
<p><a name="linkage_external">For example, since the "<tt>.LC0</tt>"
|
||||
variable is defined to be internal, if another module defined a "<tt>.LC0</tt>"
|
||||
variable and was linked with this one, one of the two would be renamed,
|
||||
preventing a collision. Since "<tt>main</tt>" and "<tt>puts</tt>" are
|
||||
external (i.e., lacking any linkage declarations), they are accessible
|
||||
outside of the current module. It is illegal for a function <i>declaration</i>
|
||||
to have any linkage type other than "externally visible".</a></p>
|
||||
</div>
|
||||
|
||||
<!-- ======================================================================= -->
|
||||
<div class="doc_subsection">
|
||||
<a name="globalvars">Global Variables</a>
|
||||
</div>
|
||||
|
||||
<div class="doc_text">
|
||||
|
||||
<p>Global variables define regions of memory allocated at compilation
|
||||
time instead of run-time. Global variables may optionally be
|
||||
initialized. A variable may be defined as a global "constant", which
|
||||
indicates that the contents of the variable will never be modified
|
||||
(enabling better optimization, allowing the global data to be placed in the
|
||||
read-only section of an executable, etc).</p>
|
||||
|
||||
<p>As SSA values, global variables define pointer values that are in
|
||||
scope (i.e. they dominate) all basic blocks in the program. Global
|
||||
variables always define a pointer to their "content" type because they
|
||||
describe a region of memory, and all memory objects in LLVM are
|
||||
accessed through pointers.</p>
|
||||
|
||||
</div>
|
||||
|
||||
|
||||
<!-- ======================================================================= -->
|
||||
<div class="doc_subsection">
|
||||
<a name="functionstructure">Functions</a>
|
||||
</div>
|
||||
|
||||
<div class="doc_text">
|
||||
|
||||
<p>LLVM function definitions are composed of a (possibly empty) argument list,
|
||||
an opening curly brace, a list of basic blocks, and a closing curly brace. LLVM
|
||||
function declarations are defined with the "<tt>declare</tt>" keyword, a
|
||||
function name, and a function signature.</p>
|
||||
|
||||
<p>A function definition contains a list of basic blocks, forming the CFG for
|
||||
the function. Each basic block may optionally start with a label (giving the
|
||||
basic block a symbol table entry), contains a list of instructions, and ends
|
||||
with a <a href="#terminators">terminator</a> instruction (such as a branch or
|
||||
function return).</p>
|
||||
|
||||
<p>The first basic block in program is special in two ways: it is immediately
|
||||
executed on entrance to the function, and it is not allowed to have predecessor
|
||||
basic blocks (i.e. there can not be any branches to the entry block of a
|
||||
function). Because the block can have no predecessors, it also cannot have any
|
||||
<a href="#i_phi">PHI nodes</a>.</p>
|
||||
|
||||
<p>LLVM functions are identified by their name and type signature. Hence, two
|
||||
functions with the same name but different parameter lists or return values are
|
||||
considered different functions, and LLVM will resolves references to each
|
||||
appropriately.</p>
|
||||
|
||||
</div>
|
||||
|
||||
|
||||
|
||||
<!-- *********************************************************************** -->
|
||||
<div class="doc_section"> <a name="typesystem">Type System</a> </div>
|
||||
<!-- *********************************************************************** -->
|
||||
|
||||
<div class="doc_text">
|
||||
|
||||
<p>The LLVM type system is one of the most important features of the
|
||||
intermediate representation. Being typed enables a number of
|
||||
optimizations to be performed on the IR directly, without having to do
|
||||
@ -290,9 +454,9 @@ extra analyses on the side before the transformation. A strong type
|
||||
system makes it easier to read the generated code and enables novel
|
||||
analyses and transformations that are not feasible to perform on normal
|
||||
three address code representations.</p>
|
||||
<!-- The written form for the type system was heavily influenced by the
|
||||
syntactic problems with types in the C language<sup><a
|
||||
href="#rw_stroustrup">1</a></sup>.<p> --> </div>
|
||||
|
||||
</div>
|
||||
|
||||
<!-- ======================================================================= -->
|
||||
<div class="doc_subsection"> <a name="t_primitive">Primitive Types</a> </div>
|
||||
<div class="doc_text">
|
||||
@ -557,152 +721,6 @@ be any integral or floating point type.</p>
|
||||
</table>
|
||||
</div>
|
||||
|
||||
<!-- *********************************************************************** -->
|
||||
<div class="doc_section"> <a name="highlevel">High Level Structure</a> </div>
|
||||
<!-- *********************************************************************** -->
|
||||
<!-- ======================================================================= -->
|
||||
<div class="doc_subsection"> <a name="modulestructure">Module Structure</a>
|
||||
</div>
|
||||
<div class="doc_text">
|
||||
<p>LLVM programs are composed of "Module"s, each of which is a
|
||||
translation unit of the input programs. Each module consists of
|
||||
functions, global variables, and symbol table entries. Modules may be
|
||||
combined together with the LLVM linker, which merges function (and
|
||||
global variable) definitions, resolves forward declarations, and merges
|
||||
symbol table entries. Here is an example of the "hello world" module:</p>
|
||||
<pre><i>; Declare the string constant as a global constant...</i>
|
||||
<a href="#identifiers">%.LC0</a> = <a href="#linkage_internal">internal</a> <a
|
||||
href="#globalvars">constant</a> <a href="#t_array">[13 x sbyte]</a> c"hello world\0A\00" <i>; [13 x sbyte]*</i>
|
||||
|
||||
<i>; External declaration of the puts function</i>
|
||||
<a href="#functionstructure">declare</a> int %puts(sbyte*) <i>; int(sbyte*)* </i>
|
||||
|
||||
<i>; Definition of main function</i>
|
||||
int %main() { <i>; int()* </i>
|
||||
<i>; Convert [13x sbyte]* to sbyte *...</i>
|
||||
%cast210 = <a
|
||||
href="#i_getelementptr">getelementptr</a> [13 x sbyte]* %.LC0, long 0, long 0 <i>; sbyte*</i>
|
||||
|
||||
<i>; Call puts function to write out the string to stdout...</i>
|
||||
<a
|
||||
href="#i_call">call</a> int %puts(sbyte* %cast210) <i>; int</i>
|
||||
<a
|
||||
href="#i_ret">ret</a> int 0<br>}<br></pre>
|
||||
<p>This example is made up of a <a href="#globalvars">global variable</a>
|
||||
named "<tt>.LC0</tt>", an external declaration of the "<tt>puts</tt>"
|
||||
function, and a <a href="#functionstructure">function definition</a>
|
||||
for "<tt>main</tt>".</p>
|
||||
<a name="linkage"> In general, a module is made up of a list of global
|
||||
values, where both functions and global variables are global values.
|
||||
Global values are represented by a pointer to a memory location (in
|
||||
this case, a pointer to an array of char, and a pointer to a function),
|
||||
and have one of the following linkage types:</a>
|
||||
<p> </p>
|
||||
<dl>
|
||||
<dt><tt><b><a name="linkage_internal">internal</a></b></tt> </dt>
|
||||
<dd>Global values with internal linkage are only directly accessible
|
||||
by objects in the current module. In particular, linking code into a
|
||||
module with an internal global value may cause the internal to be
|
||||
renamed as necessary to avoid collisions. Because the symbol is
|
||||
internal to the module, all references can be updated. This
|
||||
corresponds to the notion of the '<tt>static</tt>' keyword in C, or the
|
||||
idea of "anonymous namespaces" in C++.
|
||||
<p> </p>
|
||||
</dd>
|
||||
<dt><tt><b><a name="linkage_linkonce">linkonce</a></b></tt>: </dt>
|
||||
<dd>"<tt>linkonce</tt>" linkage is similar to <tt>internal</tt>
|
||||
linkage, with the twist that linking together two modules defining the
|
||||
same <tt>linkonce</tt> globals will cause one of the globals to be
|
||||
discarded. This is typically used to implement inline functions.
|
||||
Unreferenced <tt>linkonce</tt> globals are allowed to be discarded.
|
||||
<p> </p>
|
||||
</dd>
|
||||
<dt><tt><b><a name="linkage_weak">weak</a></b></tt>: </dt>
|
||||
<dd>"<tt>weak</tt>" linkage is exactly the same as <tt>linkonce</tt>
|
||||
linkage, except that unreferenced <tt>weak</tt> globals may not be
|
||||
discarded. This is used to implement constructs in C such as "<tt>int
|
||||
X;</tt>" at global scope.
|
||||
<p> </p>
|
||||
</dd>
|
||||
<dt><tt><b><a name="linkage_appending">appending</a></b></tt>: </dt>
|
||||
<dd>"<tt>appending</tt>" linkage may only be applied to global
|
||||
variables of pointer to array type. When two global variables with
|
||||
appending linkage are linked together, the two global arrays are
|
||||
appended together. This is the LLVM, typesafe, equivalent of having
|
||||
the system linker append together "sections" with identical names when
|
||||
.o files are linked.
|
||||
<p> </p>
|
||||
</dd>
|
||||
<dt><tt><b><a name="linkage_external">externally visible</a></b></tt>:</dt>
|
||||
<dd>If none of the above identifiers are used, the global is
|
||||
externally visible, meaning that it participates in linkage and can be
|
||||
used to resolve external symbol references.
|
||||
<p> </p>
|
||||
</dd>
|
||||
</dl>
|
||||
<p> </p>
|
||||
<p><a name="linkage_external">For example, since the "<tt>.LC0</tt>"
|
||||
variable is defined to be internal, if another module defined a "<tt>.LC0</tt>"
|
||||
variable and was linked with this one, one of the two would be renamed,
|
||||
preventing a collision. Since "<tt>main</tt>" and "<tt>puts</tt>" are
|
||||
external (i.e., lacking any linkage declarations), they are accessible
|
||||
outside of the current module. It is illegal for a function <i>declaration</i>
|
||||
to have any linkage type other than "externally visible".</a></p>
|
||||
</div>
|
||||
|
||||
<!-- ======================================================================= -->
|
||||
<div class="doc_subsection">
|
||||
<a name="globalvars">Global Variables</a>
|
||||
</div>
|
||||
|
||||
<div class="doc_text">
|
||||
|
||||
<p>Global variables define regions of memory allocated at compilation
|
||||
time instead of run-time. Global variables may optionally be
|
||||
initialized. A variable may be defined as a global "constant", which
|
||||
indicates that the contents of the variable will never be modified
|
||||
(opening options for optimization).</p>
|
||||
|
||||
<p>As SSA values, global variables define pointer values that are in
|
||||
scope (i.e. they dominate) for all basic blocks in the program. Global
|
||||
variables always define a pointer to their "content" type because they
|
||||
describe a region of memory, and all memory objects in LLVM are
|
||||
accessed through pointers.</p>
|
||||
|
||||
</div>
|
||||
|
||||
|
||||
<!-- ======================================================================= -->
|
||||
<div class="doc_subsection">
|
||||
<a name="functionstructure">Functions</a>
|
||||
</div>
|
||||
|
||||
<div class="doc_text">
|
||||
|
||||
<p>LLVM function definitions are composed of a (possibly empty) argument list,
|
||||
an opening curly brace, a list of basic blocks, and a closing curly brace. LLVM
|
||||
function declarations are defined with the "<tt>declare</tt>" keyword, a
|
||||
function name, and a function signature.</p>
|
||||
|
||||
<p>A function definition contains a list of basic blocks, forming the CFG for
|
||||
the function. Each basic block may optionally start with a label (giving the
|
||||
basic block a symbol table entry), contains a list of instructions, and ends
|
||||
with a <a href="#terminators">terminator</a> instruction (such as a branch or
|
||||
function return).</p>
|
||||
|
||||
<p>The first basic block in program is special in two ways: it is immediately
|
||||
executed on entrance to the function, and it is not allowed to have predecessor
|
||||
basic blocks (i.e. there can not be any branches to the entry block of a
|
||||
function). Because the block can have no predecessors, it also cannot have any
|
||||
<a href="#i_phi">PHI nodes</a>.</p>
|
||||
|
||||
<p>LLVM functions are identified by their name and type signature. Hence, two
|
||||
functions with the same name but different parameter lists or return values are
|
||||
considered different functions, and LLVM will resolves references to each
|
||||
appropriately.</p>
|
||||
|
||||
</div>
|
||||
|
||||
|
||||
<!-- *********************************************************************** -->
|
||||
<div class="doc_section"> <a name="instref">Instruction Reference</a> </div>
|
||||
|
Loading…
x
Reference in New Issue
Block a user