diff --git a/docs/LangRef.html b/docs/LangRef.html index 41379db37f7..25b3b1edc63 100644 --- a/docs/LangRef.html +++ b/docs/LangRef.html @@ -17,6 +17,13 @@
  • Abstract
  • Introduction
  • Identifiers
  • +
  • High Level Structure +
      +
    1. Module Structure
    2. +
    3. Global Variables
    4. +
    5. Function Structure
    6. +
    +
  • Type System
    1. Primitive Types @@ -35,12 +42,7 @@
  • -
  • High Level Structure -
      -
    1. Module Structure
    2. -
    3. Global Variables
    4. -
    5. Function Structure
    6. -
    +
  • Constants
  • Instruction Reference
      @@ -279,10 +281,172 @@ exactly. For example, NaN's, infinities, and other special cases are represented in their IEEE hexadecimal format so that assembly and disassembly do not cause any bits to change in the constants.

      + + + + + + + + +
      + +

      LLVM programs are composed of "Module"s, each of which is a +translation unit of the input programs. Each module consists of +functions, global variables, and symbol table entries. Modules may be +combined together with the LLVM linker, which merges function (and +global variable) definitions, resolves forward declarations, and merges +symbol table entries. Here is an example of the "hello world" module:

      + +
      ; Declare the string constant as a global constant...
      +%.LC0 = internal constant [13 x sbyte] c"hello world\0A\00"          ; [13 x sbyte]*
      +
      +; External declaration of the puts function
      +declare int %puts(sbyte*)                                            ; int(sbyte*)* 
      +
      +; Definition of main function
      +int %main() {                                                        ; int()* 
      +        ; Convert [13x sbyte]* to sbyte *...
      +        %cast210 = getelementptr [13 x sbyte]* %.LC0, long 0, long 0 ; sbyte*
      +
      +        ; Call puts function to write out the string to stdout...
      +        call int %puts(sbyte* %cast210)                              ; int
      +        ret int 0
      }
      + +

      This example is made up of a global variable +named ".LC0", an external declaration of the "puts" +function, and a function definition +for "main".

      + + In general, a module is made up of a list of global +values, where both functions and global variables are global values. +Global values are represented by a pointer to a memory location (in +this case, a pointer to an array of char, and a pointer to a function), +and have one of the following linkage types: + +

      + +
      +
      internal
      +
      Global values with internal linkage are only directly accessible +by objects in the current module. In particular, linking code into a +module with an internal global value may cause the internal to be +renamed as necessary to avoid collisions. Because the symbol is +internal to the module, all references can be updated. This +corresponds to the notion of the 'static' keyword in C, or the +idea of "anonymous namespaces" in C++. +

      +
      +
      linkonce:
      +
      "linkonce" linkage is similar to internal +linkage, with the twist that linking together two modules defining the +same linkonce globals will cause one of the globals to be +discarded. This is typically used to implement inline functions. +Unreferenced linkonce globals are allowed to be discarded. +

      +
      +
      weak:
      +
      "weak" linkage is exactly the same as linkonce +linkage, except that unreferenced weak globals may not be +discarded. This is used to implement constructs in C such as "int +X;" at global scope. +

      +
      +
      appending:
      +
      "appending" linkage may only be applied to global +variables of pointer to array type. When two global variables with +appending linkage are linked together, the two global arrays are +appended together. This is the LLVM, typesafe, equivalent of having +the system linker append together "sections" with identical names when +.o files are linked. +

      +
      +
      externally visible:
      +
      If none of the above identifiers are used, the global is +externally visible, meaning that it participates in linkage and can be +used to resolve external symbol references. +

      +
      +
      + +

      + +

      For example, since the ".LC0" +variable is defined to be internal, if another module defined a ".LC0" +variable and was linked with this one, one of the two would be renamed, +preventing a collision. Since "main" and "puts" are +external (i.e., lacking any linkage declarations), they are accessible +outside of the current module. It is illegal for a function declaration +to have any linkage type other than "externally visible".

      +
      + + + + +
      + +

      Global variables define regions of memory allocated at compilation +time instead of run-time. Global variables may optionally be +initialized. A variable may be defined as a global "constant", which +indicates that the contents of the variable will never be modified +(enabling better optimization, allowing the global data to be placed in the +read-only section of an executable, etc).

      + +

      As SSA values, global variables define pointer values that are in +scope (i.e. they dominate) all basic blocks in the program. Global +variables always define a pointer to their "content" type because they +describe a region of memory, and all memory objects in LLVM are +accessed through pointers.

      + +
      + + + + + +
      + +

      LLVM function definitions are composed of a (possibly empty) argument list, +an opening curly brace, a list of basic blocks, and a closing curly brace. LLVM +function declarations are defined with the "declare" keyword, a +function name, and a function signature.

      + +

      A function definition contains a list of basic blocks, forming the CFG for +the function. Each basic block may optionally start with a label (giving the +basic block a symbol table entry), contains a list of instructions, and ends +with a terminator instruction (such as a branch or +function return).

      + +

      The first basic block in program is special in two ways: it is immediately +executed on entrance to the function, and it is not allowed to have predecessor +basic blocks (i.e. there can not be any branches to the entry block of a +function). Because the block can have no predecessors, it also cannot have any +PHI nodes.

      + +

      LLVM functions are identified by their name and type signature. Hence, two +functions with the same name but different parameter lists or return values are +considered different functions, and LLVM will resolves references to each +appropriately.

      + +
      + + + +
      +

      The LLVM type system is one of the most important features of the intermediate representation. Being typed enables a number of optimizations to be performed on the IR directly, without having to do @@ -290,9 +454,9 @@ extra analyses on the side before the transformation. A strong type system makes it easier to read the generated code and enables novel analyses and transformations that are not feasible to perform on normal three address code representations.

      -
      + + +
      @@ -557,152 +721,6 @@ be any integral or floating point type.

      - - - - - -
      -

      LLVM programs are composed of "Module"s, each of which is a -translation unit of the input programs. Each module consists of -functions, global variables, and symbol table entries. Modules may be -combined together with the LLVM linker, which merges function (and -global variable) definitions, resolves forward declarations, and merges -symbol table entries. Here is an example of the "hello world" module:

      -
      ; Declare the string constant as a global constant...
      -%.LC0 = internal constant [13 x sbyte] c"hello world\0A\00"          ; [13 x sbyte]*
      -
      -; External declaration of the puts function
      -declare int %puts(sbyte*)                                            ; int(sbyte*)* 
      -
      -; Definition of main function
      -int %main() {                                                        ; int()* 
      -        ; Convert [13x sbyte]* to sbyte *...
      -        %cast210 = getelementptr [13 x sbyte]* %.LC0, long 0, long 0 ; sbyte*
      -
      -        ; Call puts function to write out the string to stdout...
      -        call int %puts(sbyte* %cast210)                              ; int
      -        ret int 0
      }
      -

      This example is made up of a global variable -named ".LC0", an external declaration of the "puts" -function, and a function definition -for "main".

      - In general, a module is made up of a list of global -values, where both functions and global variables are global values. -Global values are represented by a pointer to a memory location (in -this case, a pointer to an array of char, and a pointer to a function), -and have one of the following linkage types: -

      -
      -
      internal
      -
      Global values with internal linkage are only directly accessible -by objects in the current module. In particular, linking code into a -module with an internal global value may cause the internal to be -renamed as necessary to avoid collisions. Because the symbol is -internal to the module, all references can be updated. This -corresponds to the notion of the 'static' keyword in C, or the -idea of "anonymous namespaces" in C++. -

      -
      -
      linkonce:
      -
      "linkonce" linkage is similar to internal -linkage, with the twist that linking together two modules defining the -same linkonce globals will cause one of the globals to be -discarded. This is typically used to implement inline functions. -Unreferenced linkonce globals are allowed to be discarded. -

      -
      -
      weak:
      -
      "weak" linkage is exactly the same as linkonce -linkage, except that unreferenced weak globals may not be -discarded. This is used to implement constructs in C such as "int -X;" at global scope. -

      -
      -
      appending:
      -
      "appending" linkage may only be applied to global -variables of pointer to array type. When two global variables with -appending linkage are linked together, the two global arrays are -appended together. This is the LLVM, typesafe, equivalent of having -the system linker append together "sections" with identical names when -.o files are linked. -

      -
      -
      externally visible:
      -
      If none of the above identifiers are used, the global is -externally visible, meaning that it participates in linkage and can be -used to resolve external symbol references. -

      -
      -
      -

      -

      For example, since the ".LC0" -variable is defined to be internal, if another module defined a ".LC0" -variable and was linked with this one, one of the two would be renamed, -preventing a collision. Since "main" and "puts" are -external (i.e., lacking any linkage declarations), they are accessible -outside of the current module. It is illegal for a function declaration -to have any linkage type other than "externally visible".

      -
      - - - - -
      - -

      Global variables define regions of memory allocated at compilation -time instead of run-time. Global variables may optionally be -initialized. A variable may be defined as a global "constant", which -indicates that the contents of the variable will never be modified -(opening options for optimization).

      - -

      As SSA values, global variables define pointer values that are in -scope (i.e. they dominate) for all basic blocks in the program. Global -variables always define a pointer to their "content" type because they -describe a region of memory, and all memory objects in LLVM are -accessed through pointers.

      - -
      - - - - - -
      - -

      LLVM function definitions are composed of a (possibly empty) argument list, -an opening curly brace, a list of basic blocks, and a closing curly brace. LLVM -function declarations are defined with the "declare" keyword, a -function name, and a function signature.

      - -

      A function definition contains a list of basic blocks, forming the CFG for -the function. Each basic block may optionally start with a label (giving the -basic block a symbol table entry), contains a list of instructions, and ends -with a terminator instruction (such as a branch or -function return).

      - -

      The first basic block in program is special in two ways: it is immediately -executed on entrance to the function, and it is not allowed to have predecessor -basic blocks (i.e. there can not be any branches to the entry block of a -function). Because the block can have no predecessors, it also cannot have any -PHI nodes.

      - -

      LLVM functions are identified by their name and type signature. Hence, two -functions with the same name but different parameter lists or return values are -considered different functions, and LLVM will resolves references to each -appropriately.

      - -
      -