diff --git a/docs/LangRef.html b/docs/LangRef.html index 25b3b1edc63..42767e63d50 100644 --- a/docs/LangRef.html +++ b/docs/LangRef.html @@ -20,6 +20,7 @@
  • High Level Structure
    1. Module Structure
    2. +
    3. Linkage Types
    4. Global Variables
    5. Function Structure
    @@ -220,66 +221,88 @@ the parser.

    purposes:

      -
    1. Numeric constants are represented as you would expect: 12, -3 -123.421, etc. Floating point constants have an optional hexadecimal -notation.
    2. -
    3. Named values are represented as a string of characters with a '%' -prefix. For example, %foo, %DivisionByZero, -%a.really.long.identifier. The actual regular expression used is '%[a-zA-Z$._][a-zA-Z$._0-9]*'. -Identifiers which require other characters in their names can be -surrounded with quotes. In this way, anything except a " -character can be used in a name.
    4. -
    5. Unnamed values are represented as an unsigned numeric value with -a '%' prefix. For example, %12, %2, %44.
    6. +
    7. Numeric constants are represented as you would expect: 12, -3 123.421, + etc. Floating point constants have an optional hexadecimal notation.
    8. + +
    9. Named values are represented as a string of characters with a '%' prefix. + For example, %foo, %DivisionByZero, %a.really.long.identifier. The actual + regular expression used is '%[a-zA-Z$._][a-zA-Z$._0-9]*'. + Identifiers which require other characters in their names can be surrounded + with quotes. In this way, anything except a " character can be used + in a name.
    10. + +
    11. Unnamed values are represented as an unsigned numeric value with a '%' + prefix. For example, %12, %2, %44.
    12. +
    -

    LLVM requires that values start with a '%' sign for two reasons: -Compilers don't need to worry about name clashes with reserved words, -and the set of reserved words may be expanded in the future without -penalty. Additionally, unnamed identifiers allow a compiler to quickly -come up with a temporary variable without having to avoid symbol table -conflicts.

    + +

    LLVM requires that values start with a '%' sign for two reasons: Compilers +don't need to worry about name clashes with reserved words, and the set of +reserved words may be expanded in the future without penalty. Additionally, +unnamed identifiers allow a compiler to quickly come up with a temporary +variable without having to avoid symbol table conflicts.

    +

    Reserved words in LLVM are very similar to reserved words in other languages. There are keywords for different opcodes ('add', 'cast', 'ret', etc...), for primitive type names ('void', 'uint', -etc...), and others. These reserved words cannot conflict with -variable names, because none of them start with a '%' character.

    -

    Here is an example of LLVM code to multiply the integer variable '%X' -by 8:

    +href="#i_add">add', 'cast', 'ret', etc...), for primitive type names ('void', 'uint', etc...), +and others. These reserved words cannot conflict with variable names, because +none of them start with a '%' character.

    + +

    Here is an example of LLVM code to multiply the integer variable +'%X' by 8:

    +

    The easy way:

    -
      %result = mul uint %X, 8
    + +
    +  %result = mul uint %X, 8
    +
    +

    After strength reduction:

    -
      %result = shl uint %X, ubyte 3
    + +
    +  %result = shl uint %X, ubyte 3
    +
    +

    And the hard way:

    -
      add uint %X, %X           ; yields {uint}:%0
    -  add uint %0, %0           ; yields {uint}:%1
    -  %result = add uint %1, %1
    + +
    +  add uint %X, %X           ; yields {uint}:%0
    +  add uint %0, %0           ; yields {uint}:%1
    +  %result = add uint %1, %1
    +
    +

    This last way of multiplying %X by 8 illustrates several important lexical features of LLVM:

    +
      -
    1. Comments are delimited with a ';' and go until the end -of line.
    2. -
    3. Unnamed temporaries are created when the result of a computation -is not assigned to a named value.
    4. + +
    5. Comments are delimited with a ';' and go until the end of + line.
    6. + +
    7. Unnamed temporaries are created when the result of a computation is not + assigned to a named value.
    8. +
    9. Unnamed temporaries are numbered sequentially
    10. +
    -

    ...and it also show a convention that we follow in this document. -When demonstrating instructions, we will follow an instruction with a -comment that defines the type and name of value produced. Comments are -shown in italic text.

    -

    The one non-intuitive notation for constants is the optional -hexidecimal form of floating point constants. For example, the form 'double + +

    ...and it also show a convention that we follow in this document. When +demonstrating instructions, we will follow an instruction with a comment that +defines the type and name of value produced. Comments are shown in italic +text.

    + +

    The one non-intuitive notation for constants is the optional hexidecimal form +of floating point constants. For example, the form 'double 0x432ff973cafa8000' is equivalent to (but harder to read than) 'double -4.5e+15' which is also supported by the parser. The only time -hexadecimal floating point constants are useful (and the only time that -they are generated by the disassembler) is when an FP constant has to -be emitted that is not representable as a decimal floating point number -exactly. For example, NaN's, infinities, and other special cases are -represented in their IEEE hexadecimal format so that assembly and -disassembly do not cause any bits to change in the constants.

    +4.5e+15
    ' which is also supported by the parser. The only time hexadecimal +floating point constants are useful (and the only time that they are generated +by the disassembler) is when an FP constant has to be emitted that is not +representable as a decimal floating point number exactly. For example, NaN's, +infinities, and other special cases are represented in their IEEE hexadecimal +format so that assembly and disassembly do not cause any bits to change in the +constants.

    @@ -323,59 +346,70 @@ named ".LC0", an external declaration of the "puts" function, and a function definition for "main".

    - In general, a module is made up of a list of global -values, where both functions and global variables are global values. -Global values are represented by a pointer to a memory location (in -this case, a pointer to an array of char, and a pointer to a function), -and have one of the following linkage types: +

    In general, a module is made up of a list of global values, +where both functions and global variables are global values. Global values are +represented by a pointer to a memory location (in this case, a pointer to an +array of char, and a pointer to a function), and have one of the following linkage types.

    -

    + + + +
    + Linkage Types +
    + +
    + +

    +All Global Variables and Functions have one of the following types of linkage: +

    +
    internal
    -
    Global values with internal linkage are only directly accessible -by objects in the current module. In particular, linking code into a -module with an internal global value may cause the internal to be -renamed as necessary to avoid collisions. Because the symbol is -internal to the module, all references can be updated. This -corresponds to the notion of the 'static' keyword in C, or the -idea of "anonymous namespaces" in C++. -

    + +
    Global values with internal linkage are only directly accessible by + objects in the current module. In particular, linking code into a module with + an internal global value may cause the internal to be renamed as necessary to + avoid collisions. Because the symbol is internal to the module, all + references can be updated. This corresponds to the notion of the + 'static' keyword in C, or the idea of "anonymous namespaces" in C++.
    +
    linkonce:
    -
    "linkonce" linkage is similar to internal -linkage, with the twist that linking together two modules defining the -same linkonce globals will cause one of the globals to be -discarded. This is typically used to implement inline functions. -Unreferenced linkonce globals are allowed to be discarded. -

    + +
    "linkonce" linkage is similar to internal linkage, with + the twist that linking together two modules defining the same + linkonce globals will cause one of the globals to be discarded. This + is typically used to implement inline functions. Unreferenced + linkonce globals are allowed to be discarded.
    +
    weak:
    -
    "weak" linkage is exactly the same as linkonce -linkage, except that unreferenced weak globals may not be -discarded. This is used to implement constructs in C such as "int -X;" at global scope. -

    + +
    "weak" linkage is exactly the same as linkonce linkage, + except that unreferenced weak globals may not be discarded. This is + used to implement constructs in C such as "int X;" at global scope.
    +
    appending:
    -
    "appending" linkage may only be applied to global -variables of pointer to array type. When two global variables with -appending linkage are linked together, the two global arrays are -appended together. This is the LLVM, typesafe, equivalent of having -the system linker append together "sections" with identical names when -.o files are linked. -

    + +
    "appending" linkage may only be applied to global variables of + pointer to array type. When two global variables with appending linkage are + linked together, the two global arrays are appended together. This is the + LLVM, typesafe, equivalent of having the system linker append together + "sections" with identical names when .o files are linked.
    +
    externally visible:
    -
    If none of the above identifiers are used, the global is -externally visible, meaning that it participates in linkage and can be -used to resolve external symbol references. -

    + +
    If none of the above identifiers are used, the global is externally + visible, meaning that it participates in linkage and can be used to resolve + external symbol references.
    -

    -

    For example, since the ".LC0" variable is defined to be internal, if another module defined a ".LC0" variable and was linked with this one, one of the two would be renamed, @@ -383,6 +417,7 @@ preventing a collision. Since "main" and "puts" are external (i.e., lacking any linkage declarations), they are accessible outside of the current module. It is illegal for a function declaration to have any linkage type other than "externally visible".

    +