diff --git a/docs/ProgrammersManual.html b/docs/ProgrammersManual.html index 56751e2bd4e..dfd8a20128f 100644 --- a/docs/ProgrammersManual.html +++ b/docs/ProgrammersManual.html @@ -1,20 +1,18 @@ - + LLVM Programmer's Manual + - - - - - - - -
  LLVM Programmer's Manual
+ + +
+ LLVM Programmer's Manual +
+
    -
  1. Introduction
  2. +
  3. Introduction
  4. General Information
+ + + -Here we highlight some LLVM APIs that are generally useful and good to -know about when writing transformations. -

- - - - - - - - -
   The isa<>, -cast<> and dyn_cast<> templates
- - - - - - - - -
   The DEBUG() macro -& -debug option
- -

-
Fine grained debug info with DEBUG_TYPE() and the -debug-only -option

- - - - - - - - -
   The Statistic -template & -stats option
- - - - - - - -
Helpful Hints for Common -Operations
- - - - - - - - -
   Basic Inspection and -Traversal Routines
- -

-
Iterating over the BasicBlocks in a Function

- -

-
Iterating over the Instructions in a BasicBlock

- -

-
Iterating over the Instructions in a Function

- -

-
Turning an iterator into a class -pointer (and vice-versa)

- -

-
Finding call sites: a slightly -more complex example

- -

-
Treating calls and invokes the -same way

- -

-
Iterating over def-use & -use-def chains

- - - - - - - - -
   Making simple -changes
- -

-
Creating and inserting new Instructions

- -

-
Deleting Instructions

+
+ Creating and inserting new + Instructions +
+ +
+ +

Instantiating Instructions

+ +

Creation of Instructions is straightforward: simply call the +constructor for the kind of instruction to instantiate and provide the necessary +parameters. For example, an AllocaInst only requires a +(const-ptr-to) Type. Thus:

+ +
AllocaInst* ai = new AllocaInst(Type::IntTy);
+ +

will create an AllocaInst instance that represents the allocation of +one integer in the current stack frame, at runtime. Each Instruction +subclass is likely to have varying default parameters which change the semantics +of the instruction, so refer to the doxygen documentation for the subclass of +Instruction that you're interested in instantiating.

+ +

Naming values

+ +

It is very useful to name the values of instructions when you're able to, as +this facilitates the debugging of your transformations. If you end up looking +at generated LLVM machine code, you definitely want to have logical names +associated with the results of instructions! By supplying a value for the +Name (default) parameter of the Instruction constructor, you +associate a logical name with the result of the instruction's execution at +runtime. For example, say that I'm writing a transformation that dynamically +allocates space for an integer on the stack, and that integer is going to be +used as some kind of index by some other code. To accomplish this, I place an +AllocaInst at the first point in the first BasicBlock of some +Function, and I'm intending to use it within the same +Function. I might do:

+ +
AllocaInst* pa = new AllocaInst(Type::IntTy, 0, "indexLoc");
+ +

where indexLoc is now the logical name of the instruction's +execution value, which is a pointer to an integer on the runtime stack.

+ +

Inserting instructions

+ +

There are essentially two ways to insert an Instruction +into an existing sequence of instructions that form a BasicBlock:

+ + +
+ + +
+ Deleting Instructions +
+ +
+ +

Deleting an instruction from an existing sequence of instructions that form a +BasicBlock is very straightforward. First, +you must have a pointer to the instruction that you wish to delete. Second, you +need to obtain the pointer to that instruction's basic block. You use the +pointer to the basic block to get its list of instructions and then use the +erase function to remove your instruction. For example:

+
  Instruction *I = .. ;
BasicBlock *BB = I->getParent();
BB->getInstList().erase(I);
-

- -

-
Replacing an Instruction -with another Value

-
+ + +
+ Replacing an Instruction with another + Value +
+ +
+ +

Replacing individual instructions

+ +

Including "llvm/Transforms/Utils/BasicBlockUtils.h" permits use of two very useful replace functions: ReplaceInstWithValue -and ReplaceInstWithInst.

- +and ReplaceInstWithInst.

+

Deleting Instructions

+ - - - - - - -
The Core LLVM Class -Hierarchy Reference
-
+ -The Core LLVM classes are the primary means of representing the program +
+ The Core LLVM Class Hierarchy Reference +
+ + +
+ +

The Core LLVM classes are the primary means of representing the program being inspected or transformed. The core LLVM classes are defined in header files in the include/llvm/ directory, and implemented in -the lib/VMCore directory. -

- - - - - - - - -
   The Value class
-
+ + +
+ The Value class +
+ +
+ +

#include "llvm/Value.h" +
+doxygen info: Value Class

+ +

The Value class is the most important class in the LLVM Source +base. It represents a typed value that may be used (among other things) as an +operand to an instruction. There are many different types of Values, +such as Constants,Arguments. Even Instructions and Functions are Values.

+ +

A particular Value may be used many times in the LLVM representation +for a program. For example, an incoming argument to a function (represented +with an instance of the Argument class) is "used" by +every instruction in the function that references the argument. To keep track +of this relationship, the Value class keeps a list of all of the Users that is using it (the User class is a base class for all nodes in the LLVM +graph that can refer to Values). This use list is how LLVM represents +def-use information in the program, and is accessible through the use_* +methods, shown below.

+ +

Because LLVM is a typed representation, every LLVM Value is typed, +and this Type is available through the getType() +method. In addition, all LLVM values can be named. The "name" of the +Value is a symbolic string printed in the LLVM code:

+
   %foo = add int 1, 2
- The name of this instruction is "foo". NOTE -that the name of any value may be missing (an empty string), so names -should ONLY be used for debugging (making the source code easier -to read, debugging printouts), they should not be used to keep track of -values or map between them. For this purpose, use a std::map -of pointers to the Value itself instead. -

One important aspect of LLVM is that there is no distinction -between an SSA variable and the operation that produces it. Because of -this, any reference to the value produced by an instruction (or the -value available as an incoming argument, for example) is represented as -a direct pointer to the class that represents this value. Although -this may take some getting used to, it simplifies the representation -and makes it easier to manipulate.

-

- -

-
Important Public Members of the Value -class

+ +

The name of this instruction is "foo". NOTE +that the name of any value may be missing (an empty string), so names should +ONLY be used for debugging (making the source code easier to read, +debugging printouts), they should not be used to keep track of values or map +between them. For this purpose, use a std::map of pointers to the +Value itself instead.

+ +

One important aspect of LLVM is that there is no distinction between an SSA +variable and the operation that produces it. Because of this, any reference to +the value produced by an instruction (or the value available as an incoming +argument, for example) is represented as a direct pointer to the class that +represents this value. Although this may take some getting used to, it +simplifies the representation and makes it easier to manipulate.

+ +
+ + +
+ Important Public Members of the Value class +
+ +
+ - - - - - - - -
   The User class
-
+ + +
+ The User class +
+ +
+ +

+#include "llvm/User.h"
doxygen info: User Class
-Superclass: Value -

The User class is the common base class of all LLVM nodes -that may refer to Values. It exposes a -list of "Operands" that are all of the Values -that the User is referring to. The User class itself is a -subclass of Value.

-

The operands of a User point directly to the LLVM Value that it refers to. Because LLVM uses -Static Single Assignment (SSA) form, there can only be one definition -referred to, allowing this direct connection. This connection provides -the use-def information in LLVM.

-

- -

-
Important Public Members of the User -class

+Superclass: Value

+ +

The User class is the common base class of all LLVM nodes that may +refer to Values. It exposes a list of "Operands" +that are all of the Values that the User is +referring to. The User class itself is a subclass of +Value.

+ +

The operands of a User point directly to the LLVM Value that it refers to. Because LLVM uses Static +Single Assignment (SSA) form, there can only be one definition referred to, +allowing this direct connection. This connection provides the use-def +information in LLVM.

+ +
+ + +
+ Important Public Members of the User class +
+ +
+ +

The User class exposes the operand list in two ways: through +an index access interface and through an iterator based interface.

+ - - - - - - - -
   The Instruction -class
- -

-
Important Public Members of the Instruction -class

+ +
+ + +
+ The Instruction class +
+ +
+ +

#include "llvm/Instruction.h"
+doxygen info: Instruction Class
+Superclasses: User, Value

+ +

The Instruction class is the common base class for all LLVM +instructions. It provides only a few methods, but is a very commonly used +class. The primary data tracked by the Instruction class itself is the +opcode (instruction type) and the parent BasicBlock the Instruction is embedded +into. To represent a specific type of instruction, one of many subclasses of +Instruction are used.

+ +

Because the Instruction class subclasses the User class, its operands can be accessed in the same +way as for other Users (with the +getOperand()/getNumOperands() and +op_begin()/op_end() methods).

An important file for +the Instruction class is the llvm/Instruction.def file. This +file contains some meta-data about the various different types of instructions +in LLVM. It describes the enum values that are used as opcodes (for example +Instruction::Add and Instruction::SetLE), as well as the +concrete sub-classes of Instruction that implement the instruction (for +example BinaryOperator and SetCondInst). Unfortunately, the use of macros in +this file confuses doxygen, so these enum values don't show up correctly in the +doxygen output.

+ +
+ + +
+ Important Public Members of the Instruction + class +
+ +
+ - - - - - - - -
   The BasicBlock -class
-
+ + +
+ The BasicBlock class +
+ +
+ +

#include "llvm/BasicBlock.h"
doxygen info: BasicBlock Class
-Superclass: Value -

This class represents a single entry multiple exit section of the -code, commonly known as a basic block by the compiler community. The BasicBlock -class maintains a list of Instructions, -which form the body of the block. Matching the language definition, -the last element of this list of instructions is always a terminator -instruction (a subclass of the TerminatorInst -class).

-

In addition to tracking the list of instructions that make up the -block, the BasicBlock class also keeps track of the Function that it is embedded into.

-

Note that BasicBlocks themselves are Values, -because they are referenced by instructions like branches and can go in -the switch tables. BasicBlocks have type label.

-

- -

-
Important Public Members of the BasicBlock -class

+Superclass: Value

+ +

This class represents a single entry multiple exit section of the code, +commonly known as a basic block by the compiler community. The +BasicBlock class maintains a list of Instructions, which form the body of the block. +Matching the language definition, the last element of this list of instructions +is always a terminator instruction (a subclass of the TerminatorInst class).

+ +

In addition to tracking the list of instructions that make up the block, the +BasicBlock class also keeps track of the Function that it is embedded into.

+ +

Note that BasicBlocks themselves are Values, because they are referenced by instructions +like branches and can go in the switch tables. BasicBlocks have type +label.

+ +
+ + +
+ Important Public Members of the BasicBlock + class +
+ +
+ - - - - - - - -
   The GlobalValue -class
- -

-
Important Public Members of the GlobalValue -class

+ +
+ + +
+ The GlobalValue class +
+ +
+ +

#include "llvm/GlobalValue.h"
+doxygen info: GlobalValue Class
+Superclasses: User, Value

+ +

Global values (GlobalVariables or Functions) are the only LLVM values that are +visible in the bodies of all Functions. +Because they are visible at global scope, they are also subject to linking with +other globals defined in different translation units. To control the linking +process, GlobalValues know their linkage rules. Specifically, +GlobalValues know whether they have internal or external linkage, as +defined by the LinkageTypes enumerator.

+ +

If a GlobalValue has internal linkage (equivalent to being +static in C), it is not visible to code outside the current translation +unit, and does not participate in linking. If it has external linkage, it is +visible to external code, and does participate in linking. In addition to +linkage information, GlobalValues keep track of which Module they are currently part of.

+ +

Because GlobalValues are memory objects, they are always referred to +by their address. As such, the Type of a +global is always a pointer to its contents. It is important to remember this +when using the GetElementPtrInst instruction because this pointer must +be dereferenced first. For example, if you have a GlobalVariable (a +subclass of GlobalValue) that is an array of 24 ints, type [24 x +int], then the GlobalVariable is a pointer to that array. Although +the address of the first element of this array and the value of the +GlobalVariable are the same, they have different types. The +GlobalVariable's type is [24 x int]. The first element's type +is int. Because of this, accessing a global value requires you to +dereference the pointer with GetElementPtrInst first, then its elements +can be accessed. This is explained in the LLVM +Language Reference Manual.

+ +
+ + +
+ Important Public Members of the GlobalValue + class +
+ +
+ - - - - - - - -
   The Function -class
- -

-
Important Public Members of the Function -class

+ +
+ + +
+ The Function class +
+ +
+ +

#include "llvm/Function.h"
doxygen +info: Function Class
Superclasses: +GlobalValue, User, Value

+ +

The Function class represents a single procedure in LLVM. It is +actually one of the more complex classes in the LLVM heirarchy because it must +keep track of a large amount of data. The Function class keeps track +of a list of BasicBlocks, a list of formal Arguments, and a SymbolTable.

+ +

The list of BasicBlocks is the most +commonly used part of Function objects. The list imposes an implicit +ordering of the blocks in the function, which indicate how the code will be +layed out by the backend. Additionally, the first BasicBlock is the implicit entry node for the +Function. It is not legal in LLVM to explicitly branch to this initial +block. There are no implicit exit nodes, and in fact there may be multiple exit +nodes from a single Function. If the BasicBlock list is empty, this indicates that +the Function is actually a function declaration: the actual body of the +function hasn't been linked in yet.

+ +

In addition to a list of BasicBlocks, the +Function class also keeps track of the list of formal Arguments that the function receives. This +container manages the lifetime of the Argument +nodes, just like the BasicBlock list does for +the BasicBlocks.

+ +

The SymbolTable is a very rarely used +LLVM feature that is only used when you have to look up a value by name. Aside +from that, the SymbolTable is used +internally to make sure that there are not conflicts between the names of Instructions, BasicBlocks, or Arguments in the function body.

+ +
+ + +
+ Important Public Members of the Function + class +
+ +
+ - - - - - - - -
   The GlobalVariable -class
-
+ + +
+ The GlobalVariable class +
+ +
+ +

#include "llvm/GlobalVariable.h" +
doxygen info: GlobalVariable -Class
-Superclasses: GlobalValue, User, Value -

Global variables are represented with the (suprise suprise) GlobalVariable -class. Like functions, GlobalVariables are also subclasses of GlobalValue, and as such are always -referenced by their address (global values must live in memory, so their -"name" refers to their address). See GlobalValue for more on -this. Global variables may have an initial value (which must be a Constant), and if they have an -initializer, they may be marked as "constant" themselves (indicating -that their contents never change at runtime).  

-

- -

-
Important Public Members of the GlobalVariable -class

+Class
Superclasses: GlobalValue, User, Value

+ +

Global variables are represented with the (suprise suprise) +GlobalVariable class. Like functions, GlobalVariables are also +subclasses of GlobalValue, and as such are +always referenced by their address (global values must live in memory, so their +"name" refers to their address). See GlobalValue for more on this. Global variables +may have an initial value (which must be a Constant), and if they have an initializer, they +may be marked as "constant" themselves (indicating that their contents never +change at runtime).

+ +
+ + +
+ Important Public Members of the + GlobalVariable class +
+ +
+ - - - - - - - -
   The Module class
+ +
+ + +
+ The Module class +
+ +
+ +

#include "llvm/Module.h"
doxygen info: +Module Class

+ +

The Module class represents the top level structure present in LLVM +programs. An LLVM module is effectively either a translation unit of the +original program or a combination of several translation units merged by the +linker. The Module class keeps track of a list of Functions, a list of GlobalVariables, and a SymbolTable. Additionally, it contains a few +helpful member functions that try to make common operations easy.

+ +
+ + +
+ Important Public Members of the Module class +
+ +
+ -

-
Important Public Members of the Module -class

- -

Constructing a Module -is easy. You can optionally provide a name for it (probably based on the -name of the translation unit).

+ +

Constructing a Module is easy. You can optionally +provide a name for it (probably based on the name of the translation unit).

+ - - - - - - - -
   The Constant -class and subclasses
+ +
+ -

-
Important Public Methods

+ +
+ + + +
+ + + +
+ + +
+ The Constant class and subclasses +
+ +
+ +

Constant represents a base class for different types of constants. It +is subclassed by ConstantBool, ConstantInt, ConstantSInt, ConstantUInt, +ConstantArray etc for representing the various types of Constants.

+ +
+ + +
+ Important Public Methods +
+ +
+ - + - - - - - - - -
   The Type class and -Derived Types
+ +
+ + +
+ The Type class and Derived Types +
+ +
+ +

Type as noted earlier is also a subclass of a Value class. Any primitive +type (like int, short etc) in LLVM is an instance of Type Class. All other +types are instances of subclasses of type like FunctionType, ArrayType +etc. DerivedType is the interface for all such dervied types including +FunctionType, ArrayType, PointerType, StructType. Types can have names. They can +be recursive (StructType). There exists exactly one instance of any type +structure at a time. This allows using pointer equality of Type *s for comparing +types.

+ +
+ + +
+ Important Public Methods +
+ +
+ -

-
Important Public Methods

-
+ + +
+ The Argument class +
+ +
+ +

This subclass of Value defines the interface for incoming formal arguments to a function. A Function maitanis a list of its formal -arguments. An argument has a pointer to the parent Function. - +arguments. An argument has a pointer to the parent Function.

+ +
+ -
-
By: Dinakar Dhurjati -and Chris Lattner
-
The LLVM -Compiler Infrastructure
- Last -modified: Fri Nov 7 13:24:22 CST 2003
+
+
+ Valid CSS! + Valid HTML 4.01! + + Dinakar Dhurjati and + Chris Lattner
+ The LLVM Compiler Infrastructure
+ Last modified: $Date$ +
+