diff --git a/docs/ProgrammersManual.html b/docs/ProgrammersManual.html new file mode 100644 index 00000000000..b539e34d73e --- /dev/null +++ b/docs/ProgrammersManual.html @@ -0,0 +1,847 @@ + +
LLVM Programmer's Manual | +
Written by Dinakar Dhurjati + and Chris Lattner
+
+Introduction + |
+ +This document should get you oriented so that you can find your way in the +continuously growing source code that makes up the LLVM infrastructure. Note +that this manual is not intended to serve as a replacement for reading the +source code, so if you think there should be a method in one of these classes to +do something, but it's not listed, check the source. Links to the doxygen sources are provided to make this as easy as +possible.
+ +The first section of this document describes general information that is useful +to know when working in the LLVM infrastructure, and the second describes the +Core LLVM classes. In the future this manual will be extended with information +describing how to use extension libraries, such as dominator information, CFG +traversal routines, and useful utilities like the InstVisitor template.
+ + + +
+General Information + |
+ + + +
+ +The C++ Standard Template Library + |
+ +Here are some useful links:
+
+ +You are also encouraged to take a look at the LLVM Coding Standards guide which focuses on how +to write maintainable code more than where to put your curly braces.
+ + + + +
+The Core LLVM Class Heirarchy + |
+ + + +
+ +The Value class + |
+ + +The Value class is the most important class in LLVM Source base. It +represents a typed value that may be used (among other things) as an operand to +an instruction. There are many different types of Values, such as Constants, Arguments, and even Instructions and Functions are Values.
+ +A particular Value may be used many times in the LLVM representation +for a program. For example, an incoming argument to a function (represented +with an instance of the Argument class) is "used" by +every instruction in the function that references the argument. To keep track +of this relationship, the Value class keeps a list of all of the Users that is using it (the User class is a base class for all nodes in the LLVM +graph that can refer to Values). This use list is how LLVM represents +def-use information in the program, and is accessable through the use_* +methods, shown below.
+
+Because LLVM is a typed representation, every LLVM Value is typed, and
+this Type is available through the getType()
+method. In addition, all LLVM values can be named. The
+"name" of the Value is symbolic string printed in the LLVM code:
+
+
+
+One important aspect of LLVM is that there is no distinction between an SSA
+variable and the operation that produces it. Because of this, any reference to
+the value produced by an instruction (or the value available as an incoming
+argument, for example) is represented as a direct pointer to the class that
+represents this value. Although this may take some getting used to, it
+simplifies the representation and makes it easier to manipulate.
+
+
+
+
+ %foo = add int 1, 2
+
+
+The name of this instruction is "foo". NOTE that the name of any value
+may be missing (an empty string), so names should ONLY be used for
+debugging (making the source code easier to read, debugging printouts), they
+should not be used to keep track of values or map between them. For this
+purpose, use a std::map of pointers to the Value itself
+instead.
+ +These methods are the interface to access the def-use information in LLVM. As with all other iterators in LLVM, the naming conventions follow the conventions defined by the STL.
+ +
+This method returns the Type of the Value. + +
+ +This family of methods is used to access and assign a name to a Value, +be aware of the precaution above.
+ + +
+ +This method traverses the use list of a Value changing all User's of the current value to refer to "V" +instead. For example, if you detect that an instruction always produces a +constant value (for example through constant folding), you can replace all uses +of the instruction with the constant like this:
+ +
+ Inst->replaceAllUsesWith(ConstVal); +
+ + + + +
+ +The User class + |
+ + +The User class is the common base class of all LLVM nodes that may +refer to Values. It exposes a list of "Operands" +that are all of the Values that the User is +referring to. The User class itself is a subclass of +Value.
+ +The operands of a User point directly to the LLVM Value that it refers to. Because LLVM uses Static +Single Assignment (SSA) form, there can only be one definition referred to, +allowing this direct connection. This connection provides the use-def +information in LLVM.
+ + +
+ +
+ +These two methods expose the operands of the User in a convenient form +for direct access.
+ +
+ +Together, these methods make up the iterator based interface to the operands of +a User.
+ + + + +
+ +The Instruction class + |
+ +The Instruction class is the common base class for all LLVM +instructions. It provides only a few methods, but is a very commonly used +class. The primary data tracked by the Instruction class itself is the +opcode (instruction type) and the parent BasicBlock the Instruction is embedded +into. To represent a specific type of instruction, one of many subclasses of +Instruction are used.
+ +Because the Instruction class subclasses the User class, its operands can be accessed in the same +way as for other Users (with the +getOperand()/getNumOperands() and +op_begin()/op_end() methods).
+ + + +
+ +Returns the BasicBlock that this +Instruction is embedded into.
+ +
+ +Returns true if the instruction has side effects, i.e. it is a call, +free, invoke, or store.
+ +
+ +Returns the opcode for the Instruction.
+ + + + + +
+ +The BasicBlock class + |
+ + +This class represents a single entry multiple exit section of the code, commonly +known as a basic block by the compiler community. The BasicBlock class +maintains a list of Instructions, which form +the body of the block. Matching the language definition, the last element of +this list of instructions is always a terminator instruction (a subclass of the +TerminatorInst class).
+ +In addition to tracking the list of instructions that make up the block, the +BasicBlock class also keeps track of the Function that it is embedded into.
+ +Note that BasicBlocks themselves are Values, because they are referenced by instructions +like branches and can go in the switch tables. BasicBlocks have type +label.
+ + + +
+ +The BasicBlock constructor is used to create new basic blocks for +insertion into a function. The constructor simply takes a name for the new +block, and optionally a Function to insert it +into. If the Parent parameter is specified, the new +BasicBlock is automatically inserted at the end of the specified Function, if not specified, the BasicBlock must be +manually inserted into the Function.
+ +
+ +These methods and typedefs are forwarding functions that have the same semantics +as the standard library methods of the same names. These methods expose the +underlying instruction list of a basic block in a way that is easy to +manipulate. To get the full complement of container operations (including +operations to update the list), you must use the getInstList() +method.
+ +
+ +This method is used to get access to the underlying container that actually +holds the Instructions. This method must be used when there isn't a forwarding +function in the BasicBlock class for the operation that you would like +to perform. Because there are no forwarding functions for "updating" +operations, you need to use this if you want to update the contents of a +BasicBlock.
+ +
+ +Returns a pointer to Function the block is +embedded into, or a null pointer if it is homeless.
+ +
+ +Returns a pointer to the terminator instruction that appears at the end of the +BasicBlock. If there is no terminator instruction, or if the last +instruction in the block is not a terminator, then a null pointer is +returned.
+ + + +
+ +The GlobalValue class + |
+ +Global values (GlobalVariables or Functions) are the only LLVM values that are +visible in the bodies of all Functions. +Because they are visible at global scope, they are also subject to linking with +other globals defined in different translation units. To control the linking +process, GlobalValues know their linkage rules. Specifically, +GlobalValues know whether they have internal or external linkage.
+ +If a GlobalValue has internal linkage (equivalent to being +static in C), it is not visible to code outside the current translation +unit, and does not participate in linking. If it has external linkage, it is +visible to external code, and does participate in linking. In addition to +linkage information, GlobalValues keep track of which Module they are currently part of.
+ +Because GlobalValues are memory objects, they are always referred to by +their address. As such, the Type of a global is +always a pointer to its contents. This is explained in the LLVM Language +Reference Manual.
+ + + +
+ +These methods manipulate the linkage characteristics of the +GlobalValue.
+ +
+ +This returns the Module that the GlobalValue is +currently embedded into.
+ + + + +
+ +The Function class + |
+ +The Function class represents a single procedure in LLVM. It is +actually one of the more complex classes in the LLVM heirarchy because it must +keep track of a large amount of data. The Function class keeps track +of a list of BasicBlocks, a list of formal Arguments, and a SymbolTable.
+ +The list of BasicBlocks is the most commonly +used part of Function objects. The list imposes an implicit ordering +of the blocks in the function, which indicate how the code will be layed out by +the backend. Additionally, the first BasicBlock is the implicit entry node for the +Function. It is not legal in LLVM explicitly branch to this initial +block. There are no implicit exit nodes, and in fact there may be multiple exit +nodes from a single Function. If the BasicBlock list is empty, this indicates that +the Function is actually a function declaration: the actual body of the +function hasn't been linked in yet.
+ +In addition to a list of BasicBlocks, the +Function class also keeps track of the list of formal Arguments that the function receives. This +container manages the lifetime of the Argument +nodes, just like the BasicBlock list does for +the BasicBlocks.
+ +The SymbolTable is a very rarely used LLVM +feature that is only used when you have to look up a value by name. Aside from +that, the SymbolTable is used internally to +make sure that there are not conflicts between the names of Instructions, BasicBlocks, or Arguments in the function body.
+ + + +
+ +Constructor used when you need to create new Functions to add the the +program. The constructor must specify the type of the function to create and +whether or not it should start out with internal or external linkage.
+ +
+ +Return whether or not the Function has a body defined. If the function +is "external", it does not have a body, and thus must be resolved by linking +with a function defined in a different translation unit.
+ + +
+ +These are forwarding methods that make it easy to access the contents of a +Function object's BasicBlock +list.
+ +
+ +Returns the list of BasicBlocks. This is +neccesary to use when you need to update the list or perform a complex action +that doesn't have a forwarding method.
+ + +
+ +These are forwarding methods that make it easy to access the contents of a +Function object's Argument list.
+ +
+ +Returns the list of Arguments. This is +neccesary to use when you need to update the list or perform a complex action +that doesn't have a forwarding method.
+ + + +
+ +Returns the entry BasicBlock for the +function. Because the entry block for the function is always the first block, +this returns the first block of the Function.
+ +
+ +This traverses the Type of the Function +and returns the return type of the function, or the FunctionType of the actual function.
+ + +
+ +Return true if the Function has a symbol table allocated to it and if +there is at least one entry in it.
+ +
+ +Return a pointer to the SymbolTable for this +Function or a null pointer if one has not been allocated (because there +are no named values in the function).
+ +
+ +Return a pointer to the SymbolTable for this +Function or allocate a new SymbolTable if one is not already around. This +should only be used when adding elements to the SymbolTable, so that empty symbol tables are +not left laying around.
+ + + + +
+ +The GlobalVariable class + |
+ + +A GlobalVariable is a subclass of GlobalValue and defines the interface to +global variables in the SSA program. It can have a name and an +initializer. (initial constant Value) + +Can be constant. + + + +
+ +
+ +Returns true if this is a global variable is known not to be modified at +runtime.
+ +
+ +Returns true if this GlobalVariable has an intializer.
+ +
+ +Returns the intializer
+ + +
+ +The Constant class and subclasses + |
+ + + +
+ +The Type class and Derived Types + |
+ +The Argument class + |