From 51f31e07f636003388cdac5299ac82ce63c6be15 Mon Sep 17 00:00:00 2001
From: Reid Spencer Written by Reid Spencer
Warning: This is a work in progress.
-
-
-
This section provides the general layout of the LLVM bytecode file format. - The detailed layout can be found in the next section. -
-The bytecode file format requires blocks to be in a certain order and -nested in a particular way so that an LLVM module can be constructed -efficiently from the contents of the file. This ordering defines a general -structure for bytecode files as shown below. The table below shows the order -in which all block types may appear. Please note that some of the blocks are -optional and some may be repeated. The structure is fairly loose because -optional blocks, if empty, are completely omitted from the file. -
+This section provides the general structur of the LLVM bytecode file + format. The bytecode file format requires blocks to be in a certain order and + nested in a particular way so that an LLVM module can be constructed + efficiently from the contents of the file. This ordering defines a general + structure for bytecode files as shown below. The table below shows the order + in which all block types may appear. Please note that some of the blocks are + optional and some may be repeated. The structure is fairly loose because + optional blocks, if empty, are completely omitted from the file.
ID | @@ -309,48 +292,68 @@ optional blocks, if empty, are completely omitted from the file.Repeated? | Level | Block Type | +Description | |||
---|---|---|---|---|---|---|---|
N/A | File | No | No | 0 | Signature | +This contains the file signature (magic number) + that identifies the file as LLVM bytecode. | |
0x01 | File | No | No | 0 | Module | +This is the top level block in a bytecode file. It + contains all the other blocks. | |
0x15 | Module | No | No | 1 | -- Global Type Pool | +Global Type Pool | +This block contains all the global (module) level + types. |
0x14 | Module | No | No | 1 | -- Module Globals Info | +Module Globals Info | +This block contains the type, constness, and linkage + for each of the global variables in the module. It also contains the + type of the functions and the constant initializers. |
0x12 | Module | Yes | No | 1 | -- Module Constant Pool | +Module Constant Pool | +This block contains all the global constants + except function arguments, global values and constant strings. |
0x11 | Module | Yes | Yes | 1 | -- Function Definitions | +Function Definitions | +One function block is written for each function in + the module. The function block contains the instructions, compaction + table, type constant pool, and symbol table for the function. |
0x12 | Function | Yes | No | 2 | -- Function Constant Pool | +Function Constant Pool | +Any constants (including types) used solely + within the function are emitted here in the function constant pool. + |
0x33 | Function | Yes | No | 2 | -- Compaction Table | +Compaction Table | +This table reduces bytecode size by providing a + funtion-local mapping of type and value slot numbers to their + global slot numbers |
0x32 | Function | No | No | 2 | -- Instruction List | +Instruction List | +This block contains all the instructions of the + function. The basic blocks are inferred by terminating instructions. + |
0x13 | Function | Yes | No | 2 | -- Function Symbol Table | +Function Symbol Table | +This symbol table provides the names for the + function specific values used (basic block labels mostly). |
0x13 | Module | Yes | No | 1 | -- Module Symbol Table | +Module Symbol Table | +This symbol table provides the names for the various + entries in the file that are not function specific (global vars, and + functions mostly). |
Use the links in the table or see Block Types for @@ -358,59 +361,13 @@ details about the contents of each of the block types.
This section provides the detailed layout of the LLVM bytecode file format. -
-The descriptions of the bytecode format that follow describe the order, type -and bit fields in detail. These descriptions are provided in tabular form. -Each table has four columns that specify:
-The bytecode format encodes the intermediate representation into groups - of bytes known as blocks. The blocks are written sequentially to the file in - the following order:
-This section provides the detailed layout of the individual block types + in the LLVM bytecode file format.
To be determined.
+Type | +Field Description | +
---|---|
uint32_vbr | +The linkage type of the function: 0=External, 1=Weak, + 2=Appending, 3=Internal, 4=LinkOnce1 | +
constant pool | +The constant pool block for this function. + 2 + | +
compaction table | +The compaction table block for the function. + 2 + | +
instruction list | +The list of instructions in the function. | +
symbol table | +The function's slot table containing only those + symbols pertinent to the function (mostly block labels). + | +
To be determined.
+The instructions in a function are written as a simple list. Basic blocks + are inferred by the terminating instruction types. The format of the block + is given in the following table.
+Type | +Field Description | +
---|---|
unsigned | +Instruction list identifier (0x33). | +
unsigned | +Size in bytes of the instruction list. | +
instruction | +An instruction.1 | +
For brevity, instructions are written in one of four formats, depending on + the number of operands to the instruction. Each instruction begins with a + uint32_vbr that encodes the type of the instruction + as well as other things. The tables that follow describe the format of this + first word of each instruction.
+Instruction Format 0
+This format is used for a few instructions that can't easily be optimized + because they have large numbers of operands (e.g. PHI Node or getelementptr). + Each of the opcode, type, and operand fields is as successive fields.
+Type | +Field Description | +
---|---|
uint32_vbr | +Specifies the opcode of the instruction. Note that for + compatibility with the other instruction formats, the opcode is shifted + left by 2 bits. Bits 0 and 1 must have value zero for this format. | +
uint32_vbr | +Provides the slot number of the result type of the + instruction | +
uint32_vbr | +The number of operands that follow. | +
uint32_vbr | +The slot number of the value for the operand(s). + 1,2 | +
Instruction Format 1
+This format encodes the opcode, type and a single operand into a single + uint32_vbr as follows:
+Bits | +Type | +Field Description | +
---|---|---|
0-1 | constant "1" | +These two bits must be the value 1 which identifies + this as an instruction of format 1. | + +
2-7 | opcode | +Specifies the opcode of the instruction. Note that + the maximum opcode value si 63. | +
8-19 | unsigned | +Specifies the slot number of the type for this + instruction. Maximum slot number is 212-1=4095. | +
20-31 | unsigned | +Specifies the slot number of the value for the + first operand. Maximum slot number is 212-1=4095. Note + that the value 212-1 denotes zero operands. | +
Instruction Format 2
+This format encodes the opcode, type and two operands into a single + uint32_vbr as follows:
+Bits | +Type | +Field Description | +
---|---|---|
0-1 | constant "2" | +These two bits must be the value 2 which identifies + this as an instruction of format 2. | + +
2-7 | opcode | +Specifies the opcode of the instruction. Note that + the maximum opcode value si 63. | +
8-15 | unsigned | +Specifies the slot number of the type for this + instruction. Maximum slot number is 28-1=255. | +
16-23 | unsigned | +Specifies the slot number of the value for the + first operand. Maximum slot number is 28-1=255. | +
24-31 | unsigned | +Specifies the slot number of the value for the + second operand. Maximum slot number is 28-1=255. | +
Instruction Format 3
+This format encodes the opcode, type and three operands into a single + uint32_vbr as follows:
+Bits | +Type | +Field Description | +
---|---|---|
0-1 | constant "3" | +These two bits must be the value 3 which identifies + this as an instruction of format 3. | + +
2-7 | opcode | +Specifies the opcode of the instruction. Note that + the maximum opcode value si 63. | +
8-13 | unsigned | +Specifies the slot number of the type for this + instruction. Maximum slot number is 26-1=63. | +
14-19 | unsigned | +Specifies the slot number of the value for the + first operand. Maximum slot number is 26-1=63. | +
20-25 | unsigned | +Specifies the slot number of the value for the + second operand. Maximum slot number is 26-1=63. | +
26-31 | unsigned | +Specifies the slot number of the value for the + third operand. Maximum slot number is 26-1=63. | +
Byte(s) | -Bit(s) | -Align? | Type | Field Description | ||
---|---|---|---|---|---|---|
00-03 | - | No | unsigned | +unsigned | Symbol Table Identifier (0x13) | |
04-07 | - | No | unsigned | +unsigned | Size in bytes of the symbol table block. | |
08-111 | - | No | uint32_vbr | +uint32_vbr | Number of entries in type plane | |
12-151 | - | No | uint32_vbr | -Type plane index for following entries | +symtab_entry | +Provides the slot number of the type and its name. + 1 |
16-191,2 | - | No | uint32_vbr | -Slot number of a value. | -||
variable1,2 | - | No | string | -Name of the value in the symbol table. | -symtab_plane | +A type plane containing value slot number and name + for all values of the same type.1 |
A symbol table plane provides the symbol table entries for all values of + a common type. The encoding is given in the following table:
+Type | +Field Description | +
---|---|
uint32_vbr | +Number of entries in this plane. | +
uint32_vbr | +Slot number of type for this plane. | +
symtab_entry | +The symbol table entries for this plane (repeated). | +
A symbol table entry provides the assocation between a type or value's + slot number and the name given to that type or value. The format is given + in the following table:
+Type | +Field Description | +
---|---|
uint32_vbr | +Slot number of the type or value being given a name. + | +
uint32_vbr | +Length of the character array that follows. | +
char | +The characters of the name (repeated). | +
None. Version 1.0 and 1.1 bytecode formats are identical.