Various updates from Sam Bishop:

"I have been working my way through the JIT and Kaleidoscope tutorials in my
(minuscule) spare time.  Thanks again for writing them!  I have attached a
patch containing some minor changes, ranging from spelling and grammar fixes
to adding a "Next: <next tutorial section>" hyperlink to the bottom of each
page.

Every page has been given the "next link" treatment, but otherwise I'm only
half way through the Kaleidoscope tutorial.  I will send a follow-on patch
if time permits."



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46933 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is contained in:
Chris Lattner 2008-02-10 19:11:04 +00:00
parent 916c954bf2
commit 729eb14ae8
9 changed files with 87 additions and 75 deletions

View File

@ -25,7 +25,7 @@
<div class="doc_text">
<p>For starters, lets consider a relatively straightforward function that takes three integer parameters and returns an arithmetic combination of them. This is nice and simple, especially since it involves no control flow:</p>
<p>For starters, let's consider a relatively straightforward function that takes three integer parameters and returns an arithmetic combination of them. This is nice and simple, especially since it involves no control flow:</p>
<div class="doc_code">
<pre>
@ -86,7 +86,7 @@ int main(int argc, char**argv) {
</pre>
</div>
<p>The first segment is pretty simple: it creates an LLVM “module.” In LLVM, a module represents a single unit of code that is to be processed together. A module contains things like global variables and function declarations and implementations. Here, weve declared a <code>makeLLVMModule()</code> function to do the real work of creating the module. Dont worry, well be looking at that one next!</p>
<p>The first segment is pretty simple: it creates an LLVM “module.” In LLVM, a module represents a single unit of code that is to be processed together. A module contains things like global variables, function declarations, and implementations. Here weve declared a <code>makeLLVMModule()</code> function to do the real work of creating the module. Dont worry, well be looking at that one next!</p>
<p>The second segment runs the LLVM module verifier on our newly created module. While this probably isnt really necessary for a simple module like this one, its always a good idea, especially if youre generating LLVM IR based on some input. The verifier will print an error message if your LLVM module is malformed in any way.</p>
@ -106,7 +106,7 @@ Module* makeLLVMModule() {
<div class="doc_code">
<pre>
Constant* c = mod->getOrInsertFunction("mul_add",
Constant* c = mod-&gt;getOrInsertFunction("mul_add",
/*ret type*/ IntegerType::get(32),
/*args*/ IntegerType::get(32),
IntegerType::get(32),
@ -114,31 +114,31 @@ Module* makeLLVMModule() {
/*varargs terminated with null*/ NULL);
Function* mul_add = cast&lt;Function&gt;(c);
mul_add->setCallingConv(CallingConv::C);
mul_add-&gt;setCallingConv(CallingConv::C);
</pre>
</div>
<p>We construct our <code>Function</code> by calling <code>getOrInsertFunction()</code> on our module, passing in the name, return type, and argument types of the function. In the case of our <code>mul_add</code> function, that means one 32-bit integer for the return value, and three 32-bit integers for the arguments.</p>
<p>We construct our <code>Function</code> by calling <code>getOrInsertFunction()</code> on our module, passing in the name, return type, and argument types of the function. In the case of our <code>mul_add</code> function, that means one 32-bit integer for the return value and three 32-bit integers for the arguments.</p>
<p>You'll notice that <code>getOrInsertFunction</code> doesn't actually return a <code>Function*</code>. This is because, if the function already existed, but with a different prototype, <code>getOrInsertFunction</code> will return a cast of the existing function to the desired prototype. Since we know that there's not already a <code>mul_add</code> function, we can safely just cast <code>c</code> to a <code>Function*</code>.
<p>You'll notice that <code>getOrInsertFunction()</code> doesn't actually return a <code>Function*</code>. This is because <code>getOrInsertFunction()</code> will return a cast of the existing function if the function already existed with a different prototype. Since we know that there's not already a <code>mul_add</code> function, we can safely just cast <code>c</code> to a <code>Function*</code>.
<p>In addition, we set the calling convention for our new function to be the C calling convention. This isnt strictly necessary, but it insures that our new function will interoperate properly with C code, which is a good thing.</p>
<div class="doc_code">
<pre>
Function::arg_iterator args = mul_add->arg_begin();
Function::arg_iterator args = mul_add-&gt;arg_begin();
Value* x = args++;
x->setName("x");
x-&gt;setName("x");
Value* y = args++;
y->setName("y");
y-&gt;setName("y");
Value* z = args++;
z->setName("z");
z-&gt;setName("z");
</pre>
</div>
<p>While were setting up our function, lets also give names to the parameters. This also isnt strictly necessary (LLVM will generate names for them if you dont specify them), but itll make looking at our output somewhat more pleasant. To name the parameters, we iterator over the arguments of our function, and call <code>setName()</code> on them. Well also keep the pointer to <code>x</code>, <code>y</code>, and <code>z</code> around, since well need them when we get around to creating instructions.</p>
<p>While were setting up our function, lets also give names to the parameters. This also isnt strictly necessary (LLVM will generate names for them if you dont specify them), but itll make looking at our output somewhat more pleasant. To name the parameters, we iterate over the arguments of our function and call <code>setName()</code> on them. Well also keep the pointer to <code>x</code>, <code>y</code>, and <code>z</code> around, since well need them when we get around to creating instructions.</p>
<p>Great! We have a function now. But what good is a function if it has no body? Before we start working on a body for our new function, we need to recall some details of the LLVM IR. The IR, being an abstract assembly language, represents control flow using jumps (we call them branches), both conditional and unconditional. The straight-line sequences of code between branches are called basic blocks, or just blocks. To create a body for our function, we fill it with blocks!</p>
<p>Great! We have a function now. But what good is a function if it has no body? Before we start working on a body for our new function, we need to recall some details of the LLVM IR. The IR, being an abstract assembly language, represents control flow using jumps (we call them branches), both conditional and unconditional. The straight-line sequences of code between branches are called basic blocks, or just blocks. To create a body for our function, we fill it with blocks:</p>
<div class="doc_code">
<pre>
@ -165,17 +165,18 @@ Module* makeLLVMModule() {
<p>The final step in creating our function is to create the instructions that make it up. Our <code>mul_add</code> function is composed of just three instructions: a multiply, an add, and a return. <code>LLVMBuilder</code> gives us a simple interface for constructing these instructions and appending them to the “entry” block. Each of the calls to <code>LLVMBuilder</code> returns a <code>Value*</code> that represents the value yielded by the instruction. Youll also notice that, above, <code>x</code>, <code>y</code>, and <code>z</code> are also <code>Value*</code>s, so its clear that instructions operate on <code>Value*</code>s.</p>
<p>And thats it! Now you can compile and run your code, and get a wonderful textual print out of the LLVM IR we saw at the beginning. To compile, use the following commandline as a guide:</p>
<p>And thats it! Now you can compile and run your code, and get a wonderful textual print out of the LLVM IR we saw at the beginning. To compile, use the following command line as a guide:</p>
<div class="doc_code">
<pre>
# c++ -g tut2.cpp `llvm-config --cppflags --ldflags --libs core` -o tut2
# ./tut2
# c++ -g tut1.cpp `llvm-config --cppflags --ldflags --libs core` -o tut1
# ./tut1
</pre>
</div>
<p>The <code>llvm-config</code> utility is used to obtain the necessary GCC-compatible compiler flags for linking with LLVM. For this example, we only need the 'core' library. We'll use others once we start adding optimizers and the JIT engine.</p>
<a href="JITTutorial2.html">Next: A More Complicated Function</a>
</div>
<!-- *********************************************************************** -->

View File

@ -32,7 +32,7 @@
unsigned gcd(unsigned x, unsigned y) {
if(x == y) {
return x;
} else if(x < y) {
} else if(x &lt; y) {
return gcd(x, y - x);
} else {
return gcd(x - y, y);
@ -45,7 +45,7 @@ unsigned gcd(unsigned x, unsigned y) {
<div style="text-align: center;"><img src="JITTutorial2-1.png" alt="GCD CFG" width="60%"></div>
<p>The above is a graphical representation of a program in LLVM IR. It places each basic block on a node of a graph, and uses directed edges to indicate flow control. These blocks will be serialized when written to a text or bitcode file, but it is often useful conceptually to think of them as a graph. Again, if you are unsure about the code in the diagram, you should skim through the <a href="../LangRef.html">LLVM Language Reference Manual</a> and convince yourself that it is, in fact, the GCD algorithm.</p>
<p>This is a graphical representation of a program in LLVM IR. It places each basic block on a node of a graph and uses directed edges to indicate flow control. These blocks will be serialized when written to a text or bitcode file, but it is often useful conceptually to think of them as a graph. Again, if you are unsure about the code in the diagram, you should skim through the <a href="../LangRef.html">LLVM Language Reference Manual</a> and convince yourself that it is, in fact, the GCD algorithm.</p>
<p>The first part of our code is practically the same as from the first tutorial. The same basic setup is required: creating a module, verifying it, and running the <code>PrintModulePass</code> on it. Even the first segment of <code>makeLLVMModule()</code> looks essentially the same, except that <code>gcd</code> takes one fewer parameter than <code>mul_add</code>.</p>
@ -94,7 +94,7 @@ Module* makeLLVMModule() {
<p>Here, however, is where our code begins to diverge from the first tutorial. Because <code>gcd</code> has control flow, it is composed of multiple blocks interconnected by branching (<code>br</code>) instructions. For those familiar with assembly language, a block is similar to a labeled set of instructions. For those not familiar with assembly language, a block is basically a set of instructions that can be branched to and is executed linearly until the block is terminated by one of a small number of control flow instructions, such as <code>br</code> or <code>ret</code>.</p>
<p>Blocks corresponds to the nodes in the diagram we looked at in the beginning of this tutorial. From the diagram, we can see that this function contains five blocks, so we'll go ahead and create them. Note that, in this code sample, we're making use of LLVM's automatic name uniquing, since we're giving two blocks the same name.</p>
<p>Blocks correspond to the nodes in the diagram we looked at in the beginning of this tutorial. From the diagram, we can see that this function contains five blocks, so we'll go ahead and create them. Note that we're making use of LLVM's automatic name uniquing in this code sample, since we're giving two blocks the same name.</p>
<div class="doc_code">
<pre>
@ -106,7 +106,7 @@ Module* makeLLVMModule() {
</pre>
</div>
<p>Now, we're ready to begin generate code! We'll start with the <code>entry</code> block. This block corresponds to the top-level if-statement in the original C code, so we need to compare <code>x == y</code> To achieve this, we perform an explicity comparison using <code>ICmpEQ</code>. <code>ICmpEQ</code> stands for an <em>integer comparison for equality</em> and returns a 1-bit integer result. This 1-bit result is then used as the input to a conditional branch, with <code>ret</code> as the <code>true</code> and <code>cond_false</code> as the <code>false</code> case.</p>
<p>Now we're ready to begin generating code! We'll start with the <code>entry</code> block. This block corresponds to the top-level if-statement in the original C code, so we need to compare <code>x</code> and <code>y</code>. To achieve this, we perform an explicit comparison using <code>ICmpEQ</code>. <code>ICmpEQ</code> stands for an <em>integer comparison for equality</em> and returns a 1-bit integer result. This 1-bit result is then used as the input to a conditional branch, with <code>ret</code> as the <code>true</code> and <code>cond_false</code> as the <code>false</code> case.</p>
<div class="doc_code">
<pre>
@ -116,7 +116,7 @@ Module* makeLLVMModule() {
</pre>
</div>
<p>Our next block, <code>ret</code>, is pretty simple: it just returns the value of <code>x</code>. Recall that this block is only reached if <code>x == y</code>, so this is the correct behavior. Notice that, instead of creating a new <code>LLVMBuilder</code> for each block, we can use <code>SetInsertPoint</code> to retarget our existing one. This saves on construction and memory allocation costs.</p>
<p>Our next block, <code>ret</code>, is pretty simple: it just returns the value of <code>x</code>. Recall that this block is only reached if <code>x == y</code>, so this is the correct behavior. Notice that instead of creating a new <code>LLVMBuilder</code> for each block, we can use <code>SetInsertPoint</code> to retarget our existing one. This saves on construction and memory allocation costs.</p>
<div class="doc_code">
<pre>
@ -127,7 +127,7 @@ Module* makeLLVMModule() {
<p><code>cond_false</code> is a more interesting block: we now know that <code>x != y</code>, so we must branch again to determine which of <code>x</code> and <code>y</code> is larger. This is achieved using the <code>ICmpULT</code> instruction, which stands for <em>integer comparison for unsigned less-than</em>. In LLVM, integer types do not carry sign; a 32-bit integer pseudo-register can interpreted as signed or unsigned without casting. Whether a signed or unsigned interpretation is desired is specified in the instruction. This is why several instructions in the LLVM IR, such as integer less-than, include a specifier for signed or unsigned.</p>
<p>Also, note that we're again making use of LLVM's automatic name uniquing, this time at a register level. We've deliberately chosen to name every instruction "tmp", to illustrate that LLVM will give them all unique names without getting confused.</p>
<p>Also note that we're again making use of LLVM's automatic name uniquing, this time at a register level. We've deliberately chosen to name every instruction "tmp" to illustrate that LLVM will give them all unique names without getting confused.</p>
<div class="doc_code">
<pre>

View File

@ -54,9 +54,10 @@ teaching compiler techniques and LLVM specifically, <em>not</em> about teaching
modern and sane software engineering principles. In practice, this means that
we'll take a number of shortcuts to simplify the exposition. For example, the
code leaks memory, uses global variables all over the place, doesn't use nice
design patterns like visitors, etc... but it is very simple. If you dig in and
use the code as a basis for future projects, fixing these deficiencies shouldn't
be hard.</p>
design patterns like <a
href="http://en.wikipedia.org/wiki/Visitor_pattern">visitors</a>, etc... but it
is very simple. If you dig in and use the code as a basis for future projects,
fixing these deficiencies shouldn't be hard.</p>
<p>I've tried to put this tutorial together in a way that makes chapters easy to
skip over if you are already familiar with or are uninterested in the various
@ -328,6 +329,7 @@ build an Abstract Syntax Tree</a>. When we have that, we'll include a driver
so that you can use the lexer and parser together.
</p>
<a href="LangImpl2.html">Next: Implementing a Parser and AST</a>
</div>
<!-- *********************************************************************** -->

View File

@ -98,7 +98,7 @@ know what the stored numeric value is.</p>
<p>Right now we only create the AST, so there are no useful accessor methods on
them. It would be very easy to add a virtual method to pretty print the code,
for example. Here are the other expression AST node definitions that we'll use
in the basic form of the Kaleidoscope language.
in the basic form of the Kaleidoscope language:
</p>
<div class="doc_code">
@ -130,7 +130,7 @@ public:
</pre>
</div>
<p>This is all (intentially) rather straight-forward: variables capture the
<p>This is all (intentionally) rather straight-forward: variables capture the
variable name, binary operators capture their opcode (e.g. '+'), and calls
capture a function name as well as a list of any argument expressions. One thing
that is nice about our AST is that it captures the language features without
@ -201,7 +201,7 @@ calls like this:</p>
<div class="doc_code">
<pre>
/// CurTok/getNextToken - Provide a simple token buffer. CurTok is the current
/// token the parser it looking at. getNextToken reads another token from the
/// token the parser is looking at. getNextToken reads another token from the
/// lexer and updates CurTok with its results.
static int CurTok;
static int getNextToken() {
@ -263,11 +263,11 @@ static ExprAST *ParseNumberExpr() {
<p>This routine is very simple: it expects to be called when the current token
is a <tt>tok_number</tt> token. It takes the current number value, creates
a <tt>NumberExprAST</tt> node, advances the lexer to the next token and finally
a <tt>NumberExprAST</tt> node, advances the lexer to the next token, and finally
returns.</p>
<p>There are some interesting aspects to this. The most important one is that
this routine eats all of the tokens that correspond to the production, and
this routine eats all of the tokens that correspond to the production and
returns the lexer buffer with the next token (which is not part of the grammar
production) ready to go. This is a fairly standard way to go for recursive
descent parsers. For a better example, the parenthesis operator is defined like
@ -293,7 +293,7 @@ static ExprAST *ParseParenExpr() {
parser:</p>
<p>
1) it shows how we use the Error routines. When called, this function expects
1) It shows how we use the Error routines. When called, this function expects
that the current token is a '(' token, but after parsing the subexpression, it
is possible that there is no ')' waiting. For example, if the user types in
"(4 x" instead of "(4)", the parser should emit an error. Because errors can
@ -305,8 +305,8 @@ calling <tt>ParseExpression</tt> (we will soon see that <tt>ParseExpression</tt>
<tt>ParseParenExpr</tt>). This is powerful because it allows us to handle
recursive grammars, and keeps each production very simple. Note that
parentheses do not cause construction of AST nodes themselves. While we could
do it this way, the most important role of parens are to guide the parser and
provide grouping. Once the parser constructs the AST, parens are not
do it this way, the most important role of parentheses are to guide the parser
and provide grouping. Once the parser constructs the AST, parentheses are not
needed.</p>
<p>The next simple production is for handling variable references and function
@ -350,21 +350,21 @@ static ExprAST *ParseIdentifierExpr() {
</pre>
</div>
<p>This routine follows the same style as the other routines (it expects to be
<p>This routine follows the same style as the other routines. (It expects to be
called if the current token is a <tt>tok_identifier</tt> token). It also has
recursion and error handling. One interesting aspect of this is that it uses
<em>look-ahead</em> to determine if the current identifier is a stand alone
variable reference or if it is a function call expression. It handles this by
checking to see if the token after the identifier is a '(' token, and constructs
checking to see if the token after the identifier is a '(' token, constructing
either a <tt>VariableExprAST</tt> or <tt>CallExprAST</tt> node as appropriate.
</p>
<p>Now that we have all of our simple expression parsing logic in place, we can
define a helper function to wrap it together into one entry-point. We call this
<p>Now that we have all of our simple expression-parsing logic in place, we can
define a helper function to wrap it together into one entry point. We call this
class of expressions "primary" expressions, for reasons that will become more
clear <a href="LangImpl6.html#unary">later in the tutorial</a>. In order to
parse an arbitrary primary expression, we need to determine what sort of
specific expression it is:</p>
expression it is:</p>
<div class="doc_code">
<pre>
@ -383,13 +383,13 @@ static ExprAST *ParsePrimary() {
</pre>
</div>
<p>Now that you see the definition of this function, it makes it more obvious
why we can assume the state of CurTok in the various functions. This uses
look-ahead to determine which sort of expression is being inspected, and parses
it with a function call.</p>
<p>Now that you see the definition of this function, it is more obvious why we
can assume the state of CurTok in the various functions. This uses look-ahead
to determine which sort of expression is being inspected, and then parses it
with a function call.</p>
<p>Now that basic expressions are handled, we need to handle binary expressions,
which are a bit more complex.</p>
<p>Now that basic expressions are handled, we need to handle binary expressions.
They are a bit more complex.</p>
</div>
@ -447,12 +447,12 @@ int main() {
or -1 if the token is not a binary operator. Having a map makes it easy to add
new operators and makes it clear that the algorithm doesn't depend on the
specific operators involved, but it would be easy enough to eliminate the map
and do the comparisons in the <tt>GetTokPrecedence</tt> function (or just use
and do the comparisons in the <tt>GetTokPrecedence</tt> function. (Or just use
a fixed-size array).</p>
<p>With the helper above defined, we can now start parsing binary expressions.
The basic idea of operator precedence parsing is to break down an expression
with potentially ambiguous binary operators into pieces. Consider for example
with potentially ambiguous binary operators into pieces. Consider ,for example,
the expression "a+b+(c+d)*e*f+g". Operator precedence parsing considers this
as a stream of primary expressions separated by binary operators. As such,
it will first parse the leading primary expression "a", then it will see the
@ -708,7 +708,7 @@ static FunctionAST *ParseTopLevelExpr() {
</pre>
</div>
<p>Now that we have all the pieces, lets build a little driver that will let us
<p>Now that we have all the pieces, let's build a little driver that will let us
actually <em>execute</em> this code we've built!</p>
</div>
@ -732,7 +732,7 @@ static void MainLoop() {
fprintf(stderr, "ready&gt; ");
switch (CurTok) {
case tok_eof: return;
case ';': getNextToken(); break; // ignore top level semicolons.
case ';': getNextToken(); break; // ignore top-level semicolons.
case tok_def: HandleDefinition(); break;
case tok_extern: HandleExtern(); break;
default: HandleTopLevelExpression(); break;
@ -742,13 +742,13 @@ static void MainLoop() {
</pre>
</div>
<p>The most interesting part of this is that we ignore top-level semi colons.
<p>The most interesting part of this is that we ignore top-level semicolons.
Why is this, you ask? The basic reason is that if you type "4 + 5" at the
command line, the parser doesn't know whether that is the end of what you will type
or not. For example, on the next line you could type "def foo..." in which case
4+5 is the end of a top-level expression. Alternatively you could type "* 6",
which would continue the expression. Having top-level semicolons allows you to
type "4+5;" and the parser will know you are done.</p>
type "4+5;", and the parser will know you are done.</p>
</div>
@ -760,8 +760,8 @@ type "4+5;" and the parser will know you are done.</p>
<p>With just under 400 lines of commented code (240 lines of non-comment,
non-blank code), we fully defined our minimal language, including a lexer,
parser and AST builder. With this done, the executable will validate
Kaleidoscope code and tell us if it is gramatically invalid. For
parser, and AST builder. With this done, the executable will validate
Kaleidoscope code and tell us if it is grammatically invalid. For
example, here is a sample interaction:</p>
<div class="doc_code">
@ -798,8 +798,8 @@ Representation (IR) from the AST.</p>
<p>
Here is the complete code listing for this and the previous chapter.
Note that it is fully self-contained: you don't need LLVM or any external
libraries at all for this (other than the C and C++ standard libraries of
course). To build this, just compile with:</p>
libraries at all for this. (Besides the C and C++ standard libraries, of
course.) To build this, just compile with:</p>
<div class="doc_code">
<pre>
@ -955,7 +955,7 @@ public:
//===----------------------------------------------------------------------===//
/// CurTok/getNextToken - Provide a simple token buffer. CurTok is the current
/// token the parser it looking at. getNextToken reads another token from the
/// token the parser is looking at. getNextToken reads another token from the
/// lexer and updates CurTok with its results.
static int CurTok;
static int getNextToken() {
@ -1167,7 +1167,7 @@ static void HandleExtern() {
}
static void HandleTopLevelExpression() {
// Evaluate a top level expression into an anonymous function.
// Evaluate a top-level expression into an anonymous function.
if (FunctionAST *F = ParseTopLevelExpr()) {
fprintf(stderr, "Parsed a top-level expr\n");
} else {
@ -1182,7 +1182,7 @@ static void MainLoop() {
fprintf(stderr, "ready&gt; ");
switch (CurTok) {
case tok_eof: return;
case ';': getNextToken(); break; // ignore top level semicolons.
case ';': getNextToken(); break; // ignore top-level semicolons.
case tok_def: HandleDefinition(); break;
case tok_extern: HandleExtern(); break;
default: HandleTopLevelExpression(); break;
@ -1211,6 +1211,7 @@ int main() {
}
</pre>
</div>
<a href="LangImpl3.html">Next: Implementing Code Generation to LLVM IR</a>
</div>
<!-- *********************************************************************** -->

View File

@ -59,8 +59,8 @@ LLVM SVN to work. LLVM 2.1 and before will not work with it.</p>
<div class="doc_text">
<p>
In order to generate LLVM IR, we want some simple setup to get started. First,
we define virtual codegen methods in each AST class:</p>
In order to generate LLVM IR, we want some simple setup to get started. First
we define virtual code generation (codegen) methods in each AST class:</p>
<div class="doc_code">
<pre>
@ -95,9 +95,11 @@ href="http://en.wikipedia.org/wiki/Static_single_assignment_form">Static Single
Assignment</a> - the concepts are really quite natural once you grok them.</p>
<p>Note that instead of adding virtual methods to the ExprAST class hierarchy,
it could also make sense to use a visitor pattern or some other way to model
this. Again, this tutorial won't dwell on good software engineering practices:
for our purposes, adding a virtual method is simplest.</p>
it could also make sense to use a <a
href="http://en.wikipedia.org/wiki/Visitor_pattern">visitor pattern</a> or some
other way to model this. Again, this tutorial won't dwell on good software
engineering practices: for our purposes, adding a virtual method is
simplest.</p>
<p>The
second thing we want is an "Error" method like we used for the parser, which will
@ -121,16 +123,15 @@ uses to contain code.</p>
<p>The <tt>Builder</tt> object is a helper object that makes it easy to generate
LLVM instructions. Instances of the <a
href="http://llvm.org/doxygen/LLVMBuilder_8h-source.html"><tt>LLVMBuilder</tt>
class</a> keep track of the current place to
insert instructions and has methods to create new instructions.</p>
href="http://llvm.org/doxygen/LLVMBuilder_8h-source.html"><tt>LLVMBuilder</tt></a>
class keep track of the current place to insert instructions and has methods to
create new instructions.</p>
<p>The <tt>NamedValues</tt> map keeps track of which values are defined in the
current scope and what their LLVM representation is (in other words, it is a
symbol table for the code). In this form of
Kaleidoscope, the only things that can be referenced are function parameters.
As such, function parameters will be in this map when generating code for their
function body.</p>
current scope and what their LLVM representation is. (In other words, it is a
symbol table for the code). In this form of Kaleidoscope, the only things that
can be referenced are function parameters. As such, function parameters will
be in this map when generating code for their function body.</p>
<p>
With these basics in place, we can start talking about how to generate code for
@ -148,7 +149,7 @@ has already been done, and we'll just use it to emit code.
<div class="doc_text">
<p>Generating LLVM code for expression nodes is very straightforward: less
than 45 lines of commented code for all four of our expression nodes. First,
than 45 lines of commented code for all four of our expression nodes. First
we'll do numeric literals:</p>
<div class="doc_code">
@ -218,11 +219,13 @@ code, we do a simple switch on the opcode to create the right LLVM instruction.
LLVMBuilder knows where to insert the newly created instruction, all you have to
do is specify what instruction to create (e.g. with <tt>CreateAdd</tt>), which
operands to use (<tt>L</tt> and <tt>R</tt> here) and optionally provide a name
for the generated instruction. One nice thing about LLVM is that the name is
just a hint: if there are multiple additions in a single function, the first
will be named "addtmp" and the second will be "autorenamed" by adding a suffix,
giving it a name like "addtmp42". Local value names for instructions are purely
optional, but it makes it much easier to read the IR dumps.</p>
for the generated instruction.</p>
<p>One nice thing about LLVM is that the name is just a hint. For instance, if
the code above emits multiple "addtmp" variables, LLVM will automatically
provide each one with an increasing, unique numeric suffix. Local value names
for instructions are purely optional, but it makes it much easier to read the
IR dumps.</p>
<p><a href="../LangRef.html#instref">LLVM instructions</a> are constrained by
strict rules: for example, the Left and Right operators of
@ -1228,6 +1231,7 @@ int main() {
}
</pre>
</div>
<a href="LangImpl4.html">Next: Adding JIT and Optimizer Support</a>
</div>
<!-- *********************************************************************** -->

View File

@ -1119,6 +1119,7 @@ int main() {
</pre>
</div>
<a href="LangImpl5.html">Next: Extending the language: control flow</a>
</div>
<!-- *********************************************************************** -->

View File

@ -1745,6 +1745,7 @@ int main() {
</pre>
</div>
<a href="LangImpl6.html">Next: Extending the language: user-defined operators</a>
</div>
<!-- *********************************************************************** -->

View File

@ -1784,6 +1784,7 @@ int main() {
</pre>
</div>
<a href="LangImpl7.html">Next: Extending the language: mutable variables / SSA construction</a>
</div>
<!-- *********************************************************************** -->

View File

@ -2140,6 +2140,7 @@ int main() {
</pre>
</div>
<a href="LangImpl8.html">Next: Conclusion and other useful LLVM tidbits</a>
</div>
<!-- *********************************************************************** -->