diff --git a/docs/tutorial/JITTutorial2-1.png b/docs/tutorial/JITTutorial2-1.png new file mode 100644 index 00000000000..eb21695f684 Binary files /dev/null and b/docs/tutorial/JITTutorial2-1.png differ diff --git a/docs/tutorial/JITTutorial2.html b/docs/tutorial/JITTutorial2.html new file mode 100644 index 00000000000..d88668ad771 --- /dev/null +++ b/docs/tutorial/JITTutorial2.html @@ -0,0 +1,186 @@ + + + + + LLVM Tutorial 2: A More Complicated Function + + + + + + + + +
LLVM Tutorial 2: A More Complicated Function
+ +
+

Written by Owen Anderson

+
+ + +
Code Samples
+ + +
+All the code in this example can be downloaded at Tutorial2.tar.bz2 or Tutorial2.zip. +
+ + +
A First Function
+ + +
+ +

Now that we understand the basics of creating functions in LLVM, let's move on to a more complicated example: something with control flow. As an example, let's consider Euclid's Greatest Common Denominator (GCD) algorithm:

+ +
+
+unsigned gcd(unsigned x, unsigned y) {
+  if(x == y) {
+    return x;
+  } else if(x < y) {
+    return gcd(x, y - x);
+  } else {
+    return gcd(x - y, y);
+  }
+}
+
+
+ +

With this example, we'll learn how to create functions with multiple blocks and control flow, and how to make function calls within your LLVM code. For starters, consider the diagram below.

+ +
GCD CFG
+ +

The above is a graphical representation of a program in LLVM IR. It places each basic block on a node of a graph, and uses directed edges to indicate flow control. These blocks will be serialized when written to a text or bitcode file, but it is often useful conceptually to think of them as a graph. Again, if you are unsure about the code in the diagram, you should skim through the LLVM Language Reference Manual and convince yourself that it is, in fact, the GCD algorithm.

+ +

The first part of our code is the same as from first tutorial. The same basic setup is required: creating a module, verifying it, and running the PrintModulePass on it. Even the first segment of makeLLVMModule() looks the same, because gcd happens the have the same prototype as our mul_add function.

+ +
+
+#include <llvm/Module.h>
+#include <llvm/Function.h>
+#include <llvm/PassManager.h>
+#include <llvm/Analysis/Verifier.h>
+#include <llvm/Assembly/PrintModulePass.h>
+#include <llvm/Support/LLVMBuilder.h>
+
+using namespace llvm;
+
+Module* makeLLVMModule();
+
+int main(int argc, char**argv) {
+  Module* Mod = makeLLVMModule();
+  
+  verifyModule(*Mod, PrintMessageAction);
+  
+  PassManager PM;
+  PM.add(new PrintModulePass(&llvm::cout));
+  PM.run(*Mod);
+  
+  return 0;
+}
+
+Module* makeLLVMModule() {
+  Module* mod = new Module("tut2");
+  
+  Constant* c = mod->getOrInsertFunction("gcd",
+                                         IntegerType::get(32),
+                                         IntegerType::get(32),
+                                         IntegerType::get(32),
+                                         NULL);
+  Function* gcd = cast(c);
+  
+  Function::arg_iterator args = gcd->arg_begin();
+  Value* x = args++;
+  x->setName("x");
+  Value* y = args++;
+  y->setName("y");
+
+
+ +

Here, however, is where our code begins to diverge from the first tutorial. Because gcd has control flow, it is composed of multiple blocks interconnected by branching (br) instructions. For those familiar with assembly language, a block is similar to a labeled set of instructions. For those not familiar with assembly language, a block is basically a set of instructions that can be branched to and is executed linearly until the block is terminated by one of a small number of control flow instructions, such as br or ret.

+ +

Blocks corresponds to the nodes in the diagram we looked at in the beginning of this tutorial. From the diagram, we can see that this function contains five blocks, so we'll go ahead and create them. Note that, in this code sample, we're making use of LLVM's automatic name uniquing, since we're giving two blocks the same name.

+ +
+
+  BasicBlock* entry = new BasicBlock("entry", gcd);
+  BasicBlock* ret = new BasicBlock("return", gcd);
+  BasicBlock* cond_false = new BasicBlock("cond_false", gcd);
+  BasicBlock* cond_true = new BasicBlock("cond_true", gcd);
+  BasicBlock* cond_false_2 = new BasicBlock("cond_false", gcd);
+
+
+ +

Now, we're ready to begin generate code! We'll start with the entry block. This block corresponds to the top-level if-statement in the original C code, so we need to compare x == y To achieve this, we perform an explicity comparison using ICmpEQ. ICmpEQ stands for an integer comparison for equality and returns a 1-bit integer result. This 1-bit result is then used as the input to a conditional branch, with ret as the true and cond_false as the false case.

+ +
+
+  LLVMBuilder builder(entry);
+  Value* xEqualsY = builder.CreateICmpEQ(x, y, "tmp");
+  builder.CreateCondBr(xEqualsY, ret, cond_false);
+
+
+ +

Our next block, ret, is pretty simple: it just returns the value of x. Recall that this block is only reached if x == y, so this is the correct behavior. Notice that, instead of creating a new LLVMBuilder for each block, we can use SetInsertPoint to retarget our existing one. This saves on construction and memory allocation costs.

+ +
+
+  builder.SetInsertPoint(ret);
+  builder.CreateRet(x);
+
+
+ +

cond_false is a more interesting block: we now know that x != y, so we must branch again to determine which of x and y is larger. This is achieved using the ICmpULT instruction, which stands for integer comparison for unsigned less-than. In LLVM, integer types do not carry sign; a 32-bit integer pseudo-register can interpreted as signed or unsigned without casting. Whether a signed or unsigned interpretation is desired is specified in the instruction. This is why several instructions in the LLVM IR, such as integer less-than, include a specifier for signed or unsigned.

+ +

Also, note that we're again making use of LLVM's automatic name uniquing, this time at a register level. We've deliberately chosen to name every instruction "tmp", to illustrate that LLVM will give them all unique names without getting confused.

+ +
+
+  builder.SetInsertPoint(cond_false);
+  Value* xLessThanY = builder.CreateICmpULT(x, y, "tmp");
+  builder.CreateCondBr(xLessThanY, cond_true, cond_false_2);
+
+
+ +

Our last two blocks are quite similar; they're both recursive calls to gcd with different parameters. To create a call instruction, we have to create a vector (or any other container with InputInterators) to hold the arguments. We then pass in the beginning and ending iterators for this vector.

+ +
+
+  builder.SetInsertPoint(cond_true);
+  Value* yMinusX = builder.CreateSub(y, x, "tmp");
+  std::vector args1;
+  args1.push_back(x);
+  args1.push_back(yMinusX);
+  Value* recur_1 = builder.CreateCall(gcd, args1.begin(), args1.end(), "tmp");
+  builder.CreateRet(recur_1);
+  
+  builder.SetInsertPoint(cond_false_2);
+  Value* xMinusY = builder.CreateSub(x, y, "tmp");
+  std::vector args2;
+  args2.push_back(xMinusY);
+  args2.push_back(y);
+  Value* recur_2 = builder.CreateCall(gcd, args2.begin(), args2.end(), "tmp");
+  builder.CreateRet(recur_2);
+  
+  return mod;
+}
+
+
+ +

And that's it! You can compile your code and execute your code in the same way as before, by executing:

+ +
+
+# c++ -g tut2.cpp `llvm-config --cppflags` `llvm-config --ldflags` \
+                  `llvm-config --libs core` -o tut2
+# ./tut2
+
+
+ +
+ + + \ No newline at end of file diff --git a/docs/tutorial/index.html b/docs/tutorial/index.html index 0101c6dccc5..79597d8e9d7 100644 --- a/docs/tutorial/index.html +++ b/docs/tutorial/index.html @@ -19,7 +19,7 @@
  • Simple JIT Tutorials
    1. A First Function
    2. -
    3. A More Complicated Function
    4. +
    5. A More Complicated Function
    6. Running Optimizations
    7. Reading and Writing Bitcode
    8. Invoking the JIT