From 00c992dde0cfdaf2914b8fdc6cd9d8931bb5d168 Mon Sep 17 00:00:00 2001 From: Chris Lattner Date: Sat, 3 Nov 2007 08:55:29 +0000 Subject: [PATCH] hack and slash the first 20% of chapter seven. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@43663 91177308-0d34-0410-b5e6-96231b3b80d8 --- docs/tutorial/LangImpl7.html | 298 +++++++++++++++++++++++++++++++++++ docs/tutorial/index.html | 2 +- 2 files changed, 299 insertions(+), 1 deletion(-) create mode 100644 docs/tutorial/LangImpl7.html diff --git a/docs/tutorial/LangImpl7.html b/docs/tutorial/LangImpl7.html new file mode 100644 index 00000000000..f80c0673fdd --- /dev/null +++ b/docs/tutorial/LangImpl7.html @@ -0,0 +1,298 @@ + + + + + Kaleidoscope: Extending the Language: Mutable Variables / SSA + construction + + + + + + + +
Kaleidoscope: Extending the Language: Mutable Variables
+ +
+

Written by Chris Lattner

+
+ + +
Part 7 Introduction
+ + +
+ +

Welcome to Part 7 of the "Implementing a language with +LLVM" tutorial. In parts 1 through 6, we've built a very respectable, +albeit simple, functional +programming language. In our journey, we learned some parsing techniques, +how to build and represent an AST, how to build LLVM IR, and how to optimize +the resultant code and JIT compile it.

+ +

While Kaleidoscope is interesting as a functional language, this makes it +"too easy" to generate LLVM IR for it. In particular, a functional language +makes it very easy to build LLVM IR directly in SSA form. +Since LLVM requires that the input code be in SSA form, this is a very nice +property and it is often unclear to newcomers how to generate code for an +imperative language with mutable variables.

+ +

The short (and happy) summary of this chapter is that there is no need for +your front-end to build SSA form: LLVM provides highly tuned and well tested +support for this, though the way it works is a bit unexpected for some.

+ +
+ + +
Why is this a hard problem?
+ + +
+ +

+To understand why mutable variables cause complexities in SSA construction, +consider this extremely simple C example: +

+ +
+
+int G, H;
+int test(_Bool Condition) {
+  int X;
+  if (Condition)
+    X = G;
+  else
+    X = H;
+  return X;
+}
+
+
+ +

In this case, we have the variable "X", whose value depends on the path +executed in the program. Because there are two different possible values for X +before the return instruction, a PHI node is inserted to merge the two values. +The LLVM IR that we want for this example looks like this:

+ +
+
+@G = weak global i32 0   ; type of @G is i32*
+@H = weak global i32 0   ; type of @H is i32*
+
+define i32 @test(i1 %Condition) {
+entry:
+	br i1 %Condition, label %cond_true, label %cond_false
+
+cond_true:
+	%X.0 = load i32* @G
+	br label %cond_next
+
+cond_false:
+	%X.1 = load i32* @H
+	br label %cond_next
+
+cond_next:
+	%X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
+	ret i32 %X.2
+}
+
+
+ +

In this example, the loads from the G and H global variables are explicit in +the LLVM IR, and they live in the then/else branches of the if statement +(cond_true/cond_false). In order to merge the incoming values, the X.2 phi node +in the cond_next block selects the right value to use based on where control +flow is coming from: if control flow comes from the cond_false block, X.2 gets +the value of X.1. Alternatively, if control flow comes from cond_tree, it gets +the value of X.0. The intent of this chapter is not to explain the details of +SSA form. For more information, see one of the many online +references.

+ +

The question for this article is "who places phi nodes when lowering +assignments to mutable variables?". The issue here is that LLVM +requires that its IR be in SSA form: there is no "non-ssa" mode for it. +However, SSA construction requires non-trivial algorithms and data structures, +so it is inconvenient and wasteful for every front-end to have to reproduce this +logic.

+ +
+ + +
Memory in LLVM
+ + +
+ +

The 'trick' here is that while LLVM does require all register values to be +in SSA form, it does not require (or permit) memory objects to be in SSA form. +In the example above, note that the loads from G and H are direct accesses to +G and H: they are not renamed or versioned. This differs from some other +compiler systems, which does try to version memory objects. In LLVM, instead of +encoding dataflow analysis of memory into the LLVM IR, it is handled with Analysis Passes which are computed on +demand.

+ +

+With this in mind, the high-level idea is that we want to make a stack variable +(which lives in memory, because it is on the stack) for each mutable object in +a function. To take advantage of this trick, we need to talk about how LLVM +represents stack variables. +

+ +

In LLVM, all memory accesses are explicit with load/store instructions, and +it is carefully designed to not have (or need) an "address-of" operator. Notice +how the type of the @G/@H global variables is actually "i32*" even though the +variable is defined as "i32". What this means is that @G defines space +for an i32 in the global data area, but its name actually refers to the +address for that space. Stack variables work the same way, but instead of being +declared with global variable definitions, they are declared with the +LLVM alloca instruction:

+ +
+
+define i32 @test(i1 %Condition) {
+entry:
+	%X = alloca i32           ; type of %X is i32*.
+	...
+	%tmp = load i32* %X       ; load the stack value %X from the stack.
+	%tmp2 = add i32 %tmp, 1   ; increment it
+	store i32 %tmp2, i32* %X  ; store it back
+	...
+
+
+ +

This code shows an example of how you can declare and manipulate a stack +variable in the LLVM IR. Stack memory allocated with the alloca instruction is +fully general: you can pass the address of the stack slot to functions, you can +store it in other variables, etc. In our example above, we could rewrite the +example to use the alloca technique to avoid using a PHI node:

+ +
+
+@G = weak global i32 0   ; type of @G is i32*
+@H = weak global i32 0   ; type of @H is i32*
+
+define i32 @test(i1 %Condition) {
+entry:
+	%X = alloca i32           ; type of %X is i32*.
+	br i1 %Condition, label %cond_true, label %cond_false
+
+cond_true:
+	%X.0 = load i32* @G
+        store i32 %X.0, i32* %X   ; Update X
+	br label %cond_next
+
+cond_false:
+	%X.1 = load i32* @H
+        store i32 %X.1, i32* %X   ; Update X
+	br label %cond_next
+
+cond_next:
+	%X.2 = load i32* %X       ; Read X
+	ret i32 %X.2
+}
+
+
+ +

With this, we have discovered a way to handle arbitrary mutable variables +without the need to create Phi nodes at all:

+ +
    +
  1. Each mutable variable becomes a stack allocation.
  2. +
  3. Each read of the variable becomes a load from the stack.
  4. +
  5. Each update of the variable becomes a store to the stack.
  6. +
  7. Taking the address of a variable just uses the stack address directly.
  8. +
+ +

While this solution has solved our immediate problem, it introduced another +one: we have now apparently introduced a lot of stack traffic for very simple +and common operations, a major performance problem. Fortunately for us, the +LLVM optimizer has a highly-tuned optimization pass named "mem2reg" that handles +this case, promoting allocas like this into SSA registers, inserting Phi nodes +as appropriate. If you run this example through the pass, for example, you'll +get:

+ +
+
+$ llvm-as < example.ll | opt -mem2reg | llvm-dis
+@G = weak global i32 0
+@H = weak global i32 0
+
+define i32 @test(i1 %Condition) {
+entry:
+	br i1 %Condition, label %cond_true, label %cond_false
+
+cond_true:
+	%X.0 = load i32* @G
+	br label %cond_next
+
+cond_false:
+	%X.1 = load i32* @H
+	br label %cond_next
+
+cond_next:
+	%X.01 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
+	ret i32 %X.01
+}
+
+ +

The mem2reg pass is guaranteed to work, and + +which cases. +

+ +

The final question you may be asking is: should I bother with this nonsense +for my front-end? Wouldn't it be better if I just did SSA construction +directly, avoiding use of the mem2reg optimization pass? + +Proven, well tested, debug info, etc. +

+
+ + + + + + +
+ +

+Here is the complete code listing for our running example, enhanced with the +if/then/else and for expressions.. To build this example, use: +

+ +
+
+   # Compile
+   g++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy
+   # Run
+   ./toy
+
+
+ +

Here is the code:

+ +
+
+
+
+ +
+ + +
+
+ Valid CSS! + Valid HTML 4.01! + + Chris Lattner
+ The LLVM Compiler Infrastructure
+ Last modified: $Date: 2007-10-17 11:05:13 -0700 (Wed, 17 Oct 2007) $ +
+ + diff --git a/docs/tutorial/index.html b/docs/tutorial/index.html index f11fbdce8ba..12644dd7a4d 100644 --- a/docs/tutorial/index.html +++ b/docs/tutorial/index.html @@ -33,7 +33,7 @@
  • Adding JIT and Optimizer Support
  • Extending the language: control flow
  • Extending the language: user-defined operators
  • -
  • Extending the language: mutable variables
  • +
  • Extending the language: mutable variables / SSA construction
  • Thoughts and ideas for extensions
  • Advanced Topics