diff --git a/docs/tutorial/LangImpl6.html b/docs/tutorial/LangImpl6.html new file mode 100644 index 00000000000..a8ef28f24b0 --- /dev/null +++ b/docs/tutorial/LangImpl6.html @@ -0,0 +1,227 @@ + + + + + Kaleidoscope: Extending the Language: Operator Overloading + + + + + + + +
Kaleidoscope: Extending the Language: Operator Overloading
+ +
+

Written by Chris Lattner

+
+ + +
Part 6 Introduction
+ + +
+ +

Welcome to Part 6 of the "Implementing a language with +LLVM" tutorial. At this point in our tutorial, we now have a fully +functional language that is fairly minimal, but also useful. One big problem +with it though is that it doesn't have many useful operators (like division, +logical negation, or even any comparisons other than less-than.

+ +

This chapter of the tutorial takes a wild digression into adding operator +overloading to the simple and beautiful Kaleidoscope language, giving us a +simple and ugly language in some ways, but also a powerful one at the same time. +One of the great things about creating your own language is that you get to +decide what is good or bad. In this tutorial we'll assume that it is okay and +use this as a way to show some interesting parsing techniques.

+ +
+ + +
Operator Overloading: the Idea
+ + +
+ +

+The operator overloading that we will add to Kaleidoscope is more general than +languages like C++. In C++, you are only allowed to redefine existing +operators: you can't programatically change the grammar, introduce new +operators, change precedence levels, etc. In this chapter, we will add this +capability to Kaleidoscope, which will allow us to round out the set of +operators that are supported, culminating in a more interesting example app.

+ +

The point of going into operator overloading in a tutorial like this is to +show the power and flexibility of using a hand-written parser. The parser we +are using so far is using recursive descent for most parts of the grammar, and +operator precedence parsing for the expressions. See Chapter 2 for details. Without using operator +precedence parsing, it would be very difficult to allow the programmer to +introduce new operators into the grammar: the grammar is dynamically extensible +as the JIT runs.

+ +

The two specific features we'll add are programmable unary operators (right +now, Kaleidoscope has no unary operators at all) as well as binary operators. +An example of this is:

+ +
+
+# Logical unary not.
+def unary!(v)
+  if v then
+    0
+  else
+    1;
+
+# Define > with the same precedence as <.
+def binary> 10 (LHS RHS)
+  !(LHS < RHS);     # alternatively, could just use "RHS < LHS"
+
+# Binary "logical or", (note that it does not "short circuit")
+def binary| 5 (LHS RHS)
+  if LHS then
+    1
+  else if RHS then
+    1
+  else
+    0;
+
+# Define = with slightly lower precedence than relationals.
+def binary= 9 (LHS RHS)
+  !(LHS < RHS | LHS > RHS);
+
+
+ +

Many languages aspire to being able to implement their standard runtime +library in the language itself. In Kaleidoscope, we can implement significant +parts of the language in the library!

+ +

We will break down implementation of these features into two parts: +implementing support for overloading of binary operators and adding unary +operators.

+ +
+ + +
Overloading Binary Operators
+ + +
+ +

Adding support for overloaded binary operators is pretty simple with our +current framework. We'll first add support for the unary/binary keywords:

+ +
+
+enum Token {
+  ...
+  // operators
+  tok_binary = -11, tok_unary = -12
+};
+...
+static int gettok() {
+...
+    if (IdentifierStr == "for") return tok_for;
+    if (IdentifierStr == "in") return tok_in;
+    if (IdentifierStr == "binary") return tok_binary;
+    if (IdentifierStr == "unary") return tok_unary;
+    return tok_identifier;
+
+
+ +

This just adds lexer support for the unary and binary keywords, like we +did in previous chapters. One nice thing +about our current AST is that we represent binary operators fully generally +with their ASCII code as the opcode. For our extended operators, we'll use the +same representation, so we don't need any new AST or parser support.

+ +

On the other hand, we have to be able to represent the definitions of these +new operators, in the "def binary| 5" part of the function definition. In the +grammar so far, the "name" for the function definition is parsed as the +"prototype" production and into the PrototypeAST AST node. To +represent our new user-defined operators as prototypes, we have to extend +the PrototypeAST AST node like this:

+ +
+
+/// PrototypeAST - This class represents the "prototype" for a function,
+/// which captures its argument names as well as if it is an operator.
+class PrototypeAST {
+  std::string Name;
+  std::vector<std::string> Args;
+  bool isOperator;
+  unsigned Precedence;  // Precedence if a binary op.
+public:
+  PrototypeAST(const std::string &name, const std::vector<std::string> &args,
+               bool isoperator = false, unsigned prec = 0)
+  : Name(name), Args(args), isOperator(isoperator), Precedence(prec) {}
+  
+  bool isUnaryOp() const { return isOperator && Args.size() == 1; }
+  bool isBinaryOp() const { return isOperator && Args.size() == 2; }
+  
+  char getOperatorName() const {
+    assert(isUnaryOp() || isBinaryOp());
+    return Name[Name.size()-1];
+  }
+  
+  unsigned getBinaryPrecedence() const { return Precedence; }
+  
+  Function *Codegen();
+};
+
+
+ +

Basically, in addition to knowing a name for the prototype, we now keep track +of whether it was an operator, and if it was, what precedence level the operator +is at. The precedence is only used for binary operators.

+ + +

...

+ +
+ + + +
Full Code Listing
+ + +
+ +

+Here is the complete code listing for our running example, enhanced with the +if/then/else and for expressions.. To build this example, use: +

+ +
+
+   # Compile
+   g++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy
+   # Run
+   ./toy
+
+
+ +

Here is the code:

+ +
+
+
+
+ +
+ + +
+
+ Valid CSS! + Valid HTML 4.01! + + Chris Lattner
+ The LLVM Compiler Infrastructure
+ Last modified: $Date: 2007-10-17 11:05:13 -0700 (Wed, 17 Oct 2007) $ +
+ + diff --git a/docs/tutorial/index.html b/docs/tutorial/index.html index 11e93d3ccd5..56886c38f8f 100644 --- a/docs/tutorial/index.html +++ b/docs/tutorial/index.html @@ -32,7 +32,7 @@
  • Implementing Code Generation to LLVM IR
  • Adding JIT and Optimizer Support
  • Extending the language: control flow
  • -
  • Extending the language: operator overloading
  • +
  • Extending the language: operator overloading
  • Extending the language: mutable variables
  • Thoughts and ideas for extensions