mirror of
				https://github.com/c64scene-ar/llvm-6502.git
				synced 2025-10-31 08:16:47 +00:00 
			
		
		
		
	Fix some documentation for the tutorial.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48966 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is contained in:
		| @@ -219,15 +219,15 @@ type token = | ||||
| </div> | ||||
|  | ||||
| <p>Each token returned by our lexer will be one of the token variant values. | ||||
| An unknown character like '+' will be returned as <tt>Kwd '+'</tt>.  If the | ||||
| curr token is an identifier, the value will be <tt>Ident s</tt>.  If the | ||||
| current token is a numeric literal (like 1.0), the value will be | ||||
| <tt>Number 1.0</tt>. | ||||
| An unknown character like '+' will be returned as <tt>Token.Kwd '+'</tt>.  If | ||||
| the curr token is an identifier, the value will be <tt>Token.Ident s</tt>.  If | ||||
| the current token is a numeric literal (like 1.0), the value will be | ||||
| <tt>Token.Number 1.0</tt>. | ||||
| </p> | ||||
|  | ||||
| <p>The actual implementation of the lexer is a collection of functions driven | ||||
| by a function named <tt>lex</tt>.  The <tt>lex</tt> function is called to | ||||
| return the next token from standard input.  We will use | ||||
| by a function named <tt>Lexer.lex</tt>.  The <tt>Lexer.lex</tt> function is | ||||
| called to return the next token from standard input.  We will use | ||||
| <a href="http://caml.inria.fr/pub/docs/manual-camlp4/index.html">Camlp4</a> | ||||
| to simplify the tokenization of the standard input.  Its definition starts | ||||
| as:</p> | ||||
| @@ -245,13 +245,13 @@ let rec lex = parser | ||||
| </div> | ||||
|  | ||||
| <p> | ||||
| <tt>lex</tt> works by recursing over a <tt>char Stream.t</tt> to read | ||||
| <tt>Lexer.lex</tt> works by recursing over a <tt>char Stream.t</tt> to read | ||||
| characters one at a time from the standard input.  It eats them as it recognizes | ||||
| them and stores them in in a <tt>token</tt> variant.  The first thing that it | ||||
| has to do is ignore whitespace between tokens.  This is accomplished with the | ||||
| them and stores them in in a <tt>Token.token</tt> variant.  The first thing that | ||||
| it has to do is ignore whitespace between tokens.  This is accomplished with the | ||||
| recursive call above.</p> | ||||
|  | ||||
| <p>The next thing <tt>lex</tt> needs to do is recognize identifiers and | ||||
| <p>The next thing <tt>Lexer.lex</tt> needs to do is recognize identifiers and | ||||
| specific keywords like "def".  Kaleidoscope does this with this a pattern match | ||||
| and a helper function.<p> | ||||
|  | ||||
| @@ -300,8 +300,8 @@ and lex_number buffer = parser | ||||
|  | ||||
| <p>This is all pretty straight-forward code for processing input.  When reading | ||||
| a numeric value from input, we use the ocaml <tt>float_of_string</tt> function | ||||
| to convert it to a numeric value that we store in <tt>NumVal</tt>.  Note that | ||||
| this isn't doing sufficient error checking: it will raise <tt>Failure</tt> | ||||
| to convert it to a numeric value that we store in <tt>Token.Number</tt>.  Note | ||||
| that this isn't doing sufficient error checking: it will raise <tt>Failure</tt> | ||||
| if the string "1.23.45.67".  Feel free to extend it :).  Next we handle | ||||
| comments: | ||||
| </p> | ||||
|   | ||||
| @@ -240,13 +240,13 @@ error"</tt>, where if the token before the <tt>??</tt> does not match, then | ||||
| <tt>Stream.Error "parse error"</tt> will be raised.</p> | ||||
|  | ||||
| <p>2) Another interesting aspect of this function is that it uses recursion by | ||||
| calling <tt>parse_primary</tt> (we will soon see that <tt>parse_primary</tt> can | ||||
| call <tt>parse_primary</tt>).  This is powerful because it allows us to handle | ||||
| recursive grammars, and keeps each production very simple.  Note that | ||||
| parentheses do not cause construction of AST nodes themselves.  While we could | ||||
| do it this way, the most important role of parentheses are to guide the parser | ||||
| and provide grouping.  Once the parser constructs the AST, parentheses are not | ||||
| needed.</p> | ||||
| calling <tt>Parser.parse_primary</tt> (we will soon see that | ||||
| <tt>Parser.parse_primary</tt> can call <tt>Parser.parse_primary</tt>).  This is | ||||
| powerful because it allows us to handle recursive grammars, and keeps each | ||||
| production very simple.  Note that parentheses do not cause construction of AST | ||||
| nodes themselves.  While we could do it this way, the most important role of | ||||
| parentheses are to guide the parser and provide grouping.  Once the parser | ||||
| constructs the AST, parentheses are not needed.</p> | ||||
|  | ||||
| <p>The next simple production is for handling variable references and function | ||||
| calls:</p> | ||||
| @@ -345,12 +345,12 @@ let main () = | ||||
|  | ||||
| <p>For the basic form of Kaleidoscope, we will only support 4 binary operators | ||||
| (this can obviously be extended by you, our brave and intrepid reader).  The | ||||
| <tt>precedence</tt> function returns the precedence for the current token, | ||||
| or -1 if the token is not a binary operator.  Having a <tt>Hashtbl.t</tt> makes | ||||
| it easy to add new operators and makes it clear that the algorithm doesn't | ||||
| <tt>Parser.precedence</tt> function returns the precedence for the current | ||||
| token, or -1 if the token is not a binary operator.  Having a <tt>Hashtbl.t</tt> | ||||
| makes it easy to add new operators and makes it clear that the algorithm doesn't | ||||
| depend on the specific operators involved, but it would be easy enough to | ||||
| eliminate the <tt>Hashtbl.t</tt> and do the comparisons in the | ||||
| <tt>precedence</tt> function.  (Or just use a fixed-size array).</p> | ||||
| <tt>Parser.precedence</tt> function.  (Or just use a fixed-size array).</p> | ||||
|  | ||||
| <p>With the helper above defined, we can now start parsing binary expressions. | ||||
| The basic idea of operator precedence parsing is to break down an expression | ||||
| @@ -376,19 +376,19 @@ and parse_expr = parser | ||||
| </pre> | ||||
| </div> | ||||
|  | ||||
| <p><tt>parse_bin_rhs</tt> is the function that parses the sequence of pairs for | ||||
| us.  It takes a precedence and a pointer to an expression for the part that has been | ||||
| parsed so far.   Note that "x" is a perfectly valid expression: As such, "binoprhs" is | ||||
| allowed to be empty, in which case it returns the expression that is passed into | ||||
| it. In our example above, the code passes the expression for "a" into | ||||
| <tt>ParseBinOpRHS</tt> and the current token is "+".</p> | ||||
| <p><tt>Parser.parse_bin_rhs</tt> is the function that parses the sequence of | ||||
| pairs for us.  It takes a precedence and a pointer to an expression for the part | ||||
| that has been parsed so far.   Note that "x" is a perfectly valid expression: As | ||||
| such, "binoprhs" is allowed to be empty, in which case it returns the expression | ||||
| that is passed into it. In our example above, the code passes the expression for | ||||
| "a" into <tt>Parser.parse_bin_rhs</tt> and the current token is "+".</p> | ||||
|  | ||||
| <p>The precedence value passed into <tt>parse_bin_rhs</tt> indicates the <em> | ||||
| minimal operator precedence</em> that the function is allowed to eat.  For | ||||
| example, if the current pair stream is [+, x] and <tt>parse_bin_rhs</tt> is | ||||
| passed in a precedence of 40, it will not consume any tokens (because the | ||||
| precedence of '+' is only 20).  With this in mind, <tt>parse_bin_rhs</tt> starts | ||||
| with:</p> | ||||
| <p>The precedence value passed into <tt>Parser.parse_bin_rhs</tt> indicates the | ||||
| <em>minimal operator precedence</em> that the function is allowed to eat.  For | ||||
| example, if the current pair stream is [+, x] and <tt>Parser.parse_bin_rhs</tt> | ||||
| is passed in a precedence of 40, it will not consume any tokens (because the | ||||
| precedence of '+' is only 20).  With this in mind, <tt>Parser.parse_bin_rhs</tt> | ||||
| starts with:</p> | ||||
|  | ||||
| <div class="doc_code"> | ||||
| <pre> | ||||
| @@ -497,10 +497,10 @@ context):</p> | ||||
| has higher precedence than the binop we are currently parsing.  As such, we know | ||||
| that any sequence of pairs whose operators are all higher precedence than "+" | ||||
| should be parsed together and returned as "RHS".  To do this, we recursively | ||||
| invoke the <tt>parse_bin_rhs</tt> function specifying "token_prec+1" as the | ||||
| minimum precedence required for it to continue.  In our example above, this will | ||||
| cause it to return the AST node for "(c+d)*e*f" as RHS, which is then set as the | ||||
| RHS of the '+' expression.</p> | ||||
| invoke the <tt>Parser.parse_bin_rhs</tt> function specifying "token_prec+1" as | ||||
| the minimum precedence required for it to continue.  In our example above, this | ||||
| will cause it to return the AST node for "(c+d)*e*f" as RHS, which is then set | ||||
| as the RHS of the '+' expression.</p> | ||||
|  | ||||
| <p>Finally, on the next iteration of the while loop, the "+g" piece is parsed | ||||
| and added to the AST.  With this little bit of code (14 non-trivial lines), we | ||||
| @@ -705,7 +705,7 @@ course.)  To build this, just compile with:</p> | ||||
| # Compile | ||||
| ocamlbuild toy.byte | ||||
| # Run | ||||
| ./toy | ||||
| ./toy.byte | ||||
| </pre> | ||||
| </div> | ||||
|  | ||||
|   | ||||
		Reference in New Issue
	
	Block a user