mirror of
https://github.com/c64scene-ar/llvm-6502.git
synced 2025-08-15 22:28:18 +00:00
Fix some documentation for the tutorial.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48966 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is contained in:
@@ -219,15 +219,15 @@ type token =
|
|||||||
</div>
|
</div>
|
||||||
|
|
||||||
<p>Each token returned by our lexer will be one of the token variant values.
|
<p>Each token returned by our lexer will be one of the token variant values.
|
||||||
An unknown character like '+' will be returned as <tt>Kwd '+'</tt>. If the
|
An unknown character like '+' will be returned as <tt>Token.Kwd '+'</tt>. If
|
||||||
curr token is an identifier, the value will be <tt>Ident s</tt>. If the
|
the curr token is an identifier, the value will be <tt>Token.Ident s</tt>. If
|
||||||
current token is a numeric literal (like 1.0), the value will be
|
the current token is a numeric literal (like 1.0), the value will be
|
||||||
<tt>Number 1.0</tt>.
|
<tt>Token.Number 1.0</tt>.
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<p>The actual implementation of the lexer is a collection of functions driven
|
<p>The actual implementation of the lexer is a collection of functions driven
|
||||||
by a function named <tt>lex</tt>. The <tt>lex</tt> function is called to
|
by a function named <tt>Lexer.lex</tt>. The <tt>Lexer.lex</tt> function is
|
||||||
return the next token from standard input. We will use
|
called to return the next token from standard input. We will use
|
||||||
<a href="http://caml.inria.fr/pub/docs/manual-camlp4/index.html">Camlp4</a>
|
<a href="http://caml.inria.fr/pub/docs/manual-camlp4/index.html">Camlp4</a>
|
||||||
to simplify the tokenization of the standard input. Its definition starts
|
to simplify the tokenization of the standard input. Its definition starts
|
||||||
as:</p>
|
as:</p>
|
||||||
@@ -245,13 +245,13 @@ let rec lex = parser
|
|||||||
</div>
|
</div>
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
<tt>lex</tt> works by recursing over a <tt>char Stream.t</tt> to read
|
<tt>Lexer.lex</tt> works by recursing over a <tt>char Stream.t</tt> to read
|
||||||
characters one at a time from the standard input. It eats them as it recognizes
|
characters one at a time from the standard input. It eats them as it recognizes
|
||||||
them and stores them in in a <tt>token</tt> variant. The first thing that it
|
them and stores them in in a <tt>Token.token</tt> variant. The first thing that
|
||||||
has to do is ignore whitespace between tokens. This is accomplished with the
|
it has to do is ignore whitespace between tokens. This is accomplished with the
|
||||||
recursive call above.</p>
|
recursive call above.</p>
|
||||||
|
|
||||||
<p>The next thing <tt>lex</tt> needs to do is recognize identifiers and
|
<p>The next thing <tt>Lexer.lex</tt> needs to do is recognize identifiers and
|
||||||
specific keywords like "def". Kaleidoscope does this with this a pattern match
|
specific keywords like "def". Kaleidoscope does this with this a pattern match
|
||||||
and a helper function.<p>
|
and a helper function.<p>
|
||||||
|
|
||||||
@@ -300,8 +300,8 @@ and lex_number buffer = parser
|
|||||||
|
|
||||||
<p>This is all pretty straight-forward code for processing input. When reading
|
<p>This is all pretty straight-forward code for processing input. When reading
|
||||||
a numeric value from input, we use the ocaml <tt>float_of_string</tt> function
|
a numeric value from input, we use the ocaml <tt>float_of_string</tt> function
|
||||||
to convert it to a numeric value that we store in <tt>NumVal</tt>. Note that
|
to convert it to a numeric value that we store in <tt>Token.Number</tt>. Note
|
||||||
this isn't doing sufficient error checking: it will raise <tt>Failure</tt>
|
that this isn't doing sufficient error checking: it will raise <tt>Failure</tt>
|
||||||
if the string "1.23.45.67". Feel free to extend it :). Next we handle
|
if the string "1.23.45.67". Feel free to extend it :). Next we handle
|
||||||
comments:
|
comments:
|
||||||
</p>
|
</p>
|
||||||
|
@@ -240,13 +240,13 @@ error"</tt>, where if the token before the <tt>??</tt> does not match, then
|
|||||||
<tt>Stream.Error "parse error"</tt> will be raised.</p>
|
<tt>Stream.Error "parse error"</tt> will be raised.</p>
|
||||||
|
|
||||||
<p>2) Another interesting aspect of this function is that it uses recursion by
|
<p>2) Another interesting aspect of this function is that it uses recursion by
|
||||||
calling <tt>parse_primary</tt> (we will soon see that <tt>parse_primary</tt> can
|
calling <tt>Parser.parse_primary</tt> (we will soon see that
|
||||||
call <tt>parse_primary</tt>). This is powerful because it allows us to handle
|
<tt>Parser.parse_primary</tt> can call <tt>Parser.parse_primary</tt>). This is
|
||||||
recursive grammars, and keeps each production very simple. Note that
|
powerful because it allows us to handle recursive grammars, and keeps each
|
||||||
parentheses do not cause construction of AST nodes themselves. While we could
|
production very simple. Note that parentheses do not cause construction of AST
|
||||||
do it this way, the most important role of parentheses are to guide the parser
|
nodes themselves. While we could do it this way, the most important role of
|
||||||
and provide grouping. Once the parser constructs the AST, parentheses are not
|
parentheses are to guide the parser and provide grouping. Once the parser
|
||||||
needed.</p>
|
constructs the AST, parentheses are not needed.</p>
|
||||||
|
|
||||||
<p>The next simple production is for handling variable references and function
|
<p>The next simple production is for handling variable references and function
|
||||||
calls:</p>
|
calls:</p>
|
||||||
@@ -345,12 +345,12 @@ let main () =
|
|||||||
|
|
||||||
<p>For the basic form of Kaleidoscope, we will only support 4 binary operators
|
<p>For the basic form of Kaleidoscope, we will only support 4 binary operators
|
||||||
(this can obviously be extended by you, our brave and intrepid reader). The
|
(this can obviously be extended by you, our brave and intrepid reader). The
|
||||||
<tt>precedence</tt> function returns the precedence for the current token,
|
<tt>Parser.precedence</tt> function returns the precedence for the current
|
||||||
or -1 if the token is not a binary operator. Having a <tt>Hashtbl.t</tt> makes
|
token, or -1 if the token is not a binary operator. Having a <tt>Hashtbl.t</tt>
|
||||||
it easy to add new operators and makes it clear that the algorithm doesn't
|
makes it easy to add new operators and makes it clear that the algorithm doesn't
|
||||||
depend on the specific operators involved, but it would be easy enough to
|
depend on the specific operators involved, but it would be easy enough to
|
||||||
eliminate the <tt>Hashtbl.t</tt> and do the comparisons in the
|
eliminate the <tt>Hashtbl.t</tt> and do the comparisons in the
|
||||||
<tt>precedence</tt> function. (Or just use a fixed-size array).</p>
|
<tt>Parser.precedence</tt> function. (Or just use a fixed-size array).</p>
|
||||||
|
|
||||||
<p>With the helper above defined, we can now start parsing binary expressions.
|
<p>With the helper above defined, we can now start parsing binary expressions.
|
||||||
The basic idea of operator precedence parsing is to break down an expression
|
The basic idea of operator precedence parsing is to break down an expression
|
||||||
@@ -376,19 +376,19 @@ and parse_expr = parser
|
|||||||
</pre>
|
</pre>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<p><tt>parse_bin_rhs</tt> is the function that parses the sequence of pairs for
|
<p><tt>Parser.parse_bin_rhs</tt> is the function that parses the sequence of
|
||||||
us. It takes a precedence and a pointer to an expression for the part that has been
|
pairs for us. It takes a precedence and a pointer to an expression for the part
|
||||||
parsed so far. Note that "x" is a perfectly valid expression: As such, "binoprhs" is
|
that has been parsed so far. Note that "x" is a perfectly valid expression: As
|
||||||
allowed to be empty, in which case it returns the expression that is passed into
|
such, "binoprhs" is allowed to be empty, in which case it returns the expression
|
||||||
it. In our example above, the code passes the expression for "a" into
|
that is passed into it. In our example above, the code passes the expression for
|
||||||
<tt>ParseBinOpRHS</tt> and the current token is "+".</p>
|
"a" into <tt>Parser.parse_bin_rhs</tt> and the current token is "+".</p>
|
||||||
|
|
||||||
<p>The precedence value passed into <tt>parse_bin_rhs</tt> indicates the <em>
|
<p>The precedence value passed into <tt>Parser.parse_bin_rhs</tt> indicates the
|
||||||
minimal operator precedence</em> that the function is allowed to eat. For
|
<em>minimal operator precedence</em> that the function is allowed to eat. For
|
||||||
example, if the current pair stream is [+, x] and <tt>parse_bin_rhs</tt> is
|
example, if the current pair stream is [+, x] and <tt>Parser.parse_bin_rhs</tt>
|
||||||
passed in a precedence of 40, it will not consume any tokens (because the
|
is passed in a precedence of 40, it will not consume any tokens (because the
|
||||||
precedence of '+' is only 20). With this in mind, <tt>parse_bin_rhs</tt> starts
|
precedence of '+' is only 20). With this in mind, <tt>Parser.parse_bin_rhs</tt>
|
||||||
with:</p>
|
starts with:</p>
|
||||||
|
|
||||||
<div class="doc_code">
|
<div class="doc_code">
|
||||||
<pre>
|
<pre>
|
||||||
@@ -497,10 +497,10 @@ context):</p>
|
|||||||
has higher precedence than the binop we are currently parsing. As such, we know
|
has higher precedence than the binop we are currently parsing. As such, we know
|
||||||
that any sequence of pairs whose operators are all higher precedence than "+"
|
that any sequence of pairs whose operators are all higher precedence than "+"
|
||||||
should be parsed together and returned as "RHS". To do this, we recursively
|
should be parsed together and returned as "RHS". To do this, we recursively
|
||||||
invoke the <tt>parse_bin_rhs</tt> function specifying "token_prec+1" as the
|
invoke the <tt>Parser.parse_bin_rhs</tt> function specifying "token_prec+1" as
|
||||||
minimum precedence required for it to continue. In our example above, this will
|
the minimum precedence required for it to continue. In our example above, this
|
||||||
cause it to return the AST node for "(c+d)*e*f" as RHS, which is then set as the
|
will cause it to return the AST node for "(c+d)*e*f" as RHS, which is then set
|
||||||
RHS of the '+' expression.</p>
|
as the RHS of the '+' expression.</p>
|
||||||
|
|
||||||
<p>Finally, on the next iteration of the while loop, the "+g" piece is parsed
|
<p>Finally, on the next iteration of the while loop, the "+g" piece is parsed
|
||||||
and added to the AST. With this little bit of code (14 non-trivial lines), we
|
and added to the AST. With this little bit of code (14 non-trivial lines), we
|
||||||
@@ -705,7 +705,7 @@ course.) To build this, just compile with:</p>
|
|||||||
# Compile
|
# Compile
|
||||||
ocamlbuild toy.byte
|
ocamlbuild toy.byte
|
||||||
# Run
|
# Run
|
||||||
./toy
|
./toy.byte
|
||||||
</pre>
|
</pre>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
|
Reference in New Issue
Block a user