mirror of
				https://github.com/c64scene-ar/llvm-6502.git
				synced 2025-10-30 16:17:05 +00:00 
			
		
		
		
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217034 91177308-0d34-0410-b5e6-96231b3b80d8
		
			
				
	
	
		
			389 lines
		
	
	
		
			11 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
			
		
		
	
	
			389 lines
		
	
	
		
			11 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
| ===========================
 | |
| TableGen Language Reference
 | |
| ===========================
 | |
| 
 | |
| .. contents::
 | |
|    :local:
 | |
| 
 | |
| .. warning::
 | |
|    This document is extremely rough. If you find something lacking, please
 | |
|    fix it, file a documentation bug, or ask about it on llvmdev.
 | |
| 
 | |
| Introduction
 | |
| ============
 | |
| 
 | |
| This document is meant to be a normative spec about the TableGen language
 | |
| in and of itself (i.e. how to understand a given construct in terms of how
 | |
| it affects the final set of records represented by the TableGen file). If
 | |
| you are unsure if this document is really what you are looking for, please
 | |
| read the :doc:`introduction to TableGen <index>` first.
 | |
| 
 | |
| Notation
 | |
| ========
 | |
| 
 | |
| The lexical and syntax notation used here is intended to imitate
 | |
| `Python's`_. In particular, for lexical definitions, the productions
 | |
| operate at the character level and there is no implied whitespace between
 | |
| elements. The syntax definitions operate at the token level, so there is
 | |
| implied whitespace between tokens.
 | |
| 
 | |
| .. _`Python's`: http://docs.python.org/py3k/reference/introduction.html#notation
 | |
| 
 | |
| Lexical Analysis
 | |
| ================
 | |
| 
 | |
| TableGen supports BCPL (``// ...``) and nestable C-style (``/* ... */``)
 | |
| comments.
 | |
| 
 | |
| The following is a listing of the basic punctuation tokens::
 | |
| 
 | |
|    - + [ ] { } ( ) < > : ; .  = ? #
 | |
| 
 | |
| Numeric literals take one of the following forms:
 | |
| 
 | |
| .. TableGen actually will lex some pretty strange sequences an interpret
 | |
|    them as numbers. What is shown here is an attempt to approximate what it
 | |
|    "should" accept.
 | |
| 
 | |
| .. productionlist::
 | |
|    TokInteger: `DecimalInteger` | `HexInteger` | `BinInteger`
 | |
|    DecimalInteger: ["+" | "-"] ("0"..."9")+
 | |
|    HexInteger: "0x" ("0"..."9" | "a"..."f" | "A"..."F")+
 | |
|    BinInteger: "0b" ("0" | "1")+
 | |
| 
 | |
| One aspect to note is that the :token:`DecimalInteger` token *includes* the
 | |
| ``+`` or ``-``, as opposed to having ``+`` and ``-`` be unary operators as
 | |
| most languages do.
 | |
| 
 | |
| Also note that :token:`BinInteger` creates a value of type ``bits<n>``
 | |
| (where ``n`` is the number of bits).  This will implicitly convert to
 | |
| integers when needed.
 | |
| 
 | |
| TableGen has identifier-like tokens:
 | |
| 
 | |
| .. productionlist::
 | |
|    ualpha: "a"..."z" | "A"..."Z" | "_"
 | |
|    TokIdentifier: ("0"..."9")* `ualpha` (`ualpha` | "0"..."9")*
 | |
|    TokVarName: "$" `ualpha` (`ualpha` |  "0"..."9")*
 | |
| 
 | |
| Note that unlike most languages, TableGen allows :token:`TokIdentifier` to
 | |
| begin with a number. In case of ambiguity, a token will be interpreted as a
 | |
| numeric literal rather than an identifier.
 | |
| 
 | |
| TableGen also has two string-like literals:
 | |
| 
 | |
| .. productionlist::
 | |
|    TokString: '"' <non-'"' characters and C-like escapes> '"'
 | |
|    TokCodeFragment: "[{" <shortest text not containing "}]"> "}]"
 | |
| 
 | |
| :token:`TokCodeFragment` is essentially a multiline string literal
 | |
| delimited by ``[{`` and ``}]``.
 | |
| 
 | |
| .. note::
 | |
|    The current implementation accepts the following C-like escapes::
 | |
| 
 | |
|       \\ \' \" \t \n
 | |
| 
 | |
| TableGen also has the following keywords::
 | |
| 
 | |
|    bit   bits      class   code         dag
 | |
|    def   foreach   defm    field        in
 | |
|    int   let       list    multiclass   string
 | |
| 
 | |
| TableGen also has "bang operators" which have a
 | |
| wide variety of meanings:
 | |
| 
 | |
| .. productionlist::
 | |
|    BangOperator: one of
 | |
|                :!eq     !if      !head    !tail      !con
 | |
|                :!add    !shl     !sra     !srl       !and
 | |
|                :!cast   !empty   !subst   !foreach   !listconcat   !strconcat
 | |
| 
 | |
| Syntax
 | |
| ======
 | |
| 
 | |
| TableGen has an ``include`` mechanism. It does not play a role in the
 | |
| syntax per se, since it is lexically replaced with the contents of the
 | |
| included file.
 | |
| 
 | |
| .. productionlist::
 | |
|    IncludeDirective: "include" `TokString`
 | |
| 
 | |
| TableGen's top-level production consists of "objects".
 | |
| 
 | |
| .. productionlist::
 | |
|    TableGenFile: `Object`*
 | |
|    Object: `Class` | `Def` | `Defm` | `Let` | `MultiClass` | `Foreach`
 | |
| 
 | |
| ``class``\es
 | |
| ------------
 | |
| 
 | |
| .. productionlist::
 | |
|    Class: "class" `TokIdentifier` [`TemplateArgList`] `ObjectBody`
 | |
| 
 | |
| A ``class`` declaration creates a record which other records can inherit
 | |
| from. A class can be parametrized by a list of "template arguments", whose
 | |
| values can be used in the class body.
 | |
| 
 | |
| A given class can only be defined once. A ``class`` declaration is
 | |
| considered to define the class if any of the following is true:
 | |
| 
 | |
| .. break ObjectBody into its consituents so that they are present here?
 | |
| 
 | |
| #. The :token:`TemplateArgList` is present.
 | |
| #. The :token:`Body` in the :token:`ObjectBody` is present and is not empty.
 | |
| #. The :token:`BaseClassList` in the :token:`ObjectBody` is present.
 | |
| 
 | |
| You can declare an empty class by giving and empty :token:`TemplateArgList`
 | |
| and an empty :token:`ObjectBody`. This can serve as a restricted form of
 | |
| forward declaration: note that records deriving from the forward-declared
 | |
| class will inherit no fields from it since the record expansion is done
 | |
| when the record is parsed.
 | |
| 
 | |
| .. productionlist::
 | |
|    TemplateArgList: "<" `Declaration` ("," `Declaration`)* ">"
 | |
| 
 | |
| Declarations
 | |
| ------------
 | |
| 
 | |
| .. Omitting mention of arcane "field" prefix to discourage its use.
 | |
| 
 | |
| The declaration syntax is pretty much what you would expect as a C++
 | |
| programmer.
 | |
| 
 | |
| .. productionlist::
 | |
|    Declaration: `Type` `TokIdentifier` ["=" `Value`]
 | |
| 
 | |
| It assigns the value to the identifer.
 | |
| 
 | |
| Types
 | |
| -----
 | |
| 
 | |
| .. productionlist::
 | |
|    Type: "string" | "code" | "bit" | "int" | "dag"
 | |
|        :| "bits" "<" `TokInteger` ">"
 | |
|        :| "list" "<" `Type` ">"
 | |
|        :| `ClassID`
 | |
|    ClassID: `TokIdentifier`
 | |
| 
 | |
| Both ``string`` and ``code`` correspond to the string type; the difference
 | |
| is purely to indicate programmer intention.
 | |
| 
 | |
| The :token:`ClassID` must identify a class that has been previously
 | |
| declared or defined.
 | |
| 
 | |
| Values
 | |
| ------
 | |
| 
 | |
| .. productionlist::
 | |
|    Value: `SimpleValue` `ValueSuffix`*
 | |
|    ValueSuffix: "{" `RangeList` "}"
 | |
|               :| "[" `RangeList` "]"
 | |
|               :| "." `TokIdentifier`
 | |
|    RangeList: `RangePiece` ("," `RangePiece`)*
 | |
|    RangePiece: `TokInteger`
 | |
|              :| `TokInteger` "-" `TokInteger`
 | |
|              :| `TokInteger` `TokInteger`
 | |
| 
 | |
| The peculiar last form of :token:`RangePiece` is due to the fact that the
 | |
| "``-``" is included in the :token:`TokInteger`, hence ``1-5`` gets lexed as
 | |
| two consecutive :token:`TokInteger`'s, with values ``1`` and ``-5``,
 | |
| instead of "1", "-", and "5".
 | |
| The :token:`RangeList` can be thought of as specifying "list slice" in some
 | |
| contexts.
 | |
| 
 | |
| 
 | |
| :token:`SimpleValue` has a number of forms:
 | |
| 
 | |
| 
 | |
| .. productionlist::
 | |
|    SimpleValue: `TokIdentifier`
 | |
| 
 | |
| The value will be the variable referenced by the identifier. It can be one
 | |
| of:
 | |
| 
 | |
| .. The code for this is exceptionally abstruse. These examples are a
 | |
|    best-effort attempt.
 | |
| 
 | |
| * name of a ``def``, such as the use of ``Bar`` in::
 | |
| 
 | |
|      def Bar : SomeClass {
 | |
|        int X = 5;
 | |
|      }
 | |
| 
 | |
|      def Foo {
 | |
|        SomeClass Baz = Bar;
 | |
|      }
 | |
| 
 | |
| * value local to a ``def``, such as the use of ``Bar`` in::
 | |
| 
 | |
|      def Foo {
 | |
|        int Bar = 5;
 | |
|        int Baz = Bar;
 | |
|      }
 | |
| 
 | |
| * a template arg of a ``class``, such as the use of ``Bar`` in::
 | |
| 
 | |
|      class Foo<int Bar> {
 | |
|        int Baz = Bar;
 | |
|      }
 | |
| 
 | |
| * value local to a ``multiclass``, such as the use of ``Bar`` in::
 | |
| 
 | |
|      multiclass Foo {
 | |
|        int Bar = 5;
 | |
|        int Baz = Bar;
 | |
|      }
 | |
| 
 | |
| * a template arg to a ``multiclass``, such as the use of ``Bar`` in::
 | |
| 
 | |
|      multiclass Foo<int Bar> {
 | |
|        int Baz = Bar;
 | |
|      }
 | |
| 
 | |
| .. productionlist::
 | |
|    SimpleValue: `TokInteger`
 | |
| 
 | |
| This represents the numeric value of the integer.
 | |
| 
 | |
| .. productionlist::
 | |
|    SimpleValue: `TokString`+
 | |
| 
 | |
| Multiple adjacent string literals are concatenated like in C/C++. The value
 | |
| is the concatenation of the strings.
 | |
| 
 | |
| .. productionlist::
 | |
|    SimpleValue: `TokCodeFragment`
 | |
| 
 | |
| The value is the string value of the code fragment.
 | |
| 
 | |
| .. productionlist::
 | |
|    SimpleValue: "?"
 | |
| 
 | |
| ``?`` represents an "unset" initializer.
 | |
| 
 | |
| .. productionlist::
 | |
|    SimpleValue: "{" `ValueList` "}"
 | |
|    ValueList: [`ValueListNE`]
 | |
|    ValueListNE: `Value` ("," `Value`)*
 | |
| 
 | |
| This represents a sequence of bits, as would be used to initialize a
 | |
| ``bits<n>`` field (where ``n`` is the number of bits).
 | |
| 
 | |
| .. productionlist::
 | |
|    SimpleValue: `ClassID` "<" `ValueListNE` ">"
 | |
| 
 | |
| This generates a new anonymous record definition (as would be created by an
 | |
| unnamed ``def`` inheriting from the given class with the given template
 | |
| arguments) and the value is the value of that record definition.
 | |
| 
 | |
| .. productionlist::
 | |
|    SimpleValue: "[" `ValueList` "]" ["<" `Type` ">"]
 | |
| 
 | |
| A list initializer. The optional :token:`Type` can be used to indicate a
 | |
| specific element type, otherwise the element type will be deduced from the
 | |
| given values.
 | |
| 
 | |
| .. The initial `DagArg` of the dag must start with an identifier or
 | |
|    !cast, but this is more of an implementation detail and so for now just
 | |
|    leave it out.
 | |
| 
 | |
| .. productionlist::
 | |
|    SimpleValue: "(" `DagArg` `DagArgList` ")"
 | |
|    DagArgList: `DagArg` ("," `DagArg`)*
 | |
|    DagArg: `Value` [":" `TokVarName`] | `TokVarName`
 | |
| 
 | |
| The initial :token:`DagArg` is called the "operator" of the dag.
 | |
| 
 | |
| .. productionlist::
 | |
|    SimpleValue: `BangOperator` ["<" `Type` ">"] "(" `ValueListNE` ")"
 | |
| 
 | |
| Bodies
 | |
| ------
 | |
| 
 | |
| .. productionlist::
 | |
|    ObjectBody: `BaseClassList` `Body`
 | |
|    BaseClassList: [":" `BaseClassListNE`]
 | |
|    BaseClassListNE: `SubClassRef` ("," `SubClassRef`)*
 | |
|    SubClassRef: (`ClassID` | `MultiClassID`) ["<" `ValueList` ">"]
 | |
|    DefmID: `TokIdentifier`
 | |
| 
 | |
| The version with the :token:`MultiClassID` is only valid in the
 | |
| :token:`BaseClassList` of a ``defm``.
 | |
| The :token:`MultiClassID` should be the name of a ``multiclass``.
 | |
| 
 | |
| .. put this somewhere else
 | |
| 
 | |
| It is after parsing the base class list that the "let stack" is applied.
 | |
| 
 | |
| .. productionlist::
 | |
|    Body: ";" | "{" BodyList "}"
 | |
|    BodyList: BodyItem*
 | |
|    BodyItem: `Declaration` ";"
 | |
|            :| "let" `TokIdentifier` [`RangeList`] "=" `Value` ";"
 | |
| 
 | |
| The ``let`` form allows overriding the value of an inherited field.
 | |
| 
 | |
| ``def``
 | |
| -------
 | |
| 
 | |
| .. TODO::
 | |
|    There can be pastes in the names here, like ``#NAME#``. Look into that
 | |
|    and document it (it boils down to ParseIDValue with IDParseMode ==
 | |
|    ParseNameMode). ParseObjectName calls into the general ParseValue, with
 | |
|    the only different from "arbitrary expression parsing" being IDParseMode
 | |
|    == Mode.
 | |
| 
 | |
| .. productionlist::
 | |
|    Def: "def" `TokIdentifier` `ObjectBody`
 | |
| 
 | |
| Defines a record whose name is given by the :token:`TokIdentifier`. The
 | |
| fields of the record are inherited from the base classes and defined in the
 | |
| body.
 | |
| 
 | |
| Special handling occurs if this ``def`` appears inside a ``multiclass`` or
 | |
| a ``foreach``.
 | |
| 
 | |
| ``defm``
 | |
| --------
 | |
| 
 | |
| .. productionlist::
 | |
|    Defm: "defm" `TokIdentifier` ":" `BaseClassListNE` ";"
 | |
| 
 | |
| Note that in the :token:`BaseClassList`, all of the ``multiclass``'s must
 | |
| precede any ``class``'s that appear.
 | |
| 
 | |
| ``foreach``
 | |
| -----------
 | |
| 
 | |
| .. productionlist::
 | |
|    Foreach: "foreach" `Declaration` "in" "{" `Object`* "}"
 | |
|           :| "foreach" `Declaration` "in" `Object`
 | |
| 
 | |
| The value assigned to the variable in the declaration is iterated over and
 | |
| the object or object list is reevaluated with the variable set at each
 | |
| iterated value.
 | |
| 
 | |
| Top-Level ``let``
 | |
| -----------------
 | |
| 
 | |
| .. productionlist::
 | |
|    Let:  "let" `LetList` "in" "{" `Object`* "}"
 | |
|       :| "let" `LetList` "in" `Object`
 | |
|    LetList: `LetItem` ("," `LetItem`)*
 | |
|    LetItem: `TokIdentifier` [`RangeList`] "=" `Value`
 | |
| 
 | |
| This is effectively equivalent to ``let`` inside the body of a record
 | |
| except that it applies to multiple records at a time. The bindings are
 | |
| applied at the end of parsing the base classes of a record.
 | |
| 
 | |
| ``multiclass``
 | |
| --------------
 | |
| 
 | |
| .. productionlist::
 | |
|    MultiClass: "multiclass" `TokIdentifier` [`TemplateArgList`]
 | |
|              : [":" `BaseMultiClassList`] "{" `MultiClassObject`+ "}"
 | |
|    BaseMultiClassList: `MultiClassID` ("," `MultiClassID`)*
 | |
|    MultiClassID: `TokIdentifier`
 | |
|    MultiClassObject: `Def` | `Defm` | `Let` | `Foreach`
 |