diff --git a/docs/ExtendingLLVM.html b/docs/ExtendingLLVM.html deleted file mode 100644 index 99e209b8940..00000000000 --- a/docs/ExtendingLLVM.html +++ /dev/null @@ -1,379 +0,0 @@ - - - - - Extending LLVM: Adding instructions, intrinsics, types, etc. - - - - - -

- Extending LLVM: Adding instructions, intrinsics, types, etc. -

- -
    -
  1. Introduction and Warning
  2. -
  3. Adding a new intrinsic function
  4. -
  5. Adding a new instruction
  6. -
  7. Adding a new SelectionDAG node
  8. -
  9. Adding a new type -
      -
    1. Adding a new fundamental type
    2. -
    3. Adding a new derived type
    4. -
  10. -
- -
-

Written by Misha Brukman, - Brad Jones, Nate Begeman, - and Chris Lattner

-
- - -

- Introduction and Warning -

- - -
- -

During the course of using LLVM, you may wish to customize it for your -research project or for experimentation. At this point, you may realize that -you need to add something to LLVM, whether it be a new fundamental type, a new -intrinsic function, or a whole new instruction.

- -

When you come to this realization, stop and think. Do you really need to -extend LLVM? Is it a new fundamental capability that LLVM does not support at -its current incarnation or can it be synthesized from already pre-existing LLVM -elements? If you are not sure, ask on the LLVM-dev list. The -reason is that extending LLVM will get involved as you need to update all the -different passes that you intend to use with your extension, and there are -many LLVM analyses and transformations, so it may be quite a bit of -work.

- -

Adding an intrinsic function is far easier than -adding an instruction, and is transparent to optimization passes. If your added -functionality can be expressed as a -function call, an intrinsic function is the method of choice for LLVM -extension.

- -

Before you invest a significant amount of effort into a non-trivial -extension, ask on the list if what you are -looking to do can be done with already-existing infrastructure, or if maybe -someone else is already working on it. You will save yourself a lot of time and -effort by doing so.

- -
- - -

- Adding a new intrinsic function -

- - -
- -

Adding a new intrinsic function to LLVM is much easier than adding a new -instruction. Almost all extensions to LLVM should start as an intrinsic -function and then be turned into an instruction if warranted.

- -
    -
  1. llvm/docs/LangRef.html: - Document the intrinsic. Decide whether it is code generator specific and - what the restrictions are. Talk to other people about it so that you are - sure it's a good idea.
  2. - -
  3. llvm/include/llvm/Intrinsics*.td: - Add an entry for your intrinsic. Describe its memory access characteristics - for optimization (this controls whether it will be DCE'd, CSE'd, etc). Note - that any intrinsic using the llvm_int_ty type for an argument will - be deemed by tblgen as overloaded and the corresponding suffix - will be required on the intrinsic's name.
  4. - -
  5. llvm/lib/Analysis/ConstantFolding.cpp: If it is possible to - constant fold your intrinsic, add support to it in the - canConstantFoldCallTo and ConstantFoldCall functions.
  6. - -
  7. llvm/test/Regression/*: Add test cases for your test cases to the - test suite
  8. -
- -

Once the intrinsic has been added to the system, you must add code generator -support for it. Generally you must do the following steps:

- -
- -
Add support to the .td file for the target(s) of your choice in - lib/Target/*/*.td.
- -
This is usually a matter of adding a pattern to the .td file that matches - the intrinsic, though it may obviously require adding the instructions you - want to generate as well. There are lots of examples in the PowerPC and X86 - backend to follow.
-
- -
- - -

- Adding a new SelectionDAG node -

- - -
- -

As with intrinsics, adding a new SelectionDAG node to LLVM is much easier -than adding a new instruction. New nodes are often added to help represent -instructions common to many targets. These nodes often map to an LLVM -instruction (add, sub) or intrinsic (byteswap, population count). In other -cases, new nodes have been added to allow many targets to perform a common task -(converting between floating point and integer representation) or capture more -complicated behavior in a single node (rotate).

- -
    -
  1. include/llvm/CodeGen/ISDOpcodes.h: - Add an enum value for the new SelectionDAG node.
  2. -
  3. lib/CodeGen/SelectionDAG/SelectionDAG.cpp: - Add code to print the node to getOperationName. If your new node - can be evaluated at compile time when given constant arguments (such as an - add of a constant with another constant), find the getNode method - that takes the appropriate number of arguments, and add a case for your node - to the switch statement that performs constant folding for nodes that take - the same number of arguments as your new node.
  4. -
  5. lib/CodeGen/SelectionDAG/LegalizeDAG.cpp: - Add code to legalize, - promote, and expand the node as necessary. At a minimum, you will need - to add a case statement for your node in LegalizeOp which calls - LegalizeOp on the node's operands, and returns a new node if any of the - operands changed as a result of being legalized. It is likely that not all - targets supported by the SelectionDAG framework will natively support the - new node. In this case, you must also add code in your node's case - statement in LegalizeOp to Expand your node into simpler, legal - operations. The case for ISD::UREM for expanding a remainder into - a divide, multiply, and a subtract is a good example.
  6. -
  7. lib/CodeGen/SelectionDAG/LegalizeDAG.cpp: - If targets may support the new node being added only at certain sizes, you - will also need to add code to your node's case statement in - LegalizeOp to Promote your node's operands to a larger size, and - perform the correct operation. You will also need to add code to - PromoteOp to do this as well. For a good example, see - ISD::BSWAP, - which promotes its operand to a wider size, performs the byteswap, and then - shifts the correct bytes right to emulate the narrower byteswap in the - wider type.
  8. -
  9. lib/CodeGen/SelectionDAG/LegalizeDAG.cpp: - Add a case for your node in ExpandOp to teach the legalizer how to - perform the action represented by the new node on a value that has been - split into high and low halves. This case will be used to support your - node with a 64 bit operand on a 32 bit target.
  10. -
  11. lib/CodeGen/SelectionDAG/DAGCombiner.cpp: - If your node can be combined with itself, or other existing nodes in a - peephole-like fashion, add a visit function for it, and call that function - from . There are several good examples for simple combines you - can do; visitFABS and visitSRL are good starting places. -
  12. -
  13. lib/Target/PowerPC/PPCISelLowering.cpp: - Each target has an implementation of the TargetLowering class, - usually in its own file (although some targets include it in the same - file as the DAGToDAGISel). The default behavior for a target is to - assume that your new node is legal for all types that are legal for - that target. If this target does not natively support your node, then - tell the target to either Promote it (if it is supported at a larger - type) or Expand it. This will cause the code you wrote in - LegalizeOp above to decompose your new node into other legal - nodes for this target.
  14. -
  15. lib/Target/TargetSelectionDAG.td: - Most current targets supported by LLVM generate code using the DAGToDAG - method, where SelectionDAG nodes are pattern matched to target-specific - nodes, which represent individual instructions. In order for the targets - to match an instruction to your new node, you must add a def for that node - to the list in this file, with the appropriate type constraints. Look at - add, bswap, and fadd for examples.
  16. -
  17. lib/Target/PowerPC/PPCInstrInfo.td: - Each target has a tablegen file that describes the target's instruction - set. For targets that use the DAGToDAG instruction selection framework, - add a pattern for your new node that uses one or more target nodes. - Documentation for this is a bit sparse right now, but there are several - decent examples. See the patterns for rotl in - PPCInstrInfo.td.
  18. -
  19. TODO: document complex patterns.
  20. -
  21. llvm/test/Regression/CodeGen/*: Add test cases for your new node - to the test suite. llvm/test/Regression/CodeGen/X86/bswap.ll is - a good example.
  22. -
- -
- - -

- Adding a new instruction -

- - -
- -

WARNING: adding instructions changes the bitcode -format, and it will take some effort to maintain compatibility with -the previous version. Only add an instruction if it is absolutely -necessary.

- -
    - -
  1. llvm/include/llvm/Instruction.def: - add a number for your instruction and an enum name
  2. - -
  3. llvm/include/llvm/Instructions.h: - add a definition for the class that will represent your instruction
  4. - -
  5. llvm/include/llvm/Support/InstVisitor.h: - add a prototype for a visitor to your new instruction type
  6. - -
  7. llvm/lib/AsmParser/Lexer.l: - add a new token to parse your instruction from assembly text file
  8. - -
  9. llvm/lib/AsmParser/llvmAsmParser.y: - add the grammar on how your instruction can be read and what it will - construct as a result
  10. - -
  11. llvm/lib/Bitcode/Reader/Reader.cpp: - add a case for your instruction and how it will be parsed from bitcode
  12. - -
  13. llvm/lib/VMCore/Instruction.cpp: - add a case for how your instruction will be printed out to assembly
  14. - -
  15. llvm/lib/VMCore/Instructions.cpp: - implement the class you defined in - llvm/include/llvm/Instructions.h
  16. - -
  17. Test your instruction
  18. - -
  19. llvm/lib/Target/*: - Add support for your instruction to code generators, or add a lowering - pass.
  20. - -
  21. llvm/test/Regression/*: add your test cases to the test suite.
  22. - -
- -

Also, you need to implement (or modify) any analyses or passes that you want -to understand this new instruction.

- -
- - - -

- Adding a new type -

- - -
- -

WARNING: adding new types changes the bitcode -format, and will break compatibility with currently-existing LLVM -installations. Only add new types if it is absolutely necessary.

- - -

- Adding a fundamental type -

- -
- -
    - -
  1. llvm/include/llvm/Type.h: - add enum for the new type; add static Type* for this type
  2. - -
  3. llvm/lib/VMCore/Type.cpp: - add mapping from TypeID => Type*; - initialize the static Type*
  4. - -
  5. llvm/lib/AsmReader/Lexer.l: - add ability to parse in the type from text assembly
  6. - -
  7. llvm/lib/AsmReader/llvmAsmParser.y: - add a token for that type
  8. - -
- -
- - -

- Adding a derived type -

- -
- -
    -
  1. llvm/include/llvm/Type.h: - add enum for the new type; add a forward declaration of the type - also
  2. - -
  3. llvm/include/llvm/DerivedTypes.h: - add new class to represent new class in the hierarchy; add forward - declaration to the TypeMap value type
  4. - -
  5. llvm/lib/VMCore/Type.cpp: - add support for derived type to: -
    -
    -std::string getTypeDescription(const Type &Ty,
    -  std::vector<const Type*> &TypeStack)
    -bool TypesEqual(const Type *Ty, const Type *Ty2,
    -  std::map<const Type*, const Type*> & EqTypes)
    -
    -
    - add necessary member functions for type, and factory methods
  6. - -
  7. llvm/lib/AsmReader/Lexer.l: - add ability to parse in the type from text assembly
  8. - -
  9. llvm/lib/BitCode/Writer/Writer.cpp: - modify void BitcodeWriter::outputType(const Type *T) to serialize - your type
  10. - -
  11. llvm/lib/BitCode/Reader/Reader.cpp: - modify const Type *BitcodeReader::ParseType() to read your data - type
  12. - -
  13. llvm/lib/VMCore/AsmWriter.cpp: - modify -
    -
    -void calcTypeName(const Type *Ty,
    -                  std::vector<const Type*> &TypeStack,
    -                  std::map<const Type*,std::string> &TypeNames,
    -                  std::string & Result)
    -
    -
    - to output the new derived type -
  14. - - -
- -
- -
- - - -
-
- Valid CSS - Valid HTML 4.01 - - The LLVM Compiler Infrastructure -
- Last modified: $Date$ -
- - - diff --git a/docs/ExtendingLLVM.rst b/docs/ExtendingLLVM.rst new file mode 100644 index 00000000000..e41cfd996e5 --- /dev/null +++ b/docs/ExtendingLLVM.rst @@ -0,0 +1,306 @@ +.. _extending_llvm: + +============================================================ +Extending LLVM: Adding instructions, intrinsics, types, etc. +============================================================ + +Introduction and Warning +======================== + + +During the course of using LLVM, you may wish to customize it for your research +project or for experimentation. At this point, you may realize that you need to +add something to LLVM, whether it be a new fundamental type, a new intrinsic +function, or a whole new instruction. + +When you come to this realization, stop and think. Do you really need to extend +LLVM? Is it a new fundamental capability that LLVM does not support at its +current incarnation or can it be synthesized from already pre-existing LLVM +elements? If you are not sure, ask on the `LLVM-dev +`_ list. The reason is that +extending LLVM will get involved as you need to update all the different passes +that you intend to use with your extension, and there are ``many`` LLVM analyses +and transformations, so it may be quite a bit of work. + +Adding an `intrinsic function`_ is far easier than adding an +instruction, and is transparent to optimization passes. If your added +functionality can be expressed as a function call, an intrinsic function is the +method of choice for LLVM extension. + +Before you invest a significant amount of effort into a non-trivial extension, +**ask on the list** if what you are looking to do can be done with +already-existing infrastructure, or if maybe someone else is already working on +it. You will save yourself a lot of time and effort by doing so. + +.. _intrinsic function: + +Adding a new intrinsic function +=============================== + +Adding a new intrinsic function to LLVM is much easier than adding a new +instruction. Almost all extensions to LLVM should start as an intrinsic +function and then be turned into an instruction if warranted. + +#. ``llvm/docs/LangRef.html``: + + Document the intrinsic. Decide whether it is code generator specific and + what the restrictions are. Talk to other people about it so that you are + sure it's a good idea. + +#. ``llvm/include/llvm/Intrinsics*.td``: + + Add an entry for your intrinsic. Describe its memory access characteristics + for optimization (this controls whether it will be DCE'd, CSE'd, etc). Note + that any intrinsic using the ``llvm_int_ty`` type for an argument will + be deemed by ``tblgen`` as overloaded and the corresponding suffix will + be required on the intrinsic's name. + +#. ``llvm/lib/Analysis/ConstantFolding.cpp``: + + If it is possible to constant fold your intrinsic, add support to it in the + ``canConstantFoldCallTo`` and ``ConstantFoldCall`` functions. + +#. ``llvm/test/Regression/*``: + + Add test cases for your test cases to the test suite + +Once the intrinsic has been added to the system, you must add code generator +support for it. Generally you must do the following steps: + +Add support to the .td file for the target(s) of your choice in +``lib/Target/*/*.td``. + + This is usually a matter of adding a pattern to the .td file that matches the + intrinsic, though it may obviously require adding the instructions you want to + generate as well. There are lots of examples in the PowerPC and X86 backend + to follow. + +Adding a new SelectionDAG node +============================== + +As with intrinsics, adding a new SelectionDAG node to LLVM is much easier than +adding a new instruction. New nodes are often added to help represent +instructions common to many targets. These nodes often map to an LLVM +instruction (add, sub) or intrinsic (byteswap, population count). In other +cases, new nodes have been added to allow many targets to perform a common task +(converting between floating point and integer representation) or capture more +complicated behavior in a single node (rotate). + +#. ``include/llvm/CodeGen/ISDOpcodes.h``: + + Add an enum value for the new SelectionDAG node. + +#. ``lib/CodeGen/SelectionDAG/SelectionDAG.cpp``: + + Add code to print the node to ``getOperationName``. If your new node can be + evaluated at compile time when given constant arguments (such as an add of a + constant with another constant), find the ``getNode`` method that takes the + appropriate number of arguments, and add a case for your node to the switch + statement that performs constant folding for nodes that take the same number + of arguments as your new node. + +#. ``lib/CodeGen/SelectionDAG/LegalizeDAG.cpp``: + + Add code to `legalize, promote, and expand + `_ the node as necessary. At a + minimum, you will need to add a case statement for your node in + ``LegalizeOp`` which calls LegalizeOp on the node's operands, and returns a + new node if any of the operands changed as a result of being legalized. It + is likely that not all targets supported by the SelectionDAG framework will + natively support the new node. In this case, you must also add code in your + node's case statement in ``LegalizeOp`` to Expand your node into simpler, + legal operations. The case for ``ISD::UREM`` for expanding a remainder into + a divide, multiply, and a subtract is a good example. + +#. ``lib/CodeGen/SelectionDAG/LegalizeDAG.cpp``: + + If targets may support the new node being added only at certain sizes, you + will also need to add code to your node's case statement in ``LegalizeOp`` + to Promote your node's operands to a larger size, and perform the correct + operation. You will also need to add code to ``PromoteOp`` to do this as + well. For a good example, see ``ISD::BSWAP``, which promotes its operand to + a wider size, performs the byteswap, and then shifts the correct bytes right + to emulate the narrower byteswap in the wider type. + +#. ``lib/CodeGen/SelectionDAG/LegalizeDAG.cpp``: + + Add a case for your node in ``ExpandOp`` to teach the legalizer how to + perform the action represented by the new node on a value that has been split + into high and low halves. This case will be used to support your node with a + 64 bit operand on a 32 bit target. + +#. ``lib/CodeGen/SelectionDAG/DAGCombiner.cpp``: + + If your node can be combined with itself, or other existing nodes in a + peephole-like fashion, add a visit function for it, and call that function + from. There are several good examples for simple combines you can do; + ``visitFABS`` and ``visitSRL`` are good starting places. + +#. ``lib/Target/PowerPC/PPCISelLowering.cpp``: + + Each target has an implementation of the ``TargetLowering`` class, usually in + its own file (although some targets include it in the same file as the + DAGToDAGISel). The default behavior for a target is to assume that your new + node is legal for all types that are legal for that target. If this target + does not natively support your node, then tell the target to either Promote + it (if it is supported at a larger type) or Expand it. This will cause the + code you wrote in ``LegalizeOp`` above to decompose your new node into other + legal nodes for this target. + +#. ``lib/Target/TargetSelectionDAG.td``: + + Most current targets supported by LLVM generate code using the DAGToDAG + method, where SelectionDAG nodes are pattern matched to target-specific + nodes, which represent individual instructions. In order for the targets to + match an instruction to your new node, you must add a def for that node to + the list in this file, with the appropriate type constraints. Look at + ``add``, ``bswap``, and ``fadd`` for examples. + +#. ``lib/Target/PowerPC/PPCInstrInfo.td``: + + Each target has a tablegen file that describes the target's instruction set. + For targets that use the DAGToDAG instruction selection framework, add a + pattern for your new node that uses one or more target nodes. Documentation + for this is a bit sparse right now, but there are several decent examples. + See the patterns for ``rotl`` in ``PPCInstrInfo.td``. + +#. TODO: document complex patterns. + +#. ``llvm/test/Regression/CodeGen/*``: + + Add test cases for your new node to the test suite. + ``llvm/test/Regression/CodeGen/X86/bswap.ll`` is a good example. + +Adding a new instruction +======================== + +.. warning:: + + Adding instructions changes the bitcode format, and it will take some effort + to maintain compatibility with the previous version. Only add an instruction + if it is absolutely necessary. + +#. ``llvm/include/llvm/Instruction.def``: + + add a number for your instruction and an enum name + +#. ``llvm/include/llvm/Instructions.h``: + + add a definition for the class that will represent your instruction + +#. ``llvm/include/llvm/Support/InstVisitor.h``: + + add a prototype for a visitor to your new instruction type + +#. ``llvm/lib/AsmParser/Lexer.l``: + + add a new token to parse your instruction from assembly text file + +#. ``llvm/lib/AsmParser/llvmAsmParser.y``: + + add the grammar on how your instruction can be read and what it will + construct as a result + +#. ``llvm/lib/Bitcode/Reader/Reader.cpp``: + + add a case for your instruction and how it will be parsed from bitcode + +#. ``llvm/lib/VMCore/Instruction.cpp``: + + add a case for how your instruction will be printed out to assembly + +#. ``llvm/lib/VMCore/Instructions.cpp``: + + implement the class you defined in ``llvm/include/llvm/Instructions.h`` + +#. Test your instruction + +#. ``llvm/lib/Target/*``: + + add support for your instruction to code generators, or add a lowering pass. + +#. ``llvm/test/Regression/*``: + + add your test cases to the test suite. + +Also, you need to implement (or modify) any analyses or passes that you want to +understand this new instruction. + +Adding a new type +================= + +.. warning:: + + Adding new types changes the bitcode format, and will break compatibility with + currently-existing LLVM installations. Only add new types if it is absolutely + necessary. + +Adding a fundamental type +------------------------- + +#. ``llvm/include/llvm/Type.h``: + + add enum for the new type; add static ``Type*`` for this type + +#. ``llvm/lib/VMCore/Type.cpp``: + + add mapping from ``TypeID`` => ``Type*``; initialize the static ``Type*`` + +#. ``llvm/lib/AsmReader/Lexer.l``: + + add ability to parse in the type from text assembly + +#. ``llvm/lib/AsmReader/llvmAsmParser.y``: + + add a token for that type + +Adding a derived type +--------------------- + +#. ``llvm/include/llvm/Type.h``: + + add enum for the new type; add a forward declaration of the type also + +#. ``llvm/include/llvm/DerivedTypes.h``: + + add new class to represent new class in the hierarchy; add forward + declaration to the TypeMap value type + +#. ``llvm/lib/VMCore/Type.cpp``: + + add support for derived type to: + + .. code:: c++ + + std::string getTypeDescription(const Type &Ty, + std::vector &TypeStack) + bool TypesEqual(const Type *Ty, const Type *Ty2, + std::map &EqTypes) + + add necessary member functions for type, and factory methods + +#. ``llvm/lib/AsmReader/Lexer.l``: + + add ability to parse in the type from text assembly + +#. ``llvm/lib/BitCode/Writer/Writer.cpp``: + + modify ``void BitcodeWriter::outputType(const Type *T)`` to serialize your + type + +#. ``llvm/lib/BitCode/Reader/Reader.cpp``: + + modify ``const Type *BitcodeReader::ParseType()`` to read your data type + +#. ``llvm/lib/VMCore/AsmWriter.cpp``: + + modify + + .. code:: c++ + + void calcTypeName(const Type *Ty, + std::vector &TypeStack, + std::map &TypeNames, + std::string &Result) + + to output the new derived type diff --git a/docs/programming.rst b/docs/programming.rst index e8acc1d2e0c..c4eec59417e 100644 --- a/docs/programming.rst +++ b/docs/programming.rst @@ -6,10 +6,11 @@ Programming Documentation .. toctree:: :hidden: + Atomics CodingStandards CommandLine CompilerWriterInfo - Atomics + ExtendingLLVM HowToSetUpLLVMStyleRTTI * `LLVM Language Reference Manual `_ @@ -40,7 +41,7 @@ Programming Documentation How to make ``isa<>``, ``dyn_cast<>``, etc. available for clients of your class hierarchy. -* `Extending LLVM `_ +* :ref:`extending_llvm` Look here to see how to add instructions and intrinsics to LLVM.