During the course of using LLVM, you may wish to customize it for your
-research project or for experimentation. At this point, you may realize that
-you need to add something to LLVM, whether it be a new fundamental type, a new
-intrinsic function, or a whole new instruction.
-
-
When you come to this realization, stop and think. Do you really need to
-extend LLVM? Is it a new fundamental capability that LLVM does not support at
-its current incarnation or can it be synthesized from already pre-existing LLVM
-elements? If you are not sure, ask on the LLVM-dev list. The
-reason is that extending LLVM will get involved as you need to update all the
-different passes that you intend to use with your extension, and there are
-many LLVM analyses and transformations, so it may be quite a bit of
-work.
-
-
Adding an intrinsic function is far easier than
-adding an instruction, and is transparent to optimization passes. If your added
-functionality can be expressed as a
-function call, an intrinsic function is the method of choice for LLVM
-extension.
-
-
Before you invest a significant amount of effort into a non-trivial
-extension, ask on the list if what you are
-looking to do can be done with already-existing infrastructure, or if maybe
-someone else is already working on it. You will save yourself a lot of time and
-effort by doing so.
Adding a new intrinsic function to LLVM is much easier than adding a new
-instruction. Almost all extensions to LLVM should start as an intrinsic
-function and then be turned into an instruction if warranted.
-
-
-
llvm/docs/LangRef.html:
- Document the intrinsic. Decide whether it is code generator specific and
- what the restrictions are. Talk to other people about it so that you are
- sure it's a good idea.
-
-
llvm/include/llvm/Intrinsics*.td:
- Add an entry for your intrinsic. Describe its memory access characteristics
- for optimization (this controls whether it will be DCE'd, CSE'd, etc). Note
- that any intrinsic using the llvm_int_ty type for an argument will
- be deemed by tblgen as overloaded and the corresponding suffix
- will be required on the intrinsic's name.
-
-
llvm/lib/Analysis/ConstantFolding.cpp: If it is possible to
- constant fold your intrinsic, add support to it in the
- canConstantFoldCallTo and ConstantFoldCall functions.
-
-
llvm/test/Regression/*: Add test cases for your test cases to the
- test suite
-
-
-
Once the intrinsic has been added to the system, you must add code generator
-support for it. Generally you must do the following steps:
-
-
-
-
Add support to the .td file for the target(s) of your choice in
- lib/Target/*/*.td.
-
-
This is usually a matter of adding a pattern to the .td file that matches
- the intrinsic, though it may obviously require adding the instructions you
- want to generate as well. There are lots of examples in the PowerPC and X86
- backend to follow.
As with intrinsics, adding a new SelectionDAG node to LLVM is much easier
-than adding a new instruction. New nodes are often added to help represent
-instructions common to many targets. These nodes often map to an LLVM
-instruction (add, sub) or intrinsic (byteswap, population count). In other
-cases, new nodes have been added to allow many targets to perform a common task
-(converting between floating point and integer representation) or capture more
-complicated behavior in a single node (rotate).
-
-
-
include/llvm/CodeGen/ISDOpcodes.h:
- Add an enum value for the new SelectionDAG node.
-
lib/CodeGen/SelectionDAG/SelectionDAG.cpp:
- Add code to print the node to getOperationName. If your new node
- can be evaluated at compile time when given constant arguments (such as an
- add of a constant with another constant), find the getNode method
- that takes the appropriate number of arguments, and add a case for your node
- to the switch statement that performs constant folding for nodes that take
- the same number of arguments as your new node.
-
lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:
- Add code to legalize,
- promote, and expand the node as necessary. At a minimum, you will need
- to add a case statement for your node in LegalizeOp which calls
- LegalizeOp on the node's operands, and returns a new node if any of the
- operands changed as a result of being legalized. It is likely that not all
- targets supported by the SelectionDAG framework will natively support the
- new node. In this case, you must also add code in your node's case
- statement in LegalizeOp to Expand your node into simpler, legal
- operations. The case for ISD::UREM for expanding a remainder into
- a divide, multiply, and a subtract is a good example.
-
lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:
- If targets may support the new node being added only at certain sizes, you
- will also need to add code to your node's case statement in
- LegalizeOp to Promote your node's operands to a larger size, and
- perform the correct operation. You will also need to add code to
- PromoteOp to do this as well. For a good example, see
- ISD::BSWAP,
- which promotes its operand to a wider size, performs the byteswap, and then
- shifts the correct bytes right to emulate the narrower byteswap in the
- wider type.
-
lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:
- Add a case for your node in ExpandOp to teach the legalizer how to
- perform the action represented by the new node on a value that has been
- split into high and low halves. This case will be used to support your
- node with a 64 bit operand on a 32 bit target.
-
lib/CodeGen/SelectionDAG/DAGCombiner.cpp:
- If your node can be combined with itself, or other existing nodes in a
- peephole-like fashion, add a visit function for it, and call that function
- from . There are several good examples for simple combines you
- can do; visitFABS and visitSRL are good starting places.
-
-
lib/Target/PowerPC/PPCISelLowering.cpp:
- Each target has an implementation of the TargetLowering class,
- usually in its own file (although some targets include it in the same
- file as the DAGToDAGISel). The default behavior for a target is to
- assume that your new node is legal for all types that are legal for
- that target. If this target does not natively support your node, then
- tell the target to either Promote it (if it is supported at a larger
- type) or Expand it. This will cause the code you wrote in
- LegalizeOp above to decompose your new node into other legal
- nodes for this target.
-
lib/Target/TargetSelectionDAG.td:
- Most current targets supported by LLVM generate code using the DAGToDAG
- method, where SelectionDAG nodes are pattern matched to target-specific
- nodes, which represent individual instructions. In order for the targets
- to match an instruction to your new node, you must add a def for that node
- to the list in this file, with the appropriate type constraints. Look at
- add, bswap, and fadd for examples.
-
lib/Target/PowerPC/PPCInstrInfo.td:
- Each target has a tablegen file that describes the target's instruction
- set. For targets that use the DAGToDAG instruction selection framework,
- add a pattern for your new node that uses one or more target nodes.
- Documentation for this is a bit sparse right now, but there are several
- decent examples. See the patterns for rotl in
- PPCInstrInfo.td.
-
TODO: document complex patterns.
-
llvm/test/Regression/CodeGen/*: Add test cases for your new node
- to the test suite. llvm/test/Regression/CodeGen/X86/bswap.ll is
- a good example.
WARNING: adding instructions changes the bitcode
-format, and it will take some effort to maintain compatibility with
-the previous version. Only add an instruction if it is absolutely
-necessary.
-
-
-
-
llvm/include/llvm/Instruction.def:
- add a number for your instruction and an enum name
-
-
llvm/include/llvm/Instructions.h:
- add a definition for the class that will represent your instruction
-
-
llvm/include/llvm/Support/InstVisitor.h:
- add a prototype for a visitor to your new instruction type
-
-
llvm/lib/AsmParser/Lexer.l:
- add a new token to parse your instruction from assembly text file
-
-
llvm/lib/AsmParser/llvmAsmParser.y:
- add the grammar on how your instruction can be read and what it will
- construct as a result
-
-
llvm/lib/Bitcode/Reader/Reader.cpp:
- add a case for your instruction and how it will be parsed from bitcode
-
-
llvm/lib/VMCore/Instruction.cpp:
- add a case for how your instruction will be printed out to assembly
-
-
llvm/lib/VMCore/Instructions.cpp:
- implement the class you defined in
- llvm/include/llvm/Instructions.h
-
-
Test your instruction
-
-
llvm/lib/Target/*:
- Add support for your instruction to code generators, or add a lowering
- pass.
-
-
llvm/test/Regression/*: add your test cases to the test suite.
-
-
-
-
Also, you need to implement (or modify) any analyses or passes that you want
-to understand this new instruction.
WARNING: adding new types changes the bitcode
-format, and will break compatibility with currently-existing LLVM
-installations. Only add new types if it is absolutely necessary.
-
-
-
-
-
-
-
-
- The LLVM Compiler Infrastructure
-
- Last modified: $Date$
-
-
-
-
diff --git a/docs/ExtendingLLVM.rst b/docs/ExtendingLLVM.rst
new file mode 100644
index 00000000000..e41cfd996e5
--- /dev/null
+++ b/docs/ExtendingLLVM.rst
@@ -0,0 +1,306 @@
+.. _extending_llvm:
+
+============================================================
+Extending LLVM: Adding instructions, intrinsics, types, etc.
+============================================================
+
+Introduction and Warning
+========================
+
+
+During the course of using LLVM, you may wish to customize it for your research
+project or for experimentation. At this point, you may realize that you need to
+add something to LLVM, whether it be a new fundamental type, a new intrinsic
+function, or a whole new instruction.
+
+When you come to this realization, stop and think. Do you really need to extend
+LLVM? Is it a new fundamental capability that LLVM does not support at its
+current incarnation or can it be synthesized from already pre-existing LLVM
+elements? If you are not sure, ask on the `LLVM-dev
+`_ list. The reason is that
+extending LLVM will get involved as you need to update all the different passes
+that you intend to use with your extension, and there are ``many`` LLVM analyses
+and transformations, so it may be quite a bit of work.
+
+Adding an `intrinsic function`_ is far easier than adding an
+instruction, and is transparent to optimization passes. If your added
+functionality can be expressed as a function call, an intrinsic function is the
+method of choice for LLVM extension.
+
+Before you invest a significant amount of effort into a non-trivial extension,
+**ask on the list** if what you are looking to do can be done with
+already-existing infrastructure, or if maybe someone else is already working on
+it. You will save yourself a lot of time and effort by doing so.
+
+.. _intrinsic function:
+
+Adding a new intrinsic function
+===============================
+
+Adding a new intrinsic function to LLVM is much easier than adding a new
+instruction. Almost all extensions to LLVM should start as an intrinsic
+function and then be turned into an instruction if warranted.
+
+#. ``llvm/docs/LangRef.html``:
+
+ Document the intrinsic. Decide whether it is code generator specific and
+ what the restrictions are. Talk to other people about it so that you are
+ sure it's a good idea.
+
+#. ``llvm/include/llvm/Intrinsics*.td``:
+
+ Add an entry for your intrinsic. Describe its memory access characteristics
+ for optimization (this controls whether it will be DCE'd, CSE'd, etc). Note
+ that any intrinsic using the ``llvm_int_ty`` type for an argument will
+ be deemed by ``tblgen`` as overloaded and the corresponding suffix will
+ be required on the intrinsic's name.
+
+#. ``llvm/lib/Analysis/ConstantFolding.cpp``:
+
+ If it is possible to constant fold your intrinsic, add support to it in the
+ ``canConstantFoldCallTo`` and ``ConstantFoldCall`` functions.
+
+#. ``llvm/test/Regression/*``:
+
+ Add test cases for your test cases to the test suite
+
+Once the intrinsic has been added to the system, you must add code generator
+support for it. Generally you must do the following steps:
+
+Add support to the .td file for the target(s) of your choice in
+``lib/Target/*/*.td``.
+
+ This is usually a matter of adding a pattern to the .td file that matches the
+ intrinsic, though it may obviously require adding the instructions you want to
+ generate as well. There are lots of examples in the PowerPC and X86 backend
+ to follow.
+
+Adding a new SelectionDAG node
+==============================
+
+As with intrinsics, adding a new SelectionDAG node to LLVM is much easier than
+adding a new instruction. New nodes are often added to help represent
+instructions common to many targets. These nodes often map to an LLVM
+instruction (add, sub) or intrinsic (byteswap, population count). In other
+cases, new nodes have been added to allow many targets to perform a common task
+(converting between floating point and integer representation) or capture more
+complicated behavior in a single node (rotate).
+
+#. ``include/llvm/CodeGen/ISDOpcodes.h``:
+
+ Add an enum value for the new SelectionDAG node.
+
+#. ``lib/CodeGen/SelectionDAG/SelectionDAG.cpp``:
+
+ Add code to print the node to ``getOperationName``. If your new node can be
+ evaluated at compile time when given constant arguments (such as an add of a
+ constant with another constant), find the ``getNode`` method that takes the
+ appropriate number of arguments, and add a case for your node to the switch
+ statement that performs constant folding for nodes that take the same number
+ of arguments as your new node.
+
+#. ``lib/CodeGen/SelectionDAG/LegalizeDAG.cpp``:
+
+ Add code to `legalize, promote, and expand
+ `_ the node as necessary. At a
+ minimum, you will need to add a case statement for your node in
+ ``LegalizeOp`` which calls LegalizeOp on the node's operands, and returns a
+ new node if any of the operands changed as a result of being legalized. It
+ is likely that not all targets supported by the SelectionDAG framework will
+ natively support the new node. In this case, you must also add code in your
+ node's case statement in ``LegalizeOp`` to Expand your node into simpler,
+ legal operations. The case for ``ISD::UREM`` for expanding a remainder into
+ a divide, multiply, and a subtract is a good example.
+
+#. ``lib/CodeGen/SelectionDAG/LegalizeDAG.cpp``:
+
+ If targets may support the new node being added only at certain sizes, you
+ will also need to add code to your node's case statement in ``LegalizeOp``
+ to Promote your node's operands to a larger size, and perform the correct
+ operation. You will also need to add code to ``PromoteOp`` to do this as
+ well. For a good example, see ``ISD::BSWAP``, which promotes its operand to
+ a wider size, performs the byteswap, and then shifts the correct bytes right
+ to emulate the narrower byteswap in the wider type.
+
+#. ``lib/CodeGen/SelectionDAG/LegalizeDAG.cpp``:
+
+ Add a case for your node in ``ExpandOp`` to teach the legalizer how to
+ perform the action represented by the new node on a value that has been split
+ into high and low halves. This case will be used to support your node with a
+ 64 bit operand on a 32 bit target.
+
+#. ``lib/CodeGen/SelectionDAG/DAGCombiner.cpp``:
+
+ If your node can be combined with itself, or other existing nodes in a
+ peephole-like fashion, add a visit function for it, and call that function
+ from. There are several good examples for simple combines you can do;
+ ``visitFABS`` and ``visitSRL`` are good starting places.
+
+#. ``lib/Target/PowerPC/PPCISelLowering.cpp``:
+
+ Each target has an implementation of the ``TargetLowering`` class, usually in
+ its own file (although some targets include it in the same file as the
+ DAGToDAGISel). The default behavior for a target is to assume that your new
+ node is legal for all types that are legal for that target. If this target
+ does not natively support your node, then tell the target to either Promote
+ it (if it is supported at a larger type) or Expand it. This will cause the
+ code you wrote in ``LegalizeOp`` above to decompose your new node into other
+ legal nodes for this target.
+
+#. ``lib/Target/TargetSelectionDAG.td``:
+
+ Most current targets supported by LLVM generate code using the DAGToDAG
+ method, where SelectionDAG nodes are pattern matched to target-specific
+ nodes, which represent individual instructions. In order for the targets to
+ match an instruction to your new node, you must add a def for that node to
+ the list in this file, with the appropriate type constraints. Look at
+ ``add``, ``bswap``, and ``fadd`` for examples.
+
+#. ``lib/Target/PowerPC/PPCInstrInfo.td``:
+
+ Each target has a tablegen file that describes the target's instruction set.
+ For targets that use the DAGToDAG instruction selection framework, add a
+ pattern for your new node that uses one or more target nodes. Documentation
+ for this is a bit sparse right now, but there are several decent examples.
+ See the patterns for ``rotl`` in ``PPCInstrInfo.td``.
+
+#. TODO: document complex patterns.
+
+#. ``llvm/test/Regression/CodeGen/*``:
+
+ Add test cases for your new node to the test suite.
+ ``llvm/test/Regression/CodeGen/X86/bswap.ll`` is a good example.
+
+Adding a new instruction
+========================
+
+.. warning::
+
+ Adding instructions changes the bitcode format, and it will take some effort
+ to maintain compatibility with the previous version. Only add an instruction
+ if it is absolutely necessary.
+
+#. ``llvm/include/llvm/Instruction.def``:
+
+ add a number for your instruction and an enum name
+
+#. ``llvm/include/llvm/Instructions.h``:
+
+ add a definition for the class that will represent your instruction
+
+#. ``llvm/include/llvm/Support/InstVisitor.h``:
+
+ add a prototype for a visitor to your new instruction type
+
+#. ``llvm/lib/AsmParser/Lexer.l``:
+
+ add a new token to parse your instruction from assembly text file
+
+#. ``llvm/lib/AsmParser/llvmAsmParser.y``:
+
+ add the grammar on how your instruction can be read and what it will
+ construct as a result
+
+#. ``llvm/lib/Bitcode/Reader/Reader.cpp``:
+
+ add a case for your instruction and how it will be parsed from bitcode
+
+#. ``llvm/lib/VMCore/Instruction.cpp``:
+
+ add a case for how your instruction will be printed out to assembly
+
+#. ``llvm/lib/VMCore/Instructions.cpp``:
+
+ implement the class you defined in ``llvm/include/llvm/Instructions.h``
+
+#. Test your instruction
+
+#. ``llvm/lib/Target/*``:
+
+ add support for your instruction to code generators, or add a lowering pass.
+
+#. ``llvm/test/Regression/*``:
+
+ add your test cases to the test suite.
+
+Also, you need to implement (or modify) any analyses or passes that you want to
+understand this new instruction.
+
+Adding a new type
+=================
+
+.. warning::
+
+ Adding new types changes the bitcode format, and will break compatibility with
+ currently-existing LLVM installations. Only add new types if it is absolutely
+ necessary.
+
+Adding a fundamental type
+-------------------------
+
+#. ``llvm/include/llvm/Type.h``:
+
+ add enum for the new type; add static ``Type*`` for this type
+
+#. ``llvm/lib/VMCore/Type.cpp``:
+
+ add mapping from ``TypeID`` => ``Type*``; initialize the static ``Type*``
+
+#. ``llvm/lib/AsmReader/Lexer.l``:
+
+ add ability to parse in the type from text assembly
+
+#. ``llvm/lib/AsmReader/llvmAsmParser.y``:
+
+ add a token for that type
+
+Adding a derived type
+---------------------
+
+#. ``llvm/include/llvm/Type.h``:
+
+ add enum for the new type; add a forward declaration of the type also
+
+#. ``llvm/include/llvm/DerivedTypes.h``:
+
+ add new class to represent new class in the hierarchy; add forward
+ declaration to the TypeMap value type
+
+#. ``llvm/lib/VMCore/Type.cpp``:
+
+ add support for derived type to:
+
+ .. code:: c++
+
+ std::string getTypeDescription(const Type &Ty,
+ std::vector &TypeStack)
+ bool TypesEqual(const Type *Ty, const Type *Ty2,
+ std::map &EqTypes)
+
+ add necessary member functions for type, and factory methods
+
+#. ``llvm/lib/AsmReader/Lexer.l``:
+
+ add ability to parse in the type from text assembly
+
+#. ``llvm/lib/BitCode/Writer/Writer.cpp``:
+
+ modify ``void BitcodeWriter::outputType(const Type *T)`` to serialize your
+ type
+
+#. ``llvm/lib/BitCode/Reader/Reader.cpp``:
+
+ modify ``const Type *BitcodeReader::ParseType()`` to read your data type
+
+#. ``llvm/lib/VMCore/AsmWriter.cpp``:
+
+ modify
+
+ .. code:: c++
+
+ void calcTypeName(const Type *Ty,
+ std::vector &TypeStack,
+ std::map &TypeNames,
+ std::string &Result)
+
+ to output the new derived type
diff --git a/docs/programming.rst b/docs/programming.rst
index e8acc1d2e0c..c4eec59417e 100644
--- a/docs/programming.rst
+++ b/docs/programming.rst
@@ -6,10 +6,11 @@ Programming Documentation
.. toctree::
:hidden:
+ Atomics
CodingStandards
CommandLine
CompilerWriterInfo
- Atomics
+ ExtendingLLVM
HowToSetUpLLVMStyleRTTI
* `LLVM Language Reference Manual `_
@@ -40,7 +41,7 @@ Programming Documentation
How to make ``isa<>``, ``dyn_cast<>``, etc. available for clients of your
class hierarchy.
-* `Extending LLVM `_
+* :ref:`extending_llvm`
Look here to see how to add instructions and intrinsics to LLVM.