This document describes techniques for writing backends for LLVM which convert the LLVM representation to machine assembly code or other languages.
In general, you want to follow the format of SPARC, X86 or PowerPC (in lib/Target). SPARC is the simplest backend, and is RISC, so if you're working on a RISC target, it is a good one to start with.
To create a static compiler (one that emits text assembly), you need to implement the following:
RegisterTarget<MyTargetMachine> M("short_name", " Target name");
TableGen register info description - describe a class which will store the register's number in the binary encoding of the instruction (e.g., for JIT purposes).
You also need to define register classes to contain these registers, such as the integer register class and floating-point register class, so that you can allocate virtual registers to instructions from these sets, and let the target-independent register allocator automatically choose the actual architected registers.
// class Register is defined in Target.td class TargetReg<string name> : Register<name> { let Namespace = "Target"; } class IntReg<bits<5> num, string name> : TargetReg<name> { field bits<5> Num = num; } def R0 : IntReg<0, "%R0">; ... // class RegisterClass is defined in Target.td def IReg : RegisterClass<i64, 64, [R0, ... ]>;
TableGen instruction info description - break up instructions into classes, usually that's already done by the manufacturer (see instruction manual). Define a class for each instruction category. Define each opcode as a subclass of the category, with appropriate parameters such as the fixed binary encoding of opcodes and extended opcodes, and map the register bits to the bits of the instruction which they are encoded in (for the JIT). Also specify how the instruction should be printed so it can use the automatic assembly printer, e.g.:
// class Instruction is defined in Target.td class Form<bits<6> opcode, dag OL, string asmstr> : Instruction { field bits<42> Inst; let Namespace = "Target"; let Inst{0-6} = opcode; let OperandList = OL; let AsmString = asmstr; } def ADD : Form<42, (ops IReg:$rD, IReg:$rA, IReg:$rB), "add $rD, $rA, $rB">;
For now, just take a look at lib/Target/CBackend for an example of how the C backend is written.
To actually create your backend, you need to create and modify a few files. Here, the absolute minimum will be discussed. To actually use LLVM's target independent codegenerator, you must implement extra things.
First of all, you should create a subdirectory under lib/Target, which will hold all the files related to your target. Let's assume that our target is called, "Dummy", we would create the directory lib/Target/Dummy.
In this new directory, you should put a Makefile. You can probably copy one from another target and modify it. It should at least contain the LEVEL, LIBRARYNAME and TARGET variables, and then include $(LEVEL)/Makefile.common. Be careful to give the library the correct name, it must be named LLVMDummy (see the MIPS target, for example). Alternatively, you can split the library into LLVMDummyCodeGen and LLVMDummyAsmPrinter, the latter of which should be implemented in a subdirectory below lib/Target/Dummy (see the PowerPC target, for example).
Note that these two naming schemes are hardcoded into llvm-config. Using any other naming scheme will confuse llvm-config and produce lots of (seemingly unrelated) linker errors when linking llc.
To make your target actually do something, you need to implement a subclass of TargetMachine. This implementation should typically be in the file lib/Target/DummyTargetMachine.cpp, but any file in the lib/Target directory will be built and should work. To use LLVM's target independent code generator, you should create a subclass of LLVMTargetMachine. This is what all current machine backends do. To create a target from scratch, create a subclass of TargetMachine. This is what the current language backends do.
To get LLVM to actually build and link your target, you also need to add it to the TARGETS_TO_BUILD variable. To do this, you need to modify the configure script to know about your target when parsing the --enable-targets option. Search the configure script for TARGETS_TO_BUILD, add your target to the lists there (some creativity required) and then reconfigure. Alternatively, you can change autotools/configure.ac and regenerate configure by running ./autoconf/AutoRegen.sh.