llvm-6502/lib/Target/ARM/MCTargetDesc/ARMMCTargetDesc.cpp

486 lines
18 KiB
C++
Raw Normal View History

//===-- ARMMCTargetDesc.cpp - ARM Target Descriptions ---------------------===//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
//
// This file provides ARM specific target descriptions.
//
//===----------------------------------------------------------------------===//
#include "ARMBaseInfo.h"
#include "ARMMCAsmInfo.h"
#include "ARMMCTargetDesc.h"
#include "InstPrinter/ARMInstPrinter.h"
#include "llvm/ADT/Triple.h"
#include "llvm/MC/MCCodeGenInfo.h"
Remove some really nasty uses of hasRawTextSupport. When MC was first added, targets could use hasRawTextSupport to keep features working before they were added to the MC interface. The design goal of MC is to provide an uniform api for printing assembly and object files. Short of relaxations and other corner cases, a object file is just another representation of the assembly. It was never the intention that targets would keep doing things like if (hasRawTextSupport()) Set flags in one way. else Set flags in another way. When they do that they create two code paths and the object file is no longer just another representation of the assembly. This also then requires testing with llc -filetype=obj, which is extremelly brittle. This patch removes some of these hacks by replacing them with smaller ones. The ARM flag setting is trivial, so I just moved it to the constructor. For Mips, the patch adds two temporary hack directives that allow the assembly to represent the same things as the object file was already able to. The hope is that the mips developers will replace the hack directives with the same ones that gas uses and drop the -print-hack-directives flag. I will also try to implement a target streamer interface, so that we can move this out of the common code. In summary, for any new work, two rules of the thumb are * Don't use "llc -filetype=obj" in tests. * Don't add calls to hasRawTextSupport. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192035 91177308-0d34-0410-b5e6-96231b3b80d8
2013-10-05 16:42:21 +00:00
#include "llvm/MC/MCELFStreamer.h"
#include "llvm/MC/MCInstrAnalysis.h"
#include "llvm/MC/MCInstrInfo.h"
#include "llvm/MC/MCRegisterInfo.h"
#include "llvm/MC/MCStreamer.h"
#include "llvm/MC/MCSubtargetInfo.h"
#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/TargetRegistry.h"
using namespace llvm;
#define GET_REGINFO_MC_DESC
#include "ARMGenRegisterInfo.inc"
static bool getMCRDeprecationInfo(MCInst &MI, MCSubtargetInfo &STI,
std::string &Info) {
if (STI.getFeatureBits() & llvm::ARM::HasV7Ops &&
(MI.getOperand(0).isImm() && MI.getOperand(0).getImm() == 15) &&
(MI.getOperand(1).isImm() && MI.getOperand(1).getImm() == 0) &&
// Checks for the deprecated CP15ISB encoding:
// mcr p15, #0, rX, c7, c5, #4
(MI.getOperand(3).isImm() && MI.getOperand(3).getImm() == 7)) {
if ((MI.getOperand(5).isImm() && MI.getOperand(5).getImm() == 4)) {
if (MI.getOperand(4).isImm() && MI.getOperand(4).getImm() == 5) {
Info = "deprecated since v7, use 'isb'";
return true;
}
// Checks for the deprecated CP15DSB encoding:
// mcr p15, #0, rX, c7, c10, #4
if (MI.getOperand(4).isImm() && MI.getOperand(4).getImm() == 10) {
Info = "deprecated since v7, use 'dsb'";
return true;
}
}
// Checks for the deprecated CP15DMB encoding:
// mcr p15, #0, rX, c7, c10, #5
if (MI.getOperand(4).isImm() && MI.getOperand(4).getImm() == 10 &&
(MI.getOperand(5).isImm() && MI.getOperand(5).getImm() == 5)) {
Info = "deprecated since v7, use 'dmb'";
return true;
}
}
return false;
}
static bool getITDeprecationInfo(MCInst &MI, MCSubtargetInfo &STI,
std::string &Info) {
if (STI.getFeatureBits() & llvm::ARM::HasV8Ops && MI.getOperand(1).isImm() &&
MI.getOperand(1).getImm() != 8) {
Info = "applying IT instruction to more than one subsequent instruction is "
"deprecated";
return true;
}
return false;
}
static bool getARMStoreDeprecationInfo(MCInst &MI, MCSubtargetInfo &STI,
std::string &Info) {
assert((~STI.getFeatureBits() & llvm::ARM::ModeThumb) &&
"cannot predicate thumb instructions");
assert(MI.getNumOperands() >= 4 && "expected >= 4 arguments");
for (unsigned OI = 4, OE = MI.getNumOperands(); OI < OE; ++OI) {
assert(MI.getOperand(OI).isReg() && "expected register");
if (MI.getOperand(OI).getReg() == ARM::SP ||
MI.getOperand(OI).getReg() == ARM::PC) {
Info = "use of SP or PC in the list is deprecated";
return true;
}
}
return false;
}
static bool getARMLoadDeprecationInfo(MCInst &MI, MCSubtargetInfo &STI,
std::string &Info) {
assert((~STI.getFeatureBits() & llvm::ARM::ModeThumb) &&
"cannot predicate thumb instructions");
assert(MI.getNumOperands() >= 4 && "expected >= 4 arguments");
bool ListContainsPC = false, ListContainsLR = false;
for (unsigned OI = 4, OE = MI.getNumOperands(); OI < OE; ++OI) {
assert(MI.getOperand(OI).isReg() && "expected register");
switch (MI.getOperand(OI).getReg()) {
default:
break;
case ARM::LR:
ListContainsLR = true;
break;
case ARM::PC:
ListContainsPC = true;
break;
case ARM::SP:
Info = "use of SP in the list is deprecated";
return true;
}
}
if (ListContainsPC && ListContainsLR) {
Info = "use of LR and PC simultaneously in the list is deprecated";
return true;
}
return false;
}
#define GET_INSTRINFO_MC_DESC
#include "ARMGenInstrInfo.inc"
#define GET_SUBTARGETINFO_MC_DESC
#include "ARMGenSubtargetInfo.inc"
std::string ARM_MC::ParseARMTriple(StringRef TT, StringRef CPU) {
Triple triple(TT);
bool isThumb = triple.getArch() == Triple::thumb ||
triple.getArch() == Triple::thumbeb;
bool NoCPU = CPU == "generic" || CPU.empty();
std::string ARMArchFeature;
switch (triple.getSubArch()) {
default:
llvm_unreachable("invalid sub-architecture for ARM");
case Triple::ARMSubArch_v8:
if (NoCPU)
// v8a: FeatureDB, FeatureFPARMv8, FeatureNEON, FeatureDSPThumb2,
// FeatureMP, FeatureHWDiv, FeatureHWDivARM, FeatureTrustZone,
// FeatureT2XtPk, FeatureCrypto, FeatureCRC
ARMArchFeature = "+v8,+db,+fp-armv8,+neon,+t2dsp,+mp,+hwdiv,+hwdiv-arm,"
"+trustzone,+t2xtpk,+crypto,+crc";
else
// Use CPU to figure out the exact features
ARMArchFeature = "+v8";
break;
case Triple::ARMSubArch_v7m:
isThumb = true;
if (NoCPU)
// v7m: FeatureNoARM, FeatureDB, FeatureHWDiv, FeatureMClass
ARMArchFeature = "+v7,+noarm,+db,+hwdiv,+mclass";
else
// Use CPU to figure out the exact features.
ARMArchFeature = "+v7";
break;
case Triple::ARMSubArch_v7em:
if (NoCPU)
// v7em: FeatureNoARM, FeatureDB, FeatureHWDiv, FeatureDSPThumb2,
// FeatureT2XtPk, FeatureMClass
ARMArchFeature = "+v7,+noarm,+db,+hwdiv,+t2dsp,t2xtpk,+mclass";
else
// Use CPU to figure out the exact features.
ARMArchFeature = "+v7";
break;
case Triple::ARMSubArch_v7s:
if (NoCPU)
// v7s: FeatureNEON, FeatureDB, FeatureDSPThumb2, FeatureHasRAS
// Swift
ARMArchFeature = "+v7,+swift,+neon,+db,+t2dsp,+ras";
else
// Use CPU to figure out the exact features.
ARMArchFeature = "+v7";
break;
case Triple::ARMSubArch_v7:
// v7 CPUs have lots of different feature sets. If no CPU is specified,
// then assume v7a (e.g. cortex-a8) feature set. Otherwise, return
// the "minimum" feature set and use CPU string to figure out the exact
// features.
if (NoCPU)
// v7a: FeatureNEON, FeatureDB, FeatureDSPThumb2, FeatureT2XtPk
ARMArchFeature = "+v7,+neon,+db,+t2dsp,+t2xtpk";
else
// Use CPU to figure out the exact features.
ARMArchFeature = "+v7";
break;
case Triple::ARMSubArch_v6t2:
ARMArchFeature = "+v6t2";
break;
case Triple::ARMSubArch_v6m:
isThumb = true;
if (NoCPU)
// v6m: FeatureNoARM, FeatureMClass
ARMArchFeature = "+v6m,+noarm,+mclass";
else
ARMArchFeature = "+v6";
break;
case Triple::ARMSubArch_v6:
ARMArchFeature = "+v6";
break;
case Triple::ARMSubArch_v5te:
ARMArchFeature = "+v5te";
break;
case Triple::ARMSubArch_v5:
ARMArchFeature = "+v5t";
break;
case Triple::ARMSubArch_v4t:
ARMArchFeature = "+v4t";
break;
case Triple::NoSubArch:
break;
}
if (isThumb) {
if (ARMArchFeature.empty())
ARMArchFeature = "+thumb-mode";
else
ARMArchFeature += ",+thumb-mode";
}
if (triple.isOSNaCl()) {
if (ARMArchFeature.empty())
ARMArchFeature = "+nacl-trap";
else
ARMArchFeature += ",+nacl-trap";
}
return ARMArchFeature;
}
MCSubtargetInfo *ARM_MC::createARMMCSubtargetInfo(StringRef TT, StringRef CPU,
StringRef FS) {
std::string ArchFS = ARM_MC::ParseARMTriple(TT, CPU);
if (!FS.empty()) {
if (!ArchFS.empty())
ArchFS = ArchFS + "," + FS.str();
else
ArchFS = FS;
}
MCSubtargetInfo *X = new MCSubtargetInfo();
InitARMMCSubtargetInfo(X, TT, CPU, ArchFS);
return X;
}
static MCInstrInfo *createARMMCInstrInfo() {
MCInstrInfo *X = new MCInstrInfo();
InitARMMCInstrInfo(X);
return X;
}
static MCRegisterInfo *createARMMCRegisterInfo(StringRef Triple) {
MCRegisterInfo *X = new MCRegisterInfo();
InitARMMCRegisterInfo(X, ARM::LR, 0, 0, ARM::PC);
return X;
}
static MCAsmInfo *createARMMCAsmInfo(const MCRegisterInfo &MRI, StringRef TT) {
Triple TheTriple(TT);
MCAsmInfo *MAI;
if (TheTriple.isOSDarwin() || TheTriple.isOSBinFormatMachO())
MAI = new ARMMCAsmInfoDarwin(TT);
else if (TheTriple.isWindowsItaniumEnvironment())
MAI = new ARMCOFFMCAsmInfoGNU();
else if (TheTriple.isWindowsMSVCEnvironment())
MAI = new ARMCOFFMCAsmInfoMicrosoft();
else
MAI = new ARMELFMCAsmInfo(TT);
unsigned Reg = MRI.getDwarfRegNum(ARM::SP, true);
MAI->addInitialFrameState(MCCFIInstruction::createDefCfa(nullptr, Reg, 0));
return MAI;
}
static MCCodeGenInfo *createARMMCCodeGenInfo(StringRef TT, Reloc::Model RM,
CodeModel::Model CM,
CodeGenOpt::Level OL) {
MCCodeGenInfo *X = new MCCodeGenInfo();
if (RM == Reloc::Default) {
Triple TheTriple(TT);
// Default relocation model on Darwin is PIC, not DynamicNoPIC.
RM = TheTriple.isOSDarwin() ? Reloc::PIC_ : Reloc::DynamicNoPIC;
}
X->InitMCCodeGenInfo(RM, CM, OL);
return X;
}
// This is duplicated code. Refactor this.
static MCStreamer *createMCStreamer(const Target &T, StringRef TT,
MCContext &Ctx, MCAsmBackend &MAB,
raw_ostream &OS, MCCodeEmitter *Emitter,
const MCSubtargetInfo &STI, bool RelaxAll) {
Triple TheTriple(TT);
switch (TheTriple.getObjectFormat()) {
default: llvm_unreachable("unsupported object format");
case Triple::MachO: {
MCStreamer *S = createMachOStreamer(Ctx, MAB, OS, Emitter, false);
new ARMTargetStreamer(*S);
return S;
}
case Triple::COFF:
assert(TheTriple.isOSWindows() && "non-Windows ARM COFF is not supported");
return createARMWinCOFFStreamer(Ctx, MAB, *Emitter, OS);
case Triple::ELF:
return createARMELFStreamer(Ctx, MAB, OS, Emitter, false,
TheTriple.getArch() == Triple::thumb);
}
}
static MCInstPrinter *createARMMCInstPrinter(const Target &T,
unsigned SyntaxVariant,
const MCAsmInfo &MAI,
const MCInstrInfo &MII,
const MCRegisterInfo &MRI,
const MCSubtargetInfo &STI) {
if (SyntaxVariant == 0)
return new ARMInstPrinter(MAI, MII, MRI, STI);
return nullptr;
}
static MCRelocationInfo *createARMMCRelocationInfo(StringRef TT,
MCContext &Ctx) {
Add MCSymbolizer for symbolic/annotated disassembly. This is a basic first step towards symbolization of disassembled instructions. This used to be done using externally provided (C API) callbacks. This patch introduces: - the MCSymbolizer class, that mimics the same functions that were used in the X86 and ARM disassemblers to symbolize immediate operands and to annotate loads based off PC (for things like c string literals). - the MCExternalSymbolizer class, which implements the old C API. - the MCRelocationInfo class, which provides a way for targets to translate relocations (either object::RelocationRef, or disassembler C API VariantKinds) to MCExprs. - the MCObjectSymbolizer class, which does symbolization using what it finds in an object::ObjectFile. This makes simple symbolization (with no fancy relocation stuff) work for all object formats! - x86-64 Mach-O and ELF MCRelocationInfos. - A basic ARM Mach-O MCRelocationInfo, that provides just enough to support the C API VariantKinds. Most of what works in otool (the only user of the old symbolization API that I know of) for x86-64 symbolic disassembly (-tvV) works, namely: - symbol references: call _foo; jmp 15 <_foo+50> - relocations: call _foo-_bar; call _foo-4 - __cf?string: leaq 193(%rip), %rax ## literal pool for "hello" Stub support is the main missing part (because libObject doesn't know, among other things, about mach-o indirect symbols). As for the MCSymbolizer API, instead of relying on the disassemblers to call the tryAdding* methods, maybe this could be done automagically using InstrInfo? For instance, even though PC-relative LEAs are used to get the address of string literals in a typical Mach-O file, a MOV would be used in an ELF file. And right now, the explicit symbolization only recognizes PC-relative LEAs. InstrInfo should have already have most of what is needed to know what to symbolize, so this can definitely be improved. I'd also like to remove object::RelocationRef::getValueString (it seems only used by relocation printing in objdump), as simply printing the created MCExpr is definitely enough (and cleaner than string concats). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182625 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-24 00:39:57 +00:00
Triple TheTriple(TT);
if (TheTriple.isOSBinFormatMachO())
Add MCSymbolizer for symbolic/annotated disassembly. This is a basic first step towards symbolization of disassembled instructions. This used to be done using externally provided (C API) callbacks. This patch introduces: - the MCSymbolizer class, that mimics the same functions that were used in the X86 and ARM disassemblers to symbolize immediate operands and to annotate loads based off PC (for things like c string literals). - the MCExternalSymbolizer class, which implements the old C API. - the MCRelocationInfo class, which provides a way for targets to translate relocations (either object::RelocationRef, or disassembler C API VariantKinds) to MCExprs. - the MCObjectSymbolizer class, which does symbolization using what it finds in an object::ObjectFile. This makes simple symbolization (with no fancy relocation stuff) work for all object formats! - x86-64 Mach-O and ELF MCRelocationInfos. - A basic ARM Mach-O MCRelocationInfo, that provides just enough to support the C API VariantKinds. Most of what works in otool (the only user of the old symbolization API that I know of) for x86-64 symbolic disassembly (-tvV) works, namely: - symbol references: call _foo; jmp 15 <_foo+50> - relocations: call _foo-_bar; call _foo-4 - __cf?string: leaq 193(%rip), %rax ## literal pool for "hello" Stub support is the main missing part (because libObject doesn't know, among other things, about mach-o indirect symbols). As for the MCSymbolizer API, instead of relying on the disassemblers to call the tryAdding* methods, maybe this could be done automagically using InstrInfo? For instance, even though PC-relative LEAs are used to get the address of string literals in a typical Mach-O file, a MOV would be used in an ELF file. And right now, the explicit symbolization only recognizes PC-relative LEAs. InstrInfo should have already have most of what is needed to know what to symbolize, so this can definitely be improved. I'd also like to remove object::RelocationRef::getValueString (it seems only used by relocation printing in objdump), as simply printing the created MCExpr is definitely enough (and cleaner than string concats). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182625 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-24 00:39:57 +00:00
return createARMMachORelocationInfo(Ctx);
// Default to the stock relocation info.
return llvm::createMCRelocationInfo(TT, Ctx);
Add MCSymbolizer for symbolic/annotated disassembly. This is a basic first step towards symbolization of disassembled instructions. This used to be done using externally provided (C API) callbacks. This patch introduces: - the MCSymbolizer class, that mimics the same functions that were used in the X86 and ARM disassemblers to symbolize immediate operands and to annotate loads based off PC (for things like c string literals). - the MCExternalSymbolizer class, which implements the old C API. - the MCRelocationInfo class, which provides a way for targets to translate relocations (either object::RelocationRef, or disassembler C API VariantKinds) to MCExprs. - the MCObjectSymbolizer class, which does symbolization using what it finds in an object::ObjectFile. This makes simple symbolization (with no fancy relocation stuff) work for all object formats! - x86-64 Mach-O and ELF MCRelocationInfos. - A basic ARM Mach-O MCRelocationInfo, that provides just enough to support the C API VariantKinds. Most of what works in otool (the only user of the old symbolization API that I know of) for x86-64 symbolic disassembly (-tvV) works, namely: - symbol references: call _foo; jmp 15 <_foo+50> - relocations: call _foo-_bar; call _foo-4 - __cf?string: leaq 193(%rip), %rax ## literal pool for "hello" Stub support is the main missing part (because libObject doesn't know, among other things, about mach-o indirect symbols). As for the MCSymbolizer API, instead of relying on the disassemblers to call the tryAdding* methods, maybe this could be done automagically using InstrInfo? For instance, even though PC-relative LEAs are used to get the address of string literals in a typical Mach-O file, a MOV would be used in an ELF file. And right now, the explicit symbolization only recognizes PC-relative LEAs. InstrInfo should have already have most of what is needed to know what to symbolize, so this can definitely be improved. I'd also like to remove object::RelocationRef::getValueString (it seems only used by relocation printing in objdump), as simply printing the created MCExpr is definitely enough (and cleaner than string concats). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182625 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-24 00:39:57 +00:00
}
namespace {
class ARMMCInstrAnalysis : public MCInstrAnalysis {
public:
ARMMCInstrAnalysis(const MCInstrInfo *Info) : MCInstrAnalysis(Info) {}
bool isUnconditionalBranch(const MCInst &Inst) const override {
// BCCs with the "always" predicate are unconditional branches.
if (Inst.getOpcode() == ARM::Bcc && Inst.getOperand(1).getImm()==ARMCC::AL)
return true;
return MCInstrAnalysis::isUnconditionalBranch(Inst);
}
bool isConditionalBranch(const MCInst &Inst) const override {
// BCCs with the "always" predicate are unconditional branches.
if (Inst.getOpcode() == ARM::Bcc && Inst.getOperand(1).getImm()==ARMCC::AL)
return false;
return MCInstrAnalysis::isConditionalBranch(Inst);
}
MC: Disassembled CFG reconstruction. This patch builds on some existing code to do CFG reconstruction from a disassembled binary: - MCModule represents the binary, and has a list of MCAtoms. - MCAtom represents either disassembled instructions (MCTextAtom), or contiguous data (MCDataAtom), and covers a specific range of addresses. - MCBasicBlock and MCFunction form the reconstructed CFG. An MCBB is backed by an MCTextAtom, and has the usual successors/predecessors. - MCObjectDisassembler creates a module from an ObjectFile using a disassembler. It first builds an atom for each section. It can also construct the CFG, and this splits the text atoms into basic blocks. MCModule and MCAtom were only sketched out; MCFunction and MCBB were implemented under the experimental "-cfg" llvm-objdump -macho option. This cleans them up for further use; llvm-objdump -d -cfg now generates graphviz files for each function found in the binary. In the future, MCObjectDisassembler may be the right place to do "intelligent" disassembly: for example, handling constant islands is just a matter of splitting the atom, using information that may be available in the ObjectFile. Also, better initial atom formation than just using sections is possible using symbols (and things like Mach-O's function_starts load command). This brings two minor regressions in llvm-objdump -macho -cfg: - The printing of a relocation's referenced symbol. - An annotation on loop BBs, i.e., which are their own successor. Relocation printing is replaced by the MCSymbolizer; the basic CFG annotation will be superseded by more related functionality. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182628 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-24 01:07:04 +00:00
bool evaluateBranch(const MCInst &Inst, uint64_t Addr,
uint64_t Size, uint64_t &Target) const override {
// We only handle PCRel branches for now.
if (Info->get(Inst.getOpcode()).OpInfo[0].OperandType!=MCOI::OPERAND_PCREL)
MC: Disassembled CFG reconstruction. This patch builds on some existing code to do CFG reconstruction from a disassembled binary: - MCModule represents the binary, and has a list of MCAtoms. - MCAtom represents either disassembled instructions (MCTextAtom), or contiguous data (MCDataAtom), and covers a specific range of addresses. - MCBasicBlock and MCFunction form the reconstructed CFG. An MCBB is backed by an MCTextAtom, and has the usual successors/predecessors. - MCObjectDisassembler creates a module from an ObjectFile using a disassembler. It first builds an atom for each section. It can also construct the CFG, and this splits the text atoms into basic blocks. MCModule and MCAtom were only sketched out; MCFunction and MCBB were implemented under the experimental "-cfg" llvm-objdump -macho option. This cleans them up for further use; llvm-objdump -d -cfg now generates graphviz files for each function found in the binary. In the future, MCObjectDisassembler may be the right place to do "intelligent" disassembly: for example, handling constant islands is just a matter of splitting the atom, using information that may be available in the ObjectFile. Also, better initial atom formation than just using sections is possible using symbols (and things like Mach-O's function_starts load command). This brings two minor regressions in llvm-objdump -macho -cfg: - The printing of a relocation's referenced symbol. - An annotation on loop BBs, i.e., which are their own successor. Relocation printing is replaced by the MCSymbolizer; the basic CFG annotation will be superseded by more related functionality. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182628 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-24 01:07:04 +00:00
return false;
int64_t Imm = Inst.getOperand(0).getImm();
// FIXME: This is not right for thumb.
MC: Disassembled CFG reconstruction. This patch builds on some existing code to do CFG reconstruction from a disassembled binary: - MCModule represents the binary, and has a list of MCAtoms. - MCAtom represents either disassembled instructions (MCTextAtom), or contiguous data (MCDataAtom), and covers a specific range of addresses. - MCBasicBlock and MCFunction form the reconstructed CFG. An MCBB is backed by an MCTextAtom, and has the usual successors/predecessors. - MCObjectDisassembler creates a module from an ObjectFile using a disassembler. It first builds an atom for each section. It can also construct the CFG, and this splits the text atoms into basic blocks. MCModule and MCAtom were only sketched out; MCFunction and MCBB were implemented under the experimental "-cfg" llvm-objdump -macho option. This cleans them up for further use; llvm-objdump -d -cfg now generates graphviz files for each function found in the binary. In the future, MCObjectDisassembler may be the right place to do "intelligent" disassembly: for example, handling constant islands is just a matter of splitting the atom, using information that may be available in the ObjectFile. Also, better initial atom formation than just using sections is possible using symbols (and things like Mach-O's function_starts load command). This brings two minor regressions in llvm-objdump -macho -cfg: - The printing of a relocation's referenced symbol. - An annotation on loop BBs, i.e., which are their own successor. Relocation printing is replaced by the MCSymbolizer; the basic CFG annotation will be superseded by more related functionality. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182628 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-24 01:07:04 +00:00
Target = Addr+Imm+8; // In ARM mode the PC is always off by 8 bytes.
return true;
}
};
}
static MCInstrAnalysis *createARMMCInstrAnalysis(const MCInstrInfo *Info) {
return new ARMMCInstrAnalysis(Info);
}
// Force static initialization.
extern "C" void LLVMInitializeARMTargetMC() {
// Register the MC asm info.
RegisterMCAsmInfoFn X(TheARMLETarget, createARMMCAsmInfo);
RegisterMCAsmInfoFn Y(TheARMBETarget, createARMMCAsmInfo);
RegisterMCAsmInfoFn A(TheThumbLETarget, createARMMCAsmInfo);
RegisterMCAsmInfoFn B(TheThumbBETarget, createARMMCAsmInfo);
// Register the MC codegen info.
TargetRegistry::RegisterMCCodeGenInfo(TheARMLETarget, createARMMCCodeGenInfo);
TargetRegistry::RegisterMCCodeGenInfo(TheARMBETarget, createARMMCCodeGenInfo);
TargetRegistry::RegisterMCCodeGenInfo(TheThumbLETarget,
createARMMCCodeGenInfo);
TargetRegistry::RegisterMCCodeGenInfo(TheThumbBETarget,
createARMMCCodeGenInfo);
// Register the MC instruction info.
TargetRegistry::RegisterMCInstrInfo(TheARMLETarget, createARMMCInstrInfo);
TargetRegistry::RegisterMCInstrInfo(TheARMBETarget, createARMMCInstrInfo);
TargetRegistry::RegisterMCInstrInfo(TheThumbLETarget, createARMMCInstrInfo);
TargetRegistry::RegisterMCInstrInfo(TheThumbBETarget, createARMMCInstrInfo);
// Register the MC register info.
TargetRegistry::RegisterMCRegInfo(TheARMLETarget, createARMMCRegisterInfo);
TargetRegistry::RegisterMCRegInfo(TheARMBETarget, createARMMCRegisterInfo);
TargetRegistry::RegisterMCRegInfo(TheThumbLETarget, createARMMCRegisterInfo);
TargetRegistry::RegisterMCRegInfo(TheThumbBETarget, createARMMCRegisterInfo);
// Register the MC subtarget info.
TargetRegistry::RegisterMCSubtargetInfo(TheARMLETarget,
ARM_MC::createARMMCSubtargetInfo);
TargetRegistry::RegisterMCSubtargetInfo(TheARMBETarget,
ARM_MC::createARMMCSubtargetInfo);
TargetRegistry::RegisterMCSubtargetInfo(TheThumbLETarget,
ARM_MC::createARMMCSubtargetInfo);
TargetRegistry::RegisterMCSubtargetInfo(TheThumbBETarget,
ARM_MC::createARMMCSubtargetInfo);
// Register the MC instruction analyzer.
TargetRegistry::RegisterMCInstrAnalysis(TheARMLETarget,
createARMMCInstrAnalysis);
TargetRegistry::RegisterMCInstrAnalysis(TheARMBETarget,
createARMMCInstrAnalysis);
TargetRegistry::RegisterMCInstrAnalysis(TheThumbLETarget,
createARMMCInstrAnalysis);
TargetRegistry::RegisterMCInstrAnalysis(TheThumbBETarget,
createARMMCInstrAnalysis);
// Register the MC Code Emitter
TargetRegistry::RegisterMCCodeEmitter(TheARMLETarget,
createARMLEMCCodeEmitter);
TargetRegistry::RegisterMCCodeEmitter(TheARMBETarget,
createARMBEMCCodeEmitter);
TargetRegistry::RegisterMCCodeEmitter(TheThumbLETarget,
createARMLEMCCodeEmitter);
TargetRegistry::RegisterMCCodeEmitter(TheThumbBETarget,
createARMBEMCCodeEmitter);
// Register the asm backend.
TargetRegistry::RegisterMCAsmBackend(TheARMLETarget, createARMLEAsmBackend);
TargetRegistry::RegisterMCAsmBackend(TheARMBETarget, createARMBEAsmBackend);
TargetRegistry::RegisterMCAsmBackend(TheThumbLETarget,
createThumbLEAsmBackend);
TargetRegistry::RegisterMCAsmBackend(TheThumbBETarget,
createThumbBEAsmBackend);
// Register the object streamer.
TargetRegistry::RegisterMCObjectStreamer(TheARMLETarget, createMCStreamer);
TargetRegistry::RegisterMCObjectStreamer(TheARMBETarget, createMCStreamer);
TargetRegistry::RegisterMCObjectStreamer(TheThumbLETarget, createMCStreamer);
TargetRegistry::RegisterMCObjectStreamer(TheThumbBETarget, createMCStreamer);
// Register the asm streamer.
TargetRegistry::RegisterAsmStreamer(TheARMLETarget, createMCAsmStreamer);
TargetRegistry::RegisterAsmStreamer(TheARMBETarget, createMCAsmStreamer);
TargetRegistry::RegisterAsmStreamer(TheThumbLETarget, createMCAsmStreamer);
TargetRegistry::RegisterAsmStreamer(TheThumbBETarget, createMCAsmStreamer);
// Register the null TargetStreamer.
TargetRegistry::RegisterNullTargetStreamer(TheARMLETarget,
createARMNullTargetStreamer);
TargetRegistry::RegisterNullTargetStreamer(TheARMBETarget,
createARMNullTargetStreamer);
TargetRegistry::RegisterNullTargetStreamer(TheThumbLETarget,
createARMNullTargetStreamer);
TargetRegistry::RegisterNullTargetStreamer(TheThumbBETarget,
createARMNullTargetStreamer);
// Register the MCInstPrinter.
TargetRegistry::RegisterMCInstPrinter(TheARMLETarget, createARMMCInstPrinter);
TargetRegistry::RegisterMCInstPrinter(TheARMBETarget, createARMMCInstPrinter);
TargetRegistry::RegisterMCInstPrinter(TheThumbLETarget,
createARMMCInstPrinter);
TargetRegistry::RegisterMCInstPrinter(TheThumbBETarget,
createARMMCInstPrinter);
Add MCSymbolizer for symbolic/annotated disassembly. This is a basic first step towards symbolization of disassembled instructions. This used to be done using externally provided (C API) callbacks. This patch introduces: - the MCSymbolizer class, that mimics the same functions that were used in the X86 and ARM disassemblers to symbolize immediate operands and to annotate loads based off PC (for things like c string literals). - the MCExternalSymbolizer class, which implements the old C API. - the MCRelocationInfo class, which provides a way for targets to translate relocations (either object::RelocationRef, or disassembler C API VariantKinds) to MCExprs. - the MCObjectSymbolizer class, which does symbolization using what it finds in an object::ObjectFile. This makes simple symbolization (with no fancy relocation stuff) work for all object formats! - x86-64 Mach-O and ELF MCRelocationInfos. - A basic ARM Mach-O MCRelocationInfo, that provides just enough to support the C API VariantKinds. Most of what works in otool (the only user of the old symbolization API that I know of) for x86-64 symbolic disassembly (-tvV) works, namely: - symbol references: call _foo; jmp 15 <_foo+50> - relocations: call _foo-_bar; call _foo-4 - __cf?string: leaq 193(%rip), %rax ## literal pool for "hello" Stub support is the main missing part (because libObject doesn't know, among other things, about mach-o indirect symbols). As for the MCSymbolizer API, instead of relying on the disassemblers to call the tryAdding* methods, maybe this could be done automagically using InstrInfo? For instance, even though PC-relative LEAs are used to get the address of string literals in a typical Mach-O file, a MOV would be used in an ELF file. And right now, the explicit symbolization only recognizes PC-relative LEAs. InstrInfo should have already have most of what is needed to know what to symbolize, so this can definitely be improved. I'd also like to remove object::RelocationRef::getValueString (it seems only used by relocation printing in objdump), as simply printing the created MCExpr is definitely enough (and cleaner than string concats). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182625 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-24 00:39:57 +00:00
// Register the MC relocation info.
TargetRegistry::RegisterMCRelocationInfo(TheARMLETarget,
createARMMCRelocationInfo);
TargetRegistry::RegisterMCRelocationInfo(TheARMBETarget,
createARMMCRelocationInfo);
TargetRegistry::RegisterMCRelocationInfo(TheThumbLETarget,
createARMMCRelocationInfo);
TargetRegistry::RegisterMCRelocationInfo(TheThumbBETarget,
createARMMCRelocationInfo);
}