This patch builds on some existing code to do CFG reconstruction from
a disassembled binary:
- MCModule represents the binary, and has a list of MCAtoms.
- MCAtom represents either disassembled instructions (MCTextAtom), or
contiguous data (MCDataAtom), and covers a specific range of addresses.
- MCBasicBlock and MCFunction form the reconstructed CFG. An MCBB is
backed by an MCTextAtom, and has the usual successors/predecessors.
- MCObjectDisassembler creates a module from an ObjectFile using a
disassembler. It first builds an atom for each section. It can also
construct the CFG, and this splits the text atoms into basic blocks.
MCModule and MCAtom were only sketched out; MCFunction and MCBB were
implemented under the experimental "-cfg" llvm-objdump -macho option.
This cleans them up for further use; llvm-objdump -d -cfg now generates
graphviz files for each function found in the binary.
In the future, MCObjectDisassembler may be the right place to do
"intelligent" disassembly: for example, handling constant islands is just
a matter of splitting the atom, using information that may be available
in the ObjectFile. Also, better initial atom formation than just using
sections is possible using symbols (and things like Mach-O's
function_starts load command).
This brings two minor regressions in llvm-objdump -macho -cfg:
- The printing of a relocation's referenced symbol.
- An annotation on loop BBs, i.e., which are their own successor.
Relocation printing is replaced by the MCSymbolizer; the basic CFG
annotation will be superseded by more related functionality.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182628 91177308-0d34-0410-b5e6-96231b3b80d8
This is a basic first step towards symbolization of disassembled
instructions. This used to be done using externally provided (C API)
callbacks. This patch introduces:
- the MCSymbolizer class, that mimics the same functions that were used
in the X86 and ARM disassemblers to symbolize immediate operands and
to annotate loads based off PC (for things like c string literals).
- the MCExternalSymbolizer class, which implements the old C API.
- the MCRelocationInfo class, which provides a way for targets to
translate relocations (either object::RelocationRef, or disassembler
C API VariantKinds) to MCExprs.
- the MCObjectSymbolizer class, which does symbolization using what it
finds in an object::ObjectFile. This makes simple symbolization (with
no fancy relocation stuff) work for all object formats!
- x86-64 Mach-O and ELF MCRelocationInfos.
- A basic ARM Mach-O MCRelocationInfo, that provides just enough to
support the C API VariantKinds.
Most of what works in otool (the only user of the old symbolization API
that I know of) for x86-64 symbolic disassembly (-tvV) works, namely:
- symbol references: call _foo; jmp 15 <_foo+50>
- relocations: call _foo-_bar; call _foo-4
- __cf?string: leaq 193(%rip), %rax ## literal pool for "hello"
Stub support is the main missing part (because libObject doesn't know,
among other things, about mach-o indirect symbols).
As for the MCSymbolizer API, instead of relying on the disassemblers
to call the tryAdding* methods, maybe this could be done automagically
using InstrInfo? For instance, even though PC-relative LEAs are used
to get the address of string literals in a typical Mach-O file, a MOV
would be used in an ELF file. And right now, the explicit symbolization
only recognizes PC-relative LEAs. InstrInfo should have already have
most of what is needed to know what to symbolize, so this can
definitely be improved.
I'd also like to remove object::RelocationRef::getValueString (it seems
only used by relocation printing in objdump), as simply printing the
created MCExpr is definitely enough (and cleaner than string concats).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182625 91177308-0d34-0410-b5e6-96231b3b80d8
Solaris doesn't have an endian.h header, but SPARC is the only
big-endian architecture that runs Solaris, so just use that to detect
endianness at compile time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182419 91177308-0d34-0410-b5e6-96231b3b80d8
BitVector/SmallBitVector::reference::operator bool remain implicit since
they model more exactly a bool, rather than something else that can be
boolean tested.
The most common (non-buggy) case are where such objects are used as
return expressions in bool-returning functions or as boolean function
arguments. In those cases I've used (& added if necessary) a named
function to provide the equivalent (or sometimes negative, depending on
convenient wording) test.
One behavior change (YAMLParser) was made, though no test case is
included as I'm not sure how to reach that code path. Essentially any
comparison of llvm::yaml::document_iterators would be invalid if neither
iterator was at the end.
This helped uncover a couple of bugs in Clang - test cases provided for
those in a separate commit along with similar changes to `operator bool`
instances in Clang.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181868 91177308-0d34-0410-b5e6-96231b3b80d8
It was just a less powerful and more confusing version of
MCCFIInstruction. A side effect is that, since MCCFIInstruction uses
dwarf register numbers, calls to getDwarfRegNum are pushed out, which
should allow further simplifications.
I left the MachineModuleInfo::addFrameMove interface unchanged since
this patch was already fairly big.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181680 91177308-0d34-0410-b5e6-96231b3b80d8
- previously formatted_raw_ostream tracked columns, now it tracks lines too
- used by (upcoming) DebugIR pass to know the line number to connect to each IR
instruction
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181463 91177308-0d34-0410-b5e6-96231b3b80d8
All R_PPC_... relocs should also be present (using the same number)
under the corresponding R_PPC64_... name. The latter were missing
for a couple of cases, which this patch adds.
This is not a big problem when emitting the reloc, because we can
just use the R_PPC_... define instead. But it is a problem when
*dumping* relocations e.g. using llvm-readobj, because this will
expect only R_PPC64_... values when inspecting a ppc64 ELF file.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181451 91177308-0d34-0410-b5e6-96231b3b80d8
A * (1 - (uitofp i1 C)) -> select C, 0, A
B * (uitofp i1 C) -> select C, B, 0
select C, 0, A + select C, B, 0 -> select C, B, A
These come up in code that has been hand-optimized from a select to a linear blend,
on platforms where that may have mattered. We want to undo such changes
with the following transform:
A*(1 - uitofp i1 C) + B*(uitofp i1 C) -> select C, A, B
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181216 91177308-0d34-0410-b5e6-96231b3b80d8
Add support for matching 'ordered' and 'unordered' floating point min/max
constructs.
In LLVM we can express min/max functions as a combination of compare and select.
We have support for matching such constructs for integers but not for floating
point. In floating point math there is no total order because of the presence of
'NaN'. Therefore, we have to be careful to preserve the original fcmp semantics
when interpreting floating point compare select combinations as a minimum or
maximum function. The resulting 'ordered/unordered' floating point maximum
function has to select the same value as the select/fcmp combination it is based
on.
ordered_max(x,y) = max(x,y) iff x and y are not NaN, y otherwise
unordered_max(x,y) = max(x,y) iff x and y are not NaN, x otherwise
ordered_min(x,y) = min(x,y) iff x and y are not NaN, y otherwise
unordered_min(x,y) = min(x,y) iff x and y are not NaN, x otherwise
This matches the behavior of the underlying select(fcmp(olt/ult/.., L, R), L, R)
construct.
Any code using this predicate has to preserve this semantics.
A follow-up patch will use this to implement floating point min/max reductions
in the vectorizer.
radar://13723044
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181143 91177308-0d34-0410-b5e6-96231b3b80d8
Another step towards reinstating the SystemZ backend. Tests will be
included in the main backend patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181008 91177308-0d34-0410-b5e6-96231b3b80d8
CodeModel: It's now possible to create an MCJIT instance with any CodeModel you like. Previously it was only possible to
create an MCJIT that used CodeModel::JITDefault.
EnableFastISel: It's now possible to turn on the fast instruction selector.
The CodeModel option required some trickery. The problem is that previously, we were ensuring future binary compatibility in
the MCJITCompilerOptions by mandating that the user bzero's the options struct and passes the sizeof() that he saw; the
bindings then bzero the remaining bits. This works great but assumes that the bitwise zero equivalent of any field is a
sensible default value.
But this is not the case for LLVMCodeModel, or its internal equivalent, llvm::CodeModel::Model. In both of those, the default
for a JIT is CodeModel::JITDefault (or LLVMCodeModelJITDefault), which is not bitwise zero.
Hence this change introduces LLVMInitializeMCJITCompilerOptions(), which will initialize the user's options struct with
defaults. The user will use this in the same way that they would have previously used memset() or bzero(). MCJITCAPITest.cpp
illustrates the change, as does the comment in ExecutionEngine.h.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180893 91177308-0d34-0410-b5e6-96231b3b80d8
the things, and renames it to CBindingWrapping.h. I also moved
CBindingWrapping.h into Support/.
This new file just contains the macros for defining different wrap/unwrap
methods.
The calls to those macros, as well as any custom wrap/unwrap definitions
(like for array of Values for example), are put into corresponding C++
headers.
Doing this required some #include surgery, since some .cpp files relied
on the fact that including Wrap.h implicitly caused the inclusion of a
bunch of other things.
This also now means that the C++ headers will include their corresponding
C API headers; for example Value.h must include llvm-c/Core.h. I think
this is harmless, since the C API headers contain just external function
declarations and some C types, so I don't believe there should be any
nasty dependency issues here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180881 91177308-0d34-0410-b5e6-96231b3b80d8
I will remove the isBigEndianHost function once I update clang.
The ifdef logic is designed to
* not use configure/cmake to avoid breaking -arch i686 -arch ppc.
* default to little endian
* be as small as possible
It looks like sys/endian.h is the preferred header on most modern BSD systems,
but it is better to change this in a followup patch as machine/endian.h is
available on FreeBSD, OpenBSD, NetBSD and OS X.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179527 91177308-0d34-0410-b5e6-96231b3b80d8
This will be used in clang to decide if it should create an @file or not. It
will be tested on the clang side.
Patch by Nathan Froyd.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179285 91177308-0d34-0410-b5e6-96231b3b80d8
temporarily while we work on plumbing through some changes to continue
supporting gdb on darwin.
This reverts commit r179122.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179222 91177308-0d34-0410-b5e6-96231b3b80d8
ELF with support for:
- File headers
- Section headers + data
- Relocations
- Symbols
- Unwind data (only COFF/Win64)
The output format follows a few rules:
- Values are almost always output one per line (as elf-dump/coff-dump already do). - Many values are translated to something readable (like enum names), with the raw value in parentheses.
- Hex numbers are output in uppercase, prefixed with "0x".
- Flags are sorted alphabetically.
- Lists and groups are always delimited.
Example output:
---------- snip ----------
Sections [
Section {
Index: 1
Name: .text (5)
Type: SHT_PROGBITS (0x1)
Flags [ (0x6)
SHF_ALLOC (0x2)
SHF_EXECINSTR (0x4)
]
Address: 0x0
Offset: 0x40
Size: 33
Link: 0
Info: 0
AddressAlignment: 16
EntrySize: 0
Relocations [
0x6 R_386_32 .rodata.str1.1 0x0
0xB R_386_PC32 puts 0x0
0x12 R_386_32 .rodata.str1.1 0x0
0x17 R_386_PC32 puts 0x0
]
SectionData (
0000: 83EC04C7 04240000 0000E8FC FFFFFFC7 |.....$..........|
0010: 04240600 0000E8FC FFFFFF31 C083C404 |.$.........1....|
0020: C3 |.|
)
}
]
---------- snip ----------
Relocations and symbols can be output standalone or together with the section header as displayed in the example.
This feature set supports all tests in test/MC/COFF and test/MC/ELF (and I suspect all additional tests using elf-dump), making elf-dump and coff-dump deprecated.
Patch by Nico Rieck!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178679 91177308-0d34-0410-b5e6-96231b3b80d8
requires that the return type of *r for all iterators r be reference,
where reference is defined in [iterator.requirements.general]/p11 as
iterator_traits<X>::reference, and X is the type of r.
But in CFG.h, the dereference operator of PredIterator and SuccIterator
return pointer, not reference.
Furthermore the nested type reference is value_type&, which is not the
type returned from operator*().
This patch simply makes the iterator::reference type value_type*, which
is what the operator*() returns, and then re-lables the return type as
reference.
From a functionality point of view, the only difference is that the
nested reference type is now value_type* instead of value_type&.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178240 91177308-0d34-0410-b5e6-96231b3b80d8
As far as simplify_type is concerned, there are 3 kinds of smart pointers:
* const correct: A 'const MyPtr<int> &' produces a 'const int*'. A
'MyPtr<int> &' produces a 'int *'.
* always const: Even a 'MyPtr<int> &' produces a 'const int*'.
* no const: Even a 'const MyPtr<int> &' produces a 'int*'.
This patch then does the following:
* Removes the unused specializations. Since they are unused, it is hard
to know which kind should be implemented.
* Make sure we don't drop const.
* Fix the default forwarding so that const correct pointer only need
one specialization.
* Simplifies the existing specializations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178147 91177308-0d34-0410-b5e6-96231b3b80d8
if execution failed. ExecuteAndWait returns -1 upon an execution failure, but
checking the return value isn't sufficient because the wait command may
return -1 as well. This new parameter is to be used by the clang driver in a
subsequent commit.
Part of rdar://13362359
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178087 91177308-0d34-0410-b5e6-96231b3b80d8
MCTargetDesc/PPCMCCodeEmitter.cpp current has code like:
if (isSVR4ABI() && is64BitMode())
Fixups.push_back(MCFixup::Create(0, MO.getExpr(),
(MCFixupKind)PPC::fixup_ppc_toc16));
else
Fixups.push_back(MCFixup::Create(0, MO.getExpr(),
(MCFixupKind)PPC::fixup_ppc_lo16));
This is a problem for the asm parser, since it requires knowledge of
the ABI / 64-bit mode to be set up. However, more fundamentally,
at this point we shouldn't make such distinctions anyway; in an assembler
file, it always ought to be possible to e.g. generate TOC relocations even
when the main ABI is one that doesn't use TOC.
Fortunately, this is actually completely unnecessary; that code was added
to decide whether to generate TOC relocations, but that information is in
fact already encoded in the VariantKind of the underlying symbol.
This commit therefore merges those fixup types into one, and then decides
which relocation to use based on the VariantKind.
No changes in generated code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178007 91177308-0d34-0410-b5e6-96231b3b80d8
its own library. These functions are bridging between the bitcode reader
and the ll parser which are in different libraries. Previously we didn't
have any good library to do this, and instead played fast and loose with
a "header only" set of interfaces in the Support library. This really
doesn't work well as evidenced by the recent attempt to add timing logic
to the these routines.
As part of this, make them normal functions rather than weird inline
functions, and sink the implementation into the library. Also clean up
the header to be nice and minimal.
This requires updating lots of build system dependencies to specify that
the IRReader library is needed, and several source files to not
implicitly rely upon the header file to transitively include all manner
of other headers.
If you are using IRReader.h, this commit will break you (the header
moved) and you'll need to also update your library usage to include
'irreader'. I will commit the corresponding change to Clang momentarily.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177971 91177308-0d34-0410-b5e6-96231b3b80d8
it's only really useful if you're going to crash anyways. Use it in the pretty stack trace
printer to kill the compiler if we hang while printing the stack trace.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177962 91177308-0d34-0410-b5e6-96231b3b80d8
it. NetBSD/ARM and TILE-Gx are examples for platforms that have an
unusable fenv.h and this avoids the need for a blacklist.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177865 91177308-0d34-0410-b5e6-96231b3b80d8
-time-ir-parsing flag
This breaks the layering of the Support library. We can't add an
implementation side to IRReader because it refers directly to entities
only accessible as part of the IR, AsmParser, and BitcodeReader
libraries. It can only be used in a context where all of those libraries
will be available.
We'll need to find some other way to get this functionality, and
hopefully solve the long-standing layering problem of IRReader.h...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177695 91177308-0d34-0410-b5e6-96231b3b80d8
Clients of MemoryBuffer::getOpenFile expect it not to take ownership of the file
descriptor passed in. So don't.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176995 91177308-0d34-0410-b5e6-96231b3b80d8