151 Commits

Author SHA1 Message Date
Charles Davis
f69a29b23a Revert "Fix the build broken by r189315." and "Move everything depending on Object/MachOFormat.h over to Support/MachO.h."
This reverts commits r189319 and r189315. r189315 broke some tests on what I
believe are big-endian platforms.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@189321 91177308-0d34-0410-b5e6-96231b3b80d8
2013-08-27 05:38:30 +00:00
Charles Davis
9c3dd1b0d1 Move everything depending on Object/MachOFormat.h over to Support/MachO.h.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@189315 91177308-0d34-0410-b5e6-96231b3b80d8
2013-08-27 05:00:43 +00:00
Ahmed Bougacha
7413b54c89 Add basic YAML MC CFG testcase.
Drive-by llvm-objdump cleanup (don't hardcode ToolName).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188904 91177308-0d34-0410-b5e6-96231b3b80d8
2013-08-21 16:13:25 +00:00
Ahmed Bougacha
171ac8ca17 MC CFG: Add YAML MCModule representation to enable MC CFG testing.
Like yaml ObjectFiles, this will be very useful for testing the MC CFG
implementation (mostly MCObjectDisassembler), by matching the output
with YAML, and for potential users of the MC CFG, by using it as an input.

There isn't much to the actual format, it is just a serialization of the
MCModule class. Of note:
  - Basic block references (pred/succ, ..) are represented by the BB's
    start address.
  - Just as in the MC CFG, instructions are MCInsts with a size.
  - Operands have a prefix representing the type (only register and
    immediate supported here).
  - Instruction opcodes are represented by their names; enum values aren't
    stable, enum names mostly are: usually, a change to a name would need
    lots of changes in the backend anyway.
    Same with registers.

All in all, an example is better than 1000 words, here goes:

A simple binary:

  Disassembly of section __TEXT,__text:
  _main:
  100000f9c:      48 8b 46 08             movq    8(%rsi), %rax
  100000fa0:      0f be 00                movsbl  (%rax), %eax
  100000fa3:      3b 04 25 48 00 00 00    cmpl    72, %eax
  100000faa:      0f 8c 07 00 00 00       jl      7 <.Lend>
  100000fb0:      2b 04 25 48 00 00 00    subl    72, %eax
  .Lend:
  100000fb7:      c3                      ret

And the (pretty verbose) generated YAML:

  ---
  Atoms:
    - StartAddress:    0x0000000100000F9C
      Size:            20
      Type:            Text
      Content:
        - Inst:            MOV64rm
          Size:            4
          Ops:             [ RRAX, RRSI, I1, R, I8, R ]
        - Inst:            MOVSX32rm8
          Size:            3
          Ops:             [ REAX, RRAX, I1, R, I0, R ]
        - Inst:            CMP32rm
          Size:            7
          Ops:             [ REAX, R, I1, R, I72, R ]
        - Inst:            JL_4
          Size:            6
          Ops:             [ I7 ]
    - StartAddress:    0x0000000100000FB0
      Size:            7
      Type:            Text
      Content:
        - Inst:            SUB32rm
          Size:            7
          Ops:             [ REAX, REAX, R, I1, R, I72, R ]
    - StartAddress:    0x0000000100000FB7
      Size:            1
      Type:            Text
      Content:
        - Inst:            RET
          Size:            1
          Ops:             [  ]
  Functions:
    - Name:            __text
      BasicBlocks:
        - Address:         0x0000000100000F9C
          Preds:           [  ]
          Succs:           [ 0x0000000100000FB7, 0x0000000100000FB0 ]
     <snip>
  ...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188890 91177308-0d34-0410-b5e6-96231b3b80d8
2013-08-21 07:29:02 +00:00
Michael J. Spencer
081a1941b5 [Object] Split the ELF interface into 3 parts.
* ELFTypes.h contains template magic for defining types based on endianess, size, and alignment.
* ELFFile.h defines the ELFFile class which provides low level ELF specific access.
* ELFObjectFile.h contains ELFObjectFile which uses ELFFile to implement the ObjectFile interface.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188022 91177308-0d34-0410-b5e6-96231b3b80d8
2013-08-08 22:27:13 +00:00
Rafael Espindola
dd5af27a74 keep only the StringRef version of getFileOrSTDIN.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184826 91177308-0d34-0410-b5e6-96231b3b80d8
2013-06-25 05:28:34 +00:00
Bill Wendling
99cb622041 Use pointers to the MCAsmInfo and MCRegInfo.
Someone may want to do something crazy, like replace these objects if they
change or something.

No functionality change intended.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184175 91177308-0d34-0410-b5e6-96231b3b80d8
2013-06-18 07:20:20 +00:00
Rui Ueyama
4bf771b4e6 readobj: Dump PE/COFF optional records.
These records are mandatory for executables and are used by the loader.

Reviewers: rafael

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D939

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183852 91177308-0d34-0410-b5e6-96231b3b80d8
2013-06-12 19:10:33 +00:00
Kevin Enderby
54154f3bf1 Teach llvm-objdump with the -macho parser how to use the data in code table
from the LC_DATA_IN_CODE load command.  And when disassembling print
the data in code formatted for the kind of data it and not disassemble those
bytes.

I added the format specific functionality to the derived class MachOObjectFile
since these tables only appears in Mach-O object files. This is my first
attempt to modify the libObject stuff so if folks have better suggestions
how to fit this in or suggestions on the implementation please let me know.

rdar://11791371


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183424 91177308-0d34-0410-b5e6-96231b3b80d8
2013-06-06 17:20:50 +00:00
Rafael Espindola
6c1202c459 Handle relocations that don't point to symbols.
In ELF (as in MachO), not all relocations point to symbols. Represent this
properly by using a symbol_iterator instead of a SymbolRef. Update llvm-readobj
ELF's dumper to handle relocatios without symbols.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183284 91177308-0d34-0410-b5e6-96231b3b80d8
2013-06-05 01:33:53 +00:00
NAKAMURA Takumi
d1c99b2aae llvm-objdump.cpp: Appease MSC16 x64. utostr(n++) causes internal compiler error.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182722 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-27 00:02:48 +00:00
Michael J. Spencer
c6af2432c8 Replace Count{Leading,Trailing}Zeros_{32,64} with count{Leading,Trailing}Zeros.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182680 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-24 22:23:49 +00:00
Ahmed Bougacha
ef99356dfe MC: Disassembled CFG reconstruction.
This patch builds on some existing code to do CFG reconstruction from
a disassembled binary:
- MCModule represents the binary, and has a list of MCAtoms.
- MCAtom represents either disassembled instructions (MCTextAtom), or
  contiguous data (MCDataAtom), and covers a specific range of addresses.
- MCBasicBlock and MCFunction form the reconstructed CFG. An MCBB is
  backed by an MCTextAtom, and has the usual successors/predecessors.
- MCObjectDisassembler creates a module from an ObjectFile using a
  disassembler. It first builds an atom for each section. It can also
  construct the CFG, and this splits the text atoms into basic blocks.

MCModule and MCAtom were only sketched out; MCFunction and MCBB were
implemented under the experimental "-cfg" llvm-objdump -macho option.
This cleans them up for further use; llvm-objdump -d -cfg now generates
graphviz files for each function found in the binary.

In the future, MCObjectDisassembler may be the right place to do
"intelligent" disassembly: for example, handling constant islands is just
a matter of splitting the atom, using information that may be available
in the ObjectFile. Also, better initial atom formation than just using
sections is possible using symbols (and things like Mach-O's
function_starts load command).

This brings two minor regressions in llvm-objdump -macho -cfg:
- The printing of a relocation's referenced symbol.
- An annotation on loop BBs, i.e., which are their own successor.

Relocation printing is replaced by the MCSymbolizer; the basic CFG
annotation will be superseded by more related functionality.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182628 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-24 01:07:04 +00:00
Ahmed Bougacha
2c94d0faa0 Add MCSymbolizer for symbolic/annotated disassembly.
This is a basic first step towards symbolization of disassembled
instructions. This used to be done using externally provided (C API)
callbacks. This patch introduces:
- the MCSymbolizer class, that mimics the same functions that were used
  in the X86 and ARM disassemblers to symbolize immediate operands and
  to annotate loads based off PC (for things like c string literals).
- the MCExternalSymbolizer class, which implements the old C API.
- the MCRelocationInfo class, which provides a way for targets to
  translate relocations (either object::RelocationRef, or disassembler
  C API VariantKinds) to MCExprs.
- the MCObjectSymbolizer class, which does symbolization using what it
  finds in an object::ObjectFile. This makes simple symbolization (with
  no fancy relocation stuff) work for all object formats!
- x86-64 Mach-O and ELF MCRelocationInfos.
- A basic ARM Mach-O MCRelocationInfo, that provides just enough to
  support the C API VariantKinds.

Most of what works in otool (the only user of the old symbolization API
that I know of) for x86-64 symbolic disassembly (-tvV) works, namely:
- symbol references: call _foo; jmp 15 <_foo+50>
- relocations:       call _foo-_bar; call _foo-4
- __cf?string:       leaq 193(%rip), %rax ## literal pool for "hello"
Stub support is the main missing part (because libObject doesn't know,
among other things, about mach-o indirect symbols).

As for the MCSymbolizer API, instead of relying on the disassemblers
to call the tryAdding* methods, maybe this could be done automagically
using InstrInfo? For instance, even though PC-relative LEAs are used
to get the address of string literals in a typical Mach-O file, a MOV
would be used in an ELF file. And right now, the explicit symbolization
only recognizes PC-relative LEAs. InstrInfo should have already have
most of what is needed to know what to symbolize, so this can
definitely be improved.

I'd also like to remove object::RelocationRef::getValueString (it seems
only used by relocation printing in objdump), as simply printing the
created MCExpr is definitely enough (and cleaner than string concats).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182625 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-24 00:39:57 +00:00
Ahmed Bougacha
27a33ad5ce llvm-objdump: Initialize MCDisassembler once instead of for each section.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182054 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-16 21:28:23 +00:00
Rafael Espindola
4a971705bc Remove the MachineMove class.
It was just a less powerful and more confusing version of
MCCFIInstruction. A side effect is that, since MCCFIInstruction uses
dwarf register numbers, calls to getDwarfRegNum are pushed out, which
should allow further simplifications.

I left the MachineModuleInfo::addFrameMove interface unchanged since
this patch was already fairly big.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181680 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-13 01:16:13 +00:00
Rafael Espindola
bed93b0de1 Introduce convenience typedefs for the 4 ELF object types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181509 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-09 13:13:28 +00:00
Rafael Espindola
956ca7265c Clarify getRelocationAddress x getRelocationOffset a bit.
getRelocationAddress is for dynamic libraries and executables,
getRelocationOffset for relocatable objects.

Mark the getRelocationAddress of COFF and MachO as not implemented yet. Add a
test of ELF's. llvm-readobj -r now prints the same values as readelf -r.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180259 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-25 12:28:45 +00:00
Rafael Espindola
db5f927020 Don't read one command past the end.
Thanks to Evgeniy Stepanov for reporting this.

It might be a good idea to add a command iterator abstraction to MachO.h, but
this fixes the bug for now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179848 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-19 11:36:47 +00:00
Rafael Espindola
fd7aa38e30 At Jim Grosbach's request detemplate Object/MachO.h.
We are still able to handle mixed endian objects by swapping one struct at a
time.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179778 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-18 18:08:55 +00:00
Alexey Samsonov
0eaa6f675c llvm-objdump: Don't print contents of BSS sections: it makes no sense and crashes llvm-objdump on relocated objects with large bss
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179589 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-16 10:53:11 +00:00
Rafael Espindola
da2a2372c6 Finish templating MachObjectFile over endianness.
We are now able to handle big endian macho files in llvm-readobject. Thanks to
David Fang for providing the object files.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179440 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-13 01:45:40 +00:00
Rafael Espindola
317d3f48fd Simplify the code. No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179259 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-11 03:34:37 +00:00
Rafael Espindola
a2561a0153 Template the MachO types over endianness.
For now they are still only used as little endian.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179147 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-10 03:48:25 +00:00
Rafael Espindola
f6cfc15705 Convert MachOObjectFile to a template.
For now it is templated only on being 64 or 32 bits. I will add little/big
endian next.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179097 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-09 14:49:08 +00:00
Rafael Espindola
433611bdf3 Implement MachOObjectFile::getHeader directly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178994 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-07 19:26:57 +00:00
Rafael Espindola
6ab85a81d7 Remove LoadCommandInfo now that we always have a pointer to the command.
LoadCommandInfo was needed to keep a command and its offset in the file. Now
that we always have a pointer to the command, we don't need the offset.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178991 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-07 18:42:06 +00:00
Rafael Espindola
77638d9110 Add MachOObjectFile::LoadCommandInfo.
This avoids using MachOObject::getLoadCommandInfo.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178990 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-07 18:08:12 +00:00
Rafael Espindola
3eff318cba Remove MachOObjectFile::getObject.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178986 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-07 16:07:35 +00:00
Rafael Espindola
305b826f92 Make getObject const. Remove a const_cast.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178980 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-07 14:50:40 +00:00
Rafael Espindola
196abbffe9 Remove last use of InMemoryStruct in llvm-objdump.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178979 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-07 14:40:18 +00:00
Rafael Espindola
13d297260f Remove dead code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178977 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-07 14:30:21 +00:00
Rafael Espindola
eb721c0fbd Remove unused argument.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178976 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-07 14:25:39 +00:00
Rafael Espindola
f16c2bb320 Don't fetch pointers from a InMemoryStruct.
InMemoryStruct is extremely dangerous as it returns data from an internal
buffer when the endiannes doesn't match. This should fix the tests on big
endian hosts.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178875 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05 15:15:22 +00:00
Eric Christopher
99ff2ba240 Don't disassemble symbols with an unknown address or size.
Patch by Nico Rieck!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178678 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-03 18:31:23 +00:00
Shankar Easwaran
512685dacf print TLS segment
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176192 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-27 17:57:17 +00:00
Michael J. Spencer
561823009b [objdump] Add PT_PHDR.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175709 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-21 02:21:29 +00:00
Michael J. Spencer
8a3a1deed8 [objdump] Print the PT_INTERP and PT_DYNAMIC correcctly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175659 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-20 20:18:10 +00:00
Guy Benyei
87d0b9ed14 Add static cast to unsigned char whenever a character classification function is called with a signed char argument, in order to avoid assertions in Windows Debug configuration.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175006 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-12 21:21:59 +00:00
Michael J. Spencer
dd3aa9eab2 [objdump,readobj] Document the purpose and goals of each tool.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174439 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-05 20:27:22 +00:00
Jakub Staszak
3c8da314f1 Remove unneeded #include.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173088 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-21 21:02:47 +00:00
Chandler Carruth
90230c8466 Sort all of the includes. Several files got checked in with mis-sorted
includes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172891 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-19 08:03:47 +00:00
Michael J. Spencer
ac97f5ce48 [Object][ELF] Simplify ELFObjectFile by using ELFType.
This simplifies the usage and implementation of ELFObjectFile by using ELFType
to replace:

<endianness target_endianness, std::size_t max_alignment, bool is64Bits>

This does complicate the base ELF types as they must now use template template
parameters to partially specialize for the 32 and 64bit cases. However these
are only defined once.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172515 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-15 07:44:25 +00:00
Michael J. Spencer
27b2b1b4e5 [llvm-objdump] Emit addresses with the correct number of leading 0's.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172130 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-10 22:40:50 +00:00
Michael J. Spencer
46418797cd [objdump] Use correct format specifiers and fix C++03 variadic warning.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171651 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-06 05:23:59 +00:00
Michael J. Spencer
b2c064c695 [objdump] Add --private-headers, -p.
This currently prints the ELF program headers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171649 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-06 03:56:49 +00:00
Chandler Carruth
7f00f87767 Sort a few more #include lines in tools/... unittests/... and utils/...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171363 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-02 10:26:28 +00:00
Rafael Espindola
cef81b37c7 Add a function to get the segment name of a section.
On MachO, sections also have segment names. When a tool looking at a .o file
prints a segment name, this is what they mean. In reality, a .o has only one
anonymous, segment.

This patch adds a MachO only function to fetch that segment name. I named it
getSectionFinalSegmentName since the main use for the name seems to be inform
the linker with segment this section should go to.

The patch also changes MachOObjectFile::getSectionName to return just the
section name instead of computing SegmentName,SectionName.

The main difference from the previous patch is that it doesn't use
InMemoryStruct. It is extremely dangerous: if the endians match it returns
a pointer to the file buffer, if not, it returns a pointer to an internal buffer
that is overwritten in the next API call.

We should change all of this code to use
support::detail::packed_endian_specific_integral like ELF, but since these
functions only handle strings, they work with big and little endian machines
as is.

I have tested this by installing ubuntu 12.10 ppc on qemu, that is why it took
so long :-)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170838 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-21 03:47:03 +00:00
Rafael Espindola
cd7ee1ced0 Revert 170545 while I debug the ppc failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170547 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-19 14:48:05 +00:00
Rafael Espindola
f9a6bd8524 Add r170095 back.
I cannot reproduce it the failures locally, so I will keep an eye at the ppc
bots. This patch does add the change to the "Disassembly of section" message,
but that is not what was failing on the bots.

Original message:

Add a funciton to get the segment name of a section.

On MachO, sections also have segment names. When a tool looking at a .o file
prints a segment name, this is what they mean. In reality, a .o has only one
anonymous, segment.

This patch adds a MachO only function to fetch that segment name. I named it
getSectionFinalSegmentName since the main use for the name seems to be infor
the linker with segment this section should go to.

The patch also changes MachOObjectFile::getSectionName to return just the
section name instead of computing SegmentName,SectionName.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170545 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-19 14:15:04 +00:00