give it a bit more responsibility. Also implement it for MachO.
If hacked to use cfi, 32 bit MachO will produce
.cfi_personality 155, L___gxx_personality_v0$non_lazy_ptr
and 64 bit will produce
.cfi_presonality ___gxx_personality_v0
The general idea is that .cfi_personality gets passed the final symbol. It is
up to codegen to produce it if using indirect representation (like 32 bit
MachO), but it is up to MC to decide which relocations to create.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130341 91177308-0d34-0410-b5e6-96231b3b80d8
only check arguments with pointer types. Update the documentation
of IntrReadArgMem reflect this.
While here, add support for TBAA tags on intrinsic calls.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130317 91177308-0d34-0410-b5e6-96231b3b80d8
non private symbol. This will be use for handling
foo:
.cfi_startproc
...
On OS X where we have to create a foo.eh symbol.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130305 91177308-0d34-0410-b5e6-96231b3b80d8
more callee-saved registers and introduce copies. Only allows it if scheduling
a node above calls would end up lessen register pressure.
Call operands also has added ABI restrictions for register allocation, so be
extra careful with hoisting them above calls.
rdar://9329627
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130245 91177308-0d34-0410-b5e6-96231b3b80d8
1. Only run the early (in the module pass pipe) instcombine/simplifycfg
if the "unit at a time" passes they are cleaning up after runs.
2. Move the "clean up after the unroller" pass to the very end of the
function-level pass pipeline. Loop unroll uses instsimplify now,
so it doesn't create a ton of trash. Moving instcombine later allows
it to clean up after opportunities are exposed by GVN, DSE, etc.
3. Introduce some phase ordering tests for things that are specifically
intended to be simplified by the full optimizer as a whole.
This resolves PR2338, and is progress towards PR6627, which will be
generating code that looks similar to test2.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130241 91177308-0d34-0410-b5e6-96231b3b80d8
This has two effects: 1. We never inflate to a larger register class than what
the sub-target can handle. 2. Completely unconstrained virtual registers get the
largest possible register class.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130229 91177308-0d34-0410-b5e6-96231b3b80d8
The hook will be used by the register allocator when recomputing register
classes after removing constraints.
Thumb1 code doesn't allow anything larger than tGPR, and x86 needs to ensure
that the spill size doesn't change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130228 91177308-0d34-0410-b5e6-96231b3b80d8
This worked untill now because stars are aligned (i.e. num of complex address elments are always 0 or 2+ and when it is 2+ at least two elements are access together)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130225 91177308-0d34-0410-b5e6-96231b3b80d8
return it as a clobber. This allows GVN to do smart things.
Enhance GVN to be smart about the case when a small load is clobbered
by a larger overlapping load. In this case, forward the value. This
allows us to compile stuff like this:
int test(void *P) {
int tmp = *(unsigned int*)P;
return tmp+*((unsigned char*)P+1);
}
into:
_test: ## @test
movl (%rdi), %ecx
movzbl %ch, %eax
addl %ecx, %eax
ret
which has one load. We already handled the case where the smaller
load was from a must-aliased base pointer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130180 91177308-0d34-0410-b5e6-96231b3b80d8
fix bugs exposed by the gcc dejagnu testsuite:
1. The load may actually be used by a dead instruction, which
would cause an assert.
2. The load may not be used by the current chain of instructions,
and we could move it past a side-effecting instruction. Change
how we process uses to define the problem away.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130018 91177308-0d34-0410-b5e6-96231b3b80d8
These values were not used for anything. Spill size and alignment is a property
of the register class, not the register.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129906 91177308-0d34-0410-b5e6-96231b3b80d8
instrument the program to emit .gcda.
TODO: we should emit slightly different .gcda files when .gcno emission is off.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129903 91177308-0d34-0410-b5e6-96231b3b80d8
On the x86-64 and thumb2 targets, some registers are more expensive to encode
than others in the same register class.
Add a CostPerUse field to the TableGen register description, and make it
available from TRI->getCostPerUse. This represents the cost of a REX prefix or a
32-bit instruction encoding required by choosing a high register.
Teach the greedy register allocator to prefer cheap registers for busy live
ranges (as indicated by spill weight).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129864 91177308-0d34-0410-b5e6-96231b3b80d8
used by Clang. To help Clang integration, the PTX target has been split
into two targets: ptx32 and ptx64, depending on the desired pointer size.
- Add GCCBuiltin class to all intrinsics
- Split PTX target into ptx32 and ptx64
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129851 91177308-0d34-0410-b5e6-96231b3b80d8
Add a avoidWriteAfterWrite() target hook to identify register classes that
suffer from write-after-write hazards. For those register classes, try to avoid
writing the same register in two consecutive instructions.
This is currently disabled by default. We should not spill to avoid hazards!
The command line flag -avoid-waw-hazard can be used to enable waw avoidance.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129772 91177308-0d34-0410-b5e6-96231b3b80d8
the generated FastISel. X86 doesn't need to generate code to match ADD16ri8
since ADD16ri will do just fine. This is a small codesize win in the generated
instruction selector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129692 91177308-0d34-0410-b5e6-96231b3b80d8
kind of predicate: one that is specific to imm nodes. The predicate function
specified here just checks an int64_t directly instead of messing around with
SDNode's. The virtue of this is that it means that fastisel and other things
can reason about these predicates.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129675 91177308-0d34-0410-b5e6-96231b3b80d8
structure and fix some fixmes. We now have a TreePredicateFn class
that handles all of the decoding of these things. This is an internal
cleanup that has no impact on the code generated by tblgen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129670 91177308-0d34-0410-b5e6-96231b3b80d8
The basic issue here is that bottom-up isel is matching the branch
and compare, and was failing to fold the load into the branch/compare
combo. Fixing this (by allowing folding into any instruction of a
sequence that is selected) allows us to produce things like:
cmpb $0, 52(%rax)
je LBB4_2
instead of:
movb 52(%rax), %cl
cmpb $0, %cl
je LBB4_2
This makes the generated -O0 code run a bit faster, but also speeds up
compile time by putting less pressure on the register allocator and
generating less code.
This was one of the biggest classes of missing load folding. Implementing
this shrinks 176.gcc's c-decl.s (as a random example) by about 4% in (verbose-asm)
line count.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129656 91177308-0d34-0410-b5e6-96231b3b80d8
does. Also mostly implement it. Still a work-in-progress, but generates legal
output on crafted test cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129630 91177308-0d34-0410-b5e6-96231b3b80d8
Change ELF systems to use CFI for producing the EH tables. This reduces the
size of the clang binary in Debug builds from 690MB to 679MB.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129571 91177308-0d34-0410-b5e6-96231b3b80d8
This is done by pushing physical register definitions close to their
use, which happens to handle flag definitions if they're not glued to
the branch. This seems to be generally a good thing though, so I
didn't need to add a target hook yet.
The primary motivation is to generate code closer to what people
expect and rule out missed opportunity from enabling macro-op
fusion. As a side benefit, we get several 2-5% gains on x86
benchmarks. There is one regression:
SingleSource/Benchmarks/Shootout/lists slows down be -10%. But this is
an independent scheduler bug that will be tracked separately.
See rdar://problem/9283108.
Incidentally, pre-RA scheduling is only half the solution. Fixing the
later passes is tracked by:
<rdar://problem/8932804> [pre-RA-sched] on x86, attempt to schedule CMP/TEST adjacent with condition jump
Fixes:
<rdar://problem/9262453> Scheduler unnecessary break of cmp/jump fusion
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129508 91177308-0d34-0410-b5e6-96231b3b80d8
will allow multiple context with different loop unroll parameters to run. This is a minor change and no effect
on existing application.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129449 91177308-0d34-0410-b5e6-96231b3b80d8
Additional fixes:
Do something reasonable for subtargets with generic
itineraries by handle node latency the same as for an empty
itinerary. Now nodes default to unit latency unless an itinerary
explicitly specifies a zero cycle stage or it is a TokenFactor chain.
Original fixes:
UnitsSharePred was a source of randomness in the scheduler: node
priority depended on the queue data structure. I rewrote the recent
VRegCycle heuristics to completely replace the old heuristic without
any randomness. To make the ndoe latency adjustments work, I also
needed to do something a little more reasonable with TokenFactor. I
gave it zero latency to its consumers and always schedule it as low as
possible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129421 91177308-0d34-0410-b5e6-96231b3b80d8
Now that we have a first-class way to represent unaligned loads, the unaligned
load intrinsics are superfluous.
First part of <rdar://problem/8460511>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129401 91177308-0d34-0410-b5e6-96231b3b80d8
Add handling for tracking the relocations on symbols and resolving them.
Keep track of the relocations even after they are resolved so that if
the RuntimeDyld client moves the object, it can update the address and any
relocations to that object will be updated.
For our trival object file load/run test harness (llvm-rtdyld), this enables
relocations between functions located in the same object module. It should
be trivially extendable to load multiple objects with mutual references.
As a simple example, the following now works (running on x86_64 Darwin 10.6):
$ cat t.c
int bar() {
return 65;
}
int main() {
return bar();
}
$ clang t.c -fno-asynchronous-unwind-tables -o t.o -c
$ otool -vt t.o
t.o:
(__TEXT,__text) section
_bar:
0000000000000000 pushq %rbp
0000000000000001 movq %rsp,%rbp
0000000000000004 movl $0x00000041,%eax
0000000000000009 popq %rbp
000000000000000a ret
000000000000000b nopl 0x00(%rax,%rax)
_main:
0000000000000010 pushq %rbp
0000000000000011 movq %rsp,%rbp
0000000000000014 subq $0x10,%rsp
0000000000000018 movl $0x00000000,0xfc(%rbp)
000000000000001f callq 0x00000024
0000000000000024 addq $0x10,%rsp
0000000000000028 popq %rbp
0000000000000029 ret
$ llvm-rtdyld t.o -debug-only=dyld ; echo $?
Function sym: '_bar' @ 0
Function sym: '_main' @ 16
Extracting function: _bar from [0, 15]
allocated to 0x100153000
Extracting function: _main from [16, 41]
allocated to 0x100154000
Relocation at '_main' + 16 from '_bar(Word1: 0x2d000000)
Resolving relocation at '_main' + 16 (0x100154010) from '_bar (0x100153000)(pcrel, type: 2, Size: 4).
loaded '_main' at: 0x100154000
65
$
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129388 91177308-0d34-0410-b5e6-96231b3b80d8
Use debug info in the IR to find the directory/file:line:col. Each time that location changes, bump a counter.
Unlike the existing profiling system, we don't try to look at argv[], and thusly don't require main() to be present in the IR. This matches GCC's technique where you specify the profiling flag when producing each .o file.
The runtime library is minimal, currently just calling printf at program shutdown time. The API is designed to make it possible to emit GCOV data later on.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129340 91177308-0d34-0410-b5e6-96231b3b80d8
has some bugs. If this is interesting functionality, it should be
reimplemented in the argpromotion pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129314 91177308-0d34-0410-b5e6-96231b3b80d8
disassembler API. Hooked this up to the ARM target so such tools as Darwin's
otool(1) can now print things like branch targets for example this:
blx _puts
instead of this:
blx #-36
And even print the expression encoded in the Mach-O relocation entried for
things like this:
movt r0, :upper16:((_foo-_bar)+1234)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129284 91177308-0d34-0410-b5e6-96231b3b80d8