out, where after the first CombineTo() call, the node the second CombineTo
wishes to replace may no longer exist.
Fix a very real bug with the truncated load optimization on little endian
targets, which do not need a byte offset added to the load.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23704 91177308-0d34-0410-b5e6-96231b3b80d8
like turning:
_foo:
fctiwz f0, f1
stfd f0, -8(r1)
lwz r2, -4(r1)
rlwinm r3, r2, 0, 16, 31
blr
into
_foo:
fctiwz f0,f1
stfd f0,-8(r1)
lhz r3,-2(r1)
blr
Also removed an unncessary constraint from sra -> srl conversion, which
should take care of hte only reason we would ever need to handle sra in
MaskedValueIsZero, AFAIK.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23703 91177308-0d34-0410-b5e6-96231b3b80d8
location, replace them with a new store of the last value. This occurs
in the same neighborhood in 197.parser, speeding it up about 1.5%
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23691 91177308-0d34-0410-b5e6-96231b3b80d8
multiple results.
Use this support to implement trivial store->load forwarding, implementing
CodeGen/PowerPC/store-load-fwd.ll. Though this is the most simple case and
can be extended in the future, it is still useful. For example, it speeds
up 197.parser by 6.2% by avoiding an LSU reject in xalloc:
stw r6, lo16(l5_end_of_array)(r2)
addi r2, r5, -4
stwx r5, r4, r2
- lwzx r5, r4, r2
- rlwinm r5, r5, 0, 0, 30
stwx r5, r4, r2
lwz r2, -4(r4)
ori r2, r2, 1
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23690 91177308-0d34-0410-b5e6-96231b3b80d8
creating a new vreg and inserting a copy: just use the input vreg directly.
This speeds up the compile (e.g. about 5% on mesa with a debug build of llc)
by not adding a bunch of copies and vregs to be coallesced away. On mesa,
for example, this reduces the number of intervals from 168601 to 129040
going into the coallescer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23671 91177308-0d34-0410-b5e6-96231b3b80d8
the 177.mesa failure from last night, and fixes the
CodeGen/PowerPC/2005-10-08-ArithmeticRotate.ll regression test I added.
If this code cannot be fixed, it should be removed for good, but I'll leave
it to Nate to decide its fate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23670 91177308-0d34-0410-b5e6-96231b3b80d8
is faster and uses less stack space. This reduces our stack requirement
enough to compile sixtrack, and though it's a hack, should be enough until
we switch to iterative isel
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23664 91177308-0d34-0410-b5e6-96231b3b80d8
classes on PPC. We were emitting fmr instructions to do fp extensions, which
weren't getting coallesced. This fixes Regression/CodeGen/PowerPC/fpcopy.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23654 91177308-0d34-0410-b5e6-96231b3b80d8
helps but not enough.
Start pulling cases out of PPC32DAGToDAGISel::Select. With GCC 4, this function
required 8512 bytes of stack space for each invocation (GCC 3 required less
than 700 bytes). Pulling this first function out gets us down to 8224. More
to come :(
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23647 91177308-0d34-0410-b5e6-96231b3b80d8
previous copy elisions and we discover we need to reload a register, make
sure to use the regclass of the original register for the reload, not the
class of the current register. This avoid using 16-bit loads to reload 32-bit
values.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23645 91177308-0d34-0410-b5e6-96231b3b80d8