11062 Commits

Author SHA1 Message Date
Chris Lattner
cea8688ee4 Teach the local spiller to turn stack slot loads into register-register copies
when possible, avoiding the load (and avoiding the copy if the value is already
in the right register).

This patch came about when I noticed code like the following being generated:

  store R17 -> [SS1]
  ...blah...
  R4 = load [SS1]

This was causing an LSU reject on the G5.  This problem was due to the register
allocator folding spill code into a reg-reg copy (producing the load), which
prevented the spiller from being able to rewrite the load into a copy, despite
the fact that the value was already available in a register.  In the case
above, we now rip out the R4 load and replace it with a R4 = R17 copy.

This speeds up several programs on X86 (which spills a lot :) ), e.g.
smg2k from 22.39->20.60s, povray from 12.93->12.66s, 168.wupwise from
68.54->53.83s (!), 197.parser from 7.33->6.62s (!), etc.  This may have a larger
impact in some cases on the G5 (by avoiding LSU rejects), though it probably
won't trigger as often (less spilling in general).

Targets that implement folding of loads/stores into copies should implement
the isLoadFromStackSlot hook to get this.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23388 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-19 06:56:21 +00:00
Chris Lattner
a92aab74dd Implement the isLoadFromStackSlot interface
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23387 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-19 05:23:44 +00:00
Chris Lattner
7203e158da Refactor this code a bit and make it more general. This now compiles:
struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus2 (unsigned int x) { b.j += x; }

To:

_plus2:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        slwi r3, r3, 6
        add r3, r4, r3
        rlwimi r3, r4, 0, 26, 14
        stw r3, 0(r2)
        blr


instead of:

_plus2:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        rlwinm r5, r4, 26, 21, 31
        add r3, r5, r3
        rlwimi r4, r3, 6, 15, 25
        stw r4, 0(r2)
        blr

by eliminating an 'and'.

I'm pretty sure this is as small as we can go :)


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23386 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-18 07:22:02 +00:00
Chris Lattner
150f12af7f Compile
struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus2 (unsigned int x) {
  b.j += x;
}

to:

plus2:
        mov %EAX, DWORD PTR [b]
        mov %ECX, %EAX
        and %ECX, 131008
        mov %EDX, DWORD PTR [%ESP + 4]
        shl %EDX, 6
        add %EDX, %ECX
        and %EDX, 131008
        and %EAX, -131009
        or %EDX, %EAX
        mov DWORD PTR [b], %EDX
        ret

instead of:

plus2:
        mov %EAX, DWORD PTR [b]
        mov %ECX, %EAX
        shr %ECX, 6
        and %ECX, 2047
        add %ECX, DWORD PTR [%ESP + 4]
        shl %ECX, 6
        and %ECX, 131008
        and %EAX, -131009
        or %ECX, %EAX
        mov DWORD PTR [b], %ECX
        ret


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23385 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-18 06:30:59 +00:00
Chris Lattner
0b7c0bf249 Generalize this transform, using MaskedValueIsZero, allowing us to compile:
struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus3 (unsigned int x) { b.k += x; }

To:

plus3:
        mov %EAX, DWORD PTR [%ESP + 4]
        shl %EAX, 17
        add DWORD PTR [b], %EAX
        ret

instead of:

plus3:
        mov %EAX, DWORD PTR [%ESP + 4]
        shl %EAX, 17
        mov %ECX, DWORD PTR [b]
        add %EAX, %ECX
        and %EAX, -131072
        and %ECX, 131071
        or %ECX, %EAX
        mov DWORD PTR [b], %ECX
        ret


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23384 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-18 06:02:59 +00:00
Chris Lattner
5aa7666ebe fix typeo
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23383 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-18 05:25:20 +00:00
Chris Lattner
0d947ea943 Remove unintentionally committed code
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23382 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-18 05:12:51 +00:00
Chris Lattner
11021cb988 implement shift.ll:test25. This compiles:
struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus3 (unsigned int x) {
  b.k += x;
}

to:

_plus3:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r3, 0(r2)
        rlwinm r4, r3, 0, 0, 14
        add r4, r4, r3
        rlwimi r4, r3, 0, 15, 31
        stw r4, 0(r2)
        blr

instead of:

_plus3:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        srwi r5, r4, 17
        add r3, r5, r3
        slwi r3, r3, 17
        rlwimi r3, r4, 0, 15, 31
        stw r3, 0(r2)
        blr


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23381 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-18 05:12:10 +00:00
Chris Lattner
c8e7756791 Implement add.ll:test29. Codegening:
struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus1 (unsigned int x) {
  b.i += x;
}

as:
_plus1:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        add r3, r4, r3
        rlwimi r3, r4, 0, 0, 25
        stw r3, 0(r2)
        blr

instead of:

_plus1:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        rlwinm r5, r4, 0, 26, 31
        add r3, r5, r3
        rlwimi r3, r4, 0, 0, 25
        stw r3, 0(r2)
        blr


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23379 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-18 04:24:45 +00:00
Chris Lattner
3255bd101d remove debug output
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23377 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-18 03:50:25 +00:00
Chris Lattner
e9bed7d107 Implement or.ll:test21. This teaches instcombine to be able to turn this:
struct {
   unsigned int bit0:1;
   unsigned int ubyte:31;
} sdata;

void foo() {
  sdata.ubyte++;
}

into this:

foo:
        add DWORD PTR [sdata], 2
        ret

instead of this:

foo:
        mov %EAX, DWORD PTR [sdata]
        mov %ECX, %EAX
        add %ECX, 2
        and %ECX, -2
        and %EAX, 1
        or %EAX, %ECX
        mov DWORD PTR [sdata], %EAX
        ret


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23376 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-18 03:42:07 +00:00
Chris Lattner
6a78c2157a Implement hook for ppc
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23374 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-17 01:03:26 +00:00
Nate Begeman
452d7bebaa More DAG combining. Still need the branch instructions, and select_cc
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23371 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-16 00:54:12 +00:00
Chris Lattner
4ac85b3e94 disable this for now
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23366 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-15 21:44:00 +00:00
Chris Lattner
2e3f5db9ff Give all operands names
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23357 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-14 21:11:13 +00:00
Chris Lattner
43ef1318c6 give all operands names
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23356 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-14 21:10:24 +00:00
Chris Lattner
4345a4a452 Fix some issues exposed by more testing. XORIS had the wrong operands
specified.  The various *imm operands defined by PPC are really all i32,
even though the actual immediate is restricted to a smaller value in it.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23352 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-14 20:53:05 +00:00
Chris Lattner
c36d065dce Fix some bugs noticed by new checking code
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23350 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-14 18:18:39 +00:00
Chris Lattner
6e2f843114 Fix the regression last night compiling povray
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23348 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-14 17:32:56 +00:00
Chris Lattner
3452d23c34 fix a major regression from my patch this afternoon
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23347 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-14 06:06:45 +00:00
Chris Lattner
303b555164 we don't need this proto any longer
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23342 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-13 22:05:21 +00:00
Chris Lattner
af16538511 move the #include for the generated code into the isel class body so we
can use/define class methods


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23339 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-13 22:03:06 +00:00
Chris Lattner
7b738342f0 Change the arg lowering code to use copyfromreg from vregs associated
with incoming arguments instead of the pregs themselves.  This fixes
the scheduler from causing problems by moving a copyfromreg for an argument
to after a select_cc node (now it can, and bad things won't happen).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23334 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-13 19:33:40 +00:00
Chris Lattner
8c4469840e This has been moved to the target-indep code
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23333 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-13 19:32:18 +00:00
Chris Lattner
82da52299c This code is no longer needed, it is moved to the target-indep code
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23332 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-13 19:31:44 +00:00
Chris Lattner
fa57702388 If a function has liveins, and if the target requested that they be plopped
into particular vregs, emit copies into the entry MBB.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23331 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-13 19:30:54 +00:00
Chris Lattner
f2cded73c4 Majik numbers are bad
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23330 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-13 19:03:13 +00:00
Chris Lattner
31262ce53d Remove some dead vectors
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23329 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-13 18:47:49 +00:00
Chris Lattner
7835cdde40 Add a simple xform to simplify array accesses with casts in the way.
This is useful for 178.galgel where resolution of dope vectors (by the
optimizer) causes the scales to become apparent.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23328 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-13 18:36:04 +00:00
Chris Lattner
396b2baf3c Fix an issue where LSR would miss rewriting a use of an IV expression by a PHI node that is not the original PHI.
This fixes up a dot-product loop in galgel, speeding it up from 18.47s to
16.13s.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23327 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-13 02:09:55 +00:00
Chris Lattner
eed48275a1 Add a helper function, allowing us to simplify some code a bit, changing
indentation, no functionality change


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23325 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-13 00:40:14 +00:00
Chris Lattner
408902b3c4 Implement a simple xform to turn code like this:
if () { store A -> P; } else { store B -> P; }

into a PHI node with one store, in the most trival case.  This implements
load.ll:test10.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23324 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-12 23:23:25 +00:00
Chris Lattner
9c1f0fd8de Another load-peephole optimization: do gcse when two loads are next to
each other.  This implements InstCombine/load.ll:test9


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23322 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-12 22:21:03 +00:00
Chris Lattner
62f254df04 Implement a trivial form of store->load forwarding where the store and the
load are exactly consequtive.  This is picked up by other passes, but this
triggers thousands of times in fortran programs that use static locals
(and is thus a compile-time speedup).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23320 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-12 22:00:15 +00:00
Chris Lattner
12b50410cd Fix a regression from last night, which caused this pass to create invalid
code for IV uses outside of loops that are not dominated by the latch block.
We should only convert these uses to use the post-inc value if they ARE
dominated by the latch block.

Also use a new LoopInfo method to simplify some code.

This fixes Transforms/LoopStrengthReduce/2005-09-12-UsesOutOutsideOfLoop.ll


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23318 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-12 17:11:27 +00:00
Chris Lattner
b6a69e70e0 Add a new getLoopLatch() method.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23315 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-12 17:03:55 +00:00
Chris Lattner
c6bae65b49 _test:
li r2, 0
LBB_test_1:     ; no_exit.2
        li r5, 0
        stw r5, 0(r3)
        addi r2, r2, 1
        addi r3, r3, 4
        cmpwi cr0, r2, 701
        blt cr0, LBB_test_1     ; no_exit.2
LBB_test_2:     ; loopexit.2.loopexit
        addi r2, r2, 1
        stw r2, 0(r4)
        blr
[zion ~/llvm]$ cat > ~/xx
Uses of IV's outside of the loop should use hte post-incremented version
of the IV, not the preincremented version.  This helps many loops (e.g. in sixtrack)
which used to generate code like this (this is the code from the
dont-hoist-simple-loop-constants.ll testcase):

_test:
        li r2, 0                 **** IV starts at 0
LBB_test_1:     ; no_exit.2
        or r5, r2, r2            **** Copy for loop exit
        li r2, 0
        stw r2, 0(r3)
        addi r3, r3, 4
        addi r2, r5, 1
        addi r6, r5, 2           **** IV+2
        cmpwi cr0, r6, 701
        blt cr0, LBB_test_1     ; no_exit.2
LBB_test_2:     ; loopexit.2.loopexit
        addi r2, r5, 2       ****  IV+2
        stw r2, 0(r4)
        blr

And now generated code like this:

_test:
        li r2, 1               *** IV starts at 1
LBB_test_1:     ; no_exit.2
        li r5, 0
        stw r5, 0(r3)
        addi r2, r2, 1
        addi r3, r3, 4
        cmpwi cr0, r2, 701     *** IV.postinc + 0
        blt cr0, LBB_test_1
LBB_test_2:     ; loopexit.2.loopexit
        stw r2, 0(r4)          *** IV.postinc + 0
        blr


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23313 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-12 06:04:47 +00:00
Chris Lattner
7259df3ab8 implement Transforms/LoopStrengthReduce/dont-hoist-simple-loop-constants.ll.
We used to emit this code for it:

_test:
        li r2, 1     ;; Value tying up a register for the whole loop
        li r5, 0
LBB_test_1:     ; no_exit.2
        or r6, r5, r5
        li r5, 0
        stw r5, 0(r3)
        addi r5, r6, 1
        addi r3, r3, 4
        add r7, r2, r5  ;; should be addi r7, r5, 1
        cmpwi cr0, r7, 701
        blt cr0, LBB_test_1     ; no_exit.2
LBB_test_2:     ; loopexit.2.loopexit
        addi r2, r6, 2
        stw r2, 0(r4)
        blr

now we emit this:

_test:
        li r2, 0
LBB_test_1:     ; no_exit.2
        or r5, r2, r2
        li r2, 0
        stw r2, 0(r3)
        addi r3, r3, 4
        addi r2, r5, 1
        addi r6, r5, 2   ;; whoa, fold those adds!
        cmpwi cr0, r6, 701
        blt cr0, LBB_test_1     ; no_exit.2
LBB_test_2:     ; loopexit.2.loopexit
        addi r2, r5, 2
        stw r2, 0(r4)
        blr

more improvement coming.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23306 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-10 01:18:45 +00:00
Chris Lattner
e6ec9f20c9 PowerPC cannot truncstore i1 natively
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23304 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-10 00:21:06 +00:00
Chris Lattner
13d58e71b7 Allow targets to say they don't support truncstore i1 (which includes a mask
when storing to an 8-bit memory location), as most don't.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23303 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-10 00:20:18 +00:00
Chris Lattner
a500fc681d Add a missing #include, patch courtesy of Baptiste Lepilleur.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23302 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-09 23:53:39 +00:00
Chris Lattner
3ec5d74fc5 Fix a problem duraid encountered on itanium where this folding:
select (x < y), 1, 0 -> (x < y) incorrectly: the setcc returns i1 but the
select returned i32.  Add the zero extend as needed.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23301 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-09 23:00:07 +00:00
Chris Lattner
08addbd477 Fix a crash viewing dags that have target nodes in them
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23300 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-09 22:35:03 +00:00
Chris Lattner
c9fe7508a5 I forgot that we always spill fp values as 64-bits. Implement spill folding
for FP as well.  This triggers a couple dozen times on 177.mesa (for example).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23299 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-09 21:59:44 +00:00
Chris Lattner
f38df04c3a Fix a problem that Nate noticed, where spill code was not getting coallesced
with copies, leading to code like this:

       lwz r4, 380(r1)
       or r10, r4, r4    ;; Last use of r4

By teaching the PPC backend how to fold spills into copies, we now get this
code:

       lwz r10, 380(r1)

wow. :)

This reduces a testcase nate sent me from 1505 instructions to 1484.

Note that this could handle FP values but doesn't currently, for reasons
mentioned in the patch


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23298 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-09 21:46:49 +00:00
Chris Lattner
1463019e84 code cleanup
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23297 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-09 20:51:08 +00:00
Chris Lattner
50ea01ed5b Use continue in the use-processing loop to make it clear what the early exits
are, simplify logic, and cause things to not be nested as deeply.  This also
uses MRI->areAliases instead of an explicit loop.

No functionality change, just code cleanup.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23296 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-09 20:29:51 +00:00
Nate Begeman
39ee1ac7e5 Last round of 2-node folds from SD.cpp. Will move on to 3 node ops such
as setcc and select next.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23295 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-09 19:49:52 +00:00
Chris Lattner
ceb0a52231 remove debugging code *slaps head*
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23294 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-09 19:19:20 +00:00
Chris Lattner
b11443dc84 When spilling a live range that is used multiple times by one instruction,
only add a reload live range once for the instruction.  This is one step
towards fixing a regalloc pessimization that Nate notice, but is later undone
by the spiller (so no code is changed).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23293 91177308-0d34-0410-b5e6-96231b3b80d8
2005-09-09 19:17:47 +00:00