Commit Graph

209 Commits

Author SHA1 Message Date
Evan Cheng
c548428c5d Combine ISD::EXTLOAD, ISD::SEXTLOAD, ISD::ZEXTLOAD into ISD::LOADX. Add an
extra operand to LOADX to specify the exact value extension type.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@30714 91177308-0d34-0410-b5e6-96231b3b80d8
2006-10-04 00:56:09 +00:00
Chris Lattner
3fe6c1d389 Legalize is no longer limited to cleverness with just constant shift amounts.
Allow it to be clever when possible and fall back to the gross code when needed.

This allows us to compile:

long long foo1(long long X, int C) {
  return X << (C|32);
}
long long foo2(long long X, int C) {
  return X << (C&~32);
}

to:
_foo1:
        rlwinm r2, r5, 0, 27, 31
        slw r3, r4, r2
        li r4, 0
        blr


        .globl  _foo2
        .align  4
_foo2:
        rlwinm r2, r5, 0, 27, 25
        subfic r5, r2, 32
        slw r3, r3, r2
        srw r5, r4, r5
        or r3, r3, r5
        slw r4, r4, r2
        blr

instead of:

_foo1:
        ori r2, r5, 32
        subfic r5, r2, 32
        addi r6, r2, -32
        srw r5, r4, r5
        slw r3, r3, r2
        slw r6, r4, r6
        or r3, r3, r5
        slw r4, r4, r2
        or r3, r3, r6
        blr


        .globl  _foo2
        .align  4
_foo2:
        rlwinm r2, r5, 0, 27, 25
        subfic r5, r2, 32
        addi r6, r2, -32
        srw r5, r4, r5
        slw r3, r3, r2
        slw r6, r4, r6
        or r3, r3, r5
        slw r4, r4, r2
        or r3, r3, r6
        blr


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@30507 91177308-0d34-0410-b5e6-96231b3b80d8
2006-09-20 03:47:40 +00:00
Chris Lattner
cf9d0acfef Fold the PPCISD shifts when presented with 0 inputs. This occurs for code
like:
long long test(long long X, int Y) {
  return 1ULL << Y;
}
long long test2(long long X, int Y) {
  return -1LL << Y;
}

which we used to compile to:

_test:
        li r2, 1
        subfic r3, r5, 32
        li r4, 0
        addi r6, r5, -32
        srw r3, r2, r3
        slw r4, r4, r5
        slw r6, r2, r6
        or r3, r4, r3
        slw r4, r2, r5
        or r3, r3, r6
        blr
_test2:
        li r2, -1
        subfic r3, r5, 32
        addi r6, r5, -32
        srw r3, r2, r3
        slw r4, r2, r5
        slw r2, r2, r6
        or r3, r4, r3
        or r3, r3, r2
        blr

Now we produce:

_test:
        li r2, 1
        addi r3, r5, -32
        subfic r4, r5, 32
        slw r3, r2, r3
        srw r4, r2, r4
        or r3, r4, r3
        slw r4, r2, r5
        blr
_test2:
        li r2, -1
        subfic r3, r5, 32
        addi r6, r5, -32
        srw r3, r2, r3
        slw r4, r2, r5
        slw r2, r2, r6
        or r3, r4, r3
        or r3, r3, r2
        blr


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@30479 91177308-0d34-0410-b5e6-96231b3b80d8
2006-09-19 05:22:59 +00:00
Evan Cheng
c356a572e3 Reflects MachineConstantPoolEntry changes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@30279 91177308-0d34-0410-b5e6-96231b3b80d8
2006-09-12 21:04:05 +00:00
Reid Spencer
3a9ec2463d For PR387:
Close out this long standing bug by removing the remaining overloaded
virtual functions in LLVM. The -Woverloaded-virtual option is now turned on.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@29934 91177308-0d34-0410-b5e6-96231b3b80d8
2006-08-28 01:02:49 +00:00
Chris Lattner
f6e190fae0 Fix a bug in a recent refactoring that broke a bunch of stuff.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@29649 91177308-0d34-0410-b5e6-96231b3b80d8
2006-08-12 07:20:05 +00:00
Chris Lattner
e219945348 Eliminate use of getNode that takes a vector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@29614 91177308-0d34-0410-b5e6-96231b3b80d8
2006-08-11 17:38:39 +00:00
Chris Lattner
79e490aa23 Convert vectors to fixed sized arrays and smallvectors. Eliminate use of getNode that takes a vector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@29609 91177308-0d34-0410-b5e6-96231b3b80d8
2006-08-11 17:18:05 +00:00
Chris Lattner
325f0a129e Fix miscompilation of float vector returns. Compile code to this:
_func:
        vsldoi v2, v3, v2, 12
        vsldoi v2, v2, v2, 4
        blr

instead of:

_func:
        vsldoi v2, v3, v2, 12
        vsldoi v2, v2, v2, 4
***     vor f1, v2, v2
        blr


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@29607 91177308-0d34-0410-b5e6-96231b3b80d8
2006-08-11 16:47:32 +00:00
Chris Lattner
0d72a20630 Fix some ppc64 issues with vector code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@29384 91177308-0d34-0410-b5e6-96231b3b80d8
2006-07-28 16:45:47 +00:00
Chris Lattner
35d86fef1f Rename RelocModel::PIC to PIC_, to avoid conflicts with -DPIC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@29307 91177308-0d34-0410-b5e6-96231b3b80d8
2006-07-26 21:12:04 +00:00
Chris Lattner
d998938459 Implement Regression/CodeGen/PowerPC/bswap-load-store.ll by folding bswaps
into i16/i32 load/stores.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@29089 91177308-0d34-0410-b5e6-96231b3b80d8
2006-07-10 20:56:58 +00:00
Chris Lattner
f89437d049 Implement 64-bit select, bswap, etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28935 91177308-0d34-0410-b5e6-96231b3b80d8
2006-06-27 20:14:52 +00:00
Chris Lattner
5f9faeaa78 PPC doesn't have bit converts to/from i64
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28932 91177308-0d34-0410-b5e6-96231b3b80d8
2006-06-27 18:40:08 +00:00
Chris Lattner
563ecfbf82 Implement 64-bit undef, sub, shl/shr, srem/urem
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28929 91177308-0d34-0410-b5e6-96231b3b80d8
2006-06-27 18:18:41 +00:00
Chris Lattner
7b0c58cd25 Use i32 for shift amounts instead of i64. This gets bisort working.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28927 91177308-0d34-0410-b5e6-96231b3b80d8
2006-06-27 17:34:57 +00:00
Chris Lattner
c08f902bb7 Implement a bunch of 64-bit cleanliness work. With this, treeadd builds (but
doesn't work right).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28921 91177308-0d34-0410-b5e6-96231b3b80d8
2006-06-27 00:04:13 +00:00
Chris Lattner
c91a4757b6 Improve PPC64 calling convention support
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28919 91177308-0d34-0410-b5e6-96231b3b80d8
2006-06-26 22:48:35 +00:00
Chris Lattner
ef95710583 Correct returns of 64-bit values, though they seemed to work before...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28892 91177308-0d34-0410-b5e6-96231b3b80d8
2006-06-21 00:34:03 +00:00
Chris Lattner
059ca0f5b7 fix some assumptions that pointers can only be 32-bits. With this, we can
now compile:

static unsigned long X;
void test1() {
  X = 0;
}

into:

_test1:
        lis r2, ha16(_X)
        li r3, 0
        stw r3, lo16(_X)(r2)
        blr

Totally amazing :)


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28839 91177308-0d34-0410-b5e6-96231b3b80d8
2006-06-16 21:01:35 +00:00
Chris Lattner
a7a5854f1c Rename some subtarget features. A CPU now can *have* 64-bit instructions,
can in 32-bit mode we can choose to optionally *use* 64-bit registers.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28824 91177308-0d34-0410-b5e6-96231b3b80d8
2006-06-16 17:34:12 +00:00
Evan Cheng
a7dc4a59cb Type of extract_element index operand should be iPTR.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28797 91177308-0d34-0410-b5e6-96231b3b80d8
2006-06-15 08:18:06 +00:00
Chris Lattner
4a45abf66e Fix a problem exposed by the local allocator. CALL instructions are not marked
as using incoming argument registers, so the local allocator would clobber them
between their set and use.  To fix this, we give the call instructions a variable
number of uses in the CALL MachineInstr itself, so live variables understands
the live ranges of these register arguments.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28744 91177308-0d34-0410-b5e6-96231b3b80d8
2006-06-10 01:14:28 +00:00
Chris Lattner
7b05350906 Always reserve space for 8 spilled GPRs. GCC apparently assumes that this
space will be available, even if the callee isn't varargs.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28571 91177308-0d34-0410-b5e6-96231b3b80d8
2006-05-30 21:21:04 +00:00
Evan Cheng
6848be1a27 Change RET node to include signness information of the return values. i.e.
RET chain, value1, sign1, value2, sign2, ...


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28510 91177308-0d34-0410-b5e6-96231b3b80d8
2006-05-26 23:10:12 +00:00
Evan Cheng
4360bdcf1f CALL node change (arg / sign pairs instead of just arguments).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28462 91177308-0d34-0410-b5e6-96231b3b80d8
2006-05-25 00:57:32 +00:00
Chris Lattner
d74ea2bbd8 Patches to make the LLVM sources more -pedantic clean. Patch provided
by Anton Korobeynikov!  This is a step towards closing PR786.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28447 91177308-0d34-0410-b5e6-96231b3b80d8
2006-05-24 17:04:05 +00:00
Chris Lattner
2ef5e89dc9 Fix CodeGen/Generic/vector.ll:test_div with altivec.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28445 91177308-0d34-0410-b5e6-96231b3b80d8
2006-05-24 00:15:25 +00:00
Chris Lattner
5734012375 Handle SETO* like we handle SET*, restoring behavior after Evan's setcc
change.  This fixes PowerPC/fnegsel.ll.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28443 91177308-0d34-0410-b5e6-96231b3b80d8
2006-05-24 00:06:44 +00:00
Chris Lattner
c703a8fbf8 Make PPC call lowering more aggressive, making the isel matching code simple
enough to be autogenerated.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28354 91177308-0d34-0410-b5e6-96231b3b80d8
2006-05-17 19:00:46 +00:00
Chris Lattner
9a2a497284 Switch PPC over to a call-selection model where the lowering code creates
the copyto/fromregs instead of making the PPCISD::CALL selection code create
them.  This vastly simplifies the selection code, and moves the ABI handling
parts into one place.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28346 91177308-0d34-0410-b5e6-96231b3b80d8
2006-05-17 06:01:33 +00:00
Chris Lattner
c8b682ca19 3 changes, 2 of which are cleanup one of which changes codegen:
1. Rearrange code a bit so that the special case doesn't require indenting lots
   of code.
2. Add comments describing PPC calling convention.
3. Only round up to 56-bytes of stack space for an outgoing call if the callee
   is varargs.  This saves a bit of stack space.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28342 91177308-0d34-0410-b5e6-96231b3b80d8
2006-05-17 00:15:40 +00:00
Chris Lattner
c04ba7a97d implement passing/returning vector regs to calls, at least non-varargs calls.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28341 91177308-0d34-0410-b5e6-96231b3b80d8
2006-05-16 23:54:25 +00:00
Chris Lattner
abde460d4f Instead of implementing LowerCallTo directly, let the default impl produce an
ISD::CALL node, then custom lower that.  This means that we only have to handle
LEGAL call operands/results, not every possible type.  This allows us to
simplify the call code, shrinking it by about 1/3.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28339 91177308-0d34-0410-b5e6-96231b3b80d8
2006-05-16 22:56:08 +00:00
Chris Lattner
af4ec0c56d Simplify the argument counting logic by only incrementing the index.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28335 91177308-0d34-0410-b5e6-96231b3b80d8
2006-05-16 18:58:15 +00:00
Chris Lattner
b375b5e629 Simplify the dead argument handling code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28334 91177308-0d34-0410-b5e6-96231b3b80d8
2006-05-16 18:54:32 +00:00
Chris Lattner
be4849aabe Vector args passed in registers don't reserve stack space.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28333 91177308-0d34-0410-b5e6-96231b3b80d8
2006-05-16 18:51:52 +00:00
Chris Lattner
8ab5fe574a Switch the PPC backend over to using FORMAL_ARGUMENTS for formal argument
handling.  This makes the lower argument code significantly simpler (we
only need to handle legal argument types).

Incidentally, this also implements support for vector argument registers,
so long as they are not on the stack.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28331 91177308-0d34-0410-b5e6-96231b3b80d8
2006-05-16 18:18:50 +00:00
Chris Lattner
00402c7ec3 Fit in 80 cols
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28311 91177308-0d34-0410-b5e6-96231b3b80d8
2006-05-16 04:20:24 +00:00
Chris Lattner
b65e7256ed Remove dead var, fix bad override.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28264 91177308-0d34-0410-b5e6-96231b3b80d8
2006-05-12 21:09:57 +00:00
Chris Lattner
25b8b8cb2c Fix CodeGen/Generic/2006-04-28-Sign-extend-bool.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28017 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-28 21:56:10 +00:00
Nate Begeman
37efe67645 JumpTable support! What this represents is working asm and jit support for
x86 and ppc for 100% dense switch statements when relocations are non-PIC.
This support will be extended and enhanced in the coming days to support
PIC, and less dense forms of jump tables.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27947 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-22 18:53:45 +00:00
Chris Lattner
0090120c2b Fix a crash on:
void foo2(vector float *A, vector float *B) {
  vector float C = (vector float)vec_cmpeq(*A, *B);
  if (!vec_any_eq(*A, *B))
    *B = (vector float){0,0,0,0};
  *A = C;
}


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27808 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-18 18:28:22 +00:00
Chris Lattner
f70f8d91a7 pretty print node name
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27806 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-18 18:05:58 +00:00
Chris Lattner
90564f26d1 Implement an important entry from README_ALTIVEC:
If an altivec predicate compare is used immediately by a branch, don't
use a (serializing) MFCR instruction to read the CR6 register, which requires
a compare to get it back to CR's.  Instead, just branch on CR6 directly. :)

For example, for:
void foo2(vector float *A, vector float *B) {
  if (!vec_any_eq(*A, *B))
    *B = (vector float){0,0,0,0};
}

We now generate:

_foo2:
        mfspr r2, 256
        oris r5, r2, 12288
        mtspr 256, r5
        lvx v2, 0, r4
        lvx v3, 0, r3
        vcmpeqfp. v2, v3, v2
        bne cr6, LBB1_2 ; UnifiedReturnBlock
LBB1_1: ; cond_true
        vxor v2, v2, v2
        stvx v2, 0, r4
        mtspr 256, r2
        blr
LBB1_2: ; UnifiedReturnBlock
        mtspr 256, r2
        blr

instead of:

_foo2:
        mfspr r2, 256
        oris r5, r2, 12288
        mtspr 256, r5
        lvx v2, 0, r4
        lvx v3, 0, r3
        vcmpeqfp. v2, v3, v2
        mfcr r3, 2
        rlwinm r3, r3, 27, 31, 31
        cmpwi cr0, r3, 0
        beq cr0, LBB1_2 ; UnifiedReturnBlock
LBB1_1: ; cond_true
        vxor v2, v2, v2
        stvx v2, 0, r4
        mtspr 256, r2
        blr
LBB1_2: ; UnifiedReturnBlock
        mtspr 256, r2
        blr

This implements CodeGen/PowerPC/vec_br_cmp.ll.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27804 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-18 17:59:36 +00:00
Chris Lattner
cea2aa77eb Use vmladduhm to do v8i16 multiplies which is faster and simpler than doing
even/odd halves.  Thanks to Nate telling me what's what.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27793 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-18 04:28:57 +00:00
Chris Lattner
19a815238e Implement v16i8 multiply with this code:
vmuloub v5, v3, v2
        vmuleub v2, v3, v2
        vperm v2, v2, v5, v4

This implements CodeGen/PowerPC/vec_mul.ll.  With this, v16i8 multiplies are
6.79x faster than before.

Overall, UnitTests/Vector/multiplies.c is now 2.45x faster with LLVM than with
GCC.

Remove the 'integer multiplies' todo from the README file.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27792 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-18 03:57:35 +00:00
Chris Lattner
72dd9bdcc5 Lower v8i16 multiply into this code:
li r5, lo16(LCPI1_0)
        lis r6, ha16(LCPI1_0)
        lvx v4, r6, r5
        vmulouh v5, v3, v2
        vmuleuh v2, v3, v2
        vperm v2, v2, v5, v4

where v4 is:
LCPI1_0:                                        ;  <16 x ubyte>
        .byte   2
        .byte   3
        .byte   18
        .byte   19
        .byte   6
        .byte   7
        .byte   22
        .byte   23
        .byte   10
        .byte   11
        .byte   26
        .byte   27
        .byte   14
        .byte   15
        .byte   30
        .byte   31

This is 5.07x faster on the G5 (measured) than lowering to scalar code +
loads/stores.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27789 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-18 03:43:48 +00:00
Chris Lattner
e7c768ea24 Custom lower v4i32 multiplies into a cute sequence, instead of having legalize
scalarize the sequence into 4 mullw's and a bunch of load/store traffic.

This speeds up v4i32 multiplies 4.1x (measured) on a G5.  This implements
PowerPC/vec_mul.ll


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27788 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-18 03:24:30 +00:00
Chris Lattner
dbce85dedf Make sure to check splats of every constant we can, handle splat(31) by
being a bit more clever, add support for odd splats from -31 to -17.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27764 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-17 18:09:22 +00:00