Chris Lattner
cea2aa77eb
Use vmladduhm to do v8i16 multiplies which is faster and simpler than doing
...
even/odd halves. Thanks to Nate telling me what's what.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27793 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-18 04:28:57 +00:00
Chris Lattner
19a815238e
Implement v16i8 multiply with this code:
...
vmuloub v5, v3, v2
vmuleub v2, v3, v2
vperm v2, v2, v5, v4
This implements CodeGen/PowerPC/vec_mul.ll. With this, v16i8 multiplies are
6.79x faster than before.
Overall, UnitTests/Vector/multiplies.c is now 2.45x faster with LLVM than with
GCC.
Remove the 'integer multiplies' todo from the README file.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27792 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-18 03:57:35 +00:00
Evan Cheng
4980467476
Correct comments
...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27790 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-18 03:45:01 +00:00
Chris Lattner
72dd9bdcc5
Lower v8i16 multiply into this code:
...
li r5, lo16(LCPI1_0)
lis r6, ha16(LCPI1_0)
lvx v4, r6, r5
vmulouh v5, v3, v2
vmuleuh v2, v3, v2
vperm v2, v2, v5, v4
where v4 is:
LCPI1_0: ; <16 x ubyte>
.byte 2
.byte 3
.byte 18
.byte 19
.byte 6
.byte 7
.byte 22
.byte 23
.byte 10
.byte 11
.byte 26
.byte 27
.byte 14
.byte 15
.byte 30
.byte 31
This is 5.07x faster on the G5 (measured) than lowering to scalar code +
loads/stores.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27789 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-18 03:43:48 +00:00
Chris Lattner
e7c768ea24
Custom lower v4i32 multiplies into a cute sequence, instead of having legalize
...
scalarize the sequence into 4 mullw's and a bunch of load/store traffic.
This speeds up v4i32 multiplies 4.1x (measured) on a G5. This implements
PowerPC/vec_mul.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27788 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-18 03:24:30 +00:00
Evan Cheng
74e955d931
Another entry
...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27786 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-18 01:22:57 +00:00
Evan Cheng
7fa094a261
Another entry.
...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27784 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-18 00:21:01 +00:00
Evan Cheng
cdfc3c82a7
Use movss to insert_vector_elt(v, s, 0).
...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27782 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-17 22:45:49 +00:00
Chris Lattner
fd6bdf0b0f
Turn x86 unaligned load/store intrinsics into aligned load/store instructions
...
if the pointer is known aligned.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27781 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-17 22:26:56 +00:00
Chris Lattner
80edfb3af5
Fix handling of calls in functions that use vectors. This fixes a crash on
...
the code in GCC PR26546.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27780 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-17 22:10:08 +00:00
Evan Cheng
5edb8d270c
Use two pinsrw to insert an element into v4i32 / v4f32 vector.
...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27779 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-17 22:04:06 +00:00
Chris Lattner
22fcbb1320
remove done item
...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27778 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-17 21:52:03 +00:00
Chris Lattner
f9568d8700
Don't diddle VRSAVE if no registers need to be added/removed from it. This
...
allows us to codegen functions as:
_test_rol:
vspltisw v2, -12
vrlw v2, v2, v2
blr
instead of:
_test_rol:
mfvrsave r2, 256
mr r3, r2
mtvrsave r3
vspltisw v2, -12
vrlw v2, v2, v2
mtvrsave r2
blr
Testcase here: CodeGen/PowerPC/vec_vrsave.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27777 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-17 21:48:13 +00:00
Chris Lattner
48d7c069c7
Add a MachineInstr::eraseFromParent convenience method.
...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27775 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-17 21:35:41 +00:00
Evan Cheng
23b72005fa
Encoding bug
...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27773 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-17 21:33:57 +00:00
Chris Lattner
402504b1ba
Vectors that are known live-in and live-out are clearly already marked in
...
the vrsave register for the caller. This allows us to codegen a function as:
_test_rol:
mfspr r2, 256
mr r3, r2
mtspr 256, r3
vspltisw v2, -12
vrlw v2, v2, v2
mtspr 256, r2
blr
instead of:
_test_rol:
mfspr r2, 256
oris r3, r2, 40960
mtspr 256, r3
vspltisw v0, -12
vrlw v2, v0, v0
mtspr 256, r2
blr
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27772 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-17 21:22:06 +00:00
Chris Lattner
939274fcfd
Prefer to allocate V2-V5 before V0,V1. This lets us generate code like this:
...
vspltisw v2, -12
vrlw v2, v2, v2
instead of:
vspltisw v0, -12
vrlw v2, v0, v0
when a function is returning a value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27771 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-17 21:19:12 +00:00
Chris Lattner
369503f841
Move some knowledge about registers out of the code emitter into the register info.
...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27770 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-17 21:07:20 +00:00
Chris Lattner
f7d2372b74
Use a small table instead of macros to do this conversion.
...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27769 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-17 20:59:25 +00:00
Evan Cheng
c575ca22ea
Implement v8i16, v16i8 splat using unpckl + pshufd.
...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27768 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-17 20:43:08 +00:00
Chris Lattner
b2be4032c5
implement returns of a vector, testcase here: CodeGen/X86/vec_return.ll
...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27767 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-17 20:32:50 +00:00
Chris Lattner
8d5a894501
Codegen insertelement with constant insertion points as scalar_to_vector
...
and a shuffle. For this:
void %test2(<4 x float>* %F, float %f) {
%tmp = load <4 x float>* %F ; <<4 x float>> [#uses=2]
%tmp3 = add <4 x float> %tmp, %tmp ; <<4 x float>> [#uses=1]
%tmp2 = insertelement <4 x float> %tmp3, float %f, uint 2 ; <<4 x float>> [#uses=2]
%tmp6 = add <4 x float> %tmp2, %tmp2 ; <<4 x float>> [#uses=1]
store <4 x float> %tmp6, <4 x float>* %F
ret void
}
we now get this on X86 (which will get better):
_test2:
movl 4(%esp), %eax
movaps (%eax), %xmm0
addps %xmm0, %xmm0
movaps %xmm0, %xmm1
shufps $3, %xmm1, %xmm1
movaps %xmm0, %xmm2
shufps $1, %xmm2, %xmm2
unpcklps %xmm1, %xmm2
movss 8(%esp), %xmm1
unpcklps %xmm1, %xmm0
unpcklps %xmm2, %xmm0
addps %xmm0, %xmm0
movaps %xmm0, (%eax)
ret
instead of:
_test2:
subl $28, %esp
movl 32(%esp), %eax
movaps (%eax), %xmm0
addps %xmm0, %xmm0
movaps %xmm0, (%esp)
movss 36(%esp), %xmm0
movss %xmm0, 8(%esp)
movaps (%esp), %xmm0
addps %xmm0, %xmm0
movaps %xmm0, (%eax)
addl $28, %esp
ret
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27765 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-17 19:21:01 +00:00
Chris Lattner
dbce85dedf
Make sure to check splats of every constant we can, handle splat(31) by
...
being a bit more clever, add support for odd splats from -31 to -17.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27764 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-17 18:09:22 +00:00
Evan Cheng
51c9c43656
Incorrect foldMemoryOperand entries
...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27763 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-17 18:06:12 +00:00
Evan Cheng
083248e143
Errors in patterns preventing load folding
...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27762 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-17 18:05:01 +00:00
Jeff Cohen
3c280bf4d1
Add checks for __OpenBSD__.
...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27761 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-17 17:55:41 +00:00
Chris Lattner
bdd558cd94
Teach the ppc backend to use rol and vsldoi to generate splatted constants.
...
This implements vec_constants.ll:test_vsldoi and test_rol
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27760 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-17 17:55:10 +00:00
Chris Lattner
966083fd1a
add a note
...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27758 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-17 17:29:41 +00:00
Evan Cheng
5001ea1078
FP SETOLT, SETOLT, SETUGE, SETUGT conditions were implemented incorrectly
...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27755 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-17 07:24:10 +00:00
Chris Lattner
6876e66e5d
Make some code more general, adding support for constant formation of several
...
new patterns.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27754 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-17 06:58:41 +00:00
Chris Lattner
c408382eca
Learn how to make odd splatted constants in range [17,29]. This implements
...
PowerPC/vec_constants.ll:test_29.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27752 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-17 06:07:44 +00:00
Chris Lattner
4a998b9ca8
Pull some code out into a helper function.
...
Effeciently codegen even splats in the range [-32,30].
This allows us to codegen <30,30,30,30> as:
vspltisw v0, 15
vadduwm v2, v0, v0
instead of as a cp load.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27750 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-17 06:00:21 +00:00
Chris Lattner
5913810b82
Implement a TODO: for any shuffle that can be viewed as a v4[if]32 shuffle,
...
if it can be implemented in 3 or fewer discrete altivec instructions, codegen
it as such. This implements Regression/CodeGen/PowerPC/vec_perf_shuffle.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27748 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-17 05:28:54 +00:00
Chris Lattner
cffeb86169
Regenerate with adjusted costs
...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27746 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-17 05:26:20 +00:00
Chris Lattner
586d6a808d
Regenerate with correct offset
...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27744 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-17 05:08:46 +00:00
Chris Lattner
c74e710000
Increase the opcodes by one each to disambiguate COPY from VMRGHW.
...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27742 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-17 00:47:48 +00:00
Chris Lattner
6703461f04
Check in a table, generated by llvm-PerfectShuffle, of optimal shuffles
...
of various 4-element vectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27739 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-17 00:37:02 +00:00
Evan Cheng
06aef15843
movduprm, movshduprm bugs
...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27734 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-16 18:11:28 +00:00
Evan Cheng
d8e8223ea1
Encoding bugs
...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27733 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-16 07:02:22 +00:00
Evan Cheng
800f12df1e
Can't fold loads into alias vector SSE ops used for scalar operation. The load
...
address has to be 16-byte aligned but the values aren't spilled to 128-bit
locations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27732 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-16 06:58:19 +00:00
Chris Lattner
f3f69decca
Implement a TODO: have the legalizer canonicalize a bunch of operations to
...
one type (v4i32) so that we don't have to write patterns for each type, and
so that more CSE opportunities are exposed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27731 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-16 01:37:57 +00:00
Chris Lattner
2efce0a589
Add support for promoting stores from one legal type to another, allowing us
...
to write one pattern for vector stores instead of 4.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27730 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-16 01:36:45 +00:00
Chris Lattner
b17f1679e3
Make the BUILD_VECTOR lowering code much more aggressive w.r.t constant vectors.
...
Remove some done items from the todo list.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27729 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-16 01:01:29 +00:00
Chris Lattner
7f6cc0ccb5
Fix a bug in the 'shuffle(undef,x,mask) -> shuffle(x, undef,mask')' xform
...
Make the insert/extract elt -> shuffle code more aggressive.
This fixes CodeGen/PowerPC/vec_shuffle.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27728 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-16 00:51:47 +00:00
Chris Lattner
706126dac1
Canonicalize shuffle(undef,x,mask) -> shuffle(x, undef,mask').
...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27727 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-16 00:03:56 +00:00
Chris Lattner
730b45694b
Fix a crash when faced with a shuffle vector that has an undef in its mask.
...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27726 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-15 23:48:05 +00:00
Chris Lattner
6e94af75de
Add patterns for matching vnots with bit converted inputs. Most of these will
...
go away when I start using evan's binop type canonicalizer
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27725 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-15 23:45:24 +00:00
Chris Lattner
1fcee4edf5
Add a new vnot_conv predicate for matching vnot's where the allones vector is
...
bitconverted from some other type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27724 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-15 23:39:14 +00:00
Chris Lattner
547a16f273
Make these predicates return true for bit_convert(buildvector)'s as well as
...
buildvectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27723 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-15 23:38:00 +00:00
Evan Cheng
60d3fa24ba
More encoding bugs
...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27722 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-15 06:10:09 +00:00