Commit Graph

53 Commits

Author SHA1 Message Date
Craig Topper
54f952afac Synthesize SSSE3/AVX 128-bit horizontal integer add/sub instructions from add/sub of appropriate shuffle vectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144989 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-19 09:02:40 +00:00
Craig Topper
3113384a34 Collapse X86 PSIGNB/PSIGNW/PSIGND node types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144988 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-19 07:33:10 +00:00
Craig Topper
1666cb6d63 Extend VPBLENDVB and VPSIGN lowering to work for AVX2.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144987 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-19 07:07:26 +00:00
Craig Topper
3f2b2c218f Add a bunch more X86 AVX2 instructions and their corresponding intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143529 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-02 04:42:13 +00:00
Duncan Sands
17470bee5f Synthesize SSE3/AVX 128 bit horizontal add/sub instructions from
floating point add/sub of appropriate shuffle vectors.  Does not
synthesize the 256 bit AVX versions because they work differently.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140332 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-22 20:15:48 +00:00
Bruno Cardoso Lopes
df24e1fb08 Add versions 256-bit versions of alignedstore and alignedload, to be
more strict about the alignment checking. This was found by inspection
and I don't have any testcases so far, although the llvm testsuite runs
without any problem.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139625 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-13 19:33:03 +00:00
Nadav Rotem
5ed0983200 Format patterns, remove unused X86blend patterns
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139491 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-12 08:41:50 +00:00
Nadav Rotem
8ffad56f8e Implement vector-select support for avx256. Refactor the vblend implementation to have tablegen match the instruction by the node type
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139400 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-09 20:29:17 +00:00
Bruno Cardoso Lopes
814c6ced85 Add AVX versions of blend vector operations and fix some issues noticed
in Nadav's r139285 and r139287 commits.

1) Rename vsel.ll to a more descriptive name
2) Change the order of BLEND operands to "Op1, Op2, Cond", this is
necessary because PBLENDVB is already used in different places with
this order, and it was being emitted in the wrong way for vselect
3) Add AVX patterns and tests for the same SSE41 instructions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139305 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-08 18:05:08 +00:00
Nadav Rotem
ffe3e7da84 Add X86-SSE4 codegen support for vector-select.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139285 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-08 08:11:19 +00:00
Bruno Cardoso Lopes
0e6d230abd Introduce matching patterns for vbroadcast AVX instruction. The idea is to
match splats in the form (splat (scalar_to_vector (load ...))) whenever
the load can be folded. All the logic and instruction emission is
working but because of PR8156, there are no ways to match loads, cause
they can never be folded for splats. Thus, the tests are XFAILed, but
I've tested and exercised all the logic using a relaxed version for
checking the foldable loads, as if the bug was already fixed. This
should work out of the box once PR8156 gets fixed since MayFoldLoad will
work as expected.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137810 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-17 02:29:19 +00:00
Bruno Cardoso Lopes
53cae1362d The VPERM2F128 is a AVX instruction which permutes between two 256-bit
vectors. It operates on 128-bit elements instead of regular scalar
types. Recognize shuffles that are suitable for VPERM2F128 and teach
the x86 legalizer how to handle them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137519 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-12 21:48:26 +00:00
Bruno Cardoso Lopes
9065d4b65f Cleanup PALIGNR handling and remove the old palign pattern fragment.
Also make PALIGNR masks to don't match 256-bits, which isn't supported
It's also a step to solve PR10489

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136448 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-29 01:30:59 +00:00
Bruno Cardoso Lopes
cea34e41fa The vpermilps and vpermilpd have different behaviour regarding the
usage of the shuffle bitmask. Both work in 128-bit lanes without
crossing, but in the former the mask of the high part is the same
used by the low part while in the later both lanes have independent
masks. Handle this properly and and add support for vpermilpd.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136200 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-27 00:56:34 +00:00
Bruno Cardoso Lopes
cd9e5aed53 Remove more dead code!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136199 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-27 00:56:27 +00:00
Bruno Cardoso Lopes
4ea496846a Recognize unpckh* masks and match 256-bit versions. The new versions are
different from the previous 128-bit because they work in lanes.
Update a few comments and add testcases

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136157 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-26 22:03:40 +00:00
Bruno Cardoso Lopes
3e9235c720 Cleanup movsldup/movshdup matching.
27 insertions(+), 62 deletions(-)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136047 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-26 02:39:13 +00:00
Bruno Cardoso Lopes
6a32adc4e5 - Handle special scalar_to_vector case: splats. Using a native 128-bit
shuffle before inserting on a 256-bit vector.
- Add AVX versions of movd/movq instructions
- Introduce a few COPY patterns to match insert_subvector instructions.
This turns a trivial insert_subvector instruction into a register copy,
coalescing the xmm into a ymm and avoid emiting on more instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136002 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-25 23:05:25 +00:00
Bruno Cardoso Lopes
65b74e1d00 Add support for 256-bit versions of VPERMIL instruction. This is a new
instruction introduced in AVX, which can operate on 128 and 256-bit vectors.
It considers a 256-bit vector as two independent 128-bit lanes. It can permute
any 32 or 64 elements inside a lane, and restricts the second lane to
have the same permutation of the first one. With the improved splat support
introduced early today, adding codegen for this instruction enable more
efficient 256-bit code:

Instead of:
  vextractf128  $0, %ymm0, %xmm0
  punpcklbw %xmm0, %xmm0
  punpckhbw %xmm0, %xmm0
  vinsertf128 $0, %xmm0, %ymm0, %ymm1
  vinsertf128 $1, %xmm0, %ymm1, %ymm0
  vextractf128  $1, %ymm0, %xmm1
  shufps  $1, %xmm1, %xmm1
  movss %xmm1, 28(%rsp)
  movss %xmm1, 24(%rsp)
  movss %xmm1, 20(%rsp)
  movss %xmm1, 16(%rsp)
  vextractf128  $0, %ymm0, %xmm0
  shufps  $1, %xmm0, %xmm0
  movss %xmm0, 12(%rsp)
  movss %xmm0, 8(%rsp)
  movss %xmm0, 4(%rsp)
  movss %xmm0, (%rsp)
  vmovaps (%rsp), %ymm0
We get:
  vextractf128  $0, %ymm0, %xmm0
  punpcklbw %xmm0, %xmm0
  punpckhbw %xmm0, %xmm0
  vinsertf128 $0, %xmm0, %ymm0, %ymm1
  vinsertf128 $1, %xmm0, %ymm1, %ymm0
  vpermilps $85, %ymm0, %ymm0

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135662 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-21 01:55:47 +00:00
Benjamin Kramer
3be41b748e Port operand types for ARM and X86 over from EDIS to the .td files.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135198 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-14 21:47:22 +00:00
Bruno Cardoso Lopes
466b022c99 Make X86ISD::ANDNP more general and Codegen 256-bit VANDNP. A more
general version of X86ISD::ANDNP also opened the room for a little bit
of refactoring.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135088 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-13 21:36:51 +00:00
Bruno Cardoso Lopes
c1af4772f1 The target specific node PANDN name is misleading. That happens because
it's later selected to a ANDNPD/ANDNPS instruction instead of the PANDN
instruction. Rename it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135087 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-13 21:36:47 +00:00
Bruno Cardoso Lopes
61905f0139 AVX Codegen support for 256-bit versions of vandps, vandpd, vorps, vorpd, vxorps, vxorpd
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135023 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-13 01:15:33 +00:00
Stuart Hastings
865f09334f Reapply 132424 with fixes. This fixes PR10068.
rdar://problem/5993888


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132606 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-03 23:53:54 +00:00
Rafael Espindola
251b4a0405 Revert 132424 to fix PR10068.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132479 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-02 19:57:47 +00:00
Stuart Hastings
ec880283b3 Recommit 132404 with fixes. rdar://problem/5993888
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132424 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-01 21:33:14 +00:00
Stuart Hastings
4abc5fea9c Revert 132404 to appease a buildbot. rdar://problem/5993888
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132419 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-01 19:52:20 +00:00
Stuart Hastings
10ff0bbdfb Add support for x86 CMPEQSS and friends. These instructions do a
floating-point comparison, generate a mask of 0s or 1s, and generally
DTRT with NaNs.  Only profitable when the user wants a materialized 0
or 1 at runtime.  rdar://problem/5993888


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132404 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-01 17:17:45 +00:00
Stuart Hastings
4fd0dee3bf FGETSIGN support for x86, using movmskps/pd. Will be enabled with a
patch to TargetLowering.cpp.  rdar://problem/5660695


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132388 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-01 04:39:42 +00:00
David Greene
a20244d1ba [AVX] Fix mask predicates for 256-bit UNPCKLPS/D and implement
missing patterns for them.

      Add a SIMD test subdirectory to hold tests for SIMD instruction
      selection correctness and quality.
'


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126845 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-02 17:23:43 +00:00
David Greene
ccacdc1952 [AVX] Support VSINSERTF128 with more patterns and appropriate
infrastructure.  This makes lowering 256-bit vectors to 128-bit
vectors simple when 256-bit vector support is not available.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124868 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-04 16:08:29 +00:00
David Greene
c38a03eeca [AVX] VEXTRACTF128 support. This commit includes patterns for
matching EXTRACT_SUBVECTOR to VEXTRACTF128 along with support routines
to examine and translate index values.  VINSERTF128 comes next.  With
these two in place we can begin supporting more AVX operations as
INSERT/EXTRACT can be used as a fallback when 256-bit support is not
available.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124797 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-03 15:50:00 +00:00
Nate Begeman
672fb6225b Implement feedback from Bruno on making pblendvb an x86-specific ISD node in addition to being an intrinsic, and convert
lowering to use it.  Hopefully the pattern fragment is doing the right thing with XMM0, looks correct in testing.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122277 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-20 22:04:24 +00:00
Nate Begeman
b65c175d32 Add support for matching psign & plendvb to the x86 target
Remove unnecessary pandn patterns, 'vnot' patfrag looks through bitcasts


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122098 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-17 22:55:37 +00:00
Dale Johannesen
0488fb649a Massive rewrite of MMX:
The x86_mmx type is used for MMX intrinsics, parameters and
return values where these use MMX registers, and is also
supported in load, store, and bitcast.

Only the above operations generate MMX instructions, and optimizations
do not operate on or produce MMX intrinsics. 

MMX-sized vectors <2 x i32> etc. are lowered to XMM or split into
smaller pieces.  Optimizations may occur on these forms and the
result casted back to x86_mmx, provided the result feeds into a
previous existing x86_mmx operation.

The point of all this is prevent optimizations from introducing
MMX operations, which is unsafe due to the EMMS problem.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115243 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-30 23:57:10 +00:00
Chris Lattner
8864155a35 give VZEXT_LOAD a memory operand, it now works with segment registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114515 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-22 00:34:38 +00:00
Chris Lattner
52a261b3c1 fix a long standing wart: all the ComplexPattern's were being
passed the root of the match, even though only a few patterns
actually needed this (one in X86, several in ARM [which should
be refactored anyway], and some in CellSPU that I don't feel 
like detangling).   Instead of requiring all ComplexPatterns to
take the dead root, have targets opt into getting the root by
putting SDNPWantRoot on the ComplexPattern.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114471 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-21 20:31:19 +00:00
Dale Johannesen
e5db19ebd5 Fix typos. 128-bit PSHUFB takes 128-bit memory op.
v8i16 is not an MMX type; put it where it belongs.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113785 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-13 21:15:43 +00:00
Bill Wendling
1ec2ee6526 Reapply r113585. The msvc machine is mercurial.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113610 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-10 20:20:28 +00:00
Bill Wendling
b28f579200 r113585 was causing clang-i686-xp-msvc9 to fail in mysterious ways that I can't
understand (the log file was no help).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113605 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-10 19:20:47 +00:00
Bill Wendling
81781ed939 Mark the sse_load_f32 and sse_load_f64 load patterns as having memoperands so
that the memoperands are properly set after DAG building and general mucking
about.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113585 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-10 10:34:22 +00:00
Bruno Cardoso Lopes
70e81f1517 Remove unused target specific node
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113224 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-07 17:38:55 +00:00
Bruno Cardoso Lopes
56098f5d26 Use movlps, movlpd, movss and movsd specific nodes instead of pattern matching with movlp pattern fragment
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112694 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-01 05:08:25 +00:00
Bruno Cardoso Lopes
013bb3dee9 Use x86 specific MOVSLDUP node, add more patterns to match it and remove useless load nodes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112661 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-31 22:35:05 +00:00
Bruno Cardoso Lopes
f2db5b48d0 Use MOVLHPS and MOVHLPS x86 nodes whenever possible. Also remove some useless nodes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112642 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-31 21:15:21 +00:00
Bruno Cardoso Lopes
3157ef1c13 This is the first step towards refactoring the x86 vector shuffle code. The
general idea here is to have a group of x86 target specific nodes which are
going to be selected during lowering and then directly matched in isel.

The commit includes the addition of those specific nodes and a *bunch* of
patterns, and incrementally we're going to switch between them and what we
have right now. Both the patterns and target specific nodes can change as
we move forward with this work.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111691 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-20 22:55:05 +00:00
Bruno Cardoso Lopes
30baa63474 Add comments to some pattern fragments in x86
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111041 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-13 20:39:01 +00:00
Bruno Cardoso Lopes
045573ce21 Add AVX matching patterns to Packed Bit Test intrinsics.
Apply the same approach of SSE4.1 ptest intrinsics but
create a new x86 node "testp" since AVX introduces
vtest{ps}{pd} instructions which set ZF and CF depending
on sign bit AND and ANDN of packed floating-point sources.

This is slightly different from what the "ptest" does.
Tests comming with the other 256 intrinsics tests.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110744 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-10 23:25:42 +00:00
Bruno Cardoso Lopes
bd2d90f5a5 Patterns to match AVX 256-bit permutation intrinsics
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110468 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-06 20:03:27 +00:00
Bruno Cardoso Lopes
2b69143083 Add more 256-bit forms for a bunch of regular AVX instructions
Add 64-bit (GR64) versions of some instructions (which are not
described in their SSE forms, but are described in AVX)


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109063 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-21 23:53:50 +00:00