Commit Graph

66 Commits

Author SHA1 Message Date
Craig Topper
b3982da7d2 Merge X86 SHUFPS and SHUFPD node types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147394 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-31 23:50:21 +00:00
Craig Topper
ab44d3cf49 Remove an unused X86ISD node type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146833 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-17 19:16:44 +00:00
Craig Topper
d93e4c3496 Remove some remants of the old palign pattern fragment that were still hanging around. Also remove a cast from inside getShuffleVPERM2X128Immediate and getShuffleVPERMILPImmediate since the only caller already had done the cast.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146344 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-11 19:12:35 +00:00
Craig Topper
34671b812a Merge floating point and integer UNPCK X86ISD node types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145926 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-06 08:21:25 +00:00
Craig Topper
ec24e61ab0 Merge VPERM2F128/VPERM2I128 ISD node types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145485 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-30 07:47:51 +00:00
Craig Topper
316cd2a2c5 Merge decoding of VPERMILPD and VPERMILPS shuffle masks. Merge X86ISD node type for VPERMILPD/PS. Add instruction selection support for VINSERTI128/VEXTRACTI128.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145483 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-30 06:25:25 +00:00
Craig Topper
70b883b3a7 Add X86 instruction selection for VPERM2I128 when AVX2 is enabled. Merge VPERMILPS/VPERMILPD detection since they are pretty similar.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145238 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-28 10:14:51 +00:00
Craig Topper
38034c568c Merge 128-bit and 256-bit X86ISD node types for VPERMILPS and VPERMILPD. Simplify some shuffle lowering code since V1 can never be UNDEF due to canonalizing that occurs when shuffle nodes are created.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145153 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-26 22:55:48 +00:00
Craig Topper
06cb680779 Collapse X86ISD node types for PUNPCKH*, PUNPCKL*, UNPCKLP*, and UNPCKHP* to not be type specific. Now we just have integer high and low and floating point high and low. Pattern matching will choose the correct instruction based on the vector type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145148 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-26 20:47:44 +00:00
Craig Topper
705f2431a0 Remove 256-bit specific node types for UNPCKHPS/D and instead use the 128-bit versions and let the operand type disinquish. Also fix the load form of the v8i32 patterns for these to realize that the load would be promoted to v4i64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145126 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-24 22:57:10 +00:00
Craig Topper
f475a55bd4 Remove AVX2 specific X86ISD node types for PUNPCKH/L and instead just reuse the 128-bit versions and let the vector type distinguish.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145125 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-24 22:20:08 +00:00
Craig Topper
6fa583d787 Lowering for v32i8 to VPUNPCKLBW/VPUNPCKHBW when AVX2 is enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145028 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-21 08:26:50 +00:00
Craig Topper
6347e8662c Add support for lowering 256-bit shuffles to VPUNPCKL/H for i16, i32, i64 if AVX2 is enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145026 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-21 06:57:39 +00:00
Craig Topper
54f952afac Synthesize SSSE3/AVX 128-bit horizontal integer add/sub instructions from add/sub of appropriate shuffle vectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144989 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-19 09:02:40 +00:00
Craig Topper
3113384a34 Collapse X86 PSIGNB/PSIGNW/PSIGND node types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144988 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-19 07:33:10 +00:00
Craig Topper
1666cb6d63 Extend VPBLENDVB and VPSIGN lowering to work for AVX2.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@144987 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-19 07:07:26 +00:00
Craig Topper
3f2b2c218f Add a bunch more X86 AVX2 instructions and their corresponding intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143529 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-02 04:42:13 +00:00
Duncan Sands
17470bee5f Synthesize SSE3/AVX 128 bit horizontal add/sub instructions from
floating point add/sub of appropriate shuffle vectors.  Does not
synthesize the 256 bit AVX versions because they work differently.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140332 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-22 20:15:48 +00:00
Bruno Cardoso Lopes
df24e1fb08 Add versions 256-bit versions of alignedstore and alignedload, to be
more strict about the alignment checking. This was found by inspection
and I don't have any testcases so far, although the llvm testsuite runs
without any problem.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139625 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-13 19:33:03 +00:00
Nadav Rotem
5ed0983200 Format patterns, remove unused X86blend patterns
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139491 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-12 08:41:50 +00:00
Nadav Rotem
8ffad56f8e Implement vector-select support for avx256. Refactor the vblend implementation to have tablegen match the instruction by the node type
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139400 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-09 20:29:17 +00:00
Bruno Cardoso Lopes
814c6ced85 Add AVX versions of blend vector operations and fix some issues noticed
in Nadav's r139285 and r139287 commits.

1) Rename vsel.ll to a more descriptive name
2) Change the order of BLEND operands to "Op1, Op2, Cond", this is
necessary because PBLENDVB is already used in different places with
this order, and it was being emitted in the wrong way for vselect
3) Add AVX patterns and tests for the same SSE41 instructions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139305 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-08 18:05:08 +00:00
Nadav Rotem
ffe3e7da84 Add X86-SSE4 codegen support for vector-select.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139285 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-08 08:11:19 +00:00
Bruno Cardoso Lopes
0e6d230abd Introduce matching patterns for vbroadcast AVX instruction. The idea is to
match splats in the form (splat (scalar_to_vector (load ...))) whenever
the load can be folded. All the logic and instruction emission is
working but because of PR8156, there are no ways to match loads, cause
they can never be folded for splats. Thus, the tests are XFAILed, but
I've tested and exercised all the logic using a relaxed version for
checking the foldable loads, as if the bug was already fixed. This
should work out of the box once PR8156 gets fixed since MayFoldLoad will
work as expected.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137810 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-17 02:29:19 +00:00
Bruno Cardoso Lopes
53cae1362d The VPERM2F128 is a AVX instruction which permutes between two 256-bit
vectors. It operates on 128-bit elements instead of regular scalar
types. Recognize shuffles that are suitable for VPERM2F128 and teach
the x86 legalizer how to handle them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137519 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-12 21:48:26 +00:00
Bruno Cardoso Lopes
9065d4b65f Cleanup PALIGNR handling and remove the old palign pattern fragment.
Also make PALIGNR masks to don't match 256-bits, which isn't supported
It's also a step to solve PR10489

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136448 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-29 01:30:59 +00:00
Bruno Cardoso Lopes
cea34e41fa The vpermilps and vpermilpd have different behaviour regarding the
usage of the shuffle bitmask. Both work in 128-bit lanes without
crossing, but in the former the mask of the high part is the same
used by the low part while in the later both lanes have independent
masks. Handle this properly and and add support for vpermilpd.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136200 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-27 00:56:34 +00:00
Bruno Cardoso Lopes
cd9e5aed53 Remove more dead code!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136199 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-27 00:56:27 +00:00
Bruno Cardoso Lopes
4ea496846a Recognize unpckh* masks and match 256-bit versions. The new versions are
different from the previous 128-bit because they work in lanes.
Update a few comments and add testcases

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136157 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-26 22:03:40 +00:00
Bruno Cardoso Lopes
3e9235c720 Cleanup movsldup/movshdup matching.
27 insertions(+), 62 deletions(-)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136047 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-26 02:39:13 +00:00
Bruno Cardoso Lopes
6a32adc4e5 - Handle special scalar_to_vector case: splats. Using a native 128-bit
shuffle before inserting on a 256-bit vector.
- Add AVX versions of movd/movq instructions
- Introduce a few COPY patterns to match insert_subvector instructions.
This turns a trivial insert_subvector instruction into a register copy,
coalescing the xmm into a ymm and avoid emiting on more instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136002 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-25 23:05:25 +00:00
Bruno Cardoso Lopes
65b74e1d00 Add support for 256-bit versions of VPERMIL instruction. This is a new
instruction introduced in AVX, which can operate on 128 and 256-bit vectors.
It considers a 256-bit vector as two independent 128-bit lanes. It can permute
any 32 or 64 elements inside a lane, and restricts the second lane to
have the same permutation of the first one. With the improved splat support
introduced early today, adding codegen for this instruction enable more
efficient 256-bit code:

Instead of:
  vextractf128  $0, %ymm0, %xmm0
  punpcklbw %xmm0, %xmm0
  punpckhbw %xmm0, %xmm0
  vinsertf128 $0, %xmm0, %ymm0, %ymm1
  vinsertf128 $1, %xmm0, %ymm1, %ymm0
  vextractf128  $1, %ymm0, %xmm1
  shufps  $1, %xmm1, %xmm1
  movss %xmm1, 28(%rsp)
  movss %xmm1, 24(%rsp)
  movss %xmm1, 20(%rsp)
  movss %xmm1, 16(%rsp)
  vextractf128  $0, %ymm0, %xmm0
  shufps  $1, %xmm0, %xmm0
  movss %xmm0, 12(%rsp)
  movss %xmm0, 8(%rsp)
  movss %xmm0, 4(%rsp)
  movss %xmm0, (%rsp)
  vmovaps (%rsp), %ymm0
We get:
  vextractf128  $0, %ymm0, %xmm0
  punpcklbw %xmm0, %xmm0
  punpckhbw %xmm0, %xmm0
  vinsertf128 $0, %xmm0, %ymm0, %ymm1
  vinsertf128 $1, %xmm0, %ymm1, %ymm0
  vpermilps $85, %ymm0, %ymm0

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135662 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-21 01:55:47 +00:00
Benjamin Kramer
3be41b748e Port operand types for ARM and X86 over from EDIS to the .td files.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135198 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-14 21:47:22 +00:00
Bruno Cardoso Lopes
466b022c99 Make X86ISD::ANDNP more general and Codegen 256-bit VANDNP. A more
general version of X86ISD::ANDNP also opened the room for a little bit
of refactoring.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135088 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-13 21:36:51 +00:00
Bruno Cardoso Lopes
c1af4772f1 The target specific node PANDN name is misleading. That happens because
it's later selected to a ANDNPD/ANDNPS instruction instead of the PANDN
instruction. Rename it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135087 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-13 21:36:47 +00:00
Bruno Cardoso Lopes
61905f0139 AVX Codegen support for 256-bit versions of vandps, vandpd, vorps, vorpd, vxorps, vxorpd
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135023 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-13 01:15:33 +00:00
Stuart Hastings
865f09334f Reapply 132424 with fixes. This fixes PR10068.
rdar://problem/5993888


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132606 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-03 23:53:54 +00:00
Rafael Espindola
251b4a0405 Revert 132424 to fix PR10068.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132479 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-02 19:57:47 +00:00
Stuart Hastings
ec880283b3 Recommit 132404 with fixes. rdar://problem/5993888
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132424 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-01 21:33:14 +00:00
Stuart Hastings
4abc5fea9c Revert 132404 to appease a buildbot. rdar://problem/5993888
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132419 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-01 19:52:20 +00:00
Stuart Hastings
10ff0bbdfb Add support for x86 CMPEQSS and friends. These instructions do a
floating-point comparison, generate a mask of 0s or 1s, and generally
DTRT with NaNs.  Only profitable when the user wants a materialized 0
or 1 at runtime.  rdar://problem/5993888


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132404 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-01 17:17:45 +00:00
Stuart Hastings
4fd0dee3bf FGETSIGN support for x86, using movmskps/pd. Will be enabled with a
patch to TargetLowering.cpp.  rdar://problem/5660695


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132388 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-01 04:39:42 +00:00
David Greene
a20244d1ba [AVX] Fix mask predicates for 256-bit UNPCKLPS/D and implement
missing patterns for them.

      Add a SIMD test subdirectory to hold tests for SIMD instruction
      selection correctness and quality.
'


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126845 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-02 17:23:43 +00:00
David Greene
ccacdc1952 [AVX] Support VSINSERTF128 with more patterns and appropriate
infrastructure.  This makes lowering 256-bit vectors to 128-bit
vectors simple when 256-bit vector support is not available.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124868 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-04 16:08:29 +00:00
David Greene
c38a03eeca [AVX] VEXTRACTF128 support. This commit includes patterns for
matching EXTRACT_SUBVECTOR to VEXTRACTF128 along with support routines
to examine and translate index values.  VINSERTF128 comes next.  With
these two in place we can begin supporting more AVX operations as
INSERT/EXTRACT can be used as a fallback when 256-bit support is not
available.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124797 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-03 15:50:00 +00:00
Nate Begeman
672fb6225b Implement feedback from Bruno on making pblendvb an x86-specific ISD node in addition to being an intrinsic, and convert
lowering to use it.  Hopefully the pattern fragment is doing the right thing with XMM0, looks correct in testing.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122277 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-20 22:04:24 +00:00
Nate Begeman
b65c175d32 Add support for matching psign & plendvb to the x86 target
Remove unnecessary pandn patterns, 'vnot' patfrag looks through bitcasts


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122098 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-17 22:55:37 +00:00
Dale Johannesen
0488fb649a Massive rewrite of MMX:
The x86_mmx type is used for MMX intrinsics, parameters and
return values where these use MMX registers, and is also
supported in load, store, and bitcast.

Only the above operations generate MMX instructions, and optimizations
do not operate on or produce MMX intrinsics. 

MMX-sized vectors <2 x i32> etc. are lowered to XMM or split into
smaller pieces.  Optimizations may occur on these forms and the
result casted back to x86_mmx, provided the result feeds into a
previous existing x86_mmx operation.

The point of all this is prevent optimizations from introducing
MMX operations, which is unsafe due to the EMMS problem.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115243 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-30 23:57:10 +00:00
Chris Lattner
8864155a35 give VZEXT_LOAD a memory operand, it now works with segment registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114515 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-22 00:34:38 +00:00
Chris Lattner
52a261b3c1 fix a long standing wart: all the ComplexPattern's were being
passed the root of the match, even though only a few patterns
actually needed this (one in X86, several in ARM [which should
be refactored anyway], and some in CellSPU that I don't feel 
like detangling).   Instead of requiring all ComplexPatterns to
take the dead root, have targets opt into getting the root by
putting SDNPWantRoot on the ComplexPattern.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114471 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-21 20:31:19 +00:00