134 Commits

Author SHA1 Message Date
Benjamin Kramer
9f2c21871c X86: Lower SMUL_LOHI of v4i32 to pmuldq when SSE4.1 is available.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207318 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-26 14:12:19 +00:00
Filipe Cabecinhas
c5b286bc41 Rename X86insrtps to the proper instruction name.
Summary:
The INSERTPS pattern fragment was called insrtps (mising 'e'), which
would make it harder to grep for the patterns related to this instruction.
Renaming it to use the proper instruction name.

Reviewers: nadav

CC: llvm-commits

Differential Revision: http://reviews.llvm.org/D3443

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206779 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-21 20:07:29 +00:00
Elena Demikhovsky
27ef6eec41 AVX-512: Fixed extract_vector_elt for v16i1 and v8i1 vectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@201066 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-10 07:02:39 +00:00
Tim Northover
c0fc62c2f9 X86: deduplicate V[SZ]EXT_MOVL and V[SZ]EXT nodes
I believe VZEXT_MOVL means "zero all vector elements except the first" (and
should have identical input & output types) whereas VZEXT means "zero extend
each element of a vector (discarding higher elements if necessary)".

For example:
    (v4i32 (vzext (v16i8 ...)))

should zero extend the low 4 bytes of the incoming vector to 32-bits,
discarding higher bytes.

However, somewhere in the past, these two concepts had become confused, even
leading to a nonsensical VSEXT_MOVL.

This re-merges the nodes where appropriate (all VSEXT_MOVL -> VSEXT, VZEXT_MOVL
-> VZEXT when it's an actual extension).

rdar://problem/15981990

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200918 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-06 09:54:51 +00:00
Elena Demikhovsky
002683abc7 AVX-512: Added intrinsic for cvtph2ps.
Added VPTESTNM instruction.
Added a pattern to vselect (lit tests will follow).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200823 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-05 07:05:03 +00:00
Craig Topper
8673b5492a Improve some x86 type constraints.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200120 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-26 04:59:39 +00:00
Elena Demikhovsky
e1a621d84f AVX-512: added VPERM2D VPERM2Q VPERM2PS VPERM2PD instructions,
they give better sequences than VPERMI


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199893 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-23 14:27:26 +00:00
Elena Demikhovsky
a56ae89d22 AVX-512: added intrinsic vcvtpd2ps (with rounding mode and without)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198593 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-06 08:45:54 +00:00
Elena Demikhovsky
3062a311ac AVX-512: Added intrinsics for vcvt, vcvtt, vrndscale, vcmp
Printing rounding control.
Enncoding for EVEX_RC (rounding control).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198277 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-01 15:12:34 +00:00
Elena Demikhovsky
1f0a0e314d AVX-512: Added implementation of CONCAT_VECTORS for v8i1 vectors (by Alexey Bader).
Added implementation of "truncate" from integer type (i64/i32/i16/i8) to i1.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197482 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-17 08:33:15 +00:00
Elena Demikhovsky
376a81d8ce AVX-512: Added legal type MVT::i1 and VK1 register for it.
Added scalar compare VCMPSS, VCMPSD.
Implemented LowerSELECT for scalar FP operations.
I replaced FSETCCss, FSETCCsd with one node type FSETCCs.
Node extract_vector_elt(v16i1/v8i1, idx) returns an element of type i1.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197384 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-16 13:52:35 +00:00
Elena Demikhovsky
ea79feb1a8 AVX-512: aligned / unaligned load and store for 512-bit integer vectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193156 91177308-0d34-0410-b5e6-96231b3b80d8
2013-10-22 09:19:28 +00:00
Elena Demikhovsky
f9d2d2dc89 AVX-512: implemented extractelement with variable index.
Added parsing of mask register and "zeroing" semantic, like {%k1} {z}.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190595 91177308-0d34-0410-b5e6-96231b3b80d8
2013-09-12 08:55:00 +00:00
Elena Demikhovsky
4edfa2278a AVX-512: added extend and truncate instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@189580 91177308-0d34-0410-b5e6-96231b3b80d8
2013-08-29 11:56:53 +00:00
Craig Topper
318f4e679b Make sure x86 instructions using ssmem/sdmem operand types are only able to parse memory operands of the proper size in Intel syntax. Primarily affects some of sse cvt instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@189206 91177308-0d34-0410-b5e6-96231b3b80d8
2013-08-26 00:39:04 +00:00
Elena Demikhovsky
8ba76daba0 AVX-512: Added SHIFT instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188899 91177308-0d34-0410-b5e6-96231b3b80d8
2013-08-21 09:36:02 +00:00
Elena Demikhovsky
f12df0ad50 AVX-512: added arithmetic and logical operations.
ADD, SUB, MUL integer and FP types. OR, AND, XOR.
Added embeded broadcast form for these instructions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188673 91177308-0d34-0410-b5e6-96231b3b80d8
2013-08-19 13:26:14 +00:00
Elena Demikhovsky
3491d67d3a AVX-512: Added VMOVD, VMOVQ, VMOVSS, VMOVSD instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188637 91177308-0d34-0410-b5e6-96231b3b80d8
2013-08-18 13:08:57 +00:00
Craig Topper
0163356ad1 Don't use v16i32 for load pattern matching. All 512-bit loads are cated to v8i64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188534 91177308-0d34-0410-b5e6-96231b3b80d8
2013-08-16 06:07:34 +00:00
Elena Demikhovsky
4d36bd80e6 AVX-512: Added CMP and BLEND instructions.
Lowering for SETCC.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188265 91177308-0d34-0410-b5e6-96231b3b80d8
2013-08-13 13:24:07 +00:00
Elena Demikhovsky
fac4a4eb7d AVX-512: Added VPERM* instructons and MOV* zmm-to-zmm instructions.
Added a test for shuffles using VPERM.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188147 91177308-0d34-0410-b5e6-96231b3b80d8
2013-08-11 07:55:09 +00:00
Elena Demikhovsky
207600d2cf AVX-512 set: Added BROADCAST instructions
with lowering logic and a test.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187884 91177308-0d34-0410-b5e6-96231b3b80d8
2013-08-07 12:34:55 +00:00
Elena Demikhovsky
13e6e9171f AVX-512 set: added mask operations, lowering BUILD_VECTOR for i1 vector types.
Added intrinsics and tests.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187717 91177308-0d34-0410-b5e6-96231b3b80d8
2013-08-05 08:52:21 +00:00
Benjamin Kramer
75311b7b4d X86: Turn fp selects into mask operations.
double test(double a, double b, double c, double d) { return a<b ? c : d; }

before:
_test:
	ucomisd	%xmm0, %xmm1
	ja	LBB0_2
	movaps	%xmm3, %xmm2
LBB0_2:
	movaps	%xmm2, %xmm0

after:
_test:
	cmpltsd	%xmm1, %xmm0
	andpd	%xmm0, %xmm2
	andnpd	%xmm3, %xmm0
	orpd	%xmm2, %xmm0

Small speedup on Benchmarks/SmallPT

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187706 91177308-0d34-0410-b5e6-96231b3b80d8
2013-08-04 12:05:16 +00:00
Elena Demikhovsky
8395251c0a Added INSERT and EXTRACT intructions from AVX-512 ISA.
All insertf*/extractf* functions replaced with insert/extract since we have insertf and inserti forms.
Added lowering for INSERT_VECTOR_ELT / EXTRACT_VECTOR_ELT for 512-bit vectors.
Added lowering for EXTRACT/INSERT subvector for 512-bit vectors.
Added a test.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187491 91177308-0d34-0410-b5e6-96231b3b80d8
2013-07-31 11:35:14 +00:00
Craig Topper
4aee1bb222 Fix inconsistent usage of PALIGN and PALIGNR when referring to the same instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173667 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-28 06:48:25 +00:00
Benjamin Kramer
739c7a83e1 X86: Match the SSE/AVX min/max vector ops using a custom node instead of intrinsics
This is very mechanical, no functionality change. Preparation for PR14667.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170898 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-21 14:04:55 +00:00
Benjamin Kramer
388fc6a988 X86: Add a couple of target-specific dag combines that turn VSELECTS into psubus if possible.
We match the pattern "x >= y ? x-y : 0" into "subus x, y" and two special cases
if y is a constant. DAGCombiner canonicalizes those so we first have to undo the
canonicalization for those cases. The pattern occurs in gzip when the loop
vectorizer is enabled. Part of PR14613.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170273 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-15 16:47:44 +00:00
Elena Demikhovsky
226e0e6264 Simplified BLEND pattern matching for shuffles.
Generate VPBLENDD for AVX2 and VPBLENDW for v16i16 type on AVX2.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169366 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-05 09:24:57 +00:00
Michael Liao
d9d09600ee Enable lowering ZERO_EXTEND/ANY_EXTEND to PMOVZX from SSE4.1
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166486 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-23 17:34:00 +00:00
Michael Liao
44c2d61b67 Add support for FP_ROUND from v2f64 to v2f32
- Due to the current matching vector elements constraints in
  ISD::FP_ROUND, rounding from v2f64 to v4f32 (after legalization from
  v2f32) is scalarized. Add a customized v2f32 widening to convert it
  into a target-specific X86ISD::VFPROUND to work around this
  constraints.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165631 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-10 16:53:28 +00:00
Michael Liao
b8150d8523 Enhance PR11334 fix to support extload from v2f32/v4f32
- Fix an remaining issue of PR11674 as well



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163528 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-10 18:33:51 +00:00
Craig Topper
fd49821c35 Convert FMA4 patterns to use target specific nodes instead of intrinsics to align with FMA3.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162829 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-29 07:18:25 +00:00
Nadav Rotem
d60cb11afd When unsafe math is used, we can use commutative FMAX and FMIN. In some cases
this allows for better code generation.

Added a new DAGCombine transformation to convert FMAX and FMIN to FMANC and
FMINC, which are commutative.

For example:

  movaps  %xmm0, %xmm1
  movsd LC(%rip), %xmm0
  minsd %xmm1, %xmm0

becomes:

  minsd LC(%rip), %xmm0




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162187 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-19 13:06:16 +00:00
Michael Liao
7091b2451d fix PR11334
- FP_EXTEND only support extending from vectors with matching elements.
  This results in the scalarization of extending to v2f64 from v2f32,
  which will be legalized to v4f32 not matching with v2f64.
- add X86-specific VFPEXT supproting extending from v4f32 to v2f64.
- add BUILD_VECTOR lowering helper to recover back the original
  extending from v4f32 to v2f64.
- test case is enhanced to include different vector width.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161894 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-14 21:24:47 +00:00
Craig Topper
4feb647283 Implement proper handling for pcmpistri/pcmpestri intrinsics. Requires custom handling in DAGISelToDAG due to limitations in TableGen's implicit def handling. Fixes PR11305.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161318 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-06 06:22:36 +00:00
Elena Demikhovsky
1503aba4a0 Added FMA functionality to X86 target.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161110 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-01 12:06:00 +00:00
Bill Wendling
56cb229866 Remove tabs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160477 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-19 00:11:40 +00:00
Craig Topper
2a5dc43bd9 Use XOP vpcom intrinsics in patterns instead of a target specific SDNode type. Remove the custom lowering code that selected the SDNode type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158279 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09 17:02:24 +00:00
Elena Demikhovsky
1da5867236 ZERO_EXTEND/SIGN_EXTEND/TRUNCATE optimization for AVX2
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155309 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-22 09:39:03 +00:00
Craig Topper
9204074598 Change type profile for vpermv back to using operand type for the mask argument to match intrinsic behavior. Add a bitcast to the lowering code to convert mask from v8i32 to v8f32 for vpermps.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154798 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-16 06:43:40 +00:00
Craig Topper
8325c11d47 Merge vpermps/vpermd and vpermpd/vpermq SD nodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154782 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-16 00:41:45 +00:00
Craig Topper
ca9ee66e36 Fix SDTypeProfile for vpermps. The mask operand should be v8i32.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154781 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-16 00:12:20 +00:00
Elena Demikhovsky
73c504af9d Added VPERM optimization for AVX2 shuffles
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154761 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-15 11:18:59 +00:00
Nadav Rotem
e611378a6e Reapply 154396 after fixing a test.
Original message:
Modify the code that lowers shuffles to blends from using blendvXX to vblendXX.
blendV uses a register for the selection while Vblend uses an immediate.
On sandybridge they still have the same latency and execute on the same execution ports.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154483 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-11 06:40:27 +00:00
Eric Christopher
a139051654 Temporarily revert this patch to see if it brings the buildbots back.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154425 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-10 19:33:16 +00:00
Nadav Rotem
50e64cfe6e Modify the code that lowers shuffles to blends from using blendvXX to vblendXX.
blendv uses a register for the selection while vblend uses an immediate.
On sandybridge they still have the same latency and execute on the same execution ports.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154396 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-10 14:33:13 +00:00
Chad Rosier
abd6674166 Fix a regression from r147481.
Original commit message from r147481:
DAGCombine for transforming 128->256 casts into a vmovaps, rather
then a vxorps + vinsertf128 pair if the original vector came from a load.

Fix:
Unaligned loads need to generate a vmovups.
rdar://10974078



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@152366 91177308-0d34-0410-b5e6-96231b3b80d8
2012-03-09 02:00:48 +00:00
Jia Liu
44de83a7f6 some comment fix for X86 and ARM
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@150902 91177308-0d34-0410-b5e6-96231b3b80d8
2012-02-19 02:03:36 +00:00
Jia Liu
31d157ae1a Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, MSP430, PPC, PTX, Sparc, X86, XCore.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@150878 91177308-0d34-0410-b5e6-96231b3b80d8
2012-02-18 12:03:15 +00:00