Commit Graph

3286 Commits

Author SHA1 Message Date
Matthias Braun
48362d63cf Revert "X86: Reject register operands with obvious type mismatches."
Revert until http://llvm.org/PR23955 is investigated.

This reverts commit r239309.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240746 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-26 00:26:49 +00:00
Ahmed Bougacha
eb78c2fbdf [X86] Accept hasAVX512() as well as hasFMA() when generating FMA.
We don't always have FMA, for example when using 'clang -mavx512f'
without an explicit CPU.

Also check for an explicit +avx512f instead of CPUs in a couple
related tests.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240616 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-25 00:44:46 +00:00
Elena Demikhovsky
d96e362b3f AVX-512: Added all forms of VPABS instruction
Added all intrinsics, tests for encoding, tests for intrinsics.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240386 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-23 08:19:46 +00:00
Rafael Espindola
09bbd16112 Avoid a Symbol -> Name -> Symbol conversion.
Before this we were producing a TargetExternalSymbol from a MCSymbol.
That meant extracting the symbol name and fetching the symbol again
down the pipeline.

This patch adds a DAG.getMCSymbol that lets the MCSymbol pass unchanged on the
DAG.

Doing so removes the need for MO_NOPREFIX and fixes the root cause of pr23900,
allowing r240130 to be committed again.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240300 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-22 17:46:53 +00:00
Elena Demikhovsky
4f1ddd396b AVX-512: All forms of VCOPMRESS VEXPAND instructions,
encoding tests.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240272 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-22 11:16:30 +00:00
Elena Demikhovsky
42ceb12123 Reverted AVX-512 vector shuffle
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240258 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-22 09:01:15 +00:00
Elena Demikhovsky
c768510422 AVX-512: Added intrinsics for VPERMT2W/D/Q/PS/PD and
VPERMI2W/D/Q/PS/PD instructions.
Added tests.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240256 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-22 06:45:48 +00:00
Simon Pilgrim
da5f3d8f76 [X86] Code tidyup - Use SDValue bool operator. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240249 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-21 21:34:32 +00:00
Simon Pilgrim
1eabc9fb9d [X86][SSE] Fix PerformSExtCombine bug that accessed the wrong return value of an aggregate type.
Fix to rL237885 to ensure that it accesses the correct return value of an aggregate type.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240223 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-20 16:19:24 +00:00
Eric Christopher
933d2bd391 Fix "the the" in comments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240112 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-19 01:53:21 +00:00
Sanjay Patel
b9b8054704 use SDValue bool operator; NFCI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240064 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-18 21:44:31 +00:00
Asaf Badouh
27a2741354 quick fix for failure from r.240012
failure:
http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/11847/steps/build_Lld/logs/stdio

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240015 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-18 12:57:24 +00:00
Simon Pilgrim
6ebf741ea2 [X86][SSE] Improved support for vector i16 to float conversions.
Added explicit sign extension for v4i16/v8i16 to v4i32/v8i32 before conversion to floats. Matches existing support for v4i8/v8i8.

Follow up to D10433


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239966 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-17 22:43:34 +00:00
Reid Kleckner
4278cac3c4 Re-land "[X86] Cache variables that only depend on the subtarget"
Re-instates r239949 without accidentally flipping the sense of UseLEA.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239950 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-17 21:50:02 +00:00
Reid Kleckner
cf4978e112 Revert "[X86] Cache variables that only depend on the subtarget"
This reverts commit r239948, tests seem to be failing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239949 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-17 21:35:02 +00:00
Reid Kleckner
bbb75718b2 [X86] Cache variables that only depend on the subtarget
There is a one-to-one relationship between X86Subtarget and
X86FrameLowering, but every frame lowering method would previously pull
the subtarget off the MachineFunction and query some subtarget
properties.

Over time, these locals began to grow in complexity and it became
important to keep their names and meaning in sync across all of the
frame lowering methods, leading to duplication. We can eliminate that
duplication by computing them once in the constructor.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239948 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-17 21:31:17 +00:00
Simon Pilgrim
e2d3e4467e [X86][SSE] Vectorize v2i32 to v2f64 conversions
This patch enables support for the conversion of v2i32 to v2f64 to use the CVTDQ2PD xmm instruction and stay on the SSE unit instead of scalarizing, sign extending to i64 and using CVTSI2SDQ scalar conversions.

Differential Revision: http://reviews.llvm.org/D10433

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239855 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-16 21:40:28 +00:00
Elena Demikhovsky
05e61f7113 X86: optimized i64 vector multiply with constant
When we multiply two 64-bit vectors, we extract lower and upper part and use the PMULUDQ instruction.
When one of the operands is a constant, the upper part may be zero, we know this at compile time.
Example: %a = mul <4 x i64> %b, <4 x i64> < i64 5, i64 5, i64 5, i64 5>.
I'm checking the value of the upper part and prevent redundant "multiply", "shift" and "add" operations.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239802 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-16 06:07:24 +00:00
Igor Breger
17ae2138b0 AVX-512: Implemented DAG lowering for shuff62x2/shufi62x2 instuctions ( Shuffle Packed Values at 128-bit Granularity )
Tests added , vector-shuffle-512-v8.ll test re-generated.

Differential Revision: http://reviews.llvm.org/D10300

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239697 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-14 13:07:47 +00:00
Igor Breger
6ea3ad7e6e AVX-512: Implemented cvtsi2ss/d cvtusi2ss/d instructions with round control for KNL.
Added intrinsics for cvtsi2ss/d instructions.
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D10430

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239694 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-14 12:44:55 +00:00
Simon Pilgrim
9223c2cb1e Stripped trailing whitespace. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239672 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-13 12:51:39 +00:00
Reid Kleckner
3e16bd3aaf [WinEH] Create an llvm.x86.seh.exceptioninfo intrinsic
This intrinsic is like framerecover plus a load. It recovers the EH
registration stack allocation from the parent frame and loads the
exception information field out of it, giving back a pointer to an
EXCEPTION_POINTERS struct. It's designed for clang to use in SEH filter
expressions instead of accessing the EXCEPTION_POINTERS parameter that
is available on x64.

This required a minor change to MC to allow defining a label variable to
another absolute framerecover label variable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239567 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-11 22:32:23 +00:00
Simon Pilgrim
44226ffc19 [X86][SSE] Vectorized i8 and i16 shift operators
This patch ensures that SHL/SRL/SRA shifts for i8 and i16 vectors avoid scalarization. It builds on the existing i8 SHL vectorized implementation of moving the shift bits up to the sign bit position and separating the 4, 2 & 1 bit shifts with several improvements:

1 - SSE41 targets can use (v)pblendvb directly with the sign bit instead of performing a comparison to feed into a VSELECT node.
2 - pre-SSE41 targets were masking + comparing with an 0x80 constant - we avoid this by using the fact that a set sign bit means a negative integer which can be compared against zero to then feed into VSELECT, avoiding the need for a constant mask (zero generation is much cheaper).
3 - SRA i8 needs to be unpacked to the upper byte of a i16 so that the i16 psraw instruction can be correctly used for sign extension - we have to do more work than for SHL/SRL but perf tests indicate that this is still beneficial.

The i16 implementation is similar but simpler than for i8 - we have to do 8, 4, 2 & 1 bit shifts but less shift masking is involved. SSE41 use of (v)pblendvb requires that the i16 shift amount is splatted to both bytes however.

Tested on SSE2, SSE41 and AVX machines.

Differential Revision: http://reviews.llvm.org/D9474

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239509 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-11 07:46:37 +00:00
Elena Demikhovsky
189930760d AVX-512: Fixed a bug in comparison of i1 vectors.
cmp eq should give kxnor instruction
cmp neq should give kxor 

https://llvm.org/bugs/show_bug.cgi?id=23631



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239460 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-10 06:49:28 +00:00
Reid Kleckner
bdcbc426af [WinEH] Add 32-bit SEH state table emission prototype
This gets all the handler info through to the asm printer and we can
look at the .xdata tables now. I've convinced one small catch-all test
case to work, but other than that, it would be a stretch to say this is
functional.

The state numbering algorithm avoids doing any scope reconstruction as
we do for C++ to simplify the implementation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239433 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-09 21:42:19 +00:00
Akira Hatanaka
0e3246a86f Remove DisableTailCalls from TargetOptions and the code in resetTargetOptions
that was resetting it.

Remove the uses of DisableTailCalls in subclasses of TargetLowering and use
the value of function attribute "disable-tail-calls" instead. Also,
unconditionally add pass TailCallElim to the pipeline and check the function
attribute at the start of runOnFunction to disable the pass on a per-function
basis. 
 
This is part of the work to remove TargetMachine::resetTargetOptions, and since
DisableTailCalls was the last non-fast-math option that was being reset in that
function, we should be able to remove the function entirely after the work to
propagate IR-level fast-math flags to DAG nodes is completed.

Out-of-tree users should remove the uses of DisableTailCalls and make changes
to attach attribute "disable-tail-calls"="true" or "false" to the functions in
the IR.

rdar://problem/13752163

Differential Revision: http://reviews.llvm.org/D10099


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239427 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-09 19:07:19 +00:00
Matthias Braun
b0d6c659b7 X86: Reject register operands with obvious type mismatches.
While we have some code to transform specification like {ax} into
{eax}/{rax} if the operand type isn't 16bit, we should reject cases
where there is no sane way to do this, like the i128 type in the
example.

Related to rdar://21042280

Differential Revision: http://reviews.llvm.org/D10260

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239309 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-08 16:56:23 +00:00
Igor Breger
17e24879cb AVX-512: Implemented 256/128bit VALIGND/Q instructions for SKX and KNL
Implemented DAG lowering for all these forms.
Added tests for DAG lowering and encoding.

Differential Revision: http://reviews.llvm.org/D10310

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239300 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-08 14:03:17 +00:00
Elena Demikhovsky
0880fe5997 AVX-512: I brought back vector-shuffle-512-v8.ll test.
I re-generated it after all AVX-512 shuffle optimizations.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239026 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-04 07:49:56 +00:00
Elena Demikhovsky
693d40eec0 Removed {}, NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239014 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-04 07:01:29 +00:00
Sanjay Patel
e4e5cf5a66 make reciprocal estimate code generation more flexible by adding command-line options (3rd try)
The first try (r238051) to land this was reverted due to ExecutionEngine build failure;
that was hopefully addressed by r238788.

The second try (r238842) to land this was reverted due to BUILD_SHARED_LIBS failure;
that was hopefully addressed by r238953.

This patch adds a TargetRecip class for processing many recip codegen possibilities.
The class is intended to handle both command-line options to llc as well
as options passed in from a front-end such as clang with the -mrecip option.

The x86 backend is updated to use the new functionality.
Only -mcpu=btver2 with -ffast-math should see a functional change from this patch.
All other x86 CPUs continue to *not* use reciprocal estimates by default with -ffast-math.

Differential Revision: http://reviews.llvm.org/D8982



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239001 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-04 01:32:35 +00:00
Asaf Badouh
ce375dc63a re-apply 238809
AVX-512: Implemented GETEXP instruction for KNL and SKX
Added rounding mode modifier for SQRTPS/PD
Added tests for encoding and intrinsics.
CR:
http://reviews.llvm.org/D9991


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238923 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-03 13:41:48 +00:00
Elena Demikhovsky
fc28da72f0 AVX-512: More code improvements in shuffles, NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238919 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-03 12:05:03 +00:00
Elena Demikhovsky
10eb2dd9df AVX-512: VSHUFPD instruction selection - code improvements
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238918 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-03 11:21:01 +00:00
Elena Demikhovsky
23dc4bbf1d AVX-512: Implemented SHUFF32x4/SHUFF64x2/SHUFI32x4/SHUFI64x2 instructions for SKX and KNL.
Added tests for encoding.

By Igor Breger (igor.breger@intel.com)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238917 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-03 10:56:40 +00:00
Simon Pilgrim
dd5cde6e60 [X86] Removed (unused) FSRL x86 operation
This patch removes the old X86ISD::FSRL op - which allowed float vectors to use the byte right shift operations (causing a domain switch....).

Since the refactoring of the shuffle lowering code this no longer has any use.

Differential Revision: http://reviews.llvm.org/D10169

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238906 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-03 08:32:36 +00:00
Rafael Espindola
a0bcb4184b Revert "make reciprocal estimate code generation more flexible by adding command-line options (2nd try)"
This reverts commit r238842.

It broke -DBUILD_SHARED_LIBS=ON build.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238900 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-03 05:32:44 +00:00
Sanjay Patel
871beb8dd7 make reciprocal estimate code generation more flexible by adding command-line options (2nd try)
The first try (r238051) to land this was reverted due to bot failures
that were hopefully addressed by r238788.

This patch adds a TargetRecip class for processing many recip codegen possibilities.
The class is intended to handle both command-line options to llc as well
as options passed in from a front-end such as clang with the -mrecip option.

The x86 backend is updated to use the new functionality.
Only -mcpu=btver2 with -ffast-math should see a functional change from this patch.
All other x86 CPUs continue to *not* use reciprocal estimates by default with -ffast-math.

Differential Revision: http://reviews.llvm.org/D8982



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238842 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-02 15:28:15 +00:00
Elena Demikhovsky
ccbc17f896 AVX-512: Shorten implementation of lowerV16X32VectorShuffle()
using lowerVectorShuffleWithSHUFPS() and other shuffle-helpers routines.
Added matching of VALIGN instruction.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238830 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-02 13:43:18 +00:00
Asaf Badouh
aa9e1c528b revert 238809
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238810 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-02 07:45:19 +00:00
Asaf Badouh
82fa06895e AVX-512: Implemented GETEXP instruction for KNL and SKX
Added rounding mode modifier for SQRTPS/PD
Added tests for encoding and intrinsics.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238809 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-02 07:18:14 +00:00
Elena Demikhovsky
bbd7cab2b9 AVX-512: Optimized vector shuffle for v16f32 and v16i32 types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238743 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-01 13:26:18 +00:00
Elena Demikhovsky
af0e519127 AVX-512: Implemented VRANGEPD and VRANGEPD instructions for SKX.
Implemented DAG lowering for all these forms.
Added tests for encoding.

By Igor Breger (igor.breger@intel.com)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238738 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-01 11:05:34 +00:00
Elena Demikhovsky
aa62d8a6b2 AVX-512: Implemented vector shuffle lowering for v8i64 and v8f64 types.
I removed the vector-shuffle-512-v8.ll, it is auto-generated test, not valid any more.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238735 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-01 09:49:53 +00:00
Elena Demikhovsky
9029910d9f AVX-512: Implemented VFIXUPIMMPD and VFIXUPIMMPS instructions for KNL and SKX
Implemented DAG lowering for all these forms.
Added tests for encoding.

by Igor Breger (igor.breger@intel.com)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238728 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-01 06:50:49 +00:00
Elena Demikhovsky
9f63519857 AVX-512: Fixed a bug in compress and expand intrinsics.
By Igor Breger (igor.breger@intel.com)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238724 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-01 06:30:13 +00:00
Matt Arsenault
5f3a6430d6 Add address space argument to isLegalAddressingMode
This is important because of different addressing modes
depending on the address space for GPU targets.

This only adds the argument, and does not update
any of the uses to provide the correct address space.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238723 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-01 05:31:59 +00:00
Simon Pilgrim
08786dc314 Stripped trailing whitespace. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238654 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-30 13:01:42 +00:00
Chandler Carruth
fa68750e54 [x86] Unify the horizontal adding used for popcount lowering taking the
best approach of each.

For vNi16, we use SHL + ADD + SRL pattern that seem easily the best.

For vNi32, we use the PUNPCK + PSADBW + PACKUSWB pattern. In some cases
there is a huge improvement with this in IACA's estimated throughput --
over 2x higher throughput!!!! -- but the measurements are too good to be
true. In one narrow case, the SHL + ADD + SHL + ADD + SRL pattern looks
slightly faster, but I'm not sure I believe any of the measurements at
this point. Both are the exact same uops though. Hard to be confident of
anything past that.

If anyone wants to collect very detailed (Agner-level) timings with the
result of this patch, or with the i32 case replaced with SHL + ADD + SHl
+ ADD + SRL, I'd be very interested. Note that you'll need to test it on
both Ivybridge and Haswell, with both SSE3, SSSE3, and AVX selected as
I saw unique behavior in each of these buckets with IACA all of which
should be checked against measured performance.

But this patch is still a useful improvement by dropping duplicate work
and getting the much nicer PSADBW lowering for v2i64.

I'd still like to rephrase this in terms of generic horizontal sum. It's
a bit lame to have a special case of that just for popcount.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238652 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-30 10:35:03 +00:00
Chandler Carruth
da8bb20158 [x86] Split out the horizontal byte sum lowering component of the LUT
lowering into a helper function.

NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238650 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-30 09:46:16 +00:00