Commit Graph

11912 Commits

Author SHA1 Message Date
Ahmed Bougacha
eb78c2fbdf [X86] Accept hasAVX512() as well as hasFMA() when generating FMA.
We don't always have FMA, for example when using 'clang -mavx512f'
without an explicit CPU.

Also check for an explicit +avx512f instead of CPUs in a couple
related tests.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240616 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-25 00:44:46 +00:00
Swaroop Sridhar
983b80cf02 Enable StackMap Serialization for COFF
Summary

This change turns on the emission of 
__LLVM_Stackmaps section when generating COFF binaries.

Test Plan

Added a scenario to the test case: 
test\CodeGen\X86\statepoint-stackmap-format.ll.

Code Review:

http://reviews.llvm.org/D10680



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240613 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-25 00:28:42 +00:00
Douglas Katzman
72260ba601 [X86] Simplify some stuff in X86DisassemblerDecoder. NFC
- Deciding that insn->sibIndex is SIB_INDEX_NONE does not require another
check beyond the fully decoded bits being equal to 0x4.
The expression insn->sibIndex == SIB_INDEX_sib could not have been true unless
index were 0x4, because SIB_INDEX_sib is merely the range base (SIB_INDEX_EAX)
plus 4. Respectively SIB_INDEX_sib64.

- Don't use a switch statement to perform left-shift.

Differential Revision: http://reviews.llvm.org/D9762

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240598 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-24 22:04:55 +00:00
Rafael Espindola
821b06f3a8 Change how symbol sizes are handled in lib/Object.
COFF and MachO only define symbol sizes for common symbols. Reflect that
in the class hierarchy by having a method for common symbols only in the base
and a general one in ELF.

This avoids the need of using a magic value for the size, which had a few
problems
* Most callers didn't check for it.
* The ones that did could not tell the magic value from a file actually having
  that value.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240529 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-24 10:20:30 +00:00
Ahmed Bougacha
0810814bcc [X86] Don't generate vbroadcasti128 for v4i64 splats from memory.
We used to erroneously match:
    (v4i64 shuffle (v2i64 load), <0,0,0,0>)

Whereas vbroadcasti128 is more like:
    (v4i64 shuffle (v2i64 load), <0,1,0,1>)

This problem doesn't exist for vbroadcastf128, which kept matching
the intrinsic after r231182.  We should perhaps re-introduce the
intrinsic here as well, but that's a separate issue still being
discussed.

While there, add some proper vbroadcastf128 tests.  We don't currently
match those, like for loading vbroadcastsd/ss on AVX (the reg-reg
broadcasts where added in AVX2).

Fixes PR23886.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240488 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-24 00:07:16 +00:00
Rafael Espindola
9758b4ae95 Simplify the Mangler interface now that DataLayout is mandatory.
We only need to pass in a DataLayout when mangling a raw string, not when
constructing the mangler.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240405 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-23 13:59:29 +00:00
Rafael Espindola
b9ed9af341 Use MCSymbols for FastISel.
The summary is that it moves the mangling earlier and replaces a few
calls to .addExternalSymbol with addSym.

I originally wanted to replace all the uses of addExternalSymbol with
addSym, but noticed it was a lot of work and doesn't need to be done
all at once.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240395 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-23 12:21:54 +00:00
Alexander Kornienko
cd52a7a381 Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)
Apparently, the style needs to be agreed upon first.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240390 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-23 09:49:53 +00:00
Elena Demikhovsky
d96e362b3f AVX-512: Added all forms of VPABS instruction
Added all intrinsics, tests for encoding, tests for intrinsics.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240386 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-23 08:19:46 +00:00
Sanjay Patel
8bd59f505a [x86] generalize reassociation optimization in machine combiner to 2 instructions
Currently ( D10321, http://reviews.llvm.org/rL239486 ), we can use the machine combiner pass
to reassociate the following sequence to reduce the critical path:

A = ? op ?
B = A op X
C = B op Y
-->
A = ? op ?
B = X op Y
C = A op B

'op' is currently limited to x86 AVX scalar FP adds (with fast-math on), but in theory, it could
be any associative math/logic op (see TODO in code comment).

This patch generalizes the pattern match to ignore the instruction that defines 'A'. So instead of
a sequence of 3 adds, we now only need to find 2 dependent adds and decide if it's worth
reassociating them.

This generalization has a compile-time cost because we can now match more instruction sequences
and we rely more heavily on the machine combiner to discard sequences where reassociation doesn't
improve the critical path.

For example, in the new test case:

A = M div N
B = A add X
C = B add Y

We'll match 2 reassociation patterns, but this transform doesn't reduce the critical path:

A = M div N
B = A add Y
C = B add X

We need the combiner to reject that pattern but select this:

A = M div N
B = X add Y
C = B add A

Differential Revision: http://reviews.llvm.org/D10460



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240361 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-23 00:39:40 +00:00
Simon Pilgrim
7132523d6a [X86][FMA4] FMA4 ops can perform unaligned folded loads.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240342 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-22 21:49:41 +00:00
Ahmed Bougacha
a3afb70a5d [X86] Teach load folding to accept scalar _Int users of MOVSS/MOVSD.
The _Int instructions are special, in that they operate on the full
VR128 instead of FR32.  The load folding then looks at MOVSS, at the
user, and bails out when it sees a size mismatch.

What we really know is that the rm_Int instructions don't load the
higher lanes, so folding is fine.

This happens for the straightforward intrinsic code, e.g.:

    _mm_add_ss(a, _mm_load_ss(p));

Fixes PR23349.

Differential Revision: http://reviews.llvm.org/D10554


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240326 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-22 20:51:51 +00:00
Sanjay Patel
73aa02eb09 [x86] set default reciprocal (division and square root) codegen to match GCC
D8982 ( checked in at http://reviews.llvm.org/rL239001 ) added command-line 
options to allow reciprocal estimate instructions to be used in place of
divisions and square roots.

This patch changes the default settings for x86 targets to allow that recip
codegen (except for scalar division because that breaks too much code) when
using -ffast-math or its equivalent. 

This matches GCC behavior for this kind of codegen.

Differential Revision: http://reviews.llvm.org/D10396



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240310 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-22 18:29:44 +00:00
Rafael Espindola
09bbd16112 Avoid a Symbol -> Name -> Symbol conversion.
Before this we were producing a TargetExternalSymbol from a MCSymbol.
That meant extracting the symbol name and fetching the symbol again
down the pipeline.

This patch adds a DAG.getMCSymbol that lets the MCSymbol pass unchanged on the
DAG.

Doing so removes the need for MO_NOPREFIX and fixes the root cause of pr23900,
allowing r240130 to be committed again.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240300 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-22 17:46:53 +00:00
Elena Demikhovsky
114489ab24 AVX-512: added VPSHUFB instruction - all SKX forms
Added intrinsics and encoding tests.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240277 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-22 13:00:42 +00:00
Elena Demikhovsky
4f1ddd396b AVX-512: All forms of VCOPMRESS VEXPAND instructions,
encoding tests.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240272 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-22 11:16:30 +00:00
Elena Demikhovsky
42ceb12123 Reverted AVX-512 vector shuffle
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240258 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-22 09:01:15 +00:00
Michael Kuperstein
12219f8c85 [X86] Allow more call sequences to use push instructions for argument passing
This allows more call sequences to use pushes instead of movs when optimizing for size.
In particular, calling conventions that pass some parameters in registers (e.g. thiscall) are now supported.

Differential Revision: http://reviews.llvm.org/D10500

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240257 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-22 08:31:22 +00:00
Elena Demikhovsky
c768510422 AVX-512: Added intrinsics for VPERMT2W/D/Q/PS/PD and
VPERMI2W/D/Q/PS/PD instructions.
Added tests.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240256 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-22 06:45:48 +00:00
Simon Pilgrim
da5f3d8f76 [X86] Code tidyup - Use SDValue bool operator. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240249 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-21 21:34:32 +00:00
Simon Pilgrim
1eabc9fb9d [X86][SSE] Fix PerformSExtCombine bug that accessed the wrong return value of an aggregate type.
Fix to rL237885 to ensure that it accesses the correct return value of an aggregate type.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240223 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-20 16:19:24 +00:00
Sanjay Patel
30c3b2a4c2 name change: hasPattern() -> getMachineCombinerPatterns() ; NFC
This was suggested as part of D10460, but it's independent of
any functional change.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240192 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-19 23:21:42 +00:00
Rafael Espindola
7edd010739 Improve error handling of getRelocationAddend.
This patch changes getRelocationAddend to use ErrorOr and considers it an error
to try to get the addend of a REL section.

If, for example, a x86_64 file has a REL section, that file is corrupted and
we should reject it.

Using ErrorOr is not ideal since we check the section type once per relocation
instead of once per section.

Checking once per section would involve getRelocationAddend just asserting and
callers checking the section before iterating over the relocations.

In any case, this is an improvement and includes a test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240176 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-19 20:58:43 +00:00
Alexander Kornienko
cf0db29df2 Fixed/added namespace ending comments using clang-tidy. NFC
The patch is generated using this command:

tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \
  -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \
  llvm/lib/


Thanks to Eugene Kosov for the original patch!



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240137 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-19 15:57:42 +00:00
Eric Christopher
933d2bd391 Fix "the the" in comments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240112 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-19 01:53:21 +00:00
Sanjay Patel
b9b8054704 use SDValue bool operator; NFCI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240064 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-18 21:44:31 +00:00
Reid Kleckner
edb6ecd65a [X86] Rename RegInfo to TRI as suggested by Eric
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240047 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-18 20:32:02 +00:00
Reid Kleckner
f4e002cbd0 [X86] Refactor stack adjustments into X86FrameLowering::BuildStackAdjustment
Deduplicates some code and lets us use LEA on atom when adjusting the
stack around callee-cleanup calls. This is the only intended
functionality change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240044 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-18 20:22:12 +00:00
Reid Kleckner
e7e3ecdbf2 [X86] Remove unneeded parameters and deduplicate stack alignment code
NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240033 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-18 18:03:25 +00:00
Asaf Badouh
27a2741354 quick fix for failure from r.240012
failure:
http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/11847/steps/build_Lld/logs/stdio

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240015 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-18 12:57:24 +00:00
Asaf Badouh
bc5667c7ac [AVX512]
add instructions: VPAVGB and VPAVGW


review
http://reviews.llvm.org/D10504

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240012 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-18 12:30:53 +00:00
Elena Demikhovsky
6c24289bef AVX-512: (fixed) Added encoding of all forms of VPERMT2W/D/Q/PS/PD and VPERMI2W/D/Q/PS/PD.
Intrinsics and tests for them are comming in the next patch.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240003 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-18 08:56:19 +00:00
Elena Demikhovsky
f3d6e24ca4 reverted 239999 due to test failures
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240001 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-18 08:06:49 +00:00
Elena Demikhovsky
5686493ccc AVX-512: Added encoding of all forms of VPERMT2W/D/Q/PS/PD
and VPERMI2W/D/Q/PS/PD.
Intrinsics and tests for them are comming in the next patch.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239999 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-18 07:29:40 +00:00
Simon Pilgrim
6ebf741ea2 [X86][SSE] Improved support for vector i16 to float conversions.
Added explicit sign extension for v4i16/v8i16 to v4i32/v8i32 before conversion to floats. Matches existing support for v4i8/v8i8.

Follow up to D10433


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239966 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-17 22:43:34 +00:00
Reid Kleckner
4278cac3c4 Re-land "[X86] Cache variables that only depend on the subtarget"
Re-instates r239949 without accidentally flipping the sense of UseLEA.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239950 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-17 21:50:02 +00:00
Reid Kleckner
cf4978e112 Revert "[X86] Cache variables that only depend on the subtarget"
This reverts commit r239948, tests seem to be failing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239949 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-17 21:35:02 +00:00
Reid Kleckner
bbb75718b2 [X86] Cache variables that only depend on the subtarget
There is a one-to-one relationship between X86Subtarget and
X86FrameLowering, but every frame lowering method would previously pull
the subtarget off the MachineFunction and query some subtarget
properties.

Over time, these locals began to grow in complexity and it became
important to keep their names and meaning in sync across all of the
frame lowering methods, leading to duplication. We can eliminate that
duplication by computing them once in the constructor.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239948 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-17 21:31:17 +00:00
David Majnemer
cc714e2142 Move the personality function from LandingPadInst to Function
The personality routine currently lives in the LandingPadInst.

This isn't desirable because:
- All LandingPadInsts in the same function must have the same
  personality routine.  This means that each LandingPadInst beyond the
  first has an operand which produces no additional information.

- There is ongoing work to introduce EH IR constructs other than
  LandingPadInst.  Moving the personality routine off of any one
  particular Instruction and onto the parent function seems a lot better
  than have N different places a personality function can sneak onto an
  exceptional function.

Differential Revision: http://reviews.llvm.org/D10429

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239940 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-17 20:52:32 +00:00
Rafael Espindola
a1e31b45cc Move IsUsedInReloc from MCSymbolELF to MCSymbol.
There is a free bit is MCSymbol and MachO needs the same information.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239933 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-17 20:08:20 +00:00
Igor Breger
a066970605 AVX-512: cvtusi2ss/d intrinsics.
Change builtin function name and signature ( add third parameter - rounding mode ).
Added tests for intrinsics.

Differential Revision: http://reviews.llvm.org/D10473

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239888 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-17 07:23:57 +00:00
Simon Pilgrim
e2d3e4467e [X86][SSE] Vectorize v2i32 to v2f64 conversions
This patch enables support for the conversion of v2i32 to v2f64 to use the CVTDQ2PD xmm instruction and stay on the SSE unit instead of scalarizing, sign extending to i64 and using CVTSI2SDQ scalar conversions.

Differential Revision: http://reviews.llvm.org/D10433

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239855 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-16 21:40:28 +00:00
Reid Kleckner
8e206c19f4 [X86] Rename some frame lowering variables
Old names, new names, and what they really mean:

- IsWin64 -> IsWin64CC: This is true on non-Windows x86_64 platforms
  when the ms_abi calling convention is used.
- IsWinEH -> IsWin64Prologue: True when the target is Win64, regardless
  of calling convention. Changes the prologue to obey the constraints of
  the Win64 unwinder.
- NeedsWinEH -> NeedsWinCFI: We're using the win64 prologue *and* the we
  want .xdata unwind tables. Analogous to NeedsDwarfCFI.

NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239836 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-16 18:08:57 +00:00
Daniel Sanders
ffb22b8d80 Clean up redundant copies of Triple objects. NFC
Summary:

Reviewers: rengolin

Reviewed By: rengolin

Subscribers: llvm-commits, rengolin, jholewinski

Differential Revision: http://reviews.llvm.org/D10382


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239823 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-16 15:44:21 +00:00
Asaf Badouh
7ae3494732 [AVX512] add integer min/max intrinsics support.
review:
http://reviews.llvm.org/D10439

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239806 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-16 08:39:27 +00:00
Elena Demikhovsky
05e61f7113 X86: optimized i64 vector multiply with constant
When we multiply two 64-bit vectors, we extract lower and upper part and use the PMULUDQ instruction.
When one of the operands is a constant, the upper part may be zero, we know this at compile time.
Example: %a = mul <4 x i64> %b, <4 x i64> < i64 5, i64 5, i64 5, i64 5>.
I'm checking the value of the upper part and prevent redundant "multiply", "shift" and "add" operations.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239802 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-16 06:07:24 +00:00
Reid Kleckner
46446a56b8 [X86] Try to shorten dwarf CFI emission
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239786 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-15 23:45:08 +00:00
Sanjoy Das
a1e554d253 [TargetInstrInfo] Add new hook: AnalyzeBranchPredicate.
Summary:
NFC: no one uses AnalyzeBranchPredicate yet.

Add TargetInstrInfo::AnalyzeBranchPredicate and implement for x86.  A
later change adding support for page-fault based implicit null checks
depends on this.

Reviewers: reames, ab, atrick

Reviewed By: atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10200

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239742 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-15 18:44:21 +00:00
Sanjoy Das
319c91bbb0 [TargetInstrInfo] Rename getLdStBaseRegImmOfs and implement for x86.
Summary:

TargetInstrInfo::getLdStBaseRegImmOfs to
TargetInstrInfo::getMemOpBaseRegImmOfs and implement for x86.  The
implementation only handles a few easy cases now and will be made more
sophisticated in the future.

This is NFCI: the only user of `getLdStBaseRegImmOfs` (now
`getmemOpBaseRegImmOfs`) is `LoadClusterMotion` and `LoadClusterMotion`
is disabled for x86.

Reviewers: reames, ab, MatzeB, atrick

Reviewed By: MatzeB, atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10199

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239741 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-15 18:44:14 +00:00
Sanjoy Das
1991e2a4df [CodeGen] Introduce a FAULTING_LOAD_OP pseudo-op.
Summary:
This instruction encodes a loading operation that may fault, and a label
to branch to if the load page-faults.  The locations of potentially
faulting loads and their "handler" destinations are recorded in a
FaultMap section, meant to be consumed by LLVM's clients.

Nothing generates FAULTING_LOAD_OP instructions yet, but they will be
used in a future change.

The documentation (FaultMaps.rst) needs improvement and I will update
this diff with a more expanded version shortly.

Depends on D10196

Reviewers: rnk, reames, AndyAyers, ab, atrick, pgavlin

Reviewed By: atrick, pgavlin

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10197

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239740 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-15 18:44:08 +00:00
Sanjoy Das
36395e7598 [NFC] Extract X86MCInstLower::LowerMachineOperand.
Summary: Refactoring-only change that will be used later.

Reviewers: reames, atrick

Reviewed By: atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10196

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239739 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-15 18:44:01 +00:00
Igor Breger
17ae2138b0 AVX-512: Implemented DAG lowering for shuff62x2/shufi62x2 instuctions ( Shuffle Packed Values at 128-bit Granularity )
Tests added , vector-shuffle-512-v8.ll test re-generated.

Differential Revision: http://reviews.llvm.org/D10300

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239697 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-14 13:07:47 +00:00
Michael Kuperstein
3dd555171e Add support for parsing the XOR operator in Intel syntax inline assembly.
Differential Revision: http://reviews.llvm.org/D10385
Patch by marina.yatsina@intel.com


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239695 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-14 12:59:45 +00:00
Igor Breger
6ea3ad7e6e AVX-512: Implemented cvtsi2ss/d cvtusi2ss/d instructions with round control for KNL.
Added intrinsics for cvtsi2ss/d instructions.
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D10430

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239694 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-14 12:44:55 +00:00
Simon Pilgrim
9223c2cb1e Stripped trailing whitespace. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239672 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-13 12:51:39 +00:00
Matthias Braun
6fee0b00e2 MachineLICM: Use TargetSchedModel instead of just itineraries
This will use Itinieraries if available, but will also work if just a
MCSchedModel is available.

Differential Revision: http://reviews.llvm.org/D10428

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239658 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-13 03:42:11 +00:00
Reid Kleckner
2bd0221fa4 [WinEH] Put finally pointers in the handler scope table field
We were putting them in the filter field, which is correct for 64-bit
but wrong for 32-bit.

Also switch the order of scope table entry emission so outermost entries
are emitted first, and fix an obvious state assignment bug.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239574 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-11 23:37:18 +00:00
Juergen Ributzka
d48b38e9ec [Stackmaps][X86] Remove EFLAGS and IP registers from the live-out mask.
Remove the EFLAGS from the stackmap live-out mask. The EFLAGS register is not
supposed to be part of that set, because the X86 calling conventions mark the
register as NOT preserved.

Also remove the IP registers, since spilling and restoring those doesn't really
make any sense.

Related to rdar://problem/21019635.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239568 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-11 22:40:04 +00:00
Reid Kleckner
3e16bd3aaf [WinEH] Create an llvm.x86.seh.exceptioninfo intrinsic
This intrinsic is like framerecover plus a load. It recovers the EH
registration stack allocation from the parent frame and loads the
exception information field out of it, giving back a pointer to an
EXCEPTION_POINTERS struct. It's designed for clang to use in SEH filter
expressions instead of accessing the EXCEPTION_POINTERS parameter that
is available on x64.

This required a minor change to MC to allow defining a label variable to
another absolute framerecover label variable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239567 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-11 22:32:23 +00:00
Daniel Sanders
4ddb0ced90 Replace string GNU Triples with llvm::Triple in TargetMachine. NFC.
Summary:
For the moment, TargetMachine::getTargetTriple() still returns a StringRef.

This continues the patch series to eliminate StringRef forms of GNU triples
from the internals of LLVM that began in r239036.

Reviewers: rengolin

Reviewed By: rengolin

Subscribers: ted, llvm-commits, rengolin, jholewinski

Differential Revision: http://reviews.llvm.org/D10362

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239554 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-11 19:41:26 +00:00
Ahmed Bougacha
fd83cb21ce [CodeGen] ArrayRef'ize cond/pred in various TII APIs. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239553 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-11 19:30:37 +00:00
Simon Pilgrim
44226ffc19 [X86][SSE] Vectorized i8 and i16 shift operators
This patch ensures that SHL/SRL/SRA shifts for i8 and i16 vectors avoid scalarization. It builds on the existing i8 SHL vectorized implementation of moving the shift bits up to the sign bit position and separating the 4, 2 & 1 bit shifts with several improvements:

1 - SSE41 targets can use (v)pblendvb directly with the sign bit instead of performing a comparison to feed into a VSELECT node.
2 - pre-SSE41 targets were masking + comparing with an 0x80 constant - we avoid this by using the fact that a set sign bit means a negative integer which can be compared against zero to then feed into VSELECT, avoiding the need for a constant mask (zero generation is much cheaper).
3 - SRA i8 needs to be unpacked to the upper byte of a i16 so that the i16 psraw instruction can be correctly used for sign extension - we have to do more work than for SHL/SRL but perf tests indicate that this is still beneficial.

The i16 implementation is similar but simpler than for i8 - we have to do 8, 4, 2 & 1 bit shifts but less shift masking is involved. SSE41 use of (v)pblendvb requires that the i16 shift amount is splatted to both bytes however.

Tested on SSE2, SSE41 and AVX machines.

Differential Revision: http://reviews.llvm.org/D9474

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239509 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-11 07:46:37 +00:00
Reid Kleckner
7963762fce Revert "Move dllimport name mangling to IR mangler."
This reverts commit r239437.

This broke clang-cl self-hosts. We'd end up calling the __imp_ symbol
directly instead of using it to do an indirect function call.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239502 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-11 01:31:48 +00:00
Sanjay Patel
e0d6eef952 change assert that will never fire to llvm_unreachable
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239497 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-10 23:27:33 +00:00
Sanjay Patel
c826b54b52 [x86] Add a reassociation optimization to increase ILP via the MachineCombiner pass
This is a reimplementation of D9780 at the machine instruction level rather than the DAG.

Use the MachineCombiner pass to reassociate scalar single-precision AVX additions (just a
starting point; see the TODO comments) to increase ILP when it's safe to do so.

The code is closely based on the existing MachineCombiner optimization that is implemented
for AArch64.

This patch should not cause the kind of spilling tragedy that led to the reversion of r236031.

Differential Revision: http://reviews.llvm.org/D10321



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239486 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-10 20:32:21 +00:00
Daniel Sanders
4d13f315d1 Replace string GNU Triples with llvm::Triple in MCSubtargetInfo and create*MCSubtargetInfo(). NFC.
Summary:
This continues the patch series to eliminate StringRef forms of GNU triples
from the internals of LLVM that began in r239036.

Reviewers: rafael

Reviewed By: rafael

Subscribers: rafael, ted, jfb, llvm-commits, rengolin, jholewinski

Differential Revision: http://reviews.llvm.org/D10311


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239467 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-10 12:11:26 +00:00
Daniel Sanders
fff114c890 Replace string GNU Triples with llvm::Triple in create*MCRelocationInfo(). NFC.
Summary:
This continues the patch series to eliminate StringRef forms of GNU triples
from the internals of LLVM that began in r239036.

Reviewers: rafael

Reviewed By: rafael

Subscribers: rafael, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D10307


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239465 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-10 10:54:40 +00:00
Daniel Sanders
03c060b6d4 Replace string GNU Triples with llvm::Triple in MCAsmBackend subclasses and create*AsmBackend(). NFC.
Summary:
This continues the patch series to eliminate StringRef forms of GNU triples
from the internals of LLVM that began in r239036.

Reviewers: echristo, rafael

Reviewed By: rafael

Subscribers: rafael, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D10243

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239464 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-10 10:35:34 +00:00
Elena Demikhovsky
189930760d AVX-512: Fixed a bug in comparison of i1 vectors.
cmp eq should give kxnor instruction
cmp neq should give kxor 

https://llvm.org/bugs/show_bug.cgi?id=23631



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239460 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-10 06:49:28 +00:00
Reid Kleckner
839f83e1e3 [WinEH] Call llvm.stackrestore in __except blocks
We have to do this manually, the runtime only sets up ebp. Fixes a crash
when returning after catching an exception.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239451 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-10 01:34:54 +00:00
Reid Kleckner
c8e72e9126 [WinEH] Emit .safeseh directives for all 32-bit exception handlers
Use a "safeseh" string attribute to do this. You would think we chould
just accumulate the set of personalities like we do on dwarf, but this
fails to account for the LSDA-loading thunks we use for
__CxxFrameHandler3. Each of those needs to make it into .sxdata as well.
The string attribute seemed like the most straightforward approach.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239448 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-10 01:02:30 +00:00
Peter Collingbourne
12f81b4639 Move dllimport name mangling to IR mangler.
This ensures that LTO clients see the correct external symbol name.

Differential Revision: http://reviews.llvm.org/D10318

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239437 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-09 22:09:53 +00:00
Reid Kleckner
bdcbc426af [WinEH] Add 32-bit SEH state table emission prototype
This gets all the handler info through to the asm printer and we can
look at the .xdata tables now. I've convinced one small catch-all test
case to work, but other than that, it would be a stretch to say this is
functional.

The state numbering algorithm avoids doing any scope reconstruction as
we do for C++ to simplify the implementation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239433 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-09 21:42:19 +00:00
Akira Hatanaka
0e3246a86f Remove DisableTailCalls from TargetOptions and the code in resetTargetOptions
that was resetting it.

Remove the uses of DisableTailCalls in subclasses of TargetLowering and use
the value of function attribute "disable-tail-calls" instead. Also,
unconditionally add pass TailCallElim to the pipeline and check the function
attribute at the start of runOnFunction to disable the pass on a per-function
basis. 
 
This is part of the work to remove TargetMachine::resetTargetOptions, and since
DisableTailCalls was the last non-fast-math option that was being reset in that
function, we should be able to remove the function entirely after the work to
propagate IR-level fast-math flags to DAG nodes is completed.

Out-of-tree users should remove the uses of DisableTailCalls and make changes
to attach attribute "disable-tail-calls"="true" or "false" to the functions in
the IR.

rdar://problem/13752163

Differential Revision: http://reviews.llvm.org/D10099


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239427 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-09 19:07:19 +00:00
Elena Demikhovsky
22debdcab6 X86-MPX: Implemented encoding for MPX instructions.
Added encoding tests.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239403 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-09 13:02:10 +00:00
Matt Arsenault
d99ce2f630 MC: Add target hook to control symbol quoting
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239370 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-09 00:31:39 +00:00
Reid Kleckner
38a2b24c12 [WinEH] Cache declarations of frame intrinsics
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239361 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-08 22:43:32 +00:00
Keno Fischer
4332f869bf [InstrInfo] Refactor foldOperandImpl to thread through InsertPt. NFC
Summary:
This was a longstanding FIXME and is a necessary precursor to cases
where foldOperandImpl may have to create more than one instruction
(e.g. to constrain a register class). This is the split out NFC changes from
D6262.

Reviewers: pete, ributzka, uweigand, mcrosier

Reviewed By: mcrosier

Subscribers: mcrosier, ted, llvm-commits

Differential Revision: http://reviews.llvm.org/D10174

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239336 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-08 20:09:58 +00:00
Matthias Braun
b0d6c659b7 X86: Reject register operands with obvious type mismatches.
While we have some code to transform specification like {ax} into
{eax}/{rax} if the operand type isn't 16bit, we should reject cases
where there is no sane way to do this, like the i128 type in the
example.

Related to rdar://21042280

Differential Revision: http://reviews.llvm.org/D10260

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239309 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-08 16:56:23 +00:00
Igor Breger
17e24879cb AVX-512: Implemented 256/128bit VALIGND/Q instructions for SKX and KNL
Implemented DAG lowering for all these forms.
Added tests for DAG lowering and encoding.

Differential Revision: http://reviews.llvm.org/D10310

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239300 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-08 14:03:17 +00:00
Simon Pilgrim
4c4f0921dc [X86] Added BitScanForward/BitScanReverse memory folding + tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239257 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-07 18:34:25 +00:00
Rafael Espindola
dcb11d3206 Handle 16 bit PC relative relocations.
Fixes pr23771.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239214 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-06 02:29:56 +00:00
Jim Grosbach
eafe465f2a MC: Clean up the naming for MCMachObjectWriter. NFC.
s/ExecutePostLayoutBinding/executePostLayoutBinding/
s/ComputeSymbolTable/computeSymbolTable/
s/BindIndirectSymbols/bindIndirectSymbols/
s/RecordTLVPRelocation/recordTLVPRelocation/
s/RecordScatteredRelocation/recordScatteredRelocation/
s/WriteLinkerOptionsLoadCommand/writeLinkerOptionsLoadCommand/
s/WriteLinkeditLoadCommand/writeLinkeditLoadCommand/
s/WriteNlist/writeNlist/
s/WriteDysymtabLoadCommand/writeDysymtabLoadCommand/
s/WriteSymtabLoadCommand/writeSymtabLoadCommand/
s/WriteSection/writeSection/
s/WriteSegmentLoadCommand/writeSegmentLoadCommand/
s/WriteHeader/writeHeader/

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239119 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-04 23:25:54 +00:00
Charles Davis
3e407efb8b [Target/X86] Don't use callee-saved registers in a Win64 tail call on non-Windows.
Summary:
A small bit that I missed when I updated the X86 backend to account for
the Win64 calling convention on non-Windows. Now we don't use dead
non-volatile registers when emitting a Win64 indirect tail call on
non-Windows.

Should fix PR23710.

Test Plan: Added test for the correct behavior based on the case I posted to PR23710.

Reviewers: rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10258

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239111 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-04 22:50:05 +00:00
Jim Grosbach
bc81286cac MC: Clean up naming in MCObjectWriter. NFC.
s/WriteObject/writeObject/
s/RecordRelocation/recordRelocation/
s/IsSymbolRefDifferenceFullyResolved/isSymbolRefDifferenceFullyResolved/
s/Write8/write8/
s/WriteLE16/writeLE16/
s/WriteLE32/writeLE32/
s/WriteLE64/writeLE64/
s/WriteBE16/writeBE16/
s/WriteBE32/writeBE32/
s/WriteBE64/writeBE64/
s/Write16/write16/
s/Write32/write32/
s/Write64/write64/
s/WriteZeroes/writeZeroes/
s/WriteBytes/writeBytes/

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239108 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-04 22:24:41 +00:00
Jim Grosbach
aa48bf4e1c MC: Remove obsolete MachO UseAggressiveSymbolFolding.
Fix the FIXME and remove this old as(1) compat option. It was useful for
bringup of the integrated assembler to diff object files, but now it's
just causing more relocations than strictly necessary to be generated.

rdar://21201804

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239084 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-04 20:27:42 +00:00
Daniel Sanders
6ff6fc6055 Replace string GNU Triples with llvm::Triple in MCAsmInfo subclasses and create*AsmInfo(). NFC.
Summary:
This is the first of several patches to eliminate StringRef forms of GNU
triples from the internals of LLVM. After this is complete, GNU triples
will be replaced by a more authoratitive representation in the form of
an LLVM TargetTuple.

Reviewers: rengolin

Reviewed By: rengolin

Subscribers: ted, llvm-commits, rengolin, jholewinski

Differential Revision: http://reviews.llvm.org/D10236


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239036 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-04 13:12:25 +00:00
Elena Demikhovsky
0880fe5997 AVX-512: I brought back vector-shuffle-512-v8.ll test.
I re-generated it after all AVX-512 shuffle optimizations.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239026 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-04 07:49:56 +00:00
Elena Demikhovsky
1bbb64b206 AVX-512: added all SKX forms of VPERMW/D/Q instructions.
Added all forms of VPERMPS/PD instrcuctions.
Added encoding tests.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239016 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-04 07:07:13 +00:00
Elena Demikhovsky
693d40eec0 Removed {}, NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239014 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-04 07:01:29 +00:00
Sanjay Patel
e4e5cf5a66 make reciprocal estimate code generation more flexible by adding command-line options (3rd try)
The first try (r238051) to land this was reverted due to ExecutionEngine build failure;
that was hopefully addressed by r238788.

The second try (r238842) to land this was reverted due to BUILD_SHARED_LIBS failure;
that was hopefully addressed by r238953.

This patch adds a TargetRecip class for processing many recip codegen possibilities.
The class is intended to handle both command-line options to llc as well
as options passed in from a front-end such as clang with the -mrecip option.

The x86 backend is updated to use the new functionality.
Only -mcpu=btver2 with -ffast-math should see a functional change from this patch.
All other x86 CPUs continue to *not* use reciprocal estimates by default with -ffast-math.

Differential Revision: http://reviews.llvm.org/D8982



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239001 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-04 01:32:35 +00:00
Asaf Badouh
ce375dc63a re-apply 238809
AVX-512: Implemented GETEXP instruction for KNL and SKX
Added rounding mode modifier for SQRTPS/PD
Added tests for encoding and intrinsics.
CR:
http://reviews.llvm.org/D9991


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238923 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-03 13:41:48 +00:00
Elena Demikhovsky
fc28da72f0 AVX-512: More code improvements in shuffles, NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238919 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-03 12:05:03 +00:00
Elena Demikhovsky
10eb2dd9df AVX-512: VSHUFPD instruction selection - code improvements
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238918 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-03 11:21:01 +00:00
Elena Demikhovsky
23dc4bbf1d AVX-512: Implemented SHUFF32x4/SHUFF64x2/SHUFI32x4/SHUFI64x2 instructions for SKX and KNL.
Added tests for encoding.

By Igor Breger (igor.breger@intel.com)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238917 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-03 10:56:40 +00:00
Elena Demikhovsky
49659f6378 X86: Added MPX feature and bound registers.
Intel® Memory Protection Extensions (Intel® MPX) is a new feature in Skylake.
It is a part of KNL and SKX sets. It is also a part of Skylake client.

I added definition of %bnd0 - %bnd3 registers, each register is a pair of 64-bit integers.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238916 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-03 10:30:57 +00:00
Simon Pilgrim
dd5cde6e60 [X86] Removed (unused) FSRL x86 operation
This patch removes the old X86ISD::FSRL op - which allowed float vectors to use the byte right shift operations (causing a domain switch....).

Since the refactoring of the shuffle lowering code this no longer has any use.

Differential Revision: http://reviews.llvm.org/D10169

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238906 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-03 08:32:36 +00:00
Rafael Espindola
a0bcb4184b Revert "make reciprocal estimate code generation more flexible by adding command-line options (2nd try)"
This reverts commit r238842.

It broke -DBUILD_SHARED_LIBS=ON build.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238900 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-03 05:32:44 +00:00
Rafael Espindola
2bff0d30f8 Avoid a call to getOrCreateSymbol when we already have the symbol.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238890 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-03 00:02:40 +00:00
Sanjay Patel
871beb8dd7 make reciprocal estimate code generation more flexible by adding command-line options (2nd try)
The first try (r238051) to land this was reverted due to bot failures
that were hopefully addressed by r238788.

This patch adds a TargetRecip class for processing many recip codegen possibilities.
The class is intended to handle both command-line options to llc as well
as options passed in from a front-end such as clang with the -mrecip option.

The x86 backend is updated to use the new functionality.
Only -mcpu=btver2 with -ffast-math should see a functional change from this patch.
All other x86 CPUs continue to *not* use reciprocal estimates by default with -ffast-math.

Differential Revision: http://reviews.llvm.org/D8982



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238842 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-02 15:28:15 +00:00
Elena Demikhovsky
6628cb50bd AVX-512: Implemented VRANGESD and VRANGESS instructions for SKX Implemented DAG lowering for all these forms.
Added tests for encoding.

By Igor Breger (igor.breger@intel.com)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238834 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-02 14:12:54 +00:00
Elena Demikhovsky
ccbc17f896 AVX-512: Shorten implementation of lowerV16X32VectorShuffle()
using lowerVectorShuffleWithSHUFPS() and other shuffle-helpers routines.
Added matching of VALIGN instruction.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238830 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-02 13:43:18 +00:00
Elena Demikhovsky
d929045eb5 AVX-512: Implemented VFIXUPIMMSD and VFIXUPIMMSS instructions for KNL
Implemented DAG lowering for all these forms.
Added tests for encoding.

By Igor Breger (igor.breger@intel.com)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238811 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-02 08:28:57 +00:00
Asaf Badouh
aa9e1c528b revert 238809
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238810 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-02 07:45:19 +00:00
Asaf Badouh
82fa06895e AVX-512: Implemented GETEXP instruction for KNL and SKX
Added rounding mode modifier for SQRTPS/PD
Added tests for encoding and intrinsics.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238809 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-02 07:18:14 +00:00
Elena Demikhovsky
bbd7cab2b9 AVX-512: Optimized vector shuffle for v16f32 and v16i32 types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238743 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-01 13:26:18 +00:00
Elena Demikhovsky
af0e519127 AVX-512: Implemented VRANGEPD and VRANGEPD instructions for SKX.
Implemented DAG lowering for all these forms.
Added tests for encoding.

By Igor Breger (igor.breger@intel.com)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238738 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-01 11:05:34 +00:00
Elena Demikhovsky
aa62d8a6b2 AVX-512: Implemented vector shuffle lowering for v8i64 and v8f64 types.
I removed the vector-shuffle-512-v8.ll, it is auto-generated test, not valid any more.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238735 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-01 09:49:53 +00:00
Elena Demikhovsky
8e12b59b13 AVX-512: added all forms of VPSHUFD and VPSHUFHW, VPSHUFLW
including encodings.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238729 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-01 07:17:23 +00:00
Elena Demikhovsky
9029910d9f AVX-512: Implemented VFIXUPIMMPD and VFIXUPIMMPS instructions for KNL and SKX
Implemented DAG lowering for all these forms.
Added tests for encoding.

by Igor Breger (igor.breger@intel.com)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238728 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-01 06:50:49 +00:00
Elena Demikhovsky
9f63519857 AVX-512: Fixed a bug in compress and expand intrinsics.
By Igor Breger (igor.breger@intel.com)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238724 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-01 06:30:13 +00:00
Matt Arsenault
5f3a6430d6 Add address space argument to isLegalAddressingMode
This is important because of different addressing modes
depending on the address space for GPU targets.

This only adds the argument, and does not update
any of the uses to provide the correct address space.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238723 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-01 05:31:59 +00:00
Rafael Espindola
481f35f113 Simplify another function that doesn't fail.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238703 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-01 00:27:26 +00:00
Simon Pilgrim
08786dc314 Stripped trailing whitespace. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238654 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-30 13:01:42 +00:00
Chandler Carruth
fa68750e54 [x86] Unify the horizontal adding used for popcount lowering taking the
best approach of each.

For vNi16, we use SHL + ADD + SRL pattern that seem easily the best.

For vNi32, we use the PUNPCK + PSADBW + PACKUSWB pattern. In some cases
there is a huge improvement with this in IACA's estimated throughput --
over 2x higher throughput!!!! -- but the measurements are too good to be
true. In one narrow case, the SHL + ADD + SHL + ADD + SRL pattern looks
slightly faster, but I'm not sure I believe any of the measurements at
this point. Both are the exact same uops though. Hard to be confident of
anything past that.

If anyone wants to collect very detailed (Agner-level) timings with the
result of this patch, or with the i32 case replaced with SHL + ADD + SHl
+ ADD + SRL, I'd be very interested. Note that you'll need to test it on
both Ivybridge and Haswell, with both SSE3, SSSE3, and AVX selected as
I saw unique behavior in each of these buckets with IACA all of which
should be checked against measured performance.

But this patch is still a useful improvement by dropping duplicate work
and getting the much nicer PSADBW lowering for v2i64.

I'd still like to rephrase this in terms of generic horizontal sum. It's
a bit lame to have a special case of that just for popcount.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238652 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-30 10:35:03 +00:00
Chandler Carruth
da8bb20158 [x86] Split out the horizontal byte sum lowering component of the LUT
lowering into a helper function.

NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238650 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-30 09:46:16 +00:00
Chandler Carruth
3279f2381b [x86] Replace the long spelling of getting a bitcast with the *much*
shorter one. NFC.

In addition to being much shorter to type and requiring fewer arguments,
this change saves over 30 lines from this one file, all wasted on total
boilerplate...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238640 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-30 04:23:13 +00:00
Chandler Carruth
b26a073acb [x86] Replace the long spelling of getting a bitcast with the new short
spelling. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238639 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-30 04:19:57 +00:00
Chandler Carruth
89a133960b [sdag] Add the helper I most want to the DAG -- building a bitcast
around a value using its existing SDLoc.

Start using this in just one function to save omg lines of code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238638 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-30 04:14:10 +00:00
Chandler Carruth
d8018eeac9 [x86] Restore the bitcasts I removed when refactoring this to avoid
shifting vectors of bytes as x86 doesn't have direct support for that.

This removes a bunch of redundant masking in the generated code for SSE2
and SSE3.

In order to avoid the really significant code size growth this would
have triggered, I also factored the completely repeatative logic for
shifting and masking into two lambdas which in turn makes all of this
much easier to read IMO.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238637 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-30 04:05:11 +00:00
Chandler Carruth
828f5b807c [x86] Implement a faster vector population count based on the PSHUFB
in-register LUT technique.

Summary:
A description of this technique can be found here:
http://wm.ite.pl/articles/sse-popcount.html

The core of the idea is to use an in-register lookup table and the
PSHUFB instruction to compute the population count for the low and high
nibbles of each byte, and then to use horizontal sums to aggregate these
into vector population counts with wider element types.

On x86 there is an instruction that will directly compute the horizontal
sum for the low 8 and high 8 bytes, giving vNi64 popcount very easily.
Various tricks are used to get vNi32 and vNi16 from the vNi8 that the
LUT computes.

The base implemantion of this, and most of the work, was done by Bruno
in a follow up to D6531. See Bruno's detailed post there for lots of
timing information about these changes.

I have extended Bruno's patch in the following ways:

0) I committed the new tests with baseline sequences so this shows
   a diff, and regenerated the tests using the update scripts.

1) Bruno had noticed and mentioned in IRC a redundant mask that
   I removed.

2) I introduced a particular optimization for the i32 vector cases where
   we use PSHL + PSADBW to compute the the low i32 popcounts, and PSHUFD
   + PSADBW to compute doubled high i32 popcounts. This takes advantage
   of the fact that to line up the high i32 popcounts we have to shift
   them anyways, and we can shift them by one fewer bit to effectively
   divide the count by two. While the PSHUFD based horizontal add is no
   faster, it doesn't require registers or load traffic the way a mask
   would, and provides more ILP as it happens on different ports with
   high throughput.

3) I did some code cleanups throughout to simplify the implementation
   logic.

4) I refactored it to continue to use the parallel bitmath lowering when
   SSSE3 is not available to preserve the performance of that version on
   SSE2 targets where it is still much better than scalarizing as we'll
   still do a bitmath implementation of popcount even in scalar code
   there.

With #1 and #2 above, I analyzed the result in IACA for sandybridge,
ivybridge, and haswell. In every case I measured, the throughput is the
same or better using the LUT lowering, even v2i64 and v4i64, and even
compared with using the native popcnt instruction! The latency of the
LUT lowering is often higher than the latency of the scalarized popcnt
instruction sequence, but I think those latency measurements are deeply
misleading. Keeping the operation fully in the vector unit and having
many chances for increased throughput seems much more likely to win.

With this, we can lower every integer vector popcount implementation
using the LUT strategy if we have SSSE3 or better (and thus have
PSHUFB). I've updated the operation lowering to reflect this. This also
fixes an issue where we were scalarizing horribly some AVX lowerings.

Finally, there are some remaining cleanups. There is duplication between
the two techniques in how they perform the horizontal sum once the byte
population count is computed. I'm going to factor and merge those two in
a separate follow-up commit.

Differential Revision: http://reviews.llvm.org/D10084

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238636 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-30 03:20:59 +00:00
Chandler Carruth
43d1e87d73 [x86] Restructure the parallel bitmath lowering of popcount into
a separate routine, generalize it to work for all the integer vector
sizes, and do general code cleanups.

This dramatically improves lowerings of byte and short element vector
popcount, but more importantly it will make the introduction of the
LUT-approach much cleaner.

The biggest cleanup I've done is to just force the legalizer to do the
bitcasting we need. We run these iteratively now and it makes the code
much simpler IMO. Other changes were minor, and mostly naming and
splitting things up in a way that makes it more clear what is going on.

The other significant change is to use a different final horizontal sum
approach. This is the same number of instructions as the old method, but
shifts left instead of right so that we can clear everything but the
final sum with a single shift right. This seems likely better than
a mask which will usually have to read the mask from memory. It is
certaily fewer u-ops. Also, this will be temporary. This and the LUT
approach share the need of horizontal adds to finish the computation,
and we have more clever approaches than this one that I'll switch over
to.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238635 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-30 03:20:55 +00:00
Jim Grosbach
586c0042da MC: Clean up MCExpr naming. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238634 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-30 01:25:56 +00:00
Reid Kleckner
bfa311df8c [WinEH] Adjust the 32-bit SEH prologue to better match reality
It turns out that _except_handler3 and _except_handler4 really use the
same stack allocation layout, at least today. They just make different
choices about encoding the LSDA.

This is in preparation for lowering the llvm.eh.exceptioninfo().

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238627 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-29 22:57:46 +00:00
Reid Kleckner
f0e3e4cd84 Disable FP elimination in funcs using 32-bit MSVC EH personalities
The value in 'ebp' acts as an implicit argument to the outlined
handlers, and is recovered with frameaddress(1).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238619 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-29 21:58:11 +00:00
Rafael Espindola
cfac75ad0e Remove getData.
This completes the mechanical part of merging MCSymbol and MCSymbolData.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238617 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-29 21:45:01 +00:00
Reid Kleckner
38a2e49d1c Only add the EH state insertion pass on 32-bit Windows
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238612 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-29 20:43:10 +00:00
Rafael Espindola
5760c5fe31 Remove the MCSymbolData typedef.
The getData member function is next.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238611 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-29 20:41:47 +00:00
Reid Kleckner
16e4a624c4 [WinEH] Emit EH tables for __CxxFrameHandler3 on 32-bit x86
Small (really small!) C++ exception handling examples work on 32-bit x86
now.

This change disables the use of .seh_* directives in WinException when
CFI is not in use. It also uses absolute symbol references in the tables
instead of imagerel32 relocations.

Also fixes a cache invalidation bug in MMI personality classification.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238575 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-29 17:00:57 +00:00
Matthias Braun
e67bd6c248 CodeGen: Use mop_iterator instead of MIOperands/ConstMIOperands
MIOperands/ConstMIOperands are classes iterating over the MachineOperand
of a MachineInstr, however MachineInstr::mop_iterator does the same
thing.

I assume these two iterators exist to have a uniform interface to
iterate over the operands of a machine instruction bundle and a single
machine instruction. However in practice I find it more confusing to have 2
different iterator classes, so this patch transforms (nearly all) the
code to use mop_iterators.

The only exception being MIOperands::anlayzePhysReg() and
MIOperands::analyzeVirtReg() still needing an equivalent, I leave that
as an exercise for the next patch.

Differential Revision: http://reviews.llvm.org/D9932

This version is slightly modified from the proposed revision in that it
introduces MachineInstr::getOperandNo to avoid the extra counting
variable in the few loops that previously used MIOperands::getOperandNo.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238539 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-29 02:56:46 +00:00
Reid Kleckner
5f50442d79 [WinEH] Start inserting state number stores for C++ EH
This moves all the state numbering code for C++ EH to WinEHPrepare so
that we can call it from the X86 state numbering IR pass that runs
before isel.

Now we just call the same state numbering machinery and insert a bunch
of stores. It also populates MachineModuleInfo with information about
the current function.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238514 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-28 22:00:24 +00:00
Rafael Espindola
9886da621d Remove a trivial forwarding function. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238506 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-28 21:36:02 +00:00
Reid Kleckner
cb95e9ef45 Remove debug prints from r238487
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238501 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-28 21:23:53 +00:00
Reid Kleckner
7738ecd62b Disable x86 tail call optimizations that jump through GOT
For x86 targets, do not do sibling call optimization when materializing
the callee's address would require a GOT relocation. We can still do
tail calls to internal functions, hidden functions, and protected
functions, because they do not require this kind of relocation. It is
still possible to get GOT relocations when the user explicitly asks for
it with musttail or -tailcallopt, both of which are supposed to
guarantee TCO.

Based on a patch by Chih-hung Hsieh.

Reviewers: srhines, timmurray, danalbert, enh, void, nadav, rnk

Subscribers: joerg, davidxl, llvm-commits

Differential Revision: http://reviews.llvm.org/D9799

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238487 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-28 20:44:28 +00:00
Reid Kleckner
9417bdcc55 [WinEH] Remove debugging dump() call
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238472 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-28 20:02:05 +00:00
Elena Demikhovsky
d56dcc4243 AVX-512: Fixed a bug in extracting subvector from v64i1
By Igor Breger (igor.breger@intel.com)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238322 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-27 14:09:33 +00:00
Elena Demikhovsky
078088b790 AVX-512: Implemented all forms of sign-extend and zero-extend instructions for KNL and SKX
Implemented DAG lowering for all these forms.
Added tests for DAG lowering and encoding.

By Igor Breger (igor.breger@intel.com)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238301 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-27 08:15:19 +00:00
Quentin Colombet
60c91c28e4 [X86] Implement the support for shrink-wrapping.
With this patch the x86 backend is now shrink-wrapping capable
and this functionality can be tested by using the
-enable-shrink-wrap switch.

The next step is to make more test and enable shrink-wrapping by
default for x86.

Related to <rdar://problem/20821487>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238293 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-27 06:28:41 +00:00
Rafael Espindola
890a876e0e Print "lock \t foo" instead of "lock \n foo".
This gets gas and llc -filetype=obj to agree on the order of prefixes.

For llvm-mc we need to fix the asm parser to know that it makes a difference
on which line the "lock" is in.

Part of pr23594.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238232 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-26 18:35:10 +00:00
Elena Demikhovsky
001a2ba63f AVX-512: fixed a bug in arithmetic operations lowering for i1 type
https://llvm.org/bugs/show_bug.cgi?id=23630



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238198 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-26 12:37:17 +00:00
Elena Demikhovsky
55fd78065f AVX-512: fixed a bug in lowering VSELECT for 512-bit vector
https://llvm.org/bugs/show_bug.cgi?id=23634



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238195 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-26 11:32:39 +00:00
Michael Kuperstein
d714fcf5c8 Use std::bitset for SubtargetFeatures.
Previously, subtarget features were a bitfield with the underlying type being uint64_t. 
Since several targets (X86 and ARM, in particular) have hit or were very close to hitting this bound, switching the features to use a bitset.
No functional change.

The first several times this was committed (e.g. r229831, r233055), it caused several buildbot failures.
Apparently the reason for most failures was both clang and gcc's inability to deal with large numbers (> 10K) of bitset constructor calls in tablegen-generated initializers of instruction info tables. 
This should now be fixed.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238192 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-26 10:47:10 +00:00
Rafael Espindola
1826cd69f3 Stop using MCSectionData in MCMachObjectWriter.h.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238165 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-26 01:15:30 +00:00
Rafael Espindola
504473e6ae Stop using MCSectionData in MCExpr.h.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238163 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-26 00:52:18 +00:00
Rafael Espindola
f363960679 Return a MCSection from MCFragment::getParent().
Another step in merging MCSectionData and MCSection.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238162 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-26 00:36:57 +00:00
Simon Pilgrim
4da23583b6 [X86][AVX2] Vectorized i16 shift operators
Part of D9474, this patch extends AVX2 v16i16 types to 2 x 8i32 vectors and uses i32 shift variable shifts before packing back to i16.

Adds AVX2 tests for v8i16 and v16i16 

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238149 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-25 17:49:13 +00:00
Rafael Espindola
8823110a85 Stop forwarding getOrdinal and setOrdinal.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238139 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-25 14:12:48 +00:00
Michael Kuperstein
8ffbb68a86 [X86] When pattern-matching scalar FMA3 intrinsics, don't re-arrange the first and second operands.
The semantics of the scalar FMA intrinsics are that the high vector elements are copied from the first source.
The existing pattern switches src1 and src2 around, to match the "213" order, which ends up tying the original src2 to the dest. Since the actual scalar fma3 instructions copy the high elements from the dest register, the wrong values are copied.

This modifies the pattern to leave src1 and src2 in their original order.

Differential Revision: http://reviews.llvm.org/D9908

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238131 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-25 12:35:25 +00:00
Elena Demikhovsky
17b7d6bf25 Added promotion to EXTRACT_SUBVECTOR operand.
I encountered with this case in one of KNL tests for i1 vectors.
v16i1 = EXTRACT_SUBVECTOR v32i1, x



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238130 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-25 11:33:13 +00:00
NAKAMURA Takumi
4d3b6d43cc Reformat.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238126 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-25 01:43:34 +00:00