Commit Graph

30940 Commits

Author SHA1 Message Date
Adam Nemet
fbd0e464dd [AVX512] Intrinsics for vextract*x4
This adds the Pat<>'s for the intrinsics.  These are necessary because we
don't lower these intrinsics to SDNodes but match them directly.  See the
rational in the previous commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219362 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 23:25:37 +00:00
Adam Nemet
e868005a27 [AVX512] Add asm-only support for vextract*x4 masking variants
These derive from the new asm-only masking definitions.

Unfortunately I wasn't able to find a ISel pattern that we could legally
generate for the masking variants.  The problem is that since the destination
is v4* we would need VK4 register classes and v4i1 value types to express the
masking.  These are however not legal types/classes in AVX512f but only in VL,
so things get complicated pretty quickly.  We can revisit this question later
if we have a more pressing need to express something like this.

So the ISel patterns are empty for the masking instructions and the next patch
will add Pat<>s instead to match the intrinsics calls with instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219361 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 23:25:33 +00:00
Adam Nemet
9d0ec9212b [AVX512] Move DAG for all-zero node to X86VectorVTInfo
No functional change.

No change in X86.td.expanded except for the appearance of the new attributes.

The new attributes will be used in the subsequent patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219360 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 23:25:31 +00:00
Adam Nemet
6feb834941 [AVX512] Peel off an asm-only class from AVX512_masking_common.
No functional change.

This enables the generation of masking instructions that don't provide a
ISel pattern.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219358 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 23:25:23 +00:00
Robin Morisset
b79d91ca1c [X86] Don't transform atomic-load-add into an inc/dec when inc/dec is slow
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219357 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 23:16:23 +00:00
Robin Morisset
48dfa127d7 [X86] Avoid generating inc/dec when slow for x.atomic_store(1 + x.atomic_load())
Summary:
I had forgotten to check for NotSlowIncDec in the patterns that can generate
inc/dec for the above pattern (added in D4796).
This currently applies to Atom Silvermont, KNL and SKX.

Test Plan: New checks on atomic_mi.ll

Reviewers: jfb, nadav

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5677

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219336 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 19:38:18 +00:00
Robert Khasanov
0e3754615e [AVX512] Added intrinsics for 128-, 256- and 512-bit versions of VPCMP/VPCMPU{BWDQ}
Added CMP_MASK_CC intrinsic type.
Added tests for intrinsics.

Patch by Sergey Lisitsyn <sergey.lisitsyn@intel.com>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219316 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 15:49:26 +00:00
Robert Khasanov
e659ba92c8 [AVX512] Refactoring of avx512_binop_rm multiclass through AVX512_masking.
Added new argrument for AVX512_masking: InstrItinClass and bit isCommutable.
No functional change.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219310 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 14:37:45 +00:00
Renato Golin
1e059a88f8 Emit unaligned access build attribute for ARM
Patch by Charlie Turner.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219301 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 12:26:22 +00:00
Renato Golin
ef2ed3f465 Refactor isThumb1Only() && isMClass() into a predicate called isV6M()
This must be enforced for all v6M cores, not just the cortex-m0,
irregardless of the user-specified alignment.

Patch by Charlie Turner.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219300 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 12:26:16 +00:00
Renato Golin
3828392790 Simplify switch statement in ARM subtarget align access
This switch can be reduced to a simpler if/else statement.

Patch by Charlie Turner.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219299 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 12:26:13 +00:00
Eric Christopher
b7cd35b171 Cache TargetLowering on SelectionDAGISel and update previous
calls to getTargetLowering() with the cached variable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219284 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 07:32:17 +00:00
Chad Rosier
2929da99a9 [AArch64] Generate vector signed/unsigned mul and mla/mls long.
Phabricator Revision: http://reviews.llvm.org/D5589
Patch by Balaram Makam <bmakam@codeaurora.org>!!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219276 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 02:31:24 +00:00
Robin Morisset
28127ffb51 [X86] Fix a bug with fetch_add(INT32_MIN)
Summary:
Fix pr21099

The pseudocode of what we were doing (spread through two functions) was:
if (operand.doesNotFitIn32Bits())
  Opc.initializeWithFoo();
if (operand < 0)
  operand = -operand;
if (operand.doesFitIn8Bits())
  Opc.initializeWithBar();
else if (operand.doesFitIn32Bits())
  Opc.initializeWithBlah();
doStuff(Opc);

So for operand == INT32_MIN, Opc was never initialized because the operand changes
from fitting in 32 bits to not fitting, causing the various bugs/error messages
noted by pr21099.

This patch adds an extra test at the beginning for this case, and an
llvm_unreachable to have better error message if the operand ends up
not fitting in 32-bits at the end.

Test Plan: new test + make check

Reviewers: jfb

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5655

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219257 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 23:53:57 +00:00
Tom Stellard
806e08c5a0 R600/SI: Refactor VOP3 instruction defs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219256 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 23:51:41 +00:00
Tom Stellard
96c9ebbd04 R600/SI: Refactor VOPC instruction defs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219255 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 23:51:39 +00:00
Tom Stellard
40949737c4 R600/SI: Refactor VOP2 instruction defs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219254 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 23:51:38 +00:00
Tom Stellard
6bf4c41242 R600/SI: Refactor VOP1 instruction defs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219253 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 23:51:34 +00:00
Matt Arsenault
1ba00eab26 R600: Remove dead code
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219242 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 21:29:56 +00:00
Tom Stellard
ae06e97b3b R600: Remove some redundant initializations from AMDGPUMCAsmInfo
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219238 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 21:09:25 +00:00
Tom Stellard
9eb63e54db R600: Use MCAsmInfoELF as AMDGPUMCAsmInfo base class
The main reason for this is that the MCAsmInfo class,
which we were previously using as the base class, sets
PrivateGlobalPrefix to "L", which causes all global
functions that start with L to be treated as local symbols.

MCAsmInfoELF sets PrivateGlobalPrefix to ".L", which is what
we want, and it is probably a good idea to use this as the
base class anyway, since we are emitting ELF binaries.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219237 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 21:09:23 +00:00
Tom Stellard
b8fee4f1d9 R600/SI: Remove assertion in SIInstrInfo::areLoadsFromSameBasePtr()
Added a FIXME coment instead, we need to handle the case where the
two DS instructions being compared have different numbers of operands.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219236 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 21:09:20 +00:00
Yuri Gorshenin
86e0844d1c [asan-asm-instrumentation] CFI directives are generated for .S files.
Summary: CFI directives are generated for .S files.

Reviewers: eugenis

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5520

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219199 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 11:03:09 +00:00
Daniel Sanders
75046b4891 [mips] Return {f128} correctly for N32/N64.
Summary:
According to the ABI documentation, f128 and {f128} should both be returned
in $f0 and $f2. However, this doesn't match GCC's behaviour which is to
return f128 in $f0 and $f2, but {f128} in $f0 and $f1.

Reviewers: vmedic

Reviewed By: vmedic

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5578

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219196 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 09:29:59 +00:00
Craig Topper
95717dbb11 [X86] Fix a bug where the disassembler was ignoring the VEX.W bit in 32-bit mode for certain instructions it shouldn't.
Unfortunately, this isn't easy to fix since there's no simple way to figure out from the disassembler tables whether the W-bit is being used to select a 64-bit GPR or if its a required part of the opcode. The fix implemented here just looks for "64" in the instruction name and ignores the W-bit in 32-bit mode if its present.

Fixes PR21169.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219194 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 07:29:50 +00:00
Craig Topper
ada4703cba Formatting fixes. Most putting 'else' on the same line as the preceding curly brace.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219193 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 07:29:48 +00:00
Craig Topper
038a3b8d65 Fix filename in header and use C++ version of the C header files.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219192 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 07:29:46 +00:00
Juergen Ributzka
301d3d04f0 [FastISel][AArch64] Teach the address computation code to also fold sign-/zero-extends.
The code already folds sign-/zero-extends, but only if they are arguments to
mul and shift instructions. This extends the code to also fold them when they
are direct inputs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219187 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 03:40:06 +00:00
Juergen Ributzka
3692081566 [FastISel][AArch64] Teach the address computation to also fold sub instructions.
Tiny enhancement to the address computation code to also fold sub instructions
if the rhs is constant and can be folded into the offset.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219186 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 03:40:03 +00:00
Juergen Ributzka
ca07e256f6 [FastISel][AArch64] Fix "Fold sign-/zero-extends into the load instruction."
This commit fixes an issue with sign-/zero-extending loads that was discovered
by Richard Barton.

We use now the correct load instructions for sign-extending loads to 64bit. Also
updated and added more unit tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219185 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 03:39:59 +00:00
NAKAMURA Takumi
844eeb3741 ARMInstPrinter.cpp: Suppress a warning for -Asserts. [-Wunused-variable]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219172 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-06 23:48:04 +00:00
Tim Northover
ca6c95df36 ARM: silence unused variable warning
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219128 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-06 17:26:36 +00:00
Tim Northover
969587d62d ARM: remove dead InstPrinting code
This instruction form is handled by different AsmOperands now, so the code is
completely dead (and wrong anyway).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219127 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-06 17:10:13 +00:00
Benjamin Kramer
043994e266 X86: Drop the isConvertibleTo3Addr bit from shufps/shufpd now that we don't convert them anymore.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219112 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-06 09:56:40 +00:00
Eric Christopher
e6f32be8df Add subtarget caches to aarch64, arm, ppc, and x86.
These will make it easier to test further changes to the
code generation and optimization pipelines as those are
moved to subtargets initialized with target feature and
target cpu.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219106 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-06 06:45:36 +00:00
Chandler Carruth
1d02acb7a0 [x86] Remove the 2-addr-to-3-addr "optimization" from shufps to pshufd.
This trades a (register-renamer-friendly) movaps for a floating point
/ integer domain cross. That is a very bad trade, even on architectures
where domain crossing is relatively fast. On any chip where there is
even a cycle stall, this is a Very Bad Idea. It doesn't even seem likely
to cause a spill to be introduced because the reason for the copy is to
destructively shuffle in place.

Thanks to Ben Kramer for fixing a bug in this code that my new shuffle
lowering exposed and highlighting that perhaps it should just go away.
=]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219090 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-05 22:57:31 +00:00
Benjamin Kramer
88b3a52eec X86: Don't drop half of the mask when converting 2-address shufps into 3-address pshufd.
It's debatable whether this transform is useful at all, but for now make sure
we don't generate invalid asm.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219084 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-05 16:14:29 +00:00
Elena Demikhovsky
a0cb2c75b0 AVX-512-SKX: Added instruction VPMOVM2B/W/D/Q.
This instruction allows to broadacst mask vector to data vector.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219083 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-05 14:11:08 +00:00
Chandler Carruth
30ae72f02c [x86] Fix PR21139, one of the last remaining regressions found in the
new vector shuffle lowering.

This is loosely based on a patch by Marius Wachtler to the PR (thanks!).
I refactored it a bi to use std::count_if and a mutable array ref but
the core idea was exactly right. I also added some direct testing of
this case.

I believe PR21137 is now the only remaining regression.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219081 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-05 12:07:34 +00:00
Chandler Carruth
a644b090de [x86] Teach the new vector shuffle lowering how to lower 128-bit
shuffles using AVX and AVX2 instructions. This fixes PR21138, one of the
few remaining regressions impacting benchmarks from the new vector
shuffle lowering.

You may note that it "regresses" many of the vperm2x128 test cases --
these were actually "improved" by the naive lowering that the new
shuffle lowering previously did. This regression gave me fits. I had
this patch ready-to-go about an hour after flipping the switch but
wasn't sure how to have the best of both worlds here and thought the
correct solution might be a completely different approach to lowering
these vector shuffles.

I'm now convinced this is the correct lowering and the missed
optimizations shown in vperm2x128 are actually due to missing
target-independent DAG combines. I've even written most of the needed
DAG combine and will submit it shortly, but this part is ready and
should help some real-world benchmarks out.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219079 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-05 11:41:36 +00:00
NAKAMURA Takumi
529fcbed32 HexagonMCCodeEmitter.cpp: Prune 2nd redundant \brief. [-Wdocumentation]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219073 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-05 04:54:54 +00:00
NAKAMURA Takumi
935e4326a5 HexagonDesc: Update LLVMBuild.txt.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219071 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-05 04:54:29 +00:00
Benjamin Kramer
3cdf75c5cf [SystemZ] Make operator bool explicit. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219069 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-04 22:44:35 +00:00
Benjamin Kramer
814a2ffc7c Make AAMDNodes ctor and operator bool (!!!) explicit, mop up bugs and weirdness exposed by it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219068 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-04 22:44:29 +00:00
Benjamin Kramer
dbc6d9b9d7 Remove unnecessary copying or replace it with moves in a bunch of places.
NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219061 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-04 16:55:56 +00:00
Chandler Carruth
03a77831cc [x86] Enable the new vector shuffle lowering by default.
Update the entire regression test suite for the new shuffles. Remove
most of the old testing which was devoted to the old shuffle lowering
path and is no longer relevant really. Also remove a few other random
tests that only really exercised shuffles and only incidently or without
any interesting aspects to them.

Benchmarking that I have done shows a few small regressions with this on
LNT, zero measurable regressions on real, large applications, and for
several benchmarks where the loop vectorizer fires in the hot path it
shows 5% to 40% improvements for SSE2 and SSE3 code running on Sandy
Bridge machines. Running on AMD machines shows even more dramatic
improvements.

When using newer ISA vector extensions the gains are much more modest,
but the code is still better on the whole. There are a few regressions
being tracked (PR21137, PR21138, PR21139) but by and large this is
expected to be a win for x86 generated code performance.

It is also more correct than the code it replaces. I have fuzz tested
this extensively with ISA extensions up through AVX2 and found no
crashes or miscompiles (yet...). The old lowering had a few miscompiles
and crashers after a somewhat smaller amount of fuzz testing.

There is one significant area where the new code path lags behind and
that is in AVX-512 support. However, there was *extremely little*
support for that already and so this isn't a significant step backwards
and the new framework will probably make it easier to implement lowering
that uses the full power of AVX-512's table-based shuffle+blend (IMO).

Many thanks to Quentin, Andrea, Robert, and others for benchmarking
assistance. Thanks to Adam and others for help with AVX-512. Thanks to
Hal, Eric, and *many* others for answering my incessant questions about
how the backend actually works. =]

I will leave the old code path in the tree until the 3 PRs above are at
least resolved to folks' satisfaction. Then I will rip it (and 1000s of
lines of code) out. =] I don't expect this flag to stay around for very
long. It may not survive next week.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219046 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-04 03:52:55 +00:00
Jingyue Wu
ea1cccd0b6 Add fake use to suppress defined-but-unused warnings
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219045 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-04 03:50:10 +00:00
Chandler Carruth
1e663cf69c [x86] Fix a bug in the VZEXT DAG combine that I just made more powerful.
It turns out this combine was always somewhat flawed -- there are cases
where nested VZEXT nodes *can't* be combined: if their types have
a mismatch that can be observed in the result. While none of these show
up in currently, once I switch to the new vector shuffle lowering a few
test cases actually form such nested VZEXT nodes. I've not come up with
any IR pattern that I can sensible write to exercise this, but it will
be covered by tests once I flip the switch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219044 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-04 02:51:03 +00:00
Chandler Carruth
cd2c1d8db1 [x86] Sink a generic combine of VZEXT nodes from the lowering to VZEXT
nodes to the DAG combining of them.

This will allow the combine to fire on both old vector shuffle lowering
and the new vector shuffle lowering and generally seems like a cleaner
design. I've trimmed down the code a bit and tried to make it and the
surrounding combine fairly clean while moving it around.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219042 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-04 01:05:48 +00:00
Matt Arsenault
9747d27b59 R600/SI: Custom lower f64 -> i64 conversions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219038 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 23:54:56 +00:00