Commit Graph

8855 Commits

Author SHA1 Message Date
Benjamin Kramer
2f8a6cdfa3 X86: Turn mul of <4 x i32> into pmuludq when no SSE4.1 is available.
pmuludq is slow, but it turns out that all the unpacking and packing of the
scalarized mul is even slower. 10% speedup on loop-vectorized paq8p.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170985 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-22 16:07:56 +00:00
Benjamin Kramer
17347912b4 X86: Emit vector sext as shuffle + sra if vpmovsx is not available.
Also loosen the SSSE3 dependency a bit, expanded pshufb + psra is still better
than scalarized loads. Fixes PR14590.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170984 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-22 11:34:28 +00:00
Nadav Rotem
d0696ef8c3 In some cases, due to scheduling constraints we copy the EFLAGS.
The only way to read the eflags is using push and pop. If we don't
adjust the stack then we run over the first frame index. This is
not something that we want to do, so we have to make sure that
our machine function does not copy the flags. If it does then
we have to emit the prolog that adjusts the stack.

rdar://12896831



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170961 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-21 23:48:49 +00:00
Benjamin Kramer
2556c6b4b6 X86: Match pmin/pmax as a target specific dag combine. This occurs during vectorization.
Part of PR14667.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170908 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-21 17:46:58 +00:00
Benjamin Kramer
739c7a83e1 X86: Match the SSE/AVX min/max vector ops using a custom node instead of intrinsics
This is very mechanical, no functionality change. Preparation for PR14667.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170898 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-21 14:04:55 +00:00
Nadav Rotem
042a9a2666 Add a missing "virtual" keyword.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170842 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-21 05:02:12 +00:00
Nadav Rotem
f5637c3997 Improve the X86 cost model for loads and stores.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170830 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-21 01:33:59 +00:00
Jakob Stoklund Olesen
be06aacaa9 Add an MF argument to MI::copyImplicitOps().
This function is often used to decorate dangling instructions, so a
context reference is required to allocate memory for the operands.

Also add a corresponding MachineInstrBuilder method.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170797 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-20 22:54:02 +00:00
Roman Divacky
6af228a92a Remove MCTargetAsmLexer and its derived classes now that edis,
its only user, is gone.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170699 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-20 14:43:30 +00:00
Richard Smith
ba836a2e80 Fix use-before-construction of X86TargetLowering.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170654 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-20 04:04:17 +00:00
Jim Grosbach
fbf3b4a076 MC: Add MCInstrDesc::mayAffectControlFlow() method.
MC disassembler clients (LLDB) are interested in querying if an
instruction may affect control flow other than by virtue of being
an explicit branch instruction. For example, instructions which
write directly to the PC on some architectures.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170610 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-19 23:38:53 +00:00
Jakob Stoklund Olesen
37a942cd52 Remove the explicit MachineInstrBuilder(MI) constructor.
Use the version that also takes an MF reference instead.

It would technically be possible to extract an MF reference from the MI
as MI->getParent()->getParent(), but that would not work for MIs that
are not inserted into any basic block.

Given the reasonably small number of places this constructor was used at
all, I preferred the compile time check to a run time assertion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170588 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-19 21:31:56 +00:00
Roman Divacky
759e3fa641 Remove edis - the enhanced disassembler. Fixes PR14654.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170578 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-19 19:55:47 +00:00
Paul Redmond
6da2e22dff Transform (x&C)>V into (x&C)!=0 where possible
When the least bit of C is greater than V, (x&C) must be greater than V
if it is not zero, so the comparison can be simplified.

Although this was suggested in Target/X86/README.txt, it benefits any
architecture with a directly testable form of AND.

Patch by Kevin Schoedel


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170576 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-19 19:47:13 +00:00
Patrik Hagglund
e5c65911a6 Change TargetLowering::getTypeForExtArgOrReturn to take and return
MVTs, instead of EVTs.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170537 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-19 12:02:25 +00:00
Patrik Hagglund
dfcf33a287 Change TargetLowering::RegisterTypeForVT to contain MVTs, instead of
EVTs.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170535 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-19 11:48:16 +00:00
Patrik Hagglund
0340557fb8 Change TargetLowering::findRepresentativeClass to take an MVT, instead
of EVT.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170532 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-19 11:30:36 +00:00
NAKAMURA Takumi
16537418f4 X86ISelLowering.cpp: Fix warnings. [-Wlogical-op-parentheses]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170523 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-19 10:12:48 +00:00
Elena Demikhovsky
4b977312c7 Optimized load + SIGN_EXTEND patterns in the X86 backend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170506 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-19 07:50:20 +00:00
Bill Wendling
034b94b170 Rename the 'Attributes' class to 'Attribute'. It's going to represent a single attribute in the future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170502 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-19 07:18:57 +00:00
Jakub Staszak
270bfbd3d1 Reverse order of checking SSE level when calculating compare cost, so we check
AVX2 before AVX.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170464 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-18 22:57:56 +00:00
Craig Topper
a521e68210 Remove EFLAGS from the BLSI/BLSMSK/BLSR patterns. The nodes created by DAG combine don't contain an EFLAGS def.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170308 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-17 06:13:48 +00:00
Craig Topper
b926afcc5b Simplify BMI ANDN matching to use patterns instead of a DAG combine. Also add ANDN to isDefConvertible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170305 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-17 05:12:30 +00:00
Craig Topper
b72ae70036 Add rest of BMI/BMI2 instructions to the folding tables as well as popcnt and lzcnt.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170304 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-17 05:02:29 +00:00
Craig Topper
16a1acc3b9 Remove store forms of DEC/INC from isDefConvertible. Since they are stores they don't have a register def.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170303 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-17 04:55:07 +00:00
Benjamin Kramer
388fc6a988 X86: Add a couple of target-specific dag combines that turn VSELECTS into psubus if possible.
We match the pattern "x >= y ? x-y : 0" into "subus x, y" and two special cases
if y is a constant. DAGCombiner canonicalizes those so we first have to undo the
canonicalization for those cases. The pattern occurs in gzip when the loop
vectorizer is enabled. Part of PR14613.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170273 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-15 16:47:44 +00:00
Chandler Carruth
5db4bceb47 Make '-mtune=x86_64' assume fast unaligned memory accesses.
Not all chips targeted by x86_64 have this feature, but a dramatically
increasing number do. Specifying a chip-specific tuning parameter will
continue to turn the feature on or off as appropriate for that
particular chip, but the generic flag should try to achieve the best
performance on the most widely available hardware. Today, the number of
chips with fast UA access dwarfs those without in the x86-64 space.

Note that this also brings LLVM's code generation for this '-march' flag
more in line with that of modern GCCs. Reviewed by Dan Gohman.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170269 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-15 09:01:13 +00:00
Nadav Rotem
0a1e914f8f TypeLegalizer: Do not generate target specific nodes with illegal types, because we cant type-legalize them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170245 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-14 21:20:37 +00:00
Eli Bendersky
e1d31008c9 Fix a bogus comment
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170052 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-13 00:24:56 +00:00
Evan Cheng
946a3a9f22 Sorry about the churn. One more change to getOptimalMemOpType() hook. Did I
mention the inline memcpy / memset expansion code is a mess?

This patch split the ZeroOrLdSrc argument into two: IsMemset and ZeroMemset.
The first indicates whether it is expanding a memset or a memcpy / memmove.
The later is whether the memset is a memset of zero. It's totally possible
(likely even) that targets may want to do different things for memcpy and
memset of zero.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169959 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-12 02:34:41 +00:00
Evan Cheng
7d34267df6 - Rename isLegalMemOpType to isSafeMemOpType. "Legal" is a very overloade term.
Also added more comments to explain why it is generally ok to return true.
- Rename getOptimalMemOpType argument IsZeroVal to ZeroOrLdSrc. It's meant to
be true for loaded source (memcpy) or zero constants (memset). The poor name
choice is probably some kind of legacy issue.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169954 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-12 01:32:07 +00:00
Evan Cheng
61f4dfe369 Avoid using lossy load / stores for memcpy / memset expansion. e.g.
f64 load / store on non-SSE2 x86 targets.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169944 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-12 00:42:09 +00:00
Patrik Hagglund
34525f9ac0 Revert EVT->MVT changes, r169836-169851, due to buildbot failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169854 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-11 11:14:33 +00:00
Patrik Hagglund
47fd10f2fc Change TargetLowering::getTypeForExtArgOrReturn to take and return
MVTs, instead of EVTs.

Accordingly, add bitsLT (and similar) to MVT.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169850 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-11 10:20:51 +00:00
Patrik Hagglund
2d916231ff Change TargetLowering::RegisterTypeForVT to contain MVTs, instead of
EVTs.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169848 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-11 10:09:23 +00:00
Patrik Hagglund
bade0345d1 Change TargetLowering::findRepresentativeClass to take an MVT, instead
of EVT.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169845 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-11 09:57:18 +00:00
Chad Rosier
425e951734 Fall back to the selection dag isel to select tail calls.
This shouldn't affect codegen for -O0 compiles as tail call markers are not
emitted in unoptimized compiles.  Testing with the external/internal nightly
test suite reveals no change in compile time performance.  Testing with -O1,
-O2 and -O3 with fast-isel enabled did not cause any compile-time or
execution-time failures.  All tests were performed on my x86 machine.
I'll monitor our arm testers to ensure no regressions occur there.

In an upcoming clang patch I will be marking the objc_autoreleaseReturnValue
and objc_retainAutoreleaseReturnValue as tail calls unconditionally.  While
it's theoretically true that this is just an optimization, it's an
optimization that we very much want to happen even at -O0, or else ARC
applications become substantially harder to debug.

Part of rdar://12553082

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169796 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-11 00:18:02 +00:00
Evan Cheng
376642ed62 Some enhancements for memcpy / memset inline expansion.
1. Teach it to use overlapping unaligned load / store to copy / set the trailing
   bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies.
2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g.
   x86 and ARM.
3. When memcpy from a constant string, do *not* replace the load with a constant
   if it's not possible to materialize an integer immediate with a single
   instruction (required a new target hook: TLI.isIntImmLegal()).
4. Use unaligned load / stores more aggressively if target hooks indicates they
   are "fast".
5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8.
   Also increase the threshold to something reasonable (8 for memset, 4 pairs
   for memcpy).

This significantly improves Dhrystone, up to 50% on ARM iOS devices.

rdar://12760078


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169791 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-10 23:21:26 +00:00
Chandler Carruth
6226146f41 Revert "Make '-mtune=x86_64' assume fast unaligned memory accesses."
Accidental commit... git svn betrayed me. Sorry for the noise.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169741 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-10 18:23:52 +00:00
Chandler Carruth
b859d528f3 Make '-mtune=x86_64' assume fast unaligned memory accesses.
Summary:
Not all chips targeted by x86_64 have this feature, but a dramatically
increasing number do. Specifying a chip-specific tuning parameter will
continue to turn the feature on or off as appropriate for that
particular chip, but the generic flag should try to achieve the best
performance on the most widely available hardware. Today, the number of
chips with fast UA access dwarfs those without in the x86-64 space.

Note that this also brings LLVM's code generation for this '-march' flag
more in line with that of modern GCCs.

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D195

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169740 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-10 18:22:42 +00:00
Chandler Carruth
2c0575f2f4 Fix a typo in my previous commit -- bloomfield is 0x1A not 0x2A.
Thanks to the PaX folks for noticing in review! We need some tests here,
any sugestions welcome...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169739 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-10 18:22:40 +00:00
Chandler Carruth
9f3f40f6ef Address a FIXME and update the fast unaligned memory feature for newer
Intel chips.

The model number rules were determined by inspecting Intel's
documentation for their newer chip model numbers. My understanding is
that all of the newer Intel chips have fast unaligned memory access, but
if anyone is concerned about a particular chip, just shout.

No tests updated; it's not clear we have dedicated tests for the chips'
various features, but if anyone would like tests (or can point me at
some existing ones), I'm happy to oblige.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169730 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-10 09:18:44 +00:00
Shuxin Yang
5518a1355b - Re-enable population count loop idiom recognization
- fix a bug which cause sigfault.
- add two testing cases which was causing crash


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169687 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-09 03:12:46 +00:00
Chandler Carruth
7065a2bcec Revert the patches adding a popcount loop idiom recognition pass.
There are still bugs in this pass, as well as other issues that are
being worked on, but the bugs are crashers that occur pretty easily in
the wild. Test cases have been sent to the original commit's review
thread.

This reverts the commits:
  r169671: Fix a logic error.
  r169604: Move the popcnt tests to an X86 subdirectory.
  r168931: Initial commit adding the pass.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169683 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-08 22:18:29 +00:00
Bill Wendling
99faa3b4ec s/AttrListPtr/AttributeSet/g to better label what this class is going to be in the near future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169651 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-07 23:16:57 +00:00
Nadav Rotem
af59e9adbd When we use the BLEND instruction that uses the MSB as a mask, we can remove
the VSRI instruction before it since it does not affect the MSB.

Thanks Craig Topper for suggesting this.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169638 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-07 21:43:11 +00:00
Nadav Rotem
e4ccfef809 X86: Prefer using VPSHUFD over VPERMIL because it has better throughput.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169624 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-07 19:01:13 +00:00
Evan Cheng
2766a47310 Replace r169459 with something safer. Rather than having computeMaskedBits to
understand target implementation of any_extend / extload, just generate
zero_extend in place of any_extend for liveouts when the target knows the
zero_extend will be implicit (e.g. ARM ldrb / ldrh) or folded (e.g. x86 movz).

rdar://12771555


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169536 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-06 19:13:27 +00:00
Jakub Staszak
d3a056392b Remove unneeded function, since PR8156 was fixed over a year ago.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169534 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-06 19:05:46 +00:00
Jakub Staszak
b2af3a095b Simplify code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169521 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-06 18:22:59 +00:00