12080 Commits

Author SHA1 Message Date
Andrea Di Biagio
53daaff125 [X86] Improved lowering of v4x32 build_vector dag nodes.
This patch improves the lowering of v4f32 and v4i32 build_vector dag nodes
that are known to have at least two non-zero elements.

With this patch, a build_vector that performs a blend with zero is 
converted into a shuffle. This is done to let the shuffle legalizer expand
the dag node in a optimal way. For example, if we know that a build_vector
performs a blend with zero, we can try to lower it as a movq/blend instead of
always selecting an insertps.

This patch also improves the logic that lowers a build_vector into a insertps
with zero masking. See for example the extra test cases added to test sse41.ll.

Differential Revision: http://reviews.llvm.org/D6311


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222375 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-19 19:34:29 +00:00
Tom Stellard
334ebf33ea R600/SI: Make SIInstrInfo::isOperandLegal() more strict
A register operand that has a common sub-class with its instruction's
defined register class is not always legal.  For example,
SReg_32 and M0Reg both have a common sub-class, but we can't
use an SReg_32 in instructions that expect a M0Reg.

This prevents the llvm.SI.sendmsg.ll test from failing when the fold
operand pass is added.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222368 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-19 16:58:49 +00:00
Jozef Kolek
e4e84b22fe [mips][microMIPS] Implement CodeGen support for 16-bit instruction ADDIUR2.
Differential Revision: http://reviews.llvm.org/D5800


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222352 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-19 13:23:58 +00:00
Jozef Kolek
5c6c7e3295 [mips][microMIPS] Implement CodeGen support for ADDIUS5 instruction.
Differential Revision: http://reviews.llvm.org/D5799


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222351 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-19 13:11:09 +00:00
Jozef Kolek
baf97d8987 [mips][microMIPS] Implement SDBBP and RDHWR instructions.
Differential Revision: http://reviews.llvm.org/D5240


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222347 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-19 11:25:50 +00:00
Simon Pilgrim
a6943fff90 [X86][SSE] pslldq/psrldq byte shifts/rotation for SSE2
This patch builds on http://reviews.llvm.org/D5598 to perform byte rotation shuffles (lowerVectorShuffleAsByteRotate) on pre-SSSE3 (palignr) targets - pre-SSSE3 is only enabled on i8 and i16 vector targets where it is a more definite performance gain.

I've also added a separate byte shift shuffle (lowerVectorShuffleAsByteShift) that makes use of the ability of the SLLDQ/SRLDQ instructions to implicitly shift in zero bytes to avoid the need to create a zero register if we had used palignr.

Differential Revision: http://reviews.llvm.org/D5699



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222340 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-19 10:06:49 +00:00
Hao Liu
8db9fbf7cd [AArch64] Enable SeparateConstOffsetFromGEP, EarlyCSE and LICM passes on AArch64 backend.
SeparateConstOffsetFromGEP can gives more optimizaiton opportunities related to GEPs, which benefits EarlyCSE
and LICM. By enabling these passes we can have better address calculations and generate a better addressing
mode. Some SPEC 2006 benchmarks (astar, gobmk, namd) have obvious improvements on Cortex-A57.

Reviewed in http://reviews.llvm.org/D5864.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222331 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-19 06:39:53 +00:00
Weiming Zhao
d8e31c73cd [Aarch64] Customer lowering of CTPOP to SIMD should check for NEON availability
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222292 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-19 00:29:14 +00:00
Matt Arsenault
1bd96c574c R600/SI: Implement areMemAccessesTriviallyDisjoint
This partially makes up for not having address spaces
used for alias analysis in some simple cases.

This is not yet enabled by default so shouldn't change anything yet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222286 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-19 00:01:31 +00:00
Simon Pilgrim
e6d1a2625f [X86][AVX] 256-bit vector stack unaligned load/stores identification
Under many circumstances the stack is not 32-byte aligned, resulting in the use of the vmovups/vmovupd/vmovdqu instructions when inserting ymm reloads/spills.

This minor patch adds these instructions to the isFrameLoadOpcode/isFrameStoreOpcode helpers so that they can be correctly identified and not be treated as folded reloads/spills.

This has also been noticed by http://llvm.org/bugs/show_bug.cgi?id=18846 where it was causing redundant spills - I've added a reduced test case at test/CodeGen/X86/pr18846.ll

Differential Revision: http://reviews.llvm.org/D6252



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222281 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-18 23:38:19 +00:00
Chad Rosier
32dc2de667 [FastISel][AArch64] Also allow folding of sign-/zero-extend and arithmetic
shift-right for booleans (i1).

Arithmetic shift-right immediate with sign-/zero-extensions also works for
boolean values.  Update the assert and the test cases to reflect that fact.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222272 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-18 22:41:49 +00:00
Chad Rosier
5e3288f85b [FastISel][AArch64] Also allow folding of sign-/zero-extend and logical
shift-right for booleans (i1).

Logical shift-right immediate with sign-/zero-extensions also works for boolean
values.  Update the assert and the test cases to reflect that fact.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222270 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-18 22:38:42 +00:00
Juergen Ributzka
52e0f75f82 [FastISel][AArch64] Follow-up fix for "Fix shift-immediate emission for "zero" shifts."
Shifts also perform sign-/zero-extends to larger types, which requires us to emit
an integer extend instead of a simple COPY.

Related to PR21594.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222257 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-18 21:20:17 +00:00
Matt Arsenault
a140448780 R600/SI: Move SIFixSGPRCopies to inst selector passes
This should expose more of the actually used VALU
instructions to the machine optimization passes.

This also should help getting i1 handling into a better state.
For not entirly understood reasons, this fixes the split-scalar-i64-add.ll
test where a 64-bit add would only partially be moved to the VALU
resulting in use of undefined VCC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222256 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-18 21:06:58 +00:00
Tom Stellard
891e9e7869 R600/SI: Make sure resource descriptors are always stored in SGPRs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222253 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-18 20:39:39 +00:00
Juergen Ributzka
8b62d78689 [FastISel][AArch64] Fix shift-immediate emission for "zero" shifts.
This change emits a COPY for a shift-immediate with a "zero" shift value.
This fixes PR21594 where we emitted a shift instruction with an incorrect
immediate operand.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222247 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-18 19:58:59 +00:00
Alexey Volkov
19e8fe05dc [X86] Use ADD/SUB instead of INC/DEC for Haswell and Broadwell CPUs
Differential Revision: http://reviews.llvm.org/D5934



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222141 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-17 16:17:51 +00:00
Renato Golin
18e5ce0188 Fix ARM triple parsing
The triple parser should only accept existing architecture names
when the triple starts with armv, armebv, thumbv or thumbebv.

Patch by Gabor Ballabas.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222129 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-17 14:08:57 +00:00
Oliver Stannard
8f832fce3b [Thumb1] Re-write emitThumbRegPlusImmediate
This was motivated by a bug which caused code like this to be
miscompiled:
  declare void @take_ptr(i8*)
  define void @test() {
    %addr1.32 = alloca i8
    %addr2.32 = alloca i32, i32 1028
    call void @take_ptr(i8* %addr1)
    ret void
  }

This was emitting the following assembly to get the value of %addr1:
  add r0, sp, #1020
  add r0, r0, #8
However, "add r0, r0, #8" is not a valid Thumb1 instruction, and this
could not be assembled. The generated object file contained this,
resulting in r0 holding SP+8 rather tha SP+1028:
  add r0, sp, #1020
  add r0, sp, #8

This function looked like it could have caused miscompilations for
other combinations of registers and offsets (though I don't think it is
currently called with these), and the heuristic it used did not match
the emitted code in all cases.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222125 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-17 11:18:10 +00:00
Oliver Stannard
d9d2703b71 Fix optimisations of SELECT_CC which assumed result is boolean
Some optimisations in DAGCombiner cause miscompilations for targets that use
TargetLowering::UndefinedBooleanContent, because they assume that the results
of a SELECT_CC node are boolean values, and can be safely ANDed, ORed and
XORed. These optimisations are only valid for targets that use
ZeroOrOneBooleanContent or ZeroOrNegativeOneBooleanContent.

This is a follow-up to D6210/r221693.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222123 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-17 10:49:31 +00:00
Bob Wilson
17e95ead36 Fix CR/LF line endings in test case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222120 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-17 08:00:45 +00:00
Andrea Di Biagio
37f645cb34 [DAG] Improved target independent vector shuffle folding logic.
This patch teaches the DAGCombiner how to combine shuffles according to rules:
   shuffle(shuffle(A, Undef, M0), B, M1) -> shuffle(B, A, M2)
   shuffle(shuffle(A, B, M0), B, M1) -> shuffle(B, A, M2)
   shuffle(shuffle(A, B, M0), A, M1) -> shuffle(B, A, M2)


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222090 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-15 22:56:25 +00:00
Simon Pilgrim
01e39346f3 [X86][SSE] Improve legal SHUFP and PSHUFD shuffle matching
Updated X86TargetLowering::isShuffleMaskLegal to match SHUFP masks with commuted inputs and PSHUFD masks that reference the second input.

As part of this I've refactored isPSHUFDMask to work in a more general manner and allow it to match against either the first or second input vector.

Differential Revision: http://reviews.llvm.org/D6287



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222087 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-15 21:13:05 +00:00
Matt Arsenault
c093062447 R600: Permute operands when selecting legacy min/max
This gets the correct NaN behavior based on the compare type
the hardware uses. This now passes the new piglit test I have
for this on SI.

Add stricter tests for the operand order.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222079 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-15 05:02:57 +00:00
Tim Northover
52e76186d3 ARM: refactor .cfi_def_cfa_offset emission.
We use to track quite a few "adjusted" offsets through the FrameLowering code
to account for changes in the prologue instructions as we went and allow the
emission of correct CFA annotations. However, we were missing a couple of cases
and the code was almost impenetrable.

It's easier to just add any stack-adjusting instruction to a list and emit them
together.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222057 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-14 22:45:33 +00:00
Tim Northover
d96893fd3d ARM: correctly calculate the offset of FP in its push.
When we folded the DPR alignment gap into a push, we weren't noting the extra
distance from the beginning of the push to the FP, and so FP ended up pointing
at an incorrect offset.

The .cfi_def_cfa_offset directives are still wrong in this case, but I think
that can be improved by refactoring.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222056 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-14 22:45:31 +00:00
Tim Northover
9aa6fd59b3 ARM: simplify test.
The test's DWARF stubs were there just to trigger the emission of .cfi
directives. Fortunately, the NetBSD ABI already demands proper DWARF unwind
info, so it's easier to just use that triple.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222055 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-14 22:45:23 +00:00
Tom Stellard
6beb81daa5 R600/SI: Fix spilling of m0 register
If we have spilled the value of the m0 register, then we need to restore
it with v_readlane_b32 to a regular sgpr, because v_readlane_b32 can't
write to m0.

v_readlane_b32 can't write to m0, so

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222036 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-14 20:43:26 +00:00
Matt Arsenault
24e874a1dd R600/SI: Combine min3/max3 instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222032 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-14 20:08:52 +00:00
Matt Arsenault
848d9223c5 R600/SI: Fix verifier error from a branch on IMPLICIT_DEF
SIILowerI1Copies wasn't correctly handling this case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222020 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-14 18:43:41 +00:00
Matt Arsenault
01213b1132 R600/SI: Match integer min / max instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222015 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-14 18:30:06 +00:00
Matt Arsenault
8fd3b90c3f R600/SI: Use S_BFE_I64 for 64-bit sext_inreg
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222012 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-14 18:18:16 +00:00
Cameron McInally
b3625eb445 [AVX512] Add 512b masked integer shift by immediate patterns.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222002 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-14 15:43:00 +00:00
Bill Schmidt
40b0f5d6ce [PowerPC] Add VSX builtins for vec_div
This patch adds builtin support for xvdivdp and xvdivsp, along with a
test case.  Straightforward stuff.

There's a companion patch for Clang.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221983 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-14 12:10:40 +00:00
Tim Northover
4a7bbf4c29 X86: use getConstant rather than getTargetConstant behind BUILD_VECTOR.
getTargetConstant should only be used when you can guarantee the instruction
selected will be able to cope with the raw value. BUILD_VECTOR is rather too
generic for this so we should use getConstant instead. In that case, an
instruction can still consume the constant, but if it doesn't it'll be
materialised through its own round of ISel.

Should fix PR21352.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221961 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-14 01:30:14 +00:00
Reid Kleckner
98c86d76df Allow the use of functions as typeinfo in landingpad clauses
This is one step towards supporting SEH filter functions in LLVM.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221954 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-14 00:35:50 +00:00
Reed Kotler
198bb22754 First stage of call lowering for Mips fast-isel
Summary:
This has most of what is needed for mips fast-isel call lowering for O32.
What is missing I will add on the next patch because this patch is already too large.
It should not be doing anything wrong but it will punt on some cases that it is basically
capable of doing.

The mechanism is there for parameters to be passed on the stack but I have not enabled it because it serves as a way for now to prevent some of the strange cases of O32 register passing that I have not fully checked yet and have some issues.

The Mips O32 abi rules are very complicated as far how data is passed in floating and integer registers.

However there is a way to think about this all very simply and this implementation reflects that.

Basically, the ABI rules are written as if everything is passed on the stack and aligned as such.
Once that is conceptually done, it is nearly trivial to reassign those locations to registers and
then all the complexity disappears.

So I have told tablegen that all the data is passed on the stack and during the lowering I fix
this by assigning to registers as per the ABI doc.

This has been my approach and you can line up what I did with the ABI document and see 1 to 1 what
is going on.



Test Plan: callabi.ll

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: jholewinski, echristo, ahatanak, llvm-commits, rfuhler

Differential Revision: http://reviews.llvm.org/D5714

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221948 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-13 23:37:45 +00:00
Matt Arsenault
6f485c0bc5 R600/SI: Fix fmin_legacy / fmax_legacy matching for SI
select_cc is expanded on SI, so this was never matched.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221941 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-13 23:03:09 +00:00
Chandler Carruth
a5408b9c7c [x86] Add some tests for specific patterns of lane-flips combined with
in-lane shuffles that aren't always handled well by the current vector
shuffle lowering.

No functionality change yet, that will follow in a subsequent commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221938 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-13 22:49:44 +00:00
Juergen Ributzka
add7c56be5 [FastISel][AArch64] Don't bail during simple GEP instruction selection.
The generic FastISel code would bail, because it can't emit a sign-extend for
AArch64. This copies the code over and uses AArch64 specific emit functions.

This is not ideal and 'computeAddress' should handles this, so it can fold the
address computation into the memory operation.

I plan to clean up 'computeAddress' anyways, so I will add that in a future
commit.

Related to rdar://problem/18962471.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221923 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-13 20:50:44 +00:00
Matt Arsenault
01ab7a869d R600/SI: Use s_movk_i32
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221922 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-13 20:44:23 +00:00
Matt Arsenault
60c3acb36c R600: Fix assert on empty function
If a function is just an unreachable, this would hit a
"this is not a MachO target" assertion because of setting
HasSubsectionViaSymbols.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221920 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-13 20:07:40 +00:00
Matt Arsenault
8082990487 R600: Error on initializer for LDS.
Also give a proper error for other address spaces.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221917 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-13 19:56:13 +00:00
Matt Arsenault
b44e43623d R600/SI: Get rid of FCLAMP_SI pseudo
It's not necessary. Also use complex patterns to allow
src modifier usage.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221916 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-13 19:49:04 +00:00
Matt Arsenault
e59f9f46f7 R600/SI: Allow commuting with src2_modifiers
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221911 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-13 19:26:50 +00:00
Matt Arsenault
1aae959de7 R600/SI: Allow commuting some 3 op instructions
e.g. v_mad_f32 a, b, c -> v_mad_f32 b, a, c

This simplifies matching v_madmk_f32.

This looks somewhat surprising, but it appears to be
OK to do this. We can commute src0 and src1 in all
of these instructions, and that's all that appears
to matter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221910 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-13 19:26:47 +00:00
Tim Northover
8bca5de6a9 ARM: allow constpool entry to be moved to the user's block in all cases.
Normally entries can only move to a lower address, but when that wasn't viable,
the user's block was considered anyway. Unfortunately, it went via
createNewWater which wasn't designed to handle the case where there's already
an island after the block.

Unfortunately, the test we have is slow and fragile, and I couldn't reduce it
to anything sane even with the @llvm.arm.space intrinsic. The test change here
is recreating the previous one after the change.

rdar://problem/18545506

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221905 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-13 17:58:53 +00:00
Tim Northover
064da63fcb ARM: avoid duplicating branches during constant islands.
We were using a naive heuristic to determine whether a basic block already had
an unconditional branch at the end. This mostly corresponded to reality
(assuming branches got optimised) because there's not much point in a branch to
the next block, but could go wrong.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221904 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-13 17:58:51 +00:00
Tim Northover
5bd311bf17 ARM: add @llvm.arm.space intrinsic for testing ConstantIslands.
Creating tests for the ConstantIslands pass is very difficult, since it depends
on precise layout details. Having the ability to precisely inject a number of
bytes into the stream helps greatly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221903 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-13 17:58:48 +00:00
Elena Demikhovsky
18e1185ddf AVX-512: SINT_TO_FP cost model and some bugfixes
Checked some corner cases, for example translation
of <8 x i1> to <8 x double>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221883 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-13 11:46:16 +00:00