Commit Graph

720 Commits

Author SHA1 Message Date
Hao Liu
380417ac84 [AArch64] Lower interleaved memory accesses to ldN/stN intrinsics. This patch also adds a function to calculate the cost of interleaved memory accesses.
E.g. Lower an interleaved load:
        %wide.vec = load <8 x i32>, <8 x i32>* %ptr
        %v0 = shuffle %wide.vec, undef, <0, 2, 4, 6>
        %v1 = shuffle %wide.vec, undef, <1, 3, 5, 7>
     into:
        %ld2 = { <4 x i32>, <4 x i32> } call llvm.aarch64.neon.ld2(%ptr)
        %vec0 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 0
        %vec1 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 1

E.g. Lower an interleaved store:
        %i.vec = shuffle <8 x i32> %v0, <8 x i32> %v1, <0, 4, 8, 1, 5, 9, 2, 6, 10, 3, 7, 11>
        store <12 x i32> %i.vec, <12 x i32>* %ptr
     into:
        %sub.v0 = shuffle <8 x i32> %v0, <8 x i32> v1, <0, 1, 2, 3>
        %sub.v1 = shuffle <8 x i32> %v0, <8 x i32> v1, <4, 5, 6, 7>
        %sub.v2 = shuffle <8 x i32> %v0, <8 x i32> v1, <8, 9, 10, 11>
        call void llvm.aarch64.neon.st3(%sub.v0, %sub.v1, %sub.v2, %ptr)

Differential Revision: http://reviews.llvm.org/D10533


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240754 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-26 02:32:07 +00:00
Eric Christopher
933d2bd391 Fix "the the" in comments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240112 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-19 01:53:21 +00:00
Yi Jiang
d30c2356b0 Avoid redundant select node in early if-conversion pass
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240072 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-18 22:34:09 +00:00
David Majnemer
cc714e2142 Move the personality function from LandingPadInst to Function
The personality routine currently lives in the LandingPadInst.

This isn't desirable because:
- All LandingPadInsts in the same function must have the same
  personality routine.  This means that each LandingPadInst beyond the
  first has an operand which produces no additional information.

- There is ongoing work to introduce EH IR constructs other than
  LandingPadInst.  Moving the personality routine off of any one
  particular Instruction and onto the parent function seems a lot better
  than have N different places a personality function can sneak onto an
  exceptional function.

Differential Revision: http://reviews.llvm.org/D10429

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239940 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-17 20:52:32 +00:00
Ahmed Bougacha
4412d4b51f [CodeGenPrepare] Generalize inserted set from truncs to any inst.
It's been used before to avoid infinite loops caused by separate CGP
optimizations undoing one another.  We found one more such issue
caused by r238054.  To avoid it, generalize the "InsertedTruncs"
set to any inst, and use it to avoid touching those again.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239938 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-17 20:44:32 +00:00
Matthias Braun
e460807bcd Revert "AArch64: Use CMP;CCMP sequences for and/or/setcc trees."
The patch triggers a miscompile on SPEC 2006 403.gcc with the (ref)
200.i and scilab.i inputs. I opened PR23866 to track analysis of this.

This reverts commit r238793.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239880 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-17 04:02:32 +00:00
Ahmed Bougacha
819a6b7d4b [AArch64] Generalize extract-high DUP extension to MOVI/MVNI.
These are really immediate DUPs, and suffer from the same problem
with long instructions with a high/2 variant (e.g. smull).

By extending a MOVI (or DUP, before this patch), we can avoid an ext
on the other operand of the long instruction, e.g. turning:
    ext.16b v0, v0, v0, #8
    movi.4h v1, #0x53
    smull.4s  v0, v0, v1
into:
    movi.8h v1, #0x53
    smull2.4s  v0, v0, v1

While there, add a now-necessary combine to fold (VT NVCAST (VT x)).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239799 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-16 01:18:14 +00:00
Ahmed Bougacha
d4521f1dd5 [AArch64] Robustize neon-2velem-high test. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239798 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-16 01:05:39 +00:00
Evgeny Astigeevich
2ecc72cc58 On behalf of Alexandros Lamprineas:
LLVM targeting aarch64 doesn't correctly produce aligned accesses for non-aligned
data at -O0/fast-isel (-mno-unaligned-access).
The root cause seems to be in fast-isel not producing unaligned access correctly
for -mno-unaligned-access.

The patch just aborts fast-isel for loads and stores when -mno-unaligned-access is
present. 
The regression test is updated to check this new test case (-mno-unaligned-access 
together with fast-isel).

Differential Revision: http://reviews.llvm.org/D10360



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239732 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-15 15:48:44 +00:00
Hao Liu
5e1ea386d4 [AArch64] Delete two empty files, which should be removed by r239713.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239715 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-15 02:56:40 +00:00
Hao Liu
5ab48a2f69 [AArch64] Revert r239711 again. We need to discuss how to share code between AArch64 and ARM backend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239713 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-15 01:56:40 +00:00
Hao Liu
6024ab3b8f [AArch64] Match interleaved memory accesses into ldN/stN instructions.
Re-commit after adding "-aarch64-neon-syntax=generic" to fix the failure on OS X.
This patch was firstly committed in r239514, then reverted in r239544 because of a syntax incompatible failure on OS X.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239711 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-15 01:35:49 +00:00
Tim Northover
31b680fa24 AArch64: map bare-metal arm64-macho triple to MachO MC layer.
Far better than an assertion about expecting ELF.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239647 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-12 23:37:11 +00:00
Rafael Espindola
688e7b3049 This reverts commit r239529 and r239514.
Revert "[AArch64] Match interleaved memory accesses into ldN/stN instructions."
Revert "Fixing MSVC 2013 build error."

The  test/CodeGen/AArch64/aarch64-interleaved-accesses.ll test was failing on OS X.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239544 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-11 17:30:33 +00:00
Hao Liu
442f620296 [AArch64] Match interleaved memory accesses into ldN/stN instructions.
Add a pass AArch64InterleavedAccess to identify and match interleaved memory accesses. This pass transforms an interleaved load/store into ldN/stN intrinsic. As Loop Vectorizor disables optimization on interleaved accesses by default, this optimization is also disabled by default. To enable it by "-aarch64-interleaved-access-opt=true"

E.g. Transform an interleaved load (Factor = 2):
       %wide.vec = load <8 x i32>, <8 x i32>* %ptr
       %v0 = shuffle %wide.vec, undef, <0, 2, 4, 6>  ; Extract even elements
       %v1 = shuffle %wide.vec, undef, <1, 3, 5, 7>  ; Extract odd elements
     Into:
       %ld2 = { <4 x i32>, <4 x i32> } call aarch64.neon.ld2(%ptr)
       %v0 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 0
       %v1 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 1

E.g. Transform an interleaved store (Factor = 2):
       %i.vec = shuffle %v0, %v1, <0, 4, 1, 5, 2, 6, 3, 7>  ; Interleaved vec
       store <8 x i32> %i.vec, <8 x i32>* %ptr
     Into:
       %v0 = shuffle %i.vec, undef, <0, 1, 2, 3>
       %v1 = shuffle %i.vec, undef, <4, 5, 6, 7>
       call void aarch64.neon.st2(%v0, %v1, %ptr)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239514 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-11 09:05:02 +00:00
Chad Rosier
e2e26b486d [AArch64] Remove an overly conservative check when generating store pairs.
Store instructions do not modify register values and therefore it's safe
to form a store pair even if the source register has been read in between
the two store instructions.

Previously, the read of w1 (see below) prevented the formation of a stp.

        str      w0, [x2]
        ldr     w8, [x2, #8]
        add      w0, w8, w1
        str     w1, [x2, #4]
        ret

We now generate the following code.

        stp      w0, w1, [x2]
        ldr     w8, [x2, #8]
        add      w0, w8, w1
        ret

All correctness tests with -Ofast on A57 with Spec200x and EEMBC pass.
Performance results for SPEC2K were within noise.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239432 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-09 20:59:41 +00:00
Ahmed Bougacha
0d9335eda7 [GlobalMerge] Take into account minsize on Global users' parents.
Now that we can look at users, we can trivially do this: when we would
have otherwise disabled GlobalMerge (currently -O<3), we can just run
it for minsize functions, as it's usually a codesize win.

Differential Revision: http://reviews.llvm.org/D10054


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239087 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-04 20:39:23 +00:00
James Molloy
cd2647f4fd Don't create a MIN/MAX node if the underlying compare has more than one use.
If the compare in a select pattern has another use then it can't be removed, so we'd just
be creating repeated code if we created a min/max node.

Spotted by Matt Arsenault!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239037 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-04 13:48:23 +00:00
Matthias Braun
fe1391f07d AArch64: Use CMP;CCMP sequences for and/or/setcc trees.
Previously CCMP/FCCMP instructions were only used by the
AArch64ConditionalCompares pass for control flow. This patch uses them
for SELECT like instructions as well by matching patterns in ISelLowering.

PR20927, rdar://18326194

Differential Revision: http://reviews.llvm.org/D8232

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238793 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-01 22:31:17 +00:00
Luke Cheeseman
68f83e59f7 Removing commited assembly file.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238742 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-01 13:18:53 +00:00
Luke Cheeseman
7d97fc4164 Re-commit of r238201 with fix for building with shared libraries.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238739 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-01 12:02:47 +00:00
Diego Novillo
9c24c958f1 Revert "Re-commit changes in r237579 with fix for bug breaking windows builds."
This reverts commit r238201 to fix linking problems in x86 Linux
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150525/278413.html

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238223 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-26 17:45:38 +00:00
Luke Cheeseman
262e24f7af Re-commit changes in r237579 with fix for bug breaking windows builds.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238201 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-26 13:40:31 +00:00
Ahmed Bougacha
d8319655f2 [AArch64][CGP] Sink zext feeding stxr/stlxr into the same block.
The usual CodeGenPrepare trickery, on a target-specific intrinsic.
Without this, the expansion of atomics will usually have the zext
be hoisted out of the loop, defeating the various patterns we have
to catch this precise case.

Differential Revision: http://reviews.llvm.org/D9930


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238054 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-22 21:37:17 +00:00
Ahmed Bougacha
bde8616229 [AArch64] Robustize atomic cmpxchg test a little more. NFC.
We changed the test to test non-constant values in r238049.
We can also use CHECK-NEXT to be a little stricter.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238052 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-22 21:35:14 +00:00
Ahmed Bougacha
d3244b7749 [AArch64] Robustize atomic cmpxchg test. NFC.
Constants are easy to get right the wrong way.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238049 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-22 21:08:15 +00:00
Chad Rosier
676efa4d56 [AArch64] Enhance the load/store optimizer with target-specific alias analysis.
Phabricator: http://reviews.llvm.org/D9863

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237963 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-21 21:36:46 +00:00
Oliver Stannard
0139af335f Revert r237579, as it broke windows buildbots
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237583 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-18 16:39:16 +00:00
Oliver Stannard
d811b4bacb [LLVM - ARM/AArch64] Add ACLE special register intrinsics
This patch implements LLVM support for the ACLE special register intrinsics in
section 10.1, __arm_{w,r}sr{,p,64}.

This patch is intended to lower the read/write_register instrinsics, used to
implement the special register intrinsics in the clang patch for special
register intrinsics (see http://reviews.llvm.org/D9697), to ARM specific
instructions MRC,MCR,MSR etc. to allow reading an writing of coprocessor
registers in AArch32 and AArch64. This is done by inspecting the register
string passed to the intrinsic and then lowering to the appropriate
instruction.

Patch by Luke Cheeseman.

Differential Revision: http://reviews.llvm.org/D9699



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237579 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-18 16:23:33 +00:00
James Molloy
d63e0fc2d9 Mark SMIN/SMAX/UMIN/UMAX nodes as legal and add patterns for them.
The new [SU]{MIN,MAX} SDNodes can be lowered directly to instructions for
most NEON datatypes - the big exclusion being v2i64.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237455 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-15 16:15:57 +00:00
Artyom Skrobov
9ce56af1eb Re-apply r237247 - [AArch64] Codegen VMAX/VMIN for safe math cases
No longer breaks SPEC2000/2006



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237361 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-14 12:59:46 +00:00
Silviu Baranga
b937dadfb2 Revert r237247 - [AArch64] Codegen VMAX/VMIN.. as it is causing failures in SPEC2000/2006
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237256 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-13 14:03:18 +00:00
Artyom Skrobov
e8dceea402 [AArch64] Codegen VMAX/VMIN for safe math cases
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237247 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-13 12:01:09 +00:00
Sunil Srivastava
561c44fc33 Changed renaming of local symbols by inserting a dot vefore the numeric suffix.
One code change and several test changes to match that
details in http://reviews.llvm.org/D9481


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237150 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-12 16:47:30 +00:00
NAKAMURA Takumi
31e094b55d llvm/test/CodeGen/AArch64/tailcall_misched_graph.ll: s/REQUIRE/REQUIRES/
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236928 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-09 05:59:00 +00:00
Arnold Schwaighofer
75e36e847e ScheduleDAGInstrs: In functions with tail calls PseudoSourceValues are not non-aliasing distinct objects
The code that builds the dependence graph assumes that two PseudoSourceValues
don't alias. In a tail calling function two FixedStackObjects might refer to the
same location. Worse 'immutable' fixed stack objects like function arguments are
not immutable and will be clobbered.

Change this so that a load from a FixedStackObject is not invariant in a tail
calling function and don't return a PseudoSourceValue for an instruction in tail
calling functions when building the dependence graph so that we handle function
arguments conservatively.

Fix for PR23459.

rdar://20740035

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236916 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-08 23:52:00 +00:00
Pete Cooper
d90099d36c Clear kill flags on all used registers when sinking instructions.
The test here was sinking the AND here to a lower BB:

	%vreg7<def> = ANDWri %vreg8, 0; GPR32common:%vreg7,%vreg8
	TBNZW %vreg8<kill>, 0, <BB#1>; GPR32common:%vreg8

which meant that vreg8 was read after it was killed.

This commit changes the code from clearing kill flags on the AND to clearing flags on all registers used by the AND.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236886 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-08 17:54:32 +00:00
Pete Cooper
d887b7ef2f [AArch64] Fix sext/zext folding in address arithmetic.
We were accidentally folding a sign/zero extend in to address arithmetic in a different BB when the extend wasn't available there.

Cross BB fast-isel isn't safe, so restrict this to only when the extend is in the same BB as the use.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236764 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-07 19:21:36 +00:00
Tim Northover
84b8c10729 CodeGen: move over-zealous assert into actual if statement.
It's quite possible to encounter an insertvalue instruction that's more deeply
nested than the value we're looking for, but when that happens we really
mustn't compare beyond the end of the index array.

Since I couldn't see any guarantees about what comparisons std::equal makes, we
probably need to directly check the size beforehand. In practice, I suspect
most std::equal implementations would probably bail early, which would be OK.
But just in case...

rdar://20834485

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236635 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-06 20:07:38 +00:00
Quentin Colombet
2f7322b348 [ShrinkWrap] Add (a simplified version) of shrink-wrapping.
This patch introduces a new pass that computes the safe point to insert the
prologue and epilogue of the function.
The interest is to find safe points that are cheaper than the entry and exits
blocks.

As an example and to avoid regressions to be introduce, this patch also
implements the required bits to enable the shrink-wrapping pass for AArch64.


** Context **

Currently we insert the prologue and epilogue of the method/function in the
entry and exits blocks. Although this is correct, we can do a better job when
those are not immediately required and insert them at less frequently executed
places.
The job of the shrink-wrapping pass is to identify such places.


** Motivating example **

Let us consider the following function that perform a call only in one branch of
a if:
define i32 @f(i32 %a, i32 %b)  {
 %tmp = alloca i32, align 4
 %tmp2 = icmp slt i32 %a, %b
 br i1 %tmp2, label %true, label %false

true:
 store i32 %a, i32* %tmp, align 4
 %tmp4 = call i32 @doSomething(i32 0, i32* %tmp)
 br label %false

false:
 %tmp.0 = phi i32 [ %tmp4, %true ], [ %a, %0 ]
 ret i32 %tmp.0
}

On AArch64 this code generates (removing the cfi directives to ease
readabilities):
_f:                                     ; @f
; BB#0:
  stp x29, x30, [sp, #-16]!
  mov  x29, sp
  sub sp, sp, #16             ; =16
  cmp  w0, w1
  b.ge  LBB0_2
; BB#1:                                 ; %true
  stur  w0, [x29, #-4]
  sub x1, x29, #4             ; =4
  mov  w0, wzr
  bl  _doSomething
LBB0_2:                                 ; %false
  mov  sp, x29
  ldp x29, x30, [sp], #16
  ret

With shrink-wrapping we could generate:
_f:                                     ; @f
; BB#0:
  cmp  w0, w1
  b.ge  LBB0_2
; BB#1:                                 ; %true
  stp x29, x30, [sp, #-16]!
  mov  x29, sp
  sub sp, sp, #16             ; =16
  stur  w0, [x29, #-4]
  sub x1, x29, #4             ; =4
  mov  w0, wzr
  bl  _doSomething
  add sp, x29, #16            ; =16
  ldp x29, x30, [sp], #16
LBB0_2:                                 ; %false
  ret

Therefore, we would pay the overhead of setting up/destroying the frame only if
we actually do the call.


** Proposed Solution **

This patch introduces a new machine pass that perform the shrink-wrapping
analysis (See the comments at the beginning of ShrinkWrap.cpp for more details).
It then stores the safe save and restore point into the MachineFrameInfo
attached to the MachineFunction.
This information is then used by the PrologEpilogInserter (PEI) to place the
related code at the right place. This pass runs right before the PEI.

Unlike the original paper of Chow from PLDI’88, this implementation of
shrink-wrapping does not use expensive data-flow analysis and does not need hack
to properly avoid frequently executed point. Instead, it relies on dominance and
loop properties.

The pass is off by default and each target can opt-in by setting the
EnableShrinkWrap boolean to true in their derived class of TargetPassConfig.
This setting can also be overwritten on the command line by using
-enable-shrink-wrap.

Before you try out the pass for your target, make sure you properly fix your
emitProlog/emitEpilog/adjustForXXX method to cope with basic blocks that are not
necessarily the entry block.


** Design Decisions **

1. ShrinkWrap is its own pass right now. It could frankly be merged into PEI but
for debugging and clarity I thought it was best to have its own file.
2. Right now, we only support one save point and one restore point. At some
point we can expand this to several save point and restore point, the impacted
component would then be:
- The pass itself: New algorithm needed.
- MachineFrameInfo: Hold a list or set of Save/Restore point instead of one
  pointer.
- PEI: Should loop over the save point and restore point.
Anyhow, at least for this first iteration, I do not believe this is interesting
to support the complex cases. We should revisit that when we motivating
examples.

Differential Revision: http://reviews.llvm.org/D9210

<rdar://problem/3201744>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236507 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-05 17:38:16 +00:00
Tim Northover
7f88b179b1 CodeGen: match up correct insertvalue indices when assessing tail calls.
When deciding whether a value comes from the aggregate or inserted value of an
insertvalue instruction, we compare the indices against those of the location
we're interested in. One of the lists needs reversing because the input data is
backwards (so that modifications take place at the end of the SmallVector), but
we were reversing both before leading to incorrect results.

Should fix PR23408

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236457 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-04 20:41:51 +00:00
Quentin Colombet
32675bbfd0 [AArch64][FastISel] Variant of the logical instructions that use two input
registers cannot write on SP.

rdar://problem/20748715


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236352 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-01 21:34:57 +00:00
Quentin Colombet
4688a0507c [AArch64][FastISel] Fix the setting of kill flags for MUL -> UMULH sequences.
rdar://problem/20748715


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236346 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-01 20:57:11 +00:00
Quentin Colombet
3a0fccf6a0 [AArch64] Fix bad register class constraint in fast-isel for TST instruction.
rdar://problem/20748715


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236273 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-30 22:27:20 +00:00
Duncan P. N. Exon Smith
e56023a059 IR: Give 'DI' prefix to debug info metadata
Finish off PR23080 by renaming the debug info IR constructs from `MD*`
to `DI*`.  The last of the `DIDescriptor` classes were deleted in
r235356, and the last of the related typedefs removed in r235413, so
this has all baked for about a week.

Note: If you have out-of-tree code (like a frontend), I recommend that
you get everything compiling and tests passing with the *previous*
commit before updating to this one.  It'll be easier to keep track of
what code is using the `DIDescriptor` hierarchy and what you've already
updated, and I think you're extremely unlikely to insert bugs.  YMMV of
course.

Back to *this* commit: I did this using the rename-md-di-nodes.sh
upgrade script I've attached to PR23080 (both code and testcases) and
filtered through clang-format-diff.py.  I edited the tests for
test/Assembler/invalid-generic-debug-node-*.ll by hand since the columns
were off-by-three.  It should work on your out-of-tree testcases (and
code, if you've followed the advice in the previous paragraph).

Some of the tests are in badly named files now (e.g.,
test/Assembler/invalid-mdcompositetype-missing-tag.ll should be
'dicompositetype'); I'll come back and move the files in a follow-up
commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236120 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-29 16:38:44 +00:00
Ahmed Bougacha
ae618e7873 [AArch64] Also combine vector selects fed by non-i1 SETCCs.
After legalization, scalar SETCC has an i32 result type on AArch64.
The i1 requirement seems too conservative, replace it with an assert.

This also means that we now can run after legalization. That should also
be fine, since the ops legalizer runs again after each combine, and
all types created all have the same sizes as the (legal) inputs.

Exposed by r235917; while there, robustize its tests (bsl also uses the
register it defines).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235922 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-27 21:43:12 +00:00
Ahmed Bougacha
bc92b2ca37 [AArch64] Don't assert when combining (v3f32 select (setcc f64)).
When the setcc has f64 operands, we can't build a vector setcc mask
to feed a vselect, because f64 doesn't divide v3f32 evenly.
Just bail out when that happens.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235917 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-27 21:01:20 +00:00
Yaron Keren
d5df7d3c7b Teach AArch64\lit.local.cfg the new triple names windows-gnu and windows-msvc.
Tests were failing when built with -DLLVM_DEFAULT_TARGET_TRIPLE=i686-pc-windows-gnu.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235733 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-24 17:14:16 +00:00
Pirama Arumuga Nainar
dab5145cb3 [AArch64] Add nvcast patterns for v4f16 and v8f16
Summary:
Constant stores of f16 vectors can create NvCast nodes from various
operand types to v4f16 or v8f16 depending on patterns in the stored
constants.  This patch adds nvcast rules with v4f16 and v8f16 values.

AArchISelLowering::LowerBUILD_VECTOR has the details on which constant
patterns generate the nvcast nodes.

Reviewers: jmolloy, srhines, ab

Subscribers: rengolin, aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D9201

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235610 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 17:32:25 +00:00
Pirama Arumuga Nainar
b7db5f28c5 [AArch64] Handle vec4, vec8, vec16 *itofp for half
Summary:
Set operation action for SINT_TO_FP and UINT_TO_FP nodes with v4i32,
v8i8, v8i16 inputs to allow promotion of v4f16 results.

Add tests for sitofp and uitofp for vec4, vec8, vec16, and i8, i16, i32,
and i64 vectors.  Only missing tests are for v16i8 and v16i16 as the
shift operations are too complicated to write a proper check sequence.

The conversions from v4i64 to v4f16 do not depend on this patch - v4i64
is split and the conversion gets handled while lowering v2i64.  I am
adding a test here for completeness.

Reviewers: aemerson, rengolin, ab, jmolloy, srhines

Subscribers: rengolin, aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D9166

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235609 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 17:16:27 +00:00