Commit Graph

579 Commits

Author SHA1 Message Date
Oliver Stannard
5e487f8dc7 Teach the AArch64 backend about v4f16 and v8f16
This teaches the AArch64 backend to deal with the operations required
to deal with the operations on v4f16 and v8f16 which are exposed by
NEON intrinsics, plus the add, sub, mul and div operations.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216555 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 16:16:04 +00:00
Juergen Ributzka
fc03e72b4f [FastISel][AArch64] Fix address simplification.
When a shift with extension or an add with shift and extension cannot be folded
into the memory operation, then the address calculation has to be materialized
separately. While doing so the code forgot to consider a possible sign-/zero-
extension. This fix folds now also the sign-/zero-extension into the add or
shift instruction which is used to materialize the address.

This fixes rdar://problem/18141718.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216511 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 00:58:30 +00:00
Juergen Ributzka
836f4bd090 [FastISel][AArch64] Fold Sign-/Zero-Extend into the shift immediate instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216510 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 00:58:26 +00:00
Juergen Ributzka
5e34dffb9c [FastISel][AArch64] Add support for variable shift.
This adds the missing variable shift support for value type i8, i16, and i32.

This fixes <rdar://problem/18095685>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216242 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 23:06:07 +00:00
Juergen Ributzka
5d6365c80c [FastISel][AArch64] Use the correct register class to make the MI verifier happy.
This is mostly achieved by providing the correct register class manually,
because getRegClassFor always returns the GPR*AllRegClass for MVT::i32 and
MVT::i64.

Also cleanup the code to use the FastEmitInst_* method whenever possible. This
makes sure that the operands' register class is properly constrained. For all
the remaining cases this adds the missing constrainOperandRegClass calls for
each operand.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216225 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 20:57:57 +00:00
Quentin Colombet
ad3c6289b6 [AArch64] Run a peephole pass right after AdvSIMD pass.
The AdvSIMD pass may produce copies that are not coalescer-friendly. The
peephole optimizer knows how to fix that as demonstrated in the test case.

<rdar://problem/12702965>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216200 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 18:10:07 +00:00
Juergen Ributzka
69ec09b61f [FastISel][AArch64] Remove redundant test.
These tests and many more are already covered by fast-isel-addressing-modes.ll.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216186 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 16:40:05 +00:00
Jiangning Liu
150cef218f Revert r216066, "Optimize ZERO_EXTEND and SIGN_EXTEND in both SelectionDAG Builder and type".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216147 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 01:59:30 +00:00
Juergen Ributzka
5273295751 [FastISel][AArch64] Don't fold the sign-/zero-extend from i1 into the compare.
This fixes a bug I introduced in a previous commit (r216033). Sign-/Zero-
extension from i1 cannot be folded into the ADDS/SUBS instructions. Instead both
operands have to be sign-/zero-extended with separate instructions.

Related to <rdar://problem/17913111>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216073 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 16:34:15 +00:00
Jiangning Liu
e04455d72b Optimize ZERO_EXTEND and SIGN_EXTEND in both SelectionDAG Builder and type
legalization stage. With those two optimizations, fewer signed/zero extension
instructions can be inserted, and then we can expose more opportunities to
Machine CSE pass in back-end.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216066 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 12:05:15 +00:00
Yi Kong
40f9d11ccc ARM: Fix codegen for rbit intrinsic
LLVM generates illegal `rbit r0, #352` instruction for rbit intrinsic.
According to ARM ARM, rbit only takes register as argument, not immediate.
The correct instruction should be rbit <Rd>, <Rm>.

The bug was originally introduced in r211057.

Differential Revision: http://reviews.llvm.org/D4980

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216064 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 10:40:20 +00:00
Juergen Ributzka
3ef392c4e2 [FastISel][AArch64] Use the proper FMOV instruction to materialize a +0.0.
Use FMOVWSr/FMOVXDr instead of FMOVSr/FMOVDr, which have the proper register
class to be used with the zero register. This makes the MachineInstruction
verifier happy again.

This is related to <rdar://problem/18027157>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216040 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 01:10:36 +00:00
Juergen Ributzka
ae9a7964ef [FastISel][AArch64] Factor out ADDS/SUBS instruction emission and add support for extensions and shift folding.
Factor out the ADDS/SUBS instruction emission code into helper functions and
make the helper functions more clever to support most of the different ADDS/SUBS
instructions the architecture support. This includes better immedediate support,
shift folding, and sign-/zero-extend folding.

This fixes <rdar://problem/17913111>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216033 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-19 22:29:55 +00:00
Juergen Ributzka
1d58f989d4 [FastISel][AArch64] Extend floating-point materialization test.
This adds the missing test that I promised for r215753 to test the
materialization of the floating-point value +0.0.

Related to <rdar://problem/18027157>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216019 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-19 20:35:07 +00:00
Juergen Ributzka
06bb1ca1e0 Reapply [FastISel][AArch64] Add support for more addressing modes (r215597).
Note: This was originally reverted to track down a buildbot error. Reapply
without any modifications.

Original commit message:
FastISel didn't take much advantage of the different addressing modes available
to it on AArch64. This commit allows the ComputeAddress method to recognize more
addressing modes that allows shifts and sign-/zero-extensions to be folded into
the memory operation itself.

For Example:
  lsl x1, x1, #3     --> ldr x0, [x0, x1, lsl #3]
  ldr x0, [x0, x1]

  sxtw x1, w1
  lsl x1, x1, #3     --> ldr x0, [x0, x1, sxtw #3]
  ldr x0, [x0, x1]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216013 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-19 19:44:17 +00:00
Juergen Ributzka
78f686d37c Reapply [FastISel][AArch64] Make use of the zero register when possible (r215591).
Note: This was originally reverted to track down a buildbot error. Reapply
without any modifications.

Original commit message:
This change materializes now the value "0" from the zero register.
The zero register can be folded by several instruction, so no
materialization is need at all.

Fixes <rdar://problem/17924413>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216009 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-19 19:44:02 +00:00
Juergen Ributzka
8841fb5f25 [FastISel][AArch64] Fix a few BuildMI callsites where the result register was added as an operand register.
This fixes a few BuildMI callsites where the result register was added by
using addReg, which is per default a use and therefore an operand register.

Also use the zero register as result register when emitting a compare
instruction (SUBS with unused result register).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215997 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-19 17:41:53 +00:00
Oliver Stannard
eb922109f9 Teach the AArch64 backend to handle f16
This allows the AArch64 backend to handle fadd, fsub, fmul and fdiv
operations on f16 (half-precision) types by promoting to f32.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215891 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-18 14:22:39 +00:00
Oliver Stannard
802d420792 [ARM,AArch64] Do not tail-call to an externally-defined function with weak linkage
Externally-defined functions with weak linkage should not be
tail-called on ARM or AArch64, as the AAELF spec requires normal calls
to undefined weak functions to be replaced with a NOP or jump to the
next instruction. The behaviour of branch instructions in this
situation (as used for tail calls) is implementation-defined, so we
cannot rely on the linker replacing the tail call with a return.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215890 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-18 12:42:15 +00:00
Amara Emerson
cef3ad6720 [AArch64] Narrow arguments passed in wrong position on the stack in
big-endian mode.

Patch by Asiri Rathnayake.

Differential Revision: http://reviews.llvm.org/D4922

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215716 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-15 14:29:57 +00:00
Juergen Ributzka
6398a7f5fd Revert several FastISel commits to track down a buildbot error.
This reverts:
r215595 "[FastISel][X86] Add large code model support for materializing floating-point constants."
r215594 "[FastISel][X86] Use XOR to materialize the "0" value."
r215593 "[FastISel][X86] Emit more efficient instructions for integer constant materialization."
r215591 "[FastISel][AArch64] Make use of the zero register when possible."
r215588 "[FastISel] Let the target decide first if it wants to materialize a constant."
r215582 "[FastISel][AArch64] Cleanup constant materialization code. NFCI."

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215673 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 19:56:28 +00:00
Juergen Ributzka
14bc045838 Revert "[FastISel][AArch64] Add support for more addressing modes."
This reverts commits r215597, because it might have broken the build bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215659 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 17:10:54 +00:00
Akira Hatanaka
d0ddfb0896 [AArch64, fast-isel] Fall back to SelectionDAG to select tail calls.
Certain functions such as objc_autoreleaseReturnValue have to be called as
tail-calls even at -O0. Since normal fast-isel doesn't emit calls as tail calls,
we have to fall back to SelectionDAG to select calls that are marked as tail.

<rdar://problem/17991614>



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215600 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 23:23:58 +00:00
Juergen Ributzka
8c9a0319bb [FastISel][AArch64] Add support for more addressing modes.
FastISel didn't take much advantage of the different addressing modes available
to it on AArch64. This commit allows the ComputeAddress method to recognize more
addressing modes that allows shifts and sign-/zero-extensions to be folded into
the memory operation itself.

For Example:
  lsl x1, x1, #3     --> ldr x0, [x0, x1, lsl #3]
  ldr x0, [x0, x1]

  sxtw x1, w1
  lsl x1, x1, #3     --> ldr x0, [x0, x1, sxtw #3]
  ldr x0, [x0, x1]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215597 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 22:53:29 +00:00
Juergen Ributzka
dc408e8069 [FastISel][AArch64] Make use of the zero register when possible.
This change materializes now the value "0" from the zero register.
The zero register can be folded by several instruction, so no
materialization is need at all.

Fixes <rdar://problem/17924413>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215591 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 22:13:14 +00:00
Gerolf Hoflehner
392f7d970c [MachineCombiner] Fix for ICE bug 20598
The combiner ignored DBG nodes when checking
the uses of a virtual register.

It combined a sequence like
   %vreg1 = madd %vreg2, %vreg3,...
   DBG_VALUE (%vreg1 ...)
   %vreg4 = add %vreg1,...
to
  %vreg4 = madd %vreg2, %vreg3

leaving behind a dangling DBG_VALUE with
a definition. This triggered an assertion
in the MachineTraceMetrics.cpp module.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215431 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-12 07:54:12 +00:00
Quentin Colombet
7f4f923aa5 [AArch64] Fix registerAllocator assigns same register for base and wback in
pre/post-index load and store.

Patch by Steven Wu <stevenwu@apple.com>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215390 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 21:39:53 +00:00
Jiangning Liu
0679d2d0a4 In Machine CSE pass, the source register of a COPY machine instruction can
be propagated to all its users, and this propagation could increase the 
probability of finding common subexpressions. If the COPY has only one user,
the COPY itself can be removed.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215344 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 05:17:19 +00:00
Jiangning Liu
3b85f30319 [AArch64] Fix a type conversion bug for anlyzing compare.
The bug can cause spec2006/483.xalancbmk failure.

Patched by David Xu.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215206 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 14:19:29 +00:00
James Molloy
3a106a2813 [AArch64] Add an FP load balancing pass for Cortex-A57
For best-case performance on Cortex-A57, we should try to use a balanced mix of odd and even D-registers when performing a critical sequence of independent, non-quadword FP/ASIMD floating-point multiply or multiply-accumulate operations.

This pass attempts to detect situations where the register allocation may adversely affect this load balancing and to change the registers used so as to better utilize the CPU.

Ideally we'd just take each multiply or multiply-accumulate in turn and allocate it alternating even or odd registers. However, multiply-accumulates are most efficiently performed in the same functional unit as their accumulation operand. Therefore this pass tries to find maximal sequences ("Chains") of multiply-accumulates linked via their accumulation operand, and assign them all the same "color" (oddness/evenness).

This optimization affects S-register and D-register floating point multiplies and FMADD/FMAs, as well as vector (floating point only) muls and FMADD/FMA. Q register instructions (and 128-bit vector instructions) are not affected.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215199 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 12:33:21 +00:00
Tim Northover
bbdf1e0432 AArch64: stop trying to take control of all UnknownArch triples.
This short-circuited our error reporting for incorrectly specified
target triples (you'd get AArch64 code instead).

Should fix PR20567.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215191 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 08:27:44 +00:00
Akira Hatanaka
43f6ce9289 [stack protector] Look through bitcasts to get global variable
__stack_chk_guard.

Handle the case where the pointer operand of the load instruction that loads the
stack guard is not a global variable but instead a bitcast.

%StackGuard = load i8** bitcast (i64** @__stack_chk_guard to i8**)
call void @llvm.stackprotector(i8* %StackGuard, i8** %StackGuardSlot)

Original test case provided by Ana Pazos.

This fixes PR20558.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215167 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 23:08:24 +00:00
Gerolf Hoflehner
e4fa341dde MachineCombiner Pass for selecting faster instruction sequence on AArch64
Re-commit of r214832,r21469 with a work-around that
avoids the previous problem with gcc build compilers

The work-around is to use SmallVector instead of ArrayRef
of basic blocks in preservesResourceLen()/MachineCombiner.cpp



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215151 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 21:40:58 +00:00
James Molloy
e0243fb42d [AArch64] Add a testcase for r214957.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214965 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-06 13:31:32 +00:00
Yi Kong
68b3d680f2 AArch64: Add support for instruction prefetch intrinsic
Instruction prefetch is not implemented for AArch64, it is incorrectly
translated into data prefetch instruction.

Differential Revision: http://reviews.llvm.org/D4777


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214860 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 12:46:47 +00:00
Juergen Ributzka
eee659a076 [FastISel][AArch64] Implement the FastLowerArguments hook.
This implements basic argument lowering for AArch64 in FastISel. It only
handles a small subset of the C calling convention. It supports simple
arguments that can be passed in GPR and FPR registers.

This should cover most of the trivial cases without falling back to
SelectionDAG.

This fixes <rdar://problem/17890986>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214846 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 05:43:48 +00:00
Kevin Qin
6739735812 Revert "r214832 - MachineCombiner Pass for selecting faster instruction"
It broke compiling of most Benchmark and internal test, as clang got
clashed by segmentation fault or assertion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214845 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 05:43:47 +00:00
Juergen Ributzka
7e9c0bc511 [FastISel][AArch64] Don't perform sign-/zero-extension for function arguments that have already been sign-/zero-extended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214844 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 05:43:44 +00:00
Gerolf Hoflehner
c2328d552c MachineCombiner Pass for selecting faster instruction
sequence on AArch64

Re-commit of r214669 without changes to test cases
LLVM::CodeGen/AArch64/arm64-neon-mul-div.ll and
LLVM:: CodeGen/AArch64/dp-3source.ll
This resolves the reported compfails of the original commit.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214832 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 01:16:13 +00:00
Juergen Ributzka
2c68cde701 [FastISel][AArch64] Fix shift lowering for i8 and i16 value types.
This fix changes the parameters #r and #s that are passed to the UBFM/SBFM
instruction to get the zero/sign-extension for free.

The original problem was that the shift left would use the 32-bit shift even for
i8/i16 value types, which could leave the upper bits set with "garbage" values.

The arithmetic shift right on the other side would use the wrong MSB as sign-bit
to determine what bits to shift into the value.

This fixes <rdar://problem/17907720>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214788 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 21:49:51 +00:00
Chad Rosier
82c93451f3 [AArch64] Extend the number of scalar instructions supported in the AdvSIMD
scalar integer instruction pass.

This is a patch I had lying around from a few months ago.  The pass is
currently disabled by default, so nothing to interesting.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214779 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 21:20:25 +00:00
Kevin Qin
534100b31e Revert "r214669 - MachineCombiner Pass for selecting faster instruction"
This commit broke "make check" for several hours, so get it reverted.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214697 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 05:10:33 +00:00
Gerolf Hoflehner
48e1bd7287 MachineCombiner Pass for selecting faster instruction
sequence -  AArch64 target support

 This patch turns off madd/msub generation in the DAGCombiner and generates
 them in the MachineCombiner instead. It replaces the original code sequence
 with the combined sequence when it is beneficial to do so.

 When there is no machine model support it always generates the madd/msub
 instruction. This is true also when the objective is to optimize for code
 size: when the combined sequence is shorter is always chosen and does not
 get evaluated.

 When there is a machine model the combined instruction sequence
 is evaluated for critical path and resource length using machine
 trace metrics and the original code sequence is replaced when it is
 determined to be faster.

 rdar://16319955



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214669 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-03 22:03:40 +00:00
James Molloy
14d596a382 Update test to use a more modern AArch64 triple, as requested by Renato.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214637 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-02 17:15:11 +00:00
James Molloy
e411c38de9 [AArch64] Teach DAGCombiner that converting two consecutive loads into a vector load is not a good transform when paired loads are available.
The combiner was creating Q-register loads and stores, which then had to be spilled because there are no callee-save Q registers!



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214634 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-02 14:51:24 +00:00
Juergen Ributzka
8d3bf10dd3 [FastISel][AArch64] Fold offset into the memory operation.
Fold simple offsets into the memory operation:
  add x0, x0, #8
  ldr x0, [x0]
-->
  ldr x0, [x0, #8]

Fixes <rdar://problem/17887945>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214545 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-01 19:40:16 +00:00
Juergen Ributzka
3d253a3f80 [FastISel][AArch64] Add branch weights.
Add branch weights to branch instructions, so that the following passes can
optimize based on it (i.e. basic block ordering).

Fixes <rdar://problem/17887137>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214537 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-01 18:39:24 +00:00
Chad Rosier
b4cded4926 [AArch64] Fix test from r214518 in an attempt to appease buildbots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214521 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-01 15:30:41 +00:00
Chad Rosier
4175e2f597 [AArch64] Generate tbz/tbnz when comparing against zero.
The tbz/tbnz checks the sign bit to convert

op w1, w1, w10
cmp w1, #0
b.lt .LBB0_0

to

op w1, w1, w10
tbnz w1, #31, .LBB0_0

Differential Revision: http://reviews.llvm.org/D4440

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214518 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-01 14:48:56 +00:00
Juergen Ributzka
74ac16386b [FastISel][AArch64] Fix the immediate versions of the {s|u}{add|sub}.with.overflow intrinsics.
ADDS and SUBS cannot encode negative immediates or immediates larger than 12bit.
This fix checks if the immediate version can be used under this constraints and
if we can convert ADDS to SUBS or vice versa to support negative immediates.

Also update the test cases to test the immediate versions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214470 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-01 01:25:55 +00:00
Juergen Ributzka
95ec2761f3 [FastISel][AArch64] Add basic bitcast support for conversion between float and int.
Fixes <rdar://problem/17867078>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214389 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-31 06:25:37 +00:00
Juergen Ributzka
a60ed423a6 [FastISel][AArch64] Add sqrt intrinsic support.
Fixes <rdar://problem/17867067>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214388 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-31 06:25:33 +00:00
Juergen Ributzka
e482ebc147 [FastISel][AArch64] Update and enable patchpoint and stackmap intrinsic tests for FastISel.
This commit updates the existing SelectionDAG tests for the stackmap and patchpoint
intrinsics and enables FastISel testing. It also splits up the tests into separate
files, due to different codegen between SelectionDAG and FastISel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214382 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-31 04:10:43 +00:00
Juergen Ributzka
e3a75015d7 [FastISel][AArch64] Add MachO large code model support for function calls.
Currently the large code model for MachO uses the GOT to make function calls.
Emit the required adrp and ldr instructions to load the address from the GOT.

Related to <rdar://problem/17733076>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214381 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-31 04:10:40 +00:00
Juergen Ributzka
cb99212bc1 [FastISel][AArch64] Add select folding support for the XALU intrinsics.
This improves the code generation for the XALU intrinsics when the
condition is feeding a select instruction.

This also updates and enables the XALU unit tests for FastISel.

This fixes <rdar://problem/17831117>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214350 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-30 22:04:37 +00:00
Juergen Ributzka
b0dba10fa6 [FastISel][AArch64] Add support for shift-immediate.
Currently the shift-immediate versions are not supported by tblgen and
hopefully this can be later removed, once the required support has been
added to tblgen.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214345 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-30 22:04:22 +00:00
Jiangning Liu
a3b8caf8a7 Implement AArch64 TTI interface isAsCheapAsAMove.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214159 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-29 02:09:26 +00:00
Tim Northover
54a7f7f9e0 AArch64: fix conversion of 'J' inline asm constraints.
'J' represents a negative number suitable for an add/sub alias
instruction, but while preparing it to become an int64_t we were
mangling the sign extension. So "i32 -1" became 0xffffffffLL, for
example.

Should fix one half of PR20456.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214052 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-27 07:10:29 +00:00
Akira Hatanaka
0651a556fe [stack protector] Fix a potential security bug in stack protector where the
address of the stack guard was being spilled to the stack.

Previously the address of the stack guard would get spilled to the stack if it
was impossible to keep it in a register. This patch introduces a new target
independent node and pseudo instruction which gets expanded post-RA to a
sequence of instructions that load the stack guard value. Register allocator
can now just remat the value when it can't keep it in a register. 

<rdar://problem/12475629>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213967 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-25 19:31:34 +00:00
Juergen Ributzka
06640d93e0 [FastISel][AArch64] Add support for frameaddress intrinsic.
This commit implements the frameaddress intrinsic for the AArch64 architecture
in FastISel.

There were two test cases that pretty much tested the same, so I combined them
to a single test case.

Fixes <rdar://problem/17811834>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213959 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-25 17:47:14 +00:00
Chandler Carruth
d24d326705 [SDAG] Introduce a combined set to the DAG combiner which tracks nodes
which have successfully round-tripped through the combine phase, and use
this to ensure all operands to DAG nodes are visited by the combiner,
even if they are only added during the combine phase.

This is critical to have the combiner reach nodes that are *introduced*
during combining. Previously these would sometimes be visited and
sometimes not be visited based on whether they happened to end up on the
worklist or not. Now we always run them through the combiner.

This fixes quite a few bad codegen test cases lurking in the suite while
also being more principled. Among these, the TLS codegeneration is
particularly exciting for programs that have this in the critical path
like TSan-instrumented binaries (although I think they engineer to use
a different TLS that is faster anyways).

I've tried to check for compile-time regressions here by running llc
over a merged (but not LTO-ed) clang bitcode file and observed at most
a 3% slowdown in llc. Given that this is essentially a worst case (none
of opt or clang are running at this phase) I think this is tolerable.
The actual LTO case should be even less costly, and the cost in normal
compilation should be negligible.

With this combining logic, it is possible to re-legalize as we combine
which is necessary to implement PSHUFB formation on x86 as
a post-legalize DAG combine (my ultimate goal).

Differential Revision: http://reviews.llvm.org/D4638

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213898 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-24 22:15:28 +00:00
Kevin Qin
2daff76c05 [AArch64] Fix a bug generating incorrect instruction when building small vector.
This bug is introduced by r211144. The element of operand may be
smaller than the element of result, but previous commit can
only handle the contrary condition. This commit is to handle this
scenario and generate optimized codes like ZIP1.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213830 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-24 02:05:42 +00:00
Jiangning Liu
1bc34d71b7 [AArch64] Disable some optimization cases for type conversion from sint to fp, because those optimization cases are micro-architecture dependent and only make sense for Cyclone. A new predicate Cyclone is introduced in .td file.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213827 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-24 01:29:59 +00:00
Jim Grosbach
8c0cf3e110 Use an explicit triple in testcase.
Make the test work better on non-darwin hosts. Hopefully.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213801 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 20:46:32 +00:00
Jim Grosbach
adb6a3649e [X86,AArch64] Extend vcmp w/ unary op combine to work w/ more constants.
The transform to constant fold unary operations with an AND across a
vector comparison applies when the constant is not a splat of a scalar
as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213800 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 20:41:43 +00:00
Jim Grosbach
4070037e3c X86: restrict combine to when type sizes are safe.
The folding of unary operations through a vector compare and mask operation
is only safe if the unary operation result is of the same size as its input.
For example, it's not safe for [su]itofp from v4i32 to v4f64.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213799 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 20:41:38 +00:00
Juergen Ributzka
4fa6ecc26f [FastISel][AArch64] Fix return type in FastLowerCall.
I used the wrong method to obtain the return type inside FinishCall. This fix
simply uses the return type from FastLowerCall, which we already determined to
be a valid type.

Reduced test case from Chad. Thanks.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213788 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 20:03:13 +00:00
Chad Rosier
67c325e9f0 [AArch64] Lower sdiv x, pow2 using add + select + shift.
The target-independent DAGcombiner will generate:
asr w1, X, #31 w1 = splat sign bit.
add X, X, w1, lsr #28 X = X + 0 or pow2-1
asr w0, X, asr #4 w0 = X/pow2

However, the add + shifts is expensive, so generate:
add w0, X, 15 w0 = X + pow2-1
cmp X, wzr X - 0
csel X, w0, X, lt X = (X < 0) ? X + pow2-1 : X;
asr w0, X, asr 4 w0 = X/pow2

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213758 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 14:57:52 +00:00
Tim Northover
8b6257629a AArch64: remove "arm64_be" support in favour of "aarch64_be".
There really is no arm64_be: it was a useful fiction to test big-endian support
while both backends existed in parallel, but now the only platform that uses
the name (iOS) doesn't have a big-endian variant, let alone one called
"arm64_be".

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213748 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 12:58:11 +00:00
Juergen Ributzka
7edf396977 [FastIsel][AArch64] Add support for the FastLowerCall and FastLowerIntrinsicCall target-hooks.
This commit modifies the existing call lowering functions to be used as the
FastLowerCall and FastLowerIntrinsicCall target-hooks instead.

This enables patchpoint intrinsic lowering for AArch64.

This fixes <rdar://problem/17733076>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213704 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-22 23:14:58 +00:00
Juergen Ributzka
be8c68d72d [AArch64] Use CHECK-LABEL in ARM64 ABI unit tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213703 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-22 23:14:54 +00:00
Tim Northover
f8d927f22b CodeGen: emit IR-level f16 conversion intrinsics as fptrunc/fpext
This makes the first stage DAG for @llvm.convert.to.fp16 an fptrunc,
and correspondingly @llvm.convert.from.fp16 an fpext. The legalisation
path is now uniform, regardless of the input IR:

  fptrunc -> FP_TO_FP16 (if f16 illegal) -> libcall
  fpext -> FP16_TO_FP (if f16 illegal) -> libcall

Each target should be able to select the version that best matches its
operations and not be required to duplicate patterns for both fptrunc
and FP_TO_FP16 (for example).

As a result we can remove some redundant AArch64 patterns.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213507 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 09:13:56 +00:00
Tim Northover
e72ff8829e AArch64: implement efficient f16 bitcasts
Because i16 is illegal, there's no native DAG method to
represent a bitcast to or from an f16 type. This meant LLVM was
inserting a stack store/load pair which is really not ideal.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213378 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-18 13:07:05 +00:00
Tim Northover
1a8bcdb72e AArch64: support f16 extend/trunc operations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213375 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-18 13:01:31 +00:00
Tim Northover
0afed03229 CodeGen: soften f16 type by default instead of marking legal.
Actual support for softening f16 operations is still limited, and can be added
when it's needed.  But Soften is much closer to being a useful thing to try
than keeping it Legal when no registers can actually hold such values.

Longer term, we probably want something between Soften and Promote semantics
for most targets, it'll be more efficient to promote the 4 basic operations to
f32 than libcall them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213372 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-18 12:41:46 +00:00
Jim Grosbach
f4e104f5eb AArch64: Constant fold converting vector setcc results to float.
Since the result of a SETCC for AArch64 is 0 or -1 in each lane, we can
move unary operations, in this case [su]int_to_fp through the mask
operation and constant fold the operation away. Generally speaking:
  UNARYOP(AND(VECTOR_CMP(x,y), constant))
      --> AND(VECTOR_CMP(x,y), constant2)
where constant2 is UNARYOP(constant).

This implements the transform where UNARYOP is [su]int_to_fp.

For example, consider the simple function:
define <4 x float> @foo(<4 x float> %val, <4 x float> %test) nounwind {
  %cmp = fcmp oeq <4 x float> %val, %test
  %ext = zext <4 x i1> %cmp to <4 x i32>
  %result = sitofp <4 x i32> %ext to <4 x float>
  ret <4 x float> %result
}

Before this change, the code is generated as:
  fcmeq.4s  v0, v0, v1
  movi.4s v1, #0x1        // Integer splat value.
  and.16b v0, v0, v1      // Mask lanes based on the comparison.
  scvtf.4s  v0, v0        // Convert each lane to f32.
  ret

After, the code is improved to:
  fcmeq.4s  v0, v0, v1
  fmov.4s v1, #1.00000000 // f32 splat value.
  and.16b v0, v0, v1      // Mask lanes based on the comparison.
  ret

The svvtf.4s has been constant folded away and the floating point 1.0f
vector lanes are materialized directly via fmov.4s.

Rather than do the folding manually in the target code, teach getNode()
in the generic SelectionDAG to handle folding constant operands of
vector [su]int_to_fp nodes. It is reasonable (as noted in a FIXME) to do
additional constant folding there as well, but I don't have test cases
for those operations, so leaving them for another time when it becomes
appropriate.

rdar://17693791

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213341 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-18 00:40:52 +00:00
Tim Northover
3e61ccdded CodeGen: extend f16 conversions to permit types > float.
This makes the two intrinsics @llvm.convert.from.f16 and
@llvm.convert.to.f16 accept types other than simple "float". This is
only strictly needed for the truncate operation, since otherwise
double rounding occurs and there's no way to represent the strict IEEE
conversion. However, for symmetry we allow larger types in the extend
too.

During legalization, we can expand an "fp16_to_double" operation into
two extends for convenience, but abort when the truncate isn't legal. A new
libcall is probably needed here.

Even after this commit, various target tweaks are needed to actually use the
extended intrinsics. I've put these into separate commits for clarity, so there
are no actual tests of f64 conversion here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213248 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-17 10:51:23 +00:00
Yi Kong
f33a30cdd0 Port memory barriers intrinsics to AArch64
Memory barrier __builtin_arm_[dmb, dsb, isb] intrinsics are required to
implement their corresponding ACLE and MSVC intrinsics.

This patch ports ARM dmb, dsb, isb intrinsic to AArch64.

Differential Revision: http://reviews.llvm.org/D4520


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213247 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-17 10:50:20 +00:00
Tim Northover
fbb631183a AArch64: fall back to generic code for out of range extract/insert.
rdar://problem/17624784

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213059 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-15 10:00:26 +00:00
Tim Northover
26012cec89 AArch64: remove unnecessary pseudo-instruction.
Sufficiently twisted use of TableGen lets us write patterns directly for f16
(as an i16 promoted to i32) -> f32 conversion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212933 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-14 11:16:02 +00:00
Saleem Abdulrasool
01c06d7954 AArch64: add support for llvm.aarch64.hint intrinsic
This adds a llvm.aarch64.hint intrinsic to mirror the llvm.arm.hint in order to
support the various hint intrinsic functions in the ACLE.

Add an optional pattern field that permits the subclass to specify the pattern
that matches the selection.  The intrinsic pattern is set as mayLoad, mayStore,
so overload the value for the definition of the hint instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212883 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-12 21:20:49 +00:00
Oliver Stannard
cb047f2a74 ARM: Allow __fp16 as a function arg or return type for AArch64
ACLE 2.0 allows __fp16 to be used as a function argument or return
type. This enables this for AArch64.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212812 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-11 13:33:46 +00:00
Tim Northover
6b0ac2aa02 AArch64: correctly fast-isel i8 & i16 multiplies
We were asking for a register for type i8 or i16 which caused an assert.

rdar://problem/17620015

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212718 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-10 14:18:46 +00:00
Hao Liu
a3c15c19b8 [AArch64]Fix an assertion failure in DAG Combiner about concating 2 build_vector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212677 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-10 03:41:50 +00:00
Jim Grosbach
a3edd6a038 AArch64: Better codegen for storing to __fp16.
Storing will generally be immediately preceded by rounding from an f32
or f64, so make sure to match those patterns directly to convert into the
FPR16 register class directly rather than going through the integer GPRs.

This also eliminates an extra step in the convert-from-f64 path
which was first converting to f32 and then to f16 from there.

rdar://17594379

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212638 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-09 18:55:52 +00:00
Jim Grosbach
05bb7c5045 AArch64: Better codegen for loading from __fp16.
Loading will generally extend to an f32 or an 64, so make sure
to match those patterns directly to load into the FPR16 register
class directly rather than going through the integer GPRs.

This also eliminates an extra step in the convert-to-f64 path
which was first converting to f32 and then to f64 from there.

rdar://17594379

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212573 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-08 23:28:48 +00:00
Louis Gerbarg
e7f8191b18 Allow AArch64FastISel to degrade graceully in the presence of an MVT::i128
Currently AArch64FastISel crashes if it tries to extend an integer into an
MVT::i128. This can happen by creating 128 bit integers like so:

  typedef unsigned int uint128_t __attribute__((mode(TI)));
  typedef int sint128_t __attribute__((mode(TI)));

This patch makes EmitIntExt check for their presence and then falls back to
SelectionDAG.

Tests included.

rdar://17516686

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212492 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-07 21:37:51 +00:00
Tim Northover
3e16b022be CodeGen: it turns out that NAND is not the same thing as BIC. At all.
We've been performing the wrong operation on ARM for "atomicrmw nand" for
years, since "a NAND b" is "~(a & b)" rather than ARM's very tempting "a & ~b".
This bled over into the generic expansion pass.

So I assume no-one has ever actually tried to do an atomic nand in the real
world. Oh well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212443 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-07 09:06:35 +00:00
Kevin Qin
307e97d066 [AArch64] Normalize all constants to build a vector.
The value of constant operands will be truncated to fit element width.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212428 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-07 02:45:40 +00:00
Chandler Carruth
c179202bb4 [aarch64] Add a test that should have been in r212242 but I forgot to
add it. Sorry about that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212251 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-03 02:12:26 +00:00
Chandler Carruth
70968365db [codegen,aarch64] Add a target hook to the code generator to control
vector type legalization strategies in a more fine grained manner, and
change the legalization of several v1iN types and v1f32 to be widening
rather than scalarization on AArch64.

This fixes an assertion failure caused by scalarizing nodes like "v1i32
trunc v1i64". As v1i64 is legal it will fail to scalarize v1i32.

This also provides a foundation for other targets to have more granular
control over how vector types are legalized.

Patch by Hao Liu, reviewed by Tim Northover. I'm committing it to allow
some work to start taking place on top of this patch as it adds some
really important hooks to the backend that I'd like to immediately start
using. =]

http://reviews.llvm.org/D4322

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212242 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-03 00:23:43 +00:00
Duncan P. N. Exon Smith
9b4509a759 AArch64: Re-enable AArch64AddressTypePromotion
This reverts commits r212189 and r212190.

While this pass was accidentally disabled (until r212073), r205437
slipped in a use of `auto` that should have been `auto&`.

This fixes PR20188.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212201 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-02 18:17:40 +00:00
Duncan P. N. Exon Smith
0c4d7a1ba5 XFAIL the test to go with r202189
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212190 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-02 17:07:03 +00:00
Chad Rosier
5ee1c8f6f0 Revert "Revert "MachineScheduler: better book-keeping for asserts.""
This reverts commit r212109, which reverted r212088.

However, disable the assert as it's not necessary for correctness.  There are
several corner cases that the assert needed to handle better for in-order
scheduling, but none of them are incorrect scheduler behavior. The assert is
mainly there to collect good unit tests like this and ensure that the
target-independent scheduler is working as expected with the various machine
models.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212187 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-02 16:46:08 +00:00
Chad Rosier
582dd1e608 Revert "MachineScheduler: better book-keeping for asserts."
This reverts commit r212088, which is causing a number of spec
failures.  Will provide reduced test cases shortly.
PR20057

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212109 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-01 17:23:11 +00:00
Andrew Trick
5c7ec29ebc MachineScheduler: better book-keeping for asserts.
Fixes another test case under PR20057.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212088 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-01 03:23:13 +00:00
Duncan P. N. Exon Smith
91fa94884d AArch64: Actually do address type promotion
AArch64AddressTypePromotion was doing nothing because it was using the
old semantics of `Use` and `uses()`, when it really wanted to get at the
`users()`.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212073 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-30 23:42:14 +00:00
Chad Rosier
99f2d6fcc2 [AArch64] Unsized types don't specify an alignment.
PR20109

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212045 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-30 15:03:00 +00:00
Chad Rosier
e7dfa85e85 [AArch64] Convert mul x, -(pow2 +/- 1) to shift + add/sub.
The combine for mul x, pow2 +/- 1 is unchanged. Test cases for
both combines as well as mul x, pow2 have been added as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212044 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-30 14:51:14 +00:00
Chad Rosier
d7be29696d [AArch64] Fix memset ICE when memset value is f128.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211960 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 21:05:09 +00:00
Andrew Trick
e8f8db1c5a MachineScheduler: add some book-keeping to fix an assert.
Fixe for Bug 20057 - Assertion failied in llvm::SUnit* llvm::SchedBoundary::pickOnlyChoice(): Assertion `i <= (HazardRec->getMaxLookAhead() + MaxObservedStall) && "permanent hazard"'

Thanks to Chad for the test case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211865 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 04:57:05 +00:00
Weiming Zhao
c33b4883b3 Resubmit commit r211533
"Fix PR20056: Implement pseudo LDR <reg>, =<literal/label> for AArch64"
Missed files are added in this commit.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211605 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 16:21:38 +00:00
Kevin Qin
8c0787e83a [AArch64] Fix a build_vector pattern match fail
caused by defect in isBuildVectorAllZeros().

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211567 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 05:37:27 +00:00
Arnold Schwaighofer
5d5ddf9663 Add a triple so that right syntax is choosen on mac osx systems
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211188 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-18 17:20:49 +00:00
Kevin Qin
74287ec34c [AArch64] Fix a pattern match failure caused by creating improper CONCAT_VECTOR.
ReconstructShuffle() may wrongly creat a CONCAT_VECTOR trying to
concat 2 of v2i32 into v4i16. This commit is to fix this issue and
try to generate UZP1 instead of lots of MOV and INS.
Patch is initalized by Kevin Qin, and refactored by Tim Northover.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211144 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-18 05:54:42 +00:00
Tim Northover
c22960dba6 AArch64: estimate inline asm length during branch relaxation
To make sure branches are in range, we need to do a better job of estimating
the length of an inline assembly block than "it's probably 1 instruction, who'd
write asm with more than that?".

Fortunately there's already a (highly suspect, see how many ways you can think
of to break it!) callback for this purpose, which is used by the other targets.

rdar://problem/17277590

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211095 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-17 11:31:42 +00:00
James Molloy
b3820b4289 [AArch64] Fix a fencepost error in lowering for llvm.aarch64.neon.uqshl.
Patch by Jiangning Liu!



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211014 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-16 10:39:21 +00:00
Tim Northover
8bfc50e4a9 AArch64: improve handling & modelling of FP_TO_XINT nodes.
There's probably no acatual change in behaviour here, just updating
the LowerFP_TO_INT function to be more similar to the reverse
implementation and updating costs to current CodeGen.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210985 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-15 09:27:15 +00:00
Tim Northover
94fe5c1fe2 AArch64: improve vector [su]itofp handling.
This somehow got missed in the AArch64 merge, so should fix a
performance regression since 3.4.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210984 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-15 09:27:06 +00:00
Jiangning Liu
c5bc067a0f Move GlobalMerge from Transform to CodeGen.
This patch is to move GlobalMerge pass from Transform/Scalar                                                           
to CodeGen, because GlobalMerge depends on TargetMachine.
In the mean time, the macro INITIALIZE_TM_PASS is also moved
to CodeGen/Passes.h. With this fix we can avoid making
libScalarOpts depend on libCodeGen.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210951 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-13 22:57:59 +00:00
Tim Northover
8f2a85e099 IR: add "cmpxchg weak" variant to support permitted failure.
This commit adds a weak variant of the cmpxchg operation, as described
in C++11. A cmpxchg instruction with this modifier is permitted to
fail to store, even if the comparison indicated it should.

As a result, cmpxchg instructions must return a flag indicating
success in addition to their original iN value loaded. Thus, for
uniformity *all* cmpxchg instructions now return "{ iN, i1 }". The
second flag is 1 when the store succeeded.

At the DAG level, a new ATOMIC_CMP_SWAP_WITH_SUCCESS node has been
added as the natural representation for the new cmpxchg instructions.
It is a strong cmpxchg.

By default this gets Expanded to the existing ATOMIC_CMP_SWAP during
Legalization, so existing backends should see no change in behaviour.
If they wish to deal with the enhanced node instead, they can call
setOperationAction on it. Beware: as a node with 2 results, it cannot
be selected from TableGen.

Currently, no use is made of the extra information provided in this
patch. Test updates are almost entirely adapting the input IR to the
new scheme.

Summary for out of tree users:
------------------------------

+ Legacy Bitcode files are upgraded during read.
+ Legacy assembly IR files will be invalid.
+ Front-ends must adapt to different type for "cmpxchg".
+ Backends should be unaffected by default.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210903 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-13 14:24:07 +00:00
Chad Rosier
2e4cd27799 [AArch64] Basic Sched Model for Cortex-A57.
Patch by Dave Estes<cestes@codeaurora.org>
Differential Revision: http://reviews.llvm.org/D4008

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210705 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 21:06:56 +00:00
Jiangning Liu
f847ccb87a Global merge for global symbols.
This commit is to improve global merge pass and support global symbol merge.
The global symbol merge is not enabled by default. For aarch64, we need some
more back-end fix to make it really benifit ADRP CSE.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210640 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 06:44:53 +00:00
Chad Rosier
0db9526c1a [AArch64] Emit .ident compiler version attribute.
Patch by Ana Pazos<apazos@codeaurora.org>!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210535 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 14:32:08 +00:00
Tim Northover
e1db6ac10b AArch64: disallow x30 & x29 as the destination for indirect tail calls
As Ana Pazos pointed out, these have to be restored to their incoming values
before a function returns; i.e. before the tail call. So they can't be used
correctly as the destination register.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210525 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 10:50:24 +00:00
Tim Northover
46b3076cd0 AArch64: teach FastISel how to handle offset FrameIndices
Previously we were abandonning the attempt, leading to some combination of
extra work (when selection of a load/store fails completely) and inferior code
(when this leads to a real memcpy call instead of inlining).

rdar://problem/17187463

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210520 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 09:52:44 +00:00
Tim Northover
292c7c6a48 AArch64: make FastISel memcpy emission more robust.
We were hitting an assert if FastISel couldn't create the load or store we
requested. Currently this happens for large frame-local addresses, though
CodeGen could be improved there.

rdar://problem/17187463

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210519 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 09:52:40 +00:00
Alp Toker
8aeca44558 Reduce verbiage of lit.local.cfg files
We can just split targets_to_build in one place and make it immutable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210496 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-09 22:42:55 +00:00
Chad Rosier
451cc566c1 [AArch64] Fix the ordering of the accumulate operand in SchedRW list.
Patch by Dave Estes <cestes@codeaurora.org>
http://reviews.llvm.org/D4037

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210446 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-09 01:54:00 +00:00
Chad Rosier
0607e82c0a [AArch64] When combining constant mul of power of 2 plus/minus 1, prefer shift
plus add.  The shift can be folded into the add.  This only effects codegen
when the constant is 3.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210445 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-09 01:25:51 +00:00
Tilmann Scheller
9f039304b3 [AArch64] Add regression tests for the load/store optimizer which cover post-index update folding with sub rather than add.
The tests check that the following transform happens:

  (ldr|str) X, [x20]
   ...
  sub x20, x20, #16
   ->
  (ldr|str) X, [x20], #-16

with X being either w0, x0, s0, d0 or q0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210113 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-03 16:03:00 +00:00
Tim Northover
1410a2c906 AArch64: mark small types (i1, i8, i16) as promoted
This means the output of LowerFormalArguments returns a lowered
SDValue with the correct type (expected in SelectionDAGBuilder).
Without this, an assertion under a DEBUG macro triggers when those
types are passed on the stack.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210102 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-03 13:54:53 +00:00
Jiangning Liu
9a2d239740 [AArch64] Correctly deal with VPR stack parameter passing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210067 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-03 03:25:09 +00:00
Tilmann Scheller
cca9c26920 [AArch64] Add some more regression tests for store pre-index update folding in the load/store optimizer.
Add tests for the following transform:

 add x8, x8, #16
  ...
 str X, [x8]
  ->
 str X, [x8, #16]!

with X being either w0, x0, s0, d0 or q0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210021 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-02 12:33:33 +00:00
Tilmann Scheller
ef8a99c807 [AArch64] Add some more regression tests for load pre-index update folding in the load/store optimizer.
Add tests for the following transform:

 add x8, x8, #16
  ...
 ldr X, [x8]
  ->
 ldr X, [x8, #16]!

with X being either w0, x0, s0, d0 or q0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210018 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-02 11:57:09 +00:00
Tim Northover
d0dbe02fd2 ARM & AArch64: make use of common cmpxchg idioms after expansion
The C and C++ semantics for compare_exchange require it to return a bool
indicating success. This gets mapped to LLVM IR which follows each cmpxchg with
an icmp of the value loaded against the desired value.

When lowered to ldxr/stxr loops, this extra comparison is redundant: its
results are implicit in the control-flow of the function.

This commit makes two changes: it replaces that icmp with appropriate PHI
nodes, and then makes sure earlyCSE is called after expansion to actually make
use of the opportunities revealed.

I've also added -{arm,aarch64}-enable-atomic-tidy options, so that
existing fragile tests aren't perturbed too much by the change. Many
of them either rely on undef/unreachable too pervasively to be
restored to something well-defined (particularly while making sure
they test the same obscure assert from many years ago), or depend on a
particular CFG shape, which is disrupted by SimplifyCFG.

rdar://problem/16227836

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209883 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-30 10:09:59 +00:00
Tim Northover
7be505ae88 AArch64 & ARM: remove undefined behaviour from some tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209880 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-30 08:59:55 +00:00
Hao Liu
fd481d05be Test cases named with dates is a legacy rule not used now. Rename several test cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209877 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-30 05:58:19 +00:00
Hao Liu
086a708135 Rename a test case to contain correct date info.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209799 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-29 09:21:23 +00:00
Hao Liu
bb7f18abf8 Fix an assertion failure caused by v1i64 in DAGCombiner Shrink.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209798 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-29 09:19:07 +00:00
Hal Finkel
9b77161927 Revert "[DAGCombiner] Split up an indexed load if only the base pointer value is live"
This reverts r208640 (I've just XFAILed the test) because it broke ppc64/Linux
self-hosting. Because nearly every regression test triggers a segfault, I hope
this will be easy to fix.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209747 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-28 15:33:19 +00:00
Tilmann Scheller
d8ba67b97b [AArch64] Add store post-index update folding regression tests for the load/store optimizer.
Add regression tests for the following transformation:

  str X, [x20]
   ...
  add x20, x20, #32
   ->
  str X, [x20], #32

with X being either w0, x0, s0, d0 or q0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209715 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-28 06:43:00 +00:00
Tilmann Scheller
4cbbe0d97e [AArch64] Add load post-index update folding regression tests for the load/store optimizer.
Add regression tests for the following transformation:

 ldr X, [x20]
  ...
 add x20, x20, #32
  ->
 ldr X, [x20], #32

 with X being either w0, x0, s0, d0 or q0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209711 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-28 05:44:14 +00:00
Tim Northover
fb26356bed AArch64: add test for NZCV cross-copy save.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209665 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-27 16:50:09 +00:00
Tim Northover
be5c8baeb6 AArch64: add AArch64-specific test for 'c' and 'n'.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209664 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-27 16:50:03 +00:00
Tim Northover
e0c2787cb7 AArch64: force i1 to be zero-extended at an ABI boundary.
This commit is debatable. There are two possible approaches, neither
of which is really satisfactory:

1. Use "@foo(i1 zeroext)" to mean an extension to 32-bits on Darwin,
   and 8 bits otherwise.
2. Redefine "@foo(i1)" to mean that the i1 is extended by the caller
   to 8 bits. This goes against the spirit of "zeroext" I think, but
   it's a bit of a vague construct anyway (by definition you're going
   to extend to the amount required by the ABI, that's why it's the
   ABI!).

This implements option 2. The DAG machinery really isn't setup for the
first (there's a fairly strong assumption that "zeroext" goes to at
least the smallest register size), and even if it was the resulting
DAG looks like it would be inferior in many cases.

Theoretically we could add AssertZext nodes in the consumers of
ABI-passed values too now, but this actually seems to make the code
worse in practice by making truncation proceed in two steps. The code
produced is equally valid if we continue to assume only the low bit is
defined.

Should fix PR19850

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209637 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-26 17:22:07 +00:00
Tim Northover
4146695fb2 AArch64: simplify calling conventions slightly.
We can eliminate the custom C++ code in favour of some TableGen to
check the same things. Functionality should be identical, except for a
buffer overrun that was present in the C++ code and meant webkit
failed if any small argument needed to be passed on the stack.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209636 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-26 17:21:53 +00:00
Tilmann Scheller
7aac65de51 [AArch64] Add store + add folding regression tests for the load/store optimization pass.
Add tests for the following transform:

 str X, [x0, #32]
  ...
 add x0, x0, #32
  ->
 str X, [x0, #32]!

with X being either w1, x1, s0, d0 or q0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209627 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-26 13:36:47 +00:00
Tilmann Scheller
3390e6c4a8 [AArch64] Add more regression tests for the load/store optimization pass.
Cover the following cases:

  ldr X, [x0, #32]
   ...
  add x0, x0, #32
   ->
  ldr X, [x0, #32]!

with X being either w1, x1, s0, d0 or q0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209624 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-26 12:15:51 +00:00
Tilmann Scheller
852c0cc64f Remove accidentally committed whitespace.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209619 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-26 09:40:40 +00:00
Tilmann Scheller
47e3b43768 [AArch64] Add a regression test for the load store optimizer.
We have a couple of regression tests for load/store pairing, but (to my knowledge) there are no regression tests for the load/store + add/sub folding.

As a first step towards increased test coverage of this area, this commit adds a test for one instance of a load + add to pre-indexed load transformation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209618 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-26 09:37:19 +00:00
Tim Northover
29f94c7201 AArch64/ARM64: move ARM64 into AArch64's place
This commit starts with a "git mv ARM64 AArch64" and continues out
from there, renaming the C++ classes, intrinsics, and other
target-local objects for consistency.

"ARM64" test directories are also moved, and tests that began their
life in ARM64 use an arm64 triple, those from AArch64 use an aarch64
triple. Both should be equivalent though.

This finishes the AArch64 merge, and everyone should feel free to
continue committing as normal now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209577 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-24 12:50:23 +00:00
Tim Northover
9105f66d6f AArch64/ARM64: remove AArch64 from tree prior to renaming ARM64.
I'm doing this in two phases for a better "git blame" record. This
commit removes the previous AArch64 backend and redirects all
functionality to ARM64. It also deduplicates test-lines and removes
orphaned AArch64 tests.

The next step will be "git mv ARM64 AArch64" and rewire most of the
tests.

Hopefully LLVM is still functional, though it would be even better if
no-one ever had to care because the rename happens straight
afterwards.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209576 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-24 12:42:26 +00:00
Tim Northover
de9e4c88c8 AArch64/ARM64: enable more AArch64 tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209408 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-22 07:40:55 +00:00
Rafael Espindola
21cfedee05 Revert "Implement global merge optimization for global variables."
This reverts commit r208934.

The patch depends on aliases to GEPs with non zero offsets. That is not
supported and fairly broken.

The good news is that GlobalAlias is being redesigned and will have support
for offsets, so this patch should be a nice match for it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208978 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-16 13:02:18 +00:00
Tim Northover
a9a94ce839 TableGen: fix operand counting for aliases
TableGen has a fairly dubious heuristic to decide whether an alias should be
printed: does the alias have lest operands than the real instruction. This is
bad enough (particularly with no way to override it), but it should at least be
calculated consistently for both strings.

This patch implements that logic: first get the *correct* string for the
variant, in the same way as the Matcher, without guessing; then count the
number of whitespace chars.

There are basically 4 changes this brings about after the previous
commits; all of these appear to be good, so I have changed the tests:

+ ARM64: we print "neg X, Y" instead of "sub X, xzr, Y".
+ ARM64: we skip implicit "uxtx" and "uxtw" modifiers.
+ Sparc: we print "mov A, B" instead of "or %g0, A, B".
+ Sparc: we print "fcmpX A, B" instead of "fcmpX %fcc0, A, B"

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208969 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-16 09:42:04 +00:00
Jiangning Liu
d5db8765d6 Implement global merge optimization for global variables.
This commit implements two command line switches -global-merge-on-external
and -global-merge-aligned, and both of them are false by default, so this
optimization is disabled by default for all targets.

For ARM64, some back-end behaviors need to be tuned to get this optimization
further enabled.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208934 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-15 23:45:42 +00:00
Tim Northover
0a088b1fc5 ARM64: print correct aliases for NEON mov & mvn instructions
In all cases, if a "mov" alias exists, it is the canonical form of the
instruction. Now that TableGen can support aliases containing syntax variants,
we can enable them and improve the quality of the asm output.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208874 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-15 12:11:02 +00:00
Tim Northover
f61a467a59 TableGen/ARM64: print aliases even if they have syntax variants.
To get at least one use of the change (and some actual tests) in with its
commit, I've enabled the AArch64 & ARM64 NEON mov aliases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208867 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-15 11:16:32 +00:00
Jiangning Liu
66b123f0d8 [ARM64] Support aggressive fastcc/tailcallopt breaking ABI by popping out argument stack from callee.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208837 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-15 01:33:17 +00:00
Tim Northover
d6cd0381f6 TableGen: use PrintMethods to print more aliases
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208607 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-12 18:04:06 +00:00
Tim Northover
04a359f768 AArch64/ARM64: optimise vector selects & enable test
When performing a scalar comparison that feeds into a vector select,
it's actually better to do the comparison on the vector side: the
scalar route would be "CMP -> CSEL -> DUP", the vector is "CM -> DUP"
since the vector comparisons are all mask based.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208210 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-07 14:10:27 +00:00
James Molloy
104629cc7c [ARM64-BE] Make big endian (scalar) argument passing work correctly.
This completes the port of r204814 (cpirker "AArch64_BE function argument
passing for ARM ABI") from AArch64 to ARM64, and fixes a bunch of issues
found during later development along the way. The biggest of these was
that the alignment fixup logic wasn't replicated into all the places it
should have been.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208192 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-07 11:28:36 +00:00
Tim Northover
0d427515b3 AArch64/ARM64: run test on ARM64 too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208188 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-07 10:47:04 +00:00
Tim Northover
5a1d5139b8 AArch64/ARM64: put annotation in test
It makes finding already covered tests much easier with "grep -L
arm64".

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208187 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-07 10:47:00 +00:00
Renato Golin
22f779d1fd Implememting named register intrinsics
This patch implements the infrastructure to use named register constructs in
programs that need access to specific registers (bare metal, kernels, etc).

So far, only the stack pointer is supported as a technology preview, but as it
is, the intrinsic can already support all non-allocatable registers from any
architecture.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208104 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-06 16:51:25 +00:00
Rafael Espindola
b889448841 Special case aliases in GlobalValue::getAlignment.
An alias has the address of what it points to, so it also has the same
alignment.

This allows a few optimizations to see past aliases for free.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208103 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-06 16:48:58 +00:00
Tim Northover
f2f35a9ca3 AArch64/ARM64: print BFM instructions as BFI or BFXIL
The canonical form of the BFM instruction is always one of the more explicit
extract or insert operations, which makes reading output much easier.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207752 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-01 12:29:38 +00:00
Tim Northover
d805bf8d61 AArch64/ARM64: use HS instead of CS & LO instead of CC.
On instructions using the NZCV register, a couple of conditions have dual
representations: HS/CS and LO/CC (meaning unsigned-higher-or-same/carry-set and
unsigned-lower/carry-clear). The first of these is more descriptive in most
circumstances, so we should print it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207644 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-30 13:14:03 +00:00
Tim Northover
ebde5a5e49 ARM64: use hex immediates for movz/movk instructions
Since these are mostly used in "lsl #16", "lsl #32", "lsl #48" combinations to
piece together an immediate in 16-bit chunks, hex is probably the most
appropriate format.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207635 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-30 11:19:40 +00:00
Tim Northover
2a2cce79be ARM64: print canonical syntax for add/sub (imm) instructions.
Since these instructions only accept a 12-bit immediate, possibly shifted left
by 12, the canonical syntax used by the architecture reference manual is "#N {,
lsl #12 }". We should accept an immediate that has already been shifted, (e.g.

Also, print a comment giving the full addend since it can be helpful.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207633 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-30 11:19:15 +00:00
James Molloy
c447befac4 [ARM64] Ensure arm64_be is dealt with when emitting debug info.
This is a partial port of r204816 (cpirker "Elf support for MC-JIT
runtime dynamic linker") from AArch64 to ARM64.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207625 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-30 10:15:35 +00:00
Benjamin Kramer
43705683fd AArch64: Mark vector long multiplication as expand.
There are no patterns for this. This was already fixed for ARM64 but I forgot
to apply it to AArch64 too.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207515 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-29 09:37:54 +00:00
Bradley Smith
8aa927abb5 [ARM64] Print preferred aliases for SFBM/UBFM in InstPrinter
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207219 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-25 10:25:29 +00:00
Kevin Qin
435b9bd9fb [ARM64] Add RUN lines for "–target arm64 –mattr=-fp-armv8" on AArch64 no-fp test.
This patch is a supplement of implementing predicate of FP, enabling aarch64 backend
no-fp tests on arm64 target for verification. During this, one bug is exposed and
fixed by this patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207215 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-25 09:44:20 +00:00
Tim Northover
a05d37e1f4 AArch64: print NEON lists with a space.
This matches ARM64 behaviour, which I think is clearer. It also puts all the
churn from that difference into one easily ignored commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207116 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-24 14:06:20 +00:00
Tim Northover
00b214a406 AArch64/ARM64: port bitfield test to ARM64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207103 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-24 12:11:56 +00:00
Tim Northover
b62ba5eca0 AArch64/ARM64: implement BFI optimisation
ARM64 was not producing pure BFI instructions for bitfield insertion
operations, unlike AArch64. The approach had to be a little different (in
ISelDAGToDAG rather than ISelLowering), and the outcomes aren't identical but
hopefully this gives it similar power.

This should address PR19424.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207102 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-24 12:11:53 +00:00
Tim Northover
fe6f4e4d31 AArch64/ARM64: port more tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207101 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-24 12:11:46 +00:00
Tim Northover
2872e118b3 AArch64/ARM64: more testing from AArch64 to ARM64
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206889 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-22 12:45:47 +00:00
Tim Northover
8b36f98fd5 AArch64/ARM64: make use of ANDS and BICS instructions for comparisons.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206888 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-22 12:45:42 +00:00
Tim Northover
c499ecd1d1 AArch64/ARM64: add extra testing from AArch64 to ARM64
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206887 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-22 12:45:32 +00:00
Tim Northover
ba61446a56 AArch64/ARM64: enable various AArch64 tests on ARM64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206877 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-22 10:10:26 +00:00
Tim Northover
0e277d18bb AArch64/ARM64: add patterns for scalar_to_vector/extract pairs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206876 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-22 10:10:18 +00:00
Tim Northover
85974bc77e AArch64/ARM64: mark fmul intrinsic as commutative.
This gives DAG patterns matching indexed patterns where either side is an
indexed vector.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206875 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-22 10:10:14 +00:00
Jiangning Liu
0240286c23 [AArch64] Enable global merge pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206861 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-22 03:33:26 +00:00
Tim Northover
7b4b261611 AArch64/ARM64: add more NEON tests.
Mostly no testing this time, since they were just wrangling
target-specific intrinsics.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206613 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-18 14:54:53 +00:00
Tim Northover
1d5a2ad8a6 ARM64: add extra NEG pattern.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206609 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-18 14:54:35 +00:00
Tim Northover
936285440b AArch64/ARM64: port more AArch64 tests to ARM64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206592 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-18 13:16:55 +00:00
Tim Northover
7b4b522ec8 AArch64/ARM64: improve spotting of EXT instructions from VECTOR_SHUFFLE.
We couldn't cope if the first mask element was UNDEF before, which
isn't ideal.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206588 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-18 12:50:58 +00:00
Tim Northover
fb96efa7dd AArch64/ARM64: port atomics test to ARM64.
Covers quite a few extra instructions (like any of the max/min ones
which were broken until recently on ARM64).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206575 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-18 09:31:31 +00:00
Tim Northover
0d6995985a AArch64/ARM64: spot a greater variety of concat_vector operations.
Code mostly copied from AArch64, just tidied up a trifle and plumbed
into the ARM64 way of doing things.

This also enables the AArch64 tests which inspired the previous
untested commits.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206574 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-18 09:31:27 +00:00
Tim Northover
66643da8fc AArch64/ARM64: emit all vector FP comparisons as such.
ARM64 was scalarizing some vector comparisons which don't quite map to
AArch64's compare and mask instructions. AArch64's approach of sacrificing a
little efficiency to emulate them with the limited set available was better, so
I ported it across.

More "inspired by" than copy/paste since the backend's internal expectations
were a bit different, but the tests were invaluable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206570 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-18 09:31:07 +00:00
Tim Northover
937290d7ed AArch64/ARM64: port BSL logic from AArch64 & enable test.
I enhanced it a little in the process. The decision shouldn't really be beased
on whether a BUILD_VECTOR is a splat: any set of constants will do the job
provided they're related in the correct way.

Also, the BUILD_VECTOR could be any operand of the incoming AND nodes, so it's
best to check for all 4 possibilities rather than assuming it'll be the RHS.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206569 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-18 09:31:01 +00:00
Tim Northover
2f5d14af9d AArch64/ARM64: copy byval implementation from AArch64.
It's not actually used to handle C or C++ ABI rules on ARM64, but could well be
emitted by other language front-ends, so it's as well to have a sensible
implementation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206568 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-18 09:30:52 +00:00
Jiangning Liu
bc3655f9c8 This is one of the optimizations ported from ARM64 to AArch64 to address the performance gap between these two back ends. The test case newly added for AArch64 already exists in ARM64.
Patched by Z.Zheng



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206559 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-18 05:58:09 +00:00
Jiangning Liu
532a5ffe4c This commit enables unaligned memory accesses of vector types on AArch64 back end. This should boost vectorized code performance.
Patched by Z. Zheng



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206557 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-18 03:58:38 +00:00
Tim Northover
f539725734 AArch64/ARM64: port some NEON tests to ARM64
These ones used completely different sets of intrinsics, so the only way to do
it is create a separate ARM64 copy and change them all.

Other than that, CodeGen was straightforward, no deficiencies detected here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206392 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-16 15:28:02 +00:00
Tim Northover
1a8adcb569 AArch64/ARM64: add another set of tests from AArch64
Another batch with no code changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206381 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-16 11:53:07 +00:00
Tim Northover
1a44333f0e AArch64/ARM64: port across stub handling for ELF C++ exceptions.
The most important part here is that we should actuall emit the stubs we refer
to in the exception table, but as a side issue this uses more sensible & GCC
compatible representations for some of the bits of information.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206380 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-16 11:52:55 +00:00
Tim Northover
fef8e383eb ARM64: use 32-bit moves for constants where possible.
If we know that a particular 64-bit constant has all high bits zero, then we
can rely on the fact that 32-bit ARM64 instructions automatically zero out the
high bits of an x-register. This gives the expansion logic less constraints to
satisfy and so sometimes allows it to pick better sequences.

Came up while porting test/CodeGen/AArch64/movw-consts.ll: this will allow a
32-bit MOVN to be used in @test8 soon.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206379 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-16 11:52:51 +00:00
Tim Northover
ea9988a812 ARM64: use the integrated assembler on ELF.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206378 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-16 11:52:40 +00:00
Tim Northover
7474c171e1 DAGCombiner: don't optimise non-existant litpool load
This particular DAG combine is designed to kick in when both ConstantFPs will
end up being loaded via a litpool, however those nodes have a semi-legal
status, dictated by isFPImmLegal so in some cases there wouldn't have been a
litpool in the first place. Don't try to be clever in those circumstances.

Picked up while merging some AArch64 tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206365 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-16 09:03:09 +00:00
Quentin Colombet
49ad5d5dd5 [ARM64] Set default CPU to generic instead of cyclone.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206313 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-15 19:08:46 +00:00
Tim Northover
1e3f66afe8 AArch64/ARM64: enable more AArch64 tests on ARM64.
No code changes for this bunch, just some test rejigs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206291 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-15 14:00:29 +00:00
Tim Northover
5f8234943d AArch64/ARM64: add missing pattern for extending load.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206290 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-15 14:00:19 +00:00
Tim Northover
2a83cb71ad AArch64/ARM64: only mangle MOVZ/MOVN during encoding when needed
Sometimes we need emit the bits that would actually be a MOVN when producing a
relocated MOVZ instruction (don't ask). But not always, a check which ARM64 got
wrong until now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206289 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-15 14:00:15 +00:00
Tim Northover
5080ae2e21 AArch64/ARM64: add support for large code-model jump tables.
I've left the MachO CodeGen as it is, there's a reasonable chance it should use
the GOT like ConstPools, but I'm not certain.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206288 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-15 14:00:11 +00:00
Tim Northover
e9cae9b79f AArch64/ARM64: add patterns for various commutations of FNMADD.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206287 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-15 14:00:06 +00:00
Tim Northover
e8bc8a7d58 AArch64/ARM64: add half as a storage type on ARM64.
This brings it into line with the AArch64 behaviour and should open the way for
certain OpenCL features.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206286 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-15 14:00:03 +00:00