Commit Graph

976 Commits

Author SHA1 Message Date
Juergen Ributzka
b11c5b1078 [FastISel][AArch64] Use 'cbz' also for null values (pointers).
The pattern matching for a 'ConstantInt' value was too restrictive. Checking for
a 'Constant' with a bull value is sufficient for using an 'cbz/cbnz' instruction.

This fixes rdar://problem/18784732.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220709 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-27 19:38:05 +00:00
Juergen Ributzka
5745cad861 [FastISel][AArch64] Don't fold the 'and' instruction into the 'tbz/tbnz' instruction if it is in a different basic block.
This fixes a bug where the input register was not defined for the 'tbz/tbnz'
instruction. This happened, because we folded the 'and' instruction from a
different basic block.

This fixes rdar://problem/18784013.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220704 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-27 19:16:48 +00:00
Juergen Ributzka
d3a04223e8 [FastISel][AArch64] Fix load/store with frame indices.
At higher optimization levels the LLVM IR may contain more complex patterns for
loads/stores from/to frame indices. The 'computeAddress' function wasn't able to
handle this and triggered an assertion.

This fix extends the possible addressing modes for frame indices.

This fixes rdar://problem/18783298.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220700 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-27 18:21:58 +00:00
Lang Hames
57902cc070 [PBQP] Unique allowed-sets for nodes in the PBQP graph and use pairs of these
sets as keys into a cache of interference matrice values in the Interference
constraint adder.

Creating interference matrices was one of the large remaining time-sinks in
PBQP. Caching them reduces the total compile time (when using PBQP) on the
nightly test suite by ~10%.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220688 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-27 17:44:25 +00:00
Oliver Stannard
9bb3f37aa4 [AArch64] Fix fast-isel of cbz of i1, i8, i16
This fixes a miscompilation in the AArch64 fast-isel which was
triggered when a branch is based on an icmp with condition eq or ne,
and type i1, i8 or i16. The cbz instruction compares the whole 32-bit
register, so values with the bottom 1, 8 or 16 bits clear would cause
the wrong branch to be taken.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220553 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-24 09:54:41 +00:00
Chad Rosier
fa16693864 [AArch64] Add support for the .inst directive.
This has been implement using the MCTargetStreamer interface as is done in the
ARM, Mips and PPC backends.

Phabricator: http://reviews.llvm.org/D5891
PR20964

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220422 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-22 20:35:57 +00:00
Arnaud A. de Grandmaison
c9ada07cca [AArch64] Cleanup A57PBQPConstraints
And add a long awaited testcase.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220381 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-22 12:40:20 +00:00
Arnaud A. de Grandmaison
de246de958 [PBQP] Teach PassConfig to tell if the default register allocator is used.
This enables targets to adapt their pass pipeline to the register
allocator in use. For example, with the AArch64 backend, using PBQP
with the cortex-a57, the FPLoadBalancing pass is no longer necessary.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220321 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-21 20:47:22 +00:00
James Molloy
7023b85187 [AArch64] Fix a silent codegen fault in BUILD_VECTOR lowering.
We should be talking about the number of source elements, not the number of destination elements, given we know at this point that the source and dest element numbers are not the same.

While we're at it, avoid writing to std::vector::end()...

Bug found with random testing and a lot of coffee.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220051 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 17:06:31 +00:00
Juergen Ributzka
c40dab2069 [AArch64] Fix miscompile of sdiv-by-power-of-2.
When the constant divisor was larger than 32bits, then the optimized code
generated for the AArch64 backend would emit the wrong code, because the shift
was defined as a shift of a 32bit constant '(1<<Lg2(divisor))' and we would
loose the upper 32bits.

This fixes rdar://problem/18678801.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219934 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 16:41:15 +00:00
Juergen Ributzka
7440a83e60 Reapply "[FastISel][AArch64] Add custom lowering for GEPs."
This is mostly a copy of the existing FastISel GEP code, but we have to
duplicate it for AArch64, because otherwise we would bail out even for simple
cases. This is because the standard fastEmit functions don't cover MUL at all
and ADD is lowered very inefficientily.

The original commit had a bug in the add emit logic, which has been fixed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219831 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 18:58:07 +00:00
Juergen Ributzka
32c5fde3f1 [FastISel][AArch64] Factor out add with immediate emission into a helper function. NFC.
Simplify add with immediate emission by factoring it out into a helper function.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219830 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 18:58:02 +00:00
Rafael Espindola
90ce9f70e2 Simplify handling of --noexecstack by using getNonexecutableStackSection.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219799 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 16:12:52 +00:00
Juergen Ributzka
0081070cfd Revert "[FastISel][AArch64] Add custom lowering for GEPs."
This breaks our internal build bots. Reverting it to get the bots green again.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219776 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 04:55:48 +00:00
Eric Christopher
c6d2db4db1 Remove unused variable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219750 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 00:09:07 +00:00
Gerolf Hoflehner
cd27f3fb33 [AArch64] Wrong CC access in CSINC-conditional branch sequence
This is a follow up to commit r219742. It removes the CCInMI variable
and accesses the CC in CSCINC directly. In the case of a conditional
branch accessing the CC with CCInMI was wrong.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219748 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 23:55:00 +00:00
Gerolf Hoflehner
2bddd7cf65 [AAarch64] Optimize CSINC-branch sequence
Peephole optimization that generates a single conditional branch
for csinc-branch sequences like in the examples below. This is
possible when the csinc sets or clears a register based on a condition
code and the branch checks that register. Also the condition
code may not be modified between the csinc and the original branch.

Examples:

1. Convert csinc w9, wzr, wzr, <CC>;tbnz w9, #0, 0x44
   to b.<invCC>

2. Convert csinc w9, wzr, wzr, <CC>; tbz w9, #0, 0x44
   to b.<CC>


rdar://problem/18506500



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219742 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 23:07:53 +00:00
Juergen Ributzka
569c5b62af [FastISel][AArch64] Add custom lowering for GEPs.
This is mostly a copy of the existing FastISel GEP code, but on AArch64 we bail
out even for simple cases, because the standard fastEmit functions don't cover
MUL and ADD is lowered inefficientily.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219726 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 21:41:23 +00:00
Juergen Ributzka
40017084f7 [FastISel][AArch64] Fix sign-/zero-extend folding when SelectionDAG is involved.
Sign-/zero-extend folding depended on the load and the integer extend to be
both selected by FastISel. This cannot always be garantueed and SelectionDAG
might interfer. This commit adds additonal checks to load and integer extend
lowering to catch this.

Related to rdar://problem/18495928.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219716 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 20:36:02 +00:00
Bradley Smith
5051f6033d [AArch64] Fix crash with empty/pseudo-only blocks in A53 erratum (835769) workaround
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219684 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 14:02:41 +00:00
Hao Liu
75ad488c41 [AArch64]Select wide immediate offset into [Base+XReg] addressing mode
e.g Currently we'll generate following instructions if the immediate is too wide:
    MOV  X0, WideImmediate
    ADD  X1, BaseReg, X0
    LDR  X2, [X1, 0]

    Using [Base+XReg] addressing mode can save one ADD as following:
    MOV  X0, WideImmediate
    LDR  X2, [BaseReg, X0]

    Differential Revision: http://reviews.llvm.org/D5477


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219665 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 06:50:36 +00:00
Bradley Smith
7e67a4b0cb [AArch64] Add workaround for Cortex-A53 erratum (835769)
Some early revisions of the Cortex-A53 have an erratum (835769) whereby it is
possible for a 64-bit multiply-accumulate instruction in AArch64 state to
generate an incorrect result.  The details are quite complex and hard to
determine statically, since branches in the code may exist in some
 circumstances, but all cases end with a memory (load, store, or prefetch)
instruction followed immediately by the multiply-accumulate operation.

The safest work-around for this issue is to make the compiler avoid emitting
multiply-accumulate instructions immediately after memory instructions and the
simplest way to do this is to insert a NOP.

This patch implements such work-around in the backend, enabled via the option
-aarch64-fix-cortex-a53-835769.

The work-around code generation is not enabled by default.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219603 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-13 10:12:35 +00:00
Lang Hames
80bfef5ce1 [PBQP] Add missing headers from r219421.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219425 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-09 18:36:59 +00:00
Lang Hames
54d63b4fd5 [PBQP] Replace PBQPBuilder with composable constraints (PBQPRAConstraint).
This patch removes the PBQPBuilder class and its subclasses and replaces them
with a composable constraints class: PBQPRAConstraint. This allows constraints
that are only required for optimisation (e.g. coalescing, soft pairing) to be
mixed and matched.

This patch also introduces support for target writers to supply custom
constraints for their targets by overriding a TargetSubtargetInfo method:

std::unique_ptr<PBQPRAConstraints> getCustomPBQPConstraints() const;

This patch should have no effect on allocations.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219421 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-09 18:20:51 +00:00
Kevin Qin
f73400a864 [AArch64] Enable partial & runtime unrolling on cortex-a57.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219401 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-09 10:13:27 +00:00
Chad Rosier
2929da99a9 [AArch64] Generate vector signed/unsigned mul and mla/mls long.
Phabricator Revision: http://reviews.llvm.org/D5589
Patch by Balaram Makam <bmakam@codeaurora.org>!!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219276 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 02:31:24 +00:00
Juergen Ributzka
301d3d04f0 [FastISel][AArch64] Teach the address computation code to also fold sign-/zero-extends.
The code already folds sign-/zero-extends, but only if they are arguments to
mul and shift instructions. This extends the code to also fold them when they
are direct inputs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219187 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 03:40:06 +00:00
Juergen Ributzka
3692081566 [FastISel][AArch64] Teach the address computation to also fold sub instructions.
Tiny enhancement to the address computation code to also fold sub instructions
if the rhs is constant and can be folded into the offset.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219186 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 03:40:03 +00:00
Juergen Ributzka
ca07e256f6 [FastISel][AArch64] Fix "Fold sign-/zero-extends into the load instruction."
This commit fixes an issue with sign-/zero-extending loads that was discovered
by Richard Barton.

We use now the correct load instructions for sign-extending loads to 64bit. Also
updated and added more unit tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219185 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-07 03:39:59 +00:00
Eric Christopher
e6f32be8df Add subtarget caches to aarch64, arm, ppc, and x86.
These will make it easier to test further changes to the
code generation and optimization pipelines as those are
moved to subtargets initialized with target feature and
target cpu.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219106 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-06 06:45:36 +00:00
Benjamin Kramer
814a2ffc7c Make AAMDNodes ctor and operator bool (!!!) explicit, mop up bugs and weirdness exposed by it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219068 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-04 22:44:29 +00:00
Jingyue Wu
ea1cccd0b6 Add fake use to suppress defined-but-unused warnings
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219045 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-04 03:50:10 +00:00
Benjamin Kramer
63688e622c Eliminate some deep std::vector copies. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218999 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 18:33:16 +00:00
Eric Christopher
8f09464bc9 constify TargetMachine parameter.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218934 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 00:42:41 +00:00
Juergen Ributzka
b3f91b0af7 [Stackmaps] Make ithe frame-pointer required for stackmaps.
Do not eliminate the frame pointer if there is a stackmap or patchpoint in the
function. All stackmap references should be FP relative.

This fixes PR21107.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218920 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-02 22:21:49 +00:00
Adrian Prantl
02474a32eb Move the complex address expression out of DIVariable and into an extra
argument of the llvm.dbg.declare/llvm.dbg.value intrinsics.

Previously, DIVariable was a variable-length field that has an optional
reference to a Metadata array consisting of a variable number of
complex address expressions. In the case of OpPiece expressions this is
wasting a lot of storage in IR, because when an aggregate type is, e.g.,
SROA'd into all of its n individual members, the IR will contain n copies
of the DIVariable, all alike, only differing in the complex address
reference at the end.

By making the complex address into an extra argument of the
dbg.value/dbg.declare intrinsics, all of the pieces can reference the
same variable and the complex address expressions can be uniqued across
the CU, too.
Down the road, this will allow us to move other flags, such as
"indirection" out of the DIVariable, too.

The new intrinsics look like this:
declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr)
declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr)

This patch adds a new LLVM-local tag to DIExpressions, so we can detect
and pretty-print DIExpression metadata nodes.

What this patch doesn't do:

This patch does not touch the "Indirect" field in DIVariable; but moving
that into the expression would be a natural next step.

http://reviews.llvm.org/D4919
rdar://problem/17994491

Thanks to dblaikie and dexonsmith for reviewing this patch!

Note: I accidentally committed a bogus older version of this patch previously.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218787 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 18:55:02 +00:00
Adrian Prantl
10c4265675 Revert r218778 while investigating buldbot breakage.
"Move the complex address expression out of DIVariable and into an extra"

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218782 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 18:10:54 +00:00
Adrian Prantl
076fd5dfc1 Move the complex address expression out of DIVariable and into an extra
argument of the llvm.dbg.declare/llvm.dbg.value intrinsics.

Previously, DIVariable was a variable-length field that has an optional
reference to a Metadata array consisting of a variable number of
complex address expressions. In the case of OpPiece expressions this is
wasting a lot of storage in IR, because when an aggregate type is, e.g.,
SROA'd into all of its n individual members, the IR will contain n copies
of the DIVariable, all alike, only differing in the complex address
reference at the end.

By making the complex address into an extra argument of the
dbg.value/dbg.declare intrinsics, all of the pieces can reference the
same variable and the complex address expressions can be uniqued across
the CU, too.
Down the road, this will allow us to move other flags, such as
"indirection" out of the DIVariable, too.

The new intrinsics look like this:
declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr)
declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr)

This patch adds a new LLVM-local tag to DIExpressions, so we can detect
and pretty-print DIExpression metadata nodes.

What this patch doesn't do:

This patch does not touch the "Indirect" field in DIVariable; but moving
that into the expression would be a natural next step.

http://reviews.llvm.org/D4919
rdar://problem/17994491

Thanks to dblaikie and dexonsmith for reviewing this patch!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218778 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 17:55:39 +00:00
Tom Coxon
01649dea92 [AArch64] Allow access to all system registers with MRS/MSR instructions.
The A64 instruction set includes a generic register syntax for accessing
implementation-defined system registers. The syntax for these registers is:
    S<op0>_<op1>_<CRn>_<CRm>_<op2>

The encoding space permitted for implementation-defined system registers
is:
    op0 op1  CRn   CRm   op2
    11  xxx  1x11  xxxx  xxx

The full encoding space can now be accessed:
    op0 op1  CRn   CRm   op2
    xx  xxx  xxxx  xxxx  xxx

This is useful to anyone needing to write assembly code supporting new
system registers before the assembler has learned the official names for
them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218753 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 10:13:59 +00:00
Asiri Rathnayake
e9bbacd0a8 Add missing natual vector cast.
Summary: The natual vector cast node (similar to bitcast) AArch64ISD::NVCAST
was introduced in r217159 and r217138. This patch adds a missing cast from
v2f32 to v1i64 which is causing some compilation failures. Also added test
cases to cover various modimm types and BUILD_VECTORs with i64 elements.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218751 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 09:59:45 +00:00
Juergen Ributzka
9952c922c2 Recommit r218010 [FastISel][AArch64] Fold bit test and branch into TBZ and TBNZ.
Note: This version fixed an issue with the TBZ/TBNZ instructions that were
generated in FastISel. The issue was that the 64bit version of TBZ (TBZX)
automagically sets the upper bit of the immediate field that is used to specify
the bit we want to test. To test for any of the lower 32bits we have to first
extract the subregister and use the 32bit version of the TBZ instruction (TBZW).

Original commit message:
Teach selectBranch to fold bit test and branch into a single instruction (TBZ or
TBNZ).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218693 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 19:59:35 +00:00
Tom Coxon
8a23890385 [AArch64] Remove unnecessary whitespace. (Test commit)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218680 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 16:23:16 +00:00
Juergen Ributzka
a0af4b0271 [FastISel][AArch64] Fold sign-/zero-extends into the load instruction.
The sign-/zero-extension of the loaded value can be performed by the memory
instruction for free. If the result of the load has only one use and the use is
a sign-/zero-extend, then we emit the proper load instruction. The extend is
only a register copy and will be optimized away later on.

Other instructions that consume the sign-/zero-extended value are also made
aware of this fact, so they don't fold the extend too.

This fixes rdar://problem/18495928.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218653 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 00:49:58 +00:00
Juergen Ributzka
e8a9706eda [FastISel][AArch64] Factor out scale factor calculation. NFC.
Factor out the code that determines the implicit scale factor of memory
operations for a given value type.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218652 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-30 00:49:54 +00:00
Dave Estes
f657899174 [AArch64] Refines the Cortex-A57 Machine Model
Primarily refines all of the instructions with accurate latency
and micro-op information. Refinements largely focus on the NEON
instructions.

Additionally, a few advanced features are modeled, including
forwarding for MAC instructions and hazards for floating point SQRT
and DIV.

Lastly, the issue-width is reduced to three so that the scheduler
will better accommodate the narrower decode and dispatch width.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218627 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 21:27:36 +00:00
Chad Rosier
ea64dce261 [AArch64] Improve cost model to handle sdiv by a pow-of-two.
This patch improves the target-specific cost model to better handle signed
division by a power of two. The immediate result is that this enables the SLP
vectorizer to do a better job.

http://reviews.llvm.org/D5469
PR20714

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218607 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-29 13:59:31 +00:00
Jim Grosbach
bd847644b3 AArch64: allow constant expressions for shifted reg literals
e.g., add w1, w2, w3, lsl #(2 - 1)

This sort of thing comes up in pre-processed assembly playing macro games.
Still validate that it's an assembly time constant. The early exit error check
was just a bit overzealous and disallowed a left paren.

rdar://18430542

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218336 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-23 22:16:02 +00:00
Oliver Stannard
abe1cb7985 Fix segfault in AArch64 backend with -g and -mbig-endian
Fix a null pointer dereference when trying to swap the endianness of
fixups in the .eh_frame section in the AArch64 backend.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218311 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-23 15:38:11 +00:00
Juergen Ributzka
af989653e0 [FastISel][AArch64] Also allow folding of sign-/zero-extend and shift-left for booleans (i1).
Shift-left immediate with sign-/zero-extensions also works for boolean values.
Update the assert and the test cases to reflect that fact.

This should fix a bug found by Chad.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218275 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-22 21:08:53 +00:00
Juergen Ributzka
faf93a6e0c [FastIsel][AArch64] Fix a think-o in address computation.
When looking through sign/zero-extensions the code would always assume there is
such an extension instruction and use the wrong operand for the address.

There was also a minor issue in the handling of 'AND' instructions. I
accidentially used a 'cast' instead of a 'dyn_cast'.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218161 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-19 22:23:46 +00:00
Aaron Ballman
c21e4e197d Reverting NFC changes from r218050. Instead, the warning was disabled for GCC in r218059, so these changes are no longer required.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218062 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-18 17:34:23 +00:00
Aaron Ballman
cf5bea8e4a Fixing a bunch of -Woverloaded-virtual warnings due to hiding getSubtargetImpl from the base class. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218050 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-18 13:27:14 +00:00
Juergen Ributzka
f789dac2dd Revert "[FastISel][AArch64] Fold bit test and branch into TBZ and TBNZ."
Reverting it until I have time to investigate a regression.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218035 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-18 08:07:40 +00:00
Juergen Ributzka
ef48b51126 Fix previous commit: [FastISel][AArch64] Simplify XALU multiplies.
When folding the intrinsic flag into the branch or select we also have to
consider the fact if the intrinsic got simplified, because it changes the
flag we have to check for.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218034 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-18 07:26:26 +00:00
Juergen Ributzka
e7fba004ce [FastISel][AArch64] Simplify XALU multiplies.
Simplify {s|u}mul.with.overflow to {s|u}add.with.overflow when possible.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218033 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-18 07:04:54 +00:00
Juergen Ributzka
4b6f00ad18 [FastISel][AArch64] Followup commit for 218031 to handle negative offsets too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218032 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-18 07:04:49 +00:00
Juergen Ributzka
22b557d942 [FastISel][AArch64] Try to fold the offset into the add instruction when simplifying a memory address.
Small optimization in 'simplifyAddress'. When the offset cannot be encoded in
the load/store instruction, then we need to materialize the address manually.
The add instruction can encode a wider range of immediates than the load/store
instructions. This change tries to fold the offset into the add instruction
first before materializing the offset in a register.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218031 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-18 05:40:47 +00:00
Juergen Ributzka
ffbd4879eb [FastISel][AArch64] Fold 'AND' instruction during the address computation.
The 'AND' instruction could be used to mask out the lower 32 bits of a register.
If this is done inside an address computation we might be able to fold the
instruction into the memory instruction itself.

and  x1, x1, #0xffffffff   ---> ldrb x0, [x0, w1, uxtw]
ldrb x0, [x0, x1]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218030 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-18 05:40:41 +00:00
Juergen Ributzka
710fc316fb [FastISel][AArch64] Fold bit test and branch into TBZ and TBNZ.
Teach selectBranch to fold bit test and branch into a single instruction (TBZ or
TBNZ).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218010 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-18 02:44:13 +00:00
Juergen Ributzka
7516444a26 [FastISel][AArch64] Custom lower sdiv by power-of-2.
Emit an optimized instruction sequence for sdiv by power-of-2 depending on the
exact flag.

This fixes rdar://problem/18224511.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217986 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-17 21:55:55 +00:00
Juergen Ributzka
580875d39d [FastISel][AArch64] Simplify mul to shift when possible.
This is related to rdar://problem/18369687.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217980 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-17 20:35:41 +00:00
Juergen Ributzka
46d6fd2908 [FastISel][AArch64] Fold mul into add/sub and logical operations.
Try to fold the multiply into the add/sub or logical operations (when
possible).

This is related to rdar://problem/18369687.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217978 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-17 19:51:38 +00:00
Juergen Ributzka
5461af97bc [FastISel][AArch64] Fold mul into the address computation of memory operations.
Teach 'computeAddress' to also fold multiplies into the address computation
(when possible).

This fixes rdar://problem/18369443.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217977 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-17 19:19:31 +00:00
Juergen Ributzka
07c9ae576c [FastISel][AArch64] Fold compare with zero and branch into CBZ and CBNZ.
This takes advanatage of the CBZ and CBNZ instruction to further optimize the
common null check pattern into a single instruction.

This is related to rdar://problem/18358882.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217972 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-17 18:05:34 +00:00
Juergen Ributzka
17e0ee5078 [FastISel][AArch64] Improve branch selection to support all FP conditions.
This adds the last two missing floating-point condition codes (FCMP_UEQ and
FCMP_ONE) also to the branch selection. In these two cases an additonal branch
instruction is required.

This also adds unit tests to checks all the different condition codes.

This is related o rdar://problem/18358882.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217966 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-17 17:46:47 +00:00
Robin Morisset
5c16c4e45a [X86] Use the generic AtomicExpandPass instead of X86AtomicExpandPass
This required a new hook called hasLoadLinkedStoreConditional to know whether
to expand atomics to LL/SC (ARM, AArch64, in a future patch Power) or to
CmpXchg (X86).

Apart from that, the new code in AtomicExpandPass is mostly moved from
X86AtomicExpandPass. The main result of this patch is to get rid of that
pass, which had lots of code duplicated with AtomicExpandPass.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217928 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-17 00:06:58 +00:00
Juergen Ributzka
c9bc145e31 [FastISel][AArch64] Add vector support to argument lowering.
Lower the first 8 vector arguments too.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217850 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-16 00:25:30 +00:00
Juergen Ributzka
488f228a4f [FastISel][AArch64] Allow handling of vectors during return lowering for little endian machines.
Allow handling of vectors during return lowering at least for little endian machines.
This was restricted in r208200 to fix it for big endian machines (according to
the comment), but it also disabled it for little endian too.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217846 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-15 23:40:10 +00:00
Juergen Ributzka
d8629f313e [FastISel][AArch64] Update function and variable names to follow the coding standard. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217845 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-15 23:20:17 +00:00
Juergen Ributzka
61c9638f41 [FastISel][AArch64] Make AArch64FastISel class final. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217840 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-15 22:33:11 +00:00
Juergen Ributzka
df445d7af2 [FastISel][AArch64] Lower sin/cos/pow to runtime lib calls.
Also lower sin/cos/pow to runtime lib calls.

This fixes rdar://problem/18343468.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217839 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-15 22:33:06 +00:00
Juergen Ributzka
323445f706 [FastISel][AArch64] Add lowering support for frem.
This lowers frem to a runtime libcall inside fast-isel.

The test case also checks the CallLoweringInfo bug that was exposed by this
change.

This fixes rdar://problem/18342783.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217833 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-15 22:07:49 +00:00
Juergen Ributzka
05cd1489c0 [FastISel][AArch64] Refactor selectAddSub, selectLogicalOp, and SelectShift. NFC.
Small refactor to tidy up the code a little.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217827 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-15 21:27:56 +00:00
Juergen Ributzka
4e10936b38 [FastISel][AArch64] Refactor code to use isTypeSupported. NFC.
Gets rid of isLoadStoreTypeLegal and replace it with isTypeSupported.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217826 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-15 21:27:54 +00:00
Juergen Ributzka
86bdc1efbe [FastISel][AArch64] Improve floating-point compare support.
Add support for the last two missing fcmp condition codes: UEQ and ONE.

This fixes rdar://problem/18341575.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217823 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-15 20:47:16 +00:00
James Molloy
4fb9bbe72f [A57FPLoadBalancing] Modify r217689 - actually we do need to check defs
... Just make sure we check uses first so we see the kill first. It
turns out ignoring defs gives some pretty nasty runtime failures.
I'm certain this is the fix but I'm still reducing a testcase.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217735 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-14 18:24:26 +00:00
Juergen Ributzka
5bf1f01c15 [FastISel][AArch64] Add support for non-native types for logical ops.
Extend the logical ops selection to also support non-native types such as i1,
i8, and i16.

Fixes rdar://problem/18330589.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217732 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-13 23:46:28 +00:00
Chad Rosier
995738064a [AArch64] Don't enable the post-RA MI scheduler at OptNone.
Hopefully, this will appease the bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217712 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-12 22:17:28 +00:00
Chad Rosier
4fb3a966d0 [AArch64] Enable post-RA MI scheduler.
Phabricator Revision: http://reviews.llvm.org/D5278
Patch by Sanjin Sijaric!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217693 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-12 17:40:39 +00:00
James Molloy
ca332457ba [A57FPLoadBalancing] Remove support for vector types
Vector MUL/MLAs have tied operands, which gives us extra constraints
that we currently can't handle. Instead of silently doing the wrong
thing, remove support to be readded later properly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217690 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-12 16:55:32 +00:00
James Molloy
b4fbcbc288 [A57FPLoadBalancing] Ignore <def>s when checking if a chain may be killed.
Defs are seen before uses, so a def without the kill flag doesn't necessarily
mean that the register is not killed on that instruction. It may be killed
in a later use operand.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217689 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-12 16:55:26 +00:00
James Molloy
9204ab0456 [A57LoadBalancing] unique_ptr-ify.
Thanks to David Blakie for the in-depth review!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217682 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-12 14:35:17 +00:00
Patrik Hagglund
4a93e8dd02 Fix gcc -Wpedantic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217669 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-12 12:32:08 +00:00
Gerolf Hoflehner
e7648ae68d [AArch64] Revert r216141 for cyclone
The increase of the interleave factor to 4 has side-effects
like performance losses eg. due to reminder loops being executed
more frequently and may increase code size. It requires more
analysis and careful heuristic tuning. Expect double digit gains
in small benchmarks like lowercase.c and losses in puzzle.c.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217540 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-10 20:31:57 +00:00
Sanjay Patel
87c977a52b Rename getMaximumUnrollFactor -> getMaxInterleaveFactor; also rename option names controlling this variable.
"Unroll" is not the appropriate name for this variable. Clang already uses 
the term "interleave" in pragmas and metadata for this.

Differential Revision: http://reviews.llvm.org/D5066



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217528 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-10 17:58:16 +00:00
Arnaud A. de Grandmaison
2c925e0e8f [AArch64] Address Chad's post commit review comments for r217504 (PBQP experimental support)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217518 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-10 17:03:25 +00:00
Arnaud A. de Grandmaison
ac67a5fd18 [AArch64] Pacify lld buildbot complaining about an unused static function in release build.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217505 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-10 14:24:02 +00:00
Arnaud A. de Grandmaison
438669ca81 [AArch64] Add experimental PBQP support
This adds target specific support for using the PBQP register allocator on the
AArch64, for the A57 cpu.

By default, the PBQP allocator is not used, unless explicitely required
on the command line with "-aarch64-pbqp".

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217504 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-10 14:06:10 +00:00
Asiri Rathnayake
3babc141b2 [AArch 64] Use a constant pool load for weak symbol references when
using static relocation model and small code model.

Summary: currently we generate GOT based relocations for weak symbol
references regardless of the underlying relocation model. This should
be change so that in static relocation model we use a constant pool
load instead.

Patch from: Keith Walker

Reviewers: Renato Golin, Tim Northover

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217503 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-10 13:54:38 +00:00
Chad Rosier
c3c0c6df2a [AArch64] Enabled AA support for Cortex-A57.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217381 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-08 15:34:16 +00:00
Chad Rosier
b30d031de4 [AArch64] Improve AA to remove unneeded edges in the AA MI scheduling graph.
Patch by Sanjin Sijaric <ssijaric@codeaurora.org>!
Phabricator Review: http://reviews.llvm.org/D5103

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217371 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-08 14:43:48 +00:00
Chad Rosier
1ef487d463 [AArch64] Enabled AA support for Cortex-A53.
Patch by Sanjin Sijaric <ssijaric@codeaurora.org>!
Phabricator Review: http://reviews.llvm.org/D5103

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217370 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-08 14:31:49 +00:00
Jiangning Liu
b20b9bf9fd [AArch64] Add pass to enable additional comparison optimizations by CSE.
Patched by Sergey Dmitrouk.

This pass tries to make consecutive compares of values use same operands to
allow CSE pass to remove duplicated instructions. For this it analyzes
branches and adjusts comparisons with immediate values by converting:

GE -> GT
GT -> GE
LT -> LE
LE -> LT

and adjusting immediate values appropriately. It basically corrects two
immediate values towards each other to make them equal.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217220 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-05 02:55:24 +00:00
Tim Northover
8dcac5d77a AArch64: fix vector-immediate BIC/ORR on big-endian devices.
Follow up to r217138, extending the logic to other NEON-immediate instructions.
As before, the instruction already performs the correct operation and we're
just using a different type for convenience, so we want a true nop-cast.

Patch by Asiri Rathnayake.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217159 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-04 15:05:24 +00:00
Tim Northover
dfe4e3e706 AArch64: fix big-endian immediate materialisation
We were materialising big-endian constants using DAG nodes with types different
from what was requested, followed by a bitcast. This is fine on little-endian
machines where bitcasting is a nop, but we need a slightly different
representation for big-endian. This adds a new set of NVCAST (natural-vector
cast) operations which are always nops.

Patch by Asiri Rathnayake.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217138 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-04 09:46:14 +00:00
Juergen Ributzka
319b5d0e8e [FastISel][AArch64] Cleanup and simplify 'fastSelectInstruction'. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217119 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-04 01:29:21 +00:00
Juergen Ributzka
68a4ab08b3 [FastISel][AArch64] Add target-specific lowering for logical operations.
This change adds support for immediate and shift-left folding into logical
operations.

This fixes rdar://problem/18223183.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217118 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-04 01:29:18 +00:00
Robin Morisset
1ad925ccf8 Refactor AtomicExpandPass and add a generic isAtomic() method to Instruction
Summary:
Split shouldExpandAtomicInIR() into different versions for Stores/Loads/RMWs/CmpXchgs.
Makes runOnFunction cleaner (no more redundant checking/casting), and will help moving
the X86 backend to this pass.

This requires a way of easily detecting which instructions are atomic.
I followed the pattern of mayReadFromMemory, mayWriteOrReadMemory, etc.. in making
isAtomic() a method of Instruction implemented by a switch on the opcodes.

Test Plan: make check

Reviewers: jfb

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D5035

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217080 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 21:29:59 +00:00
Juergen Ributzka
ecadea992a [FastISel][tblgen] Rename tblgen generated FastISel functions. NFC.
This is the final round of renaming. This changes tblgen to emit lower-case
function names for FastEmitInst_* and FastEmit_*, and updates all its uses
in the source code.

Reviewed by Eric

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217075 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 20:56:59 +00:00
Juergen Ributzka
6042034603 [FastISel] Rename public visible FastISel functions. NFC.
This commit renames the following public FastISel functions:
LowerArguments -> lowerArguments
SelectInstruction -> selectInstruction
TargetSelectInstruction -> fastSelectInstruction
FastLowerArguments -> fastLowerArguments
FastLowerCall -> fastLowerCall
FastLowerIntrinsicCall -> fastLowerIntrinsicCall
FastEmitZExtFromI1 -> fastEmitZExtFromI1
FastEmitBranch -> fastEmitBranch
UpdateValueMap -> updateValueMap
TargetMaterializeConstant -> fastMaterializeConstant
TargetMaterializeAlloca -> fastMaterializeAlloca
TargetMaterializeFloatZero -> fastMaterializeFloatZero
LowerCallTo -> lowerCallTo

Reviewed by Eric

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217074 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 20:56:52 +00:00
Eric Christopher
c24df453b0 Remove unnecessary getTarget call now that the subtarget is cached
on the machine function.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217070 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 20:36:26 +00:00
Juergen Ributzka
39af4e655a [FastISel] Some long overdue spring cleaning of FastISel.
Things got a little bit messy over the years and it is time for a little bit
spring cleaning.

This first commit is focused on the FastISel base class itself. It doxyfies all
comments, C++11fies the code where it makes sense, renames internal methods to
adhere to the coding standard, and clang-formats the files.

Reviewed by Eric

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217060 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 18:46:45 +00:00
Juergen Ributzka
1892eabdef [FastISel][AArch64] Move unconditional branch handling into 'SelectBranch'. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217054 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 17:58:10 +00:00
Benjamin Kramer
a80ff26688 Add override to overriden virtual methods, remove virtual keywords.
No functionality change. Changes made by clang-tidy + some manual cleanup.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217028 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 11:41:21 +00:00
Juergen Ributzka
847547086d Reapply r216805 "[MachineCombiner][AArch64] Use the correct register class for MADD, SUB, and OR.""
This reapplies r216805 with a fix to a copy-past error, which resulted in an
incorrect register class.

Original commit message:
Select the correct register class for the various instructions that are
generated when combining instructions and constrain the registers to the
appropriate register class.

This fixes rdar://problem/18183707.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217019 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 07:07:10 +00:00
Juergen Ributzka
dd7a7107c1 [FastISel][AArch64] Add target-dependent instruction selection for Add/Sub.
There is already target-dependent instruction selection support for Adds/Subs to
support compares and the intrinsics with overflow check. This takes advantage of
the existing infrastructure to also support Add/Sub, which allows the folding of
immediates, sign-/zero-extends, and shifts.

This fixes rdar://problem/18207316.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217007 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 01:38:36 +00:00
Juergen Ributzka
79ec2ed417 [FastISel][AArch64] Use the target-dependent selection code for shifts first.
This uses the target-dependent selection code for shifts first, which allows us
to create better code for shifts with immediates and sign-/zero-extend folding.

Vector type are not handled yet and the code falls back to target-independent
instruction selection for these cases.

This fixes rdar://problem/17907920.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216985 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02 22:33:57 +00:00
Juergen Ributzka
4919b697c2 [FastISel][AArch64] Use a new helper function to determine if a value type is supported. NFCI.
FastISel for AArch64 supports more value types than are actually legal. Use a
dedicated helper function to reflect this.

It is very similar to the isLoadStoreTypeLegal function, with the exception
that vector types are not supported yet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216984 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02 22:33:53 +00:00
Eric Christopher
d5dd8ce2a5 Reinstate "Nuke the old JIT."
Approved by Jim Grosbach, Lang Hames, Rafael Espindola.

This reinstates commits r215111, 215115, 215116, 215117, 215136.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216982 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02 22:28:02 +00:00
Juergen Ributzka
a2e22cd4cd [FastISel][AArch64] Move over to target-dependent instruction selection only.
This change moves FastISel for AArch64 to target-dependent instruction selection
only. This change replicates the existing target-independent behavior, therefore
there are no changes to the unit tests or new tests.

Future changes will take advantage of this change and update functionality
and unit tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216955 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02 21:32:54 +00:00
Pete Cooper
6de6c6aae4 Change MCSchedModel to be a struct of statically initialized data.
This removes static initializers from the backends which generate this data, and also makes this struct match the other Tablegen generated structs in behaviour

Reviewed by Andy Trick and Chandler C

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216919 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02 17:43:54 +00:00
Alexey Samsonov
4648487925 Fix left shifts of negative integers in AArch64 InstPrinter/Disassembler
Summary:
Left shift of negative integer is an undefined behavior, and
is reported by UBSan. It's ok for imm values to be negative, so we can
just replace left shifts with multiplications.

Test Plan: check-llvm test suite

Reviewers: t.p.northover

Reviewed By: t.p.northover

Subscribers: aemerson, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D5132

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216910 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02 16:19:41 +00:00
Aaron Ballman
edc658d686 Silencing an MSVC C4334 warning ('<<' : result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)). NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216902 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02 12:19:02 +00:00
David Xu
4e2b661005 Merge Extend and Shift into a UBFX
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216899 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02 09:33:56 +00:00
Craig Topper
3af13568fb Remove 'virtual' keyword from methods markedwith 'override' keyword.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216823 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-30 16:48:34 +00:00
Juergen Ributzka
bcbae3d680 Revert r216805 "[MachineCombiner][AArch64] Use the correct register class for MADD, SUB, and OR."
I think this broke the build bot. Reverting it for now until I have time to take a closer look.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216813 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-30 06:16:26 +00:00
Juergen Ributzka
4e92383b67 [MachineCombiner][AArch64] Use the correct register class for MADD, SUB, and OR.
Select the correct register class for the various instructions that are
generated when combining instructions and constrain the registers to the
appropriate register class.

This fixes rdar://problem/18183707.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216805 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 23:48:09 +00:00
Juergen Ributzka
e7f301e079 [FastISel][AArch64] Use the correct register class for branches.
Also constrain the register class for branches.

This fixes rdar://problem/18181496.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216804 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 23:48:06 +00:00
Alexey Samsonov
c5198dc17a Make isValidMCLOHType take unsigned instead of enum to avoid loading invalid enum values
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216797 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 22:34:28 +00:00
Reid Kleckner
61123cf20a AArch64: Silence -Wabsolute-value warning with std::abs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216794 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 22:14:26 +00:00
Robin Morisset
217b38e19a Fix typos in comments, NFC
Summary: Just fixing comments, no functional change.

Test Plan: N/A

Reviewers: jfb

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D5130

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216784 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 21:53:01 +00:00
Louis Gerbarg
6393b3a677 Remove spurious mask operations from AArch64 add->compares on 16 and 8 bit values
This patch checks for DAG patterns that are an add or a sub followed by a
compare on 16 and 8 bit inputs. Since AArch64 does not support those types
natively they are legalized into 32 bit values, which means that mask operations
are inserted into the DAG to emulate overflow behaviour. In many cases those
masks do not change the result of the processing and just introduce a dependent
operation, often in the middle of a hot loop.

This patch detects the relevent DAG patterns and then tests to see if the
transforms are equivalent with and without the mask, removing the mask if
possible. The exact mechanism of this patch was discusses in
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-July/074444.html

There is a reasonably good chance there are missed oppurtunities due to similiar
(but not identical) DAG patterns that could be funneled into this test, adding
them should be simple if we see test cases.

Tests included.

rdar://13754426

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216776 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 21:00:22 +00:00
Juergen Ributzka
d8835d09ec [FastISel][AArch64] Fix an incorrect kill flag due to a bug in SelectTrunc.
When we select a trunc instruction we don't emit any code if the type is already
i32 or smaller. This is because the instruction that uses the truncated value
will deal with it.

This behavior can incorrectly transfer a kill flag, which was meant for the
result of the truncate, onto the source register.

%2 = trunc i32 %1 to i16
... = ... %2                -> ... = ... vreg1 <kill>
... = ... %1                   ... = ... vreg1

This commit fixes this by emitting a COPY instruction, so that the result and
source register are distinct virtual registers.

This fixes rdar://problem/18178188.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216750 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 17:58:16 +00:00
Tim Northover
1e77dc84c4 AArch64: only try to get operand of a known node.
A bug in r216725 meant we tried to discover the type of a SETCC before
confirming the node actually was a SETCC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216734 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 15:34:58 +00:00
Tim Northover
1f70cb9c14 AArch64: skip select/setcc combine in complex case.
In an llvm-stress generated test, we were trying to create a v0iN type and
asserting when that failed. This case could probably be handled by the
function, but not without added complexity and the situation it arises in is
sufficiently odd that there's probably no benefit anyway.

Should fix PR20775.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216725 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 13:05:18 +00:00
Arnaud A. de Grandmaison
123c92cbe0 [AArch64] FPLoadBalancing: move ownership of the chain to its current accumulator register
and forget about the previously used accumulator.

Coming up with a simple testcase is not easy, as this highly depends on
what the register allocator is doing: this issue showed up while working
with the PBQP allocator, which produced a different allocation scheme.
A testcase would need to come up with chain starting in D[0-7], then
moving to D[8-15], followed by a call to a function whose regmask
clobbers the starting accumulator in D[0-7], then another use of the chain.

Fixed some formatting, added some invariant checks while there.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216721 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 09:54:11 +00:00
Jiangning Liu
3cd73a5ded [AArch64] Fix some failures exposed by value type v4f16 and v8f16.
1) Add some missing bitcast patterns for v8f16.
2) Add type promotion for operand of ld/st operations.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216706 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 01:31:42 +00:00
Juergen Ributzka
cf45151b2c [FastISel][AArch64] Don't fold instructions that are not in the same basic block.
This fix checks first if the instruction to be folded (e.g. sign-/zero-extend,
or shift) is in the same machine basic block as the instruction we are folding
into.

Not doing so can result in incorrect code, because the value might not be
live-out of the basic block, where the value is defined.

This fixes rdar://problem/18169495.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216700 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 00:19:21 +00:00
Jim Grosbach
0d34b1ed26 AArch64: More correctly constrain target vector extend lowering.
The AArch64 target lowering for [zs]ext of vectors is set up to handle
input simple types and expects the generic SDag path to do something reasonable
with anything that's not a simple type. The code, however, was only
checking that the result type was a simple type and assuming that
implied that the source type would also be a simple type. That's not a
valid assumption, as operations like "zext <1 x i1> %0 to <1 x i32>"
demonstrate. The fix is to simply explicitly validate the source type
as well as the result type.

PR20791

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216689 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-28 22:08:28 +00:00
David Xu
5ca793561e Generate CMN when comparing a short int with minus
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216651 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-28 04:59:53 +00:00
Juergen Ributzka
a26b1bdcc8 Revert "[FastISel][AArch64] Don't fold instructions too aggressively into the memory operation."
Quentin pointed out that this is not the correct approach and there is a better and easier solution.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216632 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 23:09:40 +00:00
Juergen Ributzka
1f5263e43f [FastISel][AArch64] Don't fold instructions too aggressively into the memory operation.
Currently instructions are folded very aggressively into the memory operation,
which can lead to the use of killed operands:
  %vreg1<def> = ADDXri %vreg0<kill>, 2
  %vreg2<def> = LDRBBui %vreg0, 2
  ... = ... %vreg1 ...

This usually happens when the result is also used by another non-memory
instruction in the same basic block, or any instruction in another basic block.

If the computed address is used by only memory operations in the same basic
block, then it is safe to fold them. This is because all memory operations will
fold the address computation and the original computation will never be emitted.

This fixes rdar://problem/18142857.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216629 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 22:52:33 +00:00
Juergen Ributzka
2c4b9ea282 [FastISel][AArch64] Fix a comment in my previous commit (r216617).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216622 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 21:40:50 +00:00
Juergen Ributzka
ccf53013cd [FastISel][AArch64] Fix simplify address when the address comes from a shift.
When the address comes directly from a shift instruction then the address
computation cannot be folded into the memory instruction, because the zero
register is not available as a base register. Simplify addess needs to emit the
shift instruction and use the result as base register.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216621 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 21:38:33 +00:00
Juergen Ributzka
d445e4acdb [FastISel][AArch64] Use the zero register for stores.
Use the zero register directly when possible to avoid an unnecessary register
copy and a wasted register at -O0. This also uses integer stores to store a
positive floating-point zero. This saves us from materializing the positive zero
in a register and then storing it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216617 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 21:04:52 +00:00
Oliver Stannard
5e487f8dc7 Teach the AArch64 backend about v4f16 and v8f16
This teaches the AArch64 backend to deal with the operations required
to deal with the operations on v4f16 and v8f16 which are exposed by
NEON intrinsics, plus the add, sub, mul and div operations.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216555 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 16:16:04 +00:00
Craig Topper
3512034554 Simplify creation of a bunch of ArrayRefs by using None, makeArrayRef or just letting them be implicitly created.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216525 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 05:25:25 +00:00
Juergen Ributzka
fc03e72b4f [FastISel][AArch64] Fix address simplification.
When a shift with extension or an add with shift and extension cannot be folded
into the memory operation, then the address calculation has to be materialized
separately. While doing so the code forgot to consider a possible sign-/zero-
extension. This fix folds now also the sign-/zero-extension into the add or
shift instruction which is used to materialize the address.

This fixes rdar://problem/18141718.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216511 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 00:58:30 +00:00
Juergen Ributzka
836f4bd090 [FastISel][AArch64] Fold Sign-/Zero-Extend into the shift immediate instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216510 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 00:58:26 +00:00
James Molloy
698eabd8b7 Change the return value of "getEnd()" from a MachineInstr* to a MachineBasicBlock::iterator.
It seems on Darwin the illegal round-trip ::iterator -> MachineInstr* -> ::iterator breaks execution horribly when the iterator is not a real MachineInstr, like ::end().



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216455 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-26 13:41:31 +00:00
Dylan Noblesmith
2db57b44de AArch64: use std::fill instead of memset
Followup based on review.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216436 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-26 03:33:26 +00:00
Dylan Noblesmith
4a3195c40d Revert "AArch64: use std::vector for temp array"
This reverts commit r216365.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216433 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-26 02:03:43 +00:00
Juergen Ributzka
670885e56e [FastISel][AArch64] Refactor float zero materialization. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216403 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-25 19:58:05 +00:00
Karthik Bhat
e637d65af3 Allow vectorization of division by uniform power of 2.
This patch adds support to recognize division by uniform power of 2 and modifies the cost table to vectorize division by uniform power of 2 whenever possible.
Updates Cost model for Loop and SLP Vectorizer.The cost table is currently only updated for X86 backend.
Thanks to Hal, Andrea, Sanjay for the review. (http://reviews.llvm.org/D4971)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216371 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-25 04:56:54 +00:00
Dylan Noblesmith
2c6a26dc68 AArch64: unique_ptr-ify map structures
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216366 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-25 01:59:38 +00:00
Dylan Noblesmith
7fb86b0705 AArch64: use std::vector for temp array
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216365 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-25 01:59:36 +00:00
Juergen Ributzka
5e34dffb9c [FastISel][AArch64] Add support for variable shift.
This adds the missing variable shift support for value type i8, i16, and i32.

This fixes <rdar://problem/18095685>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216242 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 23:06:07 +00:00
Robin Morisset
cf165c36ee Rename AtomicExpandLoadLinked into AtomicExpand
AtomicExpandLoadLinked is currently rather ARM-specific. This patch is the first of
a group that aim at making it more target-independent. See
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075873.html
for details

The command line option is "atomic-expand"

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216231 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 21:50:01 +00:00
Juergen Ributzka
5d6365c80c [FastISel][AArch64] Use the correct register class to make the MI verifier happy.
This is mostly achieved by providing the correct register class manually,
because getRegClassFor always returns the GPR*AllRegClass for MVT::i32 and
MVT::i64.

Also cleanup the code to use the FastEmitInst_* method whenever possible. This
makes sure that the operands' register class is properly constrained. For all
the remaining cases this adds the missing constrainOperandRegClass calls for
each operand.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216225 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 20:57:57 +00:00
Quentin Colombet
ad3c6289b6 [AArch64] Run a peephole pass right after AdvSIMD pass.
The AdvSIMD pass may produce copies that are not coalescer-friendly. The
peephole optimizer knows how to fix that as demonstrated in the test case.

<rdar://problem/12702965>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216200 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 18:10:07 +00:00
Juergen Ributzka
60aadd5d8b [FastISel][AArch64] Factor out ANDWri instruction generation into a helper function. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216199 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 18:02:25 +00:00
James Molloy
824431f680 [LoopVectorize] Up the maximum unroll factor to 4 for AArch64
Only for Cortex-A57 and Cyclone for now, where it has shown wins.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216141 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 00:02:51 +00:00
Juergen Ributzka
5273295751 [FastISel][AArch64] Don't fold the sign-/zero-extend from i1 into the compare.
This fixes a bug I introduced in a previous commit (r216033). Sign-/Zero-
extension from i1 cannot be folded into the ADDS/SUBS instructions. Instead both
operands have to be sign-/zero-extended with separate instructions.

Related to <rdar://problem/17913111>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216073 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 16:34:15 +00:00
Aaron Ballman
0e9c68e6bc Silencing an MSVC C4334 warning ('<<' : result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)). NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216067 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 12:14:35 +00:00
Juergen Ributzka
3ef392c4e2 [FastISel][AArch64] Use the proper FMOV instruction to materialize a +0.0.
Use FMOVWSr/FMOVXDr instead of FMOVSr/FMOVDr, which have the proper register
class to be used with the zero register. This makes the MachineInstruction
verifier happy again.

This is related to <rdar://problem/18027157>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216040 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 01:10:36 +00:00
Juergen Ributzka
ae9a7964ef [FastISel][AArch64] Factor out ADDS/SUBS instruction emission and add support for extensions and shift folding.
Factor out the ADDS/SUBS instruction emission code into helper functions and
make the helper functions more clever to support most of the different ADDS/SUBS
instructions the architecture support. This includes better immedediate support,
shift folding, and sign-/zero-extend folding.

This fixes <rdar://problem/17913111>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216033 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-19 22:29:55 +00:00
Alexey Samsonov
97eadf21c8 Delete unused argument in AArch64MCInstLower constructor: it doesn't
use Mangler, and Mangler is in fact not even created when AArch64MCInstLower
is constructed.

This bug is reported by UBSan.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216030 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-19 21:51:08 +00:00
Juergen Ributzka
06bb1ca1e0 Reapply [FastISel][AArch64] Add support for more addressing modes (r215597).
Note: This was originally reverted to track down a buildbot error. Reapply
without any modifications.

Original commit message:
FastISel didn't take much advantage of the different addressing modes available
to it on AArch64. This commit allows the ComputeAddress method to recognize more
addressing modes that allows shifts and sign-/zero-extensions to be folded into
the memory operation itself.

For Example:
  lsl x1, x1, #3     --> ldr x0, [x0, x1, lsl #3]
  ldr x0, [x0, x1]

  sxtw x1, w1
  lsl x1, x1, #3     --> ldr x0, [x0, x1, sxtw #3]
  ldr x0, [x0, x1]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216013 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-19 19:44:17 +00:00
Juergen Ributzka
78f686d37c Reapply [FastISel][AArch64] Make use of the zero register when possible (r215591).
Note: This was originally reverted to track down a buildbot error. Reapply
without any modifications.

Original commit message:
This change materializes now the value "0" from the zero register.
The zero register can be folded by several instruction, so no
materialization is need at all.

Fixes <rdar://problem/17924413>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216009 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-19 19:44:02 +00:00
Alexey Samsonov
2ac376ba34 Hide two different AlignMode enums in anonymous namespaces. This bug is reported by UBSan.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216001 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-19 18:40:39 +00:00
Juergen Ributzka
8841fb5f25 [FastISel][AArch64] Fix a few BuildMI callsites where the result register was added as an operand register.
This fixes a few BuildMI callsites where the result register was added by
using addReg, which is per default a use and therefore an operand register.

Also use the zero register as result register when emitting a compare
instruction (SUBS with unused result register).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215997 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-19 17:41:53 +00:00
Robin Morisset
af2fa71a64 Make use of isAtLeastRelease/Acquire in the ARM/AArch64 backends
Summary:
Make use of isAtLeastRelease/Acquire in the ARM/AArch64 backends
These helper functions are introduced in D4844.
Depends D4844

Test Plan: make check-all passes

Reviewers: jfb

Subscribers: aemerson, llvm-commits, mcrosier, reames

Differential Revision: http://reviews.llvm.org/D4937

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215902 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-18 16:48:58 +00:00
Oliver Stannard
eb922109f9 Teach the AArch64 backend to handle f16
This allows the AArch64 backend to handle fadd, fsub, fmul and fdiv
operations on f16 (half-precision) types by promoting to f32.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215891 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-18 14:22:39 +00:00
Oliver Stannard
802d420792 [ARM,AArch64] Do not tail-call to an externally-defined function with weak linkage
Externally-defined functions with weak linkage should not be
tail-called on ARM or AArch64, as the AAELF spec requires normal calls
to undefined weak functions to be replaced with a NOP or jump to the
next instruction. The behaviour of branch instructions in this
situation (as used for tail calls) is implementation-defined, so we
cannot rely on the linker replacing the tail call with a return.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215890 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-18 12:42:15 +00:00
Tim Northover
049ffbbdf2 TableGen: allow use of uint64_t for available features mask.
ARM in particular is getting dangerously close to exceeding 32 bits worth of
possible subtarget features. When this happens, various parts of MC start to
fail inexplicably as masks get truncated to "unsigned".

Mostly just refactoring at present, and there's probably no way to test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215887 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-18 11:49:42 +00:00
Robin Morisset
c51ec911e5 Fix typos in comments
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215777 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-15 22:17:28 +00:00
Juergen Ributzka
9ad3f9b8e1 [FastISel][AArch64] Fix a latent bug in floating-point materialization.
The floating-point value positive zero (+0.0) is a valid immedate value
according to isFPImmLegal. As a result AArch64 FastISel went ahead and
used the immediate version of fmov to materialize the constant.

The problem is that the immediate version of fmov cannot encode an imediate for
postive zero. Instead a fmov from the zero register was supposed to be used in
this case.

This fix adds handling for this special case and uses fmov from the zero
register to materialize a positive zero (negative zeroes go to the constant
pool).

There is no test case for this, because this code is currently dead. It will be
enabled in a future commit and I will add a test case in a separate commit
after that.

This fixes <rdar://problem/18027157>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215753 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-15 18:55:55 +00:00
Juergen Ributzka
db9a5cb2ea Reapplying [FastISel][AArch64] Cleanup constant materialization code. NFCI.
Note: This reapplies r215582 without any modifications. The refactoring wasn't
responsible for the buildbot failures.

Original commit message:
Cleanup and prepare constant materialization code for future commits.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215752 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-15 18:55:52 +00:00
Amara Emerson
cef3ad6720 [AArch64] Narrow arguments passed in wrong position on the stack in
big-endian mode.

Patch by Asiri Rathnayake.

Differential Revision: http://reviews.llvm.org/D4922

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215716 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-15 14:29:57 +00:00
Rafael Espindola
a348fc7fda Remove HasLEB128.
We already require CFI, so it should be safe to require .leb128 and .uleb128.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215712 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-15 14:01:07 +00:00
Juergen Ributzka
6398a7f5fd Revert several FastISel commits to track down a buildbot error.
This reverts:
r215595 "[FastISel][X86] Add large code model support for materializing floating-point constants."
r215594 "[FastISel][X86] Use XOR to materialize the "0" value."
r215593 "[FastISel][X86] Emit more efficient instructions for integer constant materialization."
r215591 "[FastISel][AArch64] Make use of the zero register when possible."
r215588 "[FastISel] Let the target decide first if it wants to materialize a constant."
r215582 "[FastISel][AArch64] Cleanup constant materialization code. NFCI."

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215673 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 19:56:28 +00:00
Juergen Ributzka
14bc045838 Revert "[FastISel][AArch64] Add support for more addressing modes."
This reverts commits r215597, because it might have broken the build bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215659 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 17:10:54 +00:00
Moritz Roth
d8b0f99f87 Testing commit access.
Remove a trailing whitespace.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215653 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 16:20:50 +00:00
Aaron Ballman
f6964a7df9 Silencing an MSVC C4334 warning ('<<' : result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)). NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215642 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 13:43:57 +00:00
David Majnemer
8c651f5c26 AArch64: Silence warning in AArch64FastISel
GCC was emitting a signed vs unsigned comparison warning.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215620 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 06:44:51 +00:00
Akira Hatanaka
d0ddfb0896 [AArch64, fast-isel] Fall back to SelectionDAG to select tail calls.
Certain functions such as objc_autoreleaseReturnValue have to be called as
tail-calls even at -O0. Since normal fast-isel doesn't emit calls as tail calls,
we have to fall back to SelectionDAG to select calls that are marked as tail.

<rdar://problem/17991614>



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215600 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 23:23:58 +00:00
Juergen Ributzka
8c9a0319bb [FastISel][AArch64] Add support for more addressing modes.
FastISel didn't take much advantage of the different addressing modes available
to it on AArch64. This commit allows the ComputeAddress method to recognize more
addressing modes that allows shifts and sign-/zero-extensions to be folded into
the memory operation itself.

For Example:
  lsl x1, x1, #3     --> ldr x0, [x0, x1, lsl #3]
  ldr x0, [x0, x1]

  sxtw x1, w1
  lsl x1, x1, #3     --> ldr x0, [x0, x1, sxtw #3]
  ldr x0, [x0, x1]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215597 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 22:53:29 +00:00
Juergen Ributzka
dc408e8069 [FastISel][AArch64] Make use of the zero register when possible.
This change materializes now the value "0" from the zero register.
The zero register can be folded by several instruction, so no
materialization is need at all.

Fixes <rdar://problem/17924413>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215591 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 22:13:14 +00:00
Gerolf Hoflehner
2205044968 [MachineCombiner] Removal of dangling DBG_VALUES after combining [20598]
This is a cleaner solution to the problem described in r215431.
When instructions are combined a dangling DBG_VALUE is removed.
This resolves bug 20598.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215587 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 22:07:36 +00:00
Juergen Ributzka
eac0bae1e8 [FastISel][AArch64] Cleanup constant materialization code. NFCI.
Cleanup and prepare constant materialization code for future commits.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215582 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 21:34:04 +00:00
Benjamin Kramer
00e08fcaa0 Canonicalize header guards into a common format.
Add header guards to files that were missing guards. Remove #endif comments
as they don't seem common in LLVM (we can easily add them back if we decide
they're useful)

Changes made by clang-tidy with minor tweaks.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215558 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 16:26:38 +00:00
Gerolf Hoflehner
392f7d970c [MachineCombiner] Fix for ICE bug 20598
The combiner ignored DBG nodes when checking
the uses of a virtual register.

It combined a sequence like
   %vreg1 = madd %vreg2, %vreg3,...
   DBG_VALUE (%vreg1 ...)
   %vreg4 = add %vreg1,...
to
  %vreg4 = madd %vreg2, %vreg3

leaving behind a dangling DBG_VALUE with
a definition. This triggered an assertion
in the MachineTraceMetrics.cpp module.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215431 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-12 07:54:12 +00:00
Jim Grosbach
ce9ccc00d3 Add missing closing namespace comment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215402 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 22:42:31 +00:00
Jim Grosbach
3392b33ccd AArch64: Tidy up a few comments.
Have the comments match the actual parameter names. Found via clang-tidy.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215401 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 22:42:28 +00:00
Quentin Colombet
7f4f923aa5 [AArch64] Fix registerAllocator assigns same register for base and wback in
pre/post-index load and store.

Patch by Steven Wu <stevenwu@apple.com>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215390 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 21:39:53 +00:00
Sylvestre Ledru
a7f0941b83 Fix typos:
* libaries => libraries
* avaiable => available



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215366 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 18:04:46 +00:00
Joerg Sonnenberger
b2b363408b If available, pass down the Fixup object to EvaluateAsRelocatable.
At least on PowerPC, the interpretation of certain modifiers depends on
the context they appear in.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215310 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-10 11:35:12 +00:00
Aaron Ballman
c399631168 Resolving some type truncation warnings in MSVC (enum to bool in this case). No functional changes intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215293 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-09 19:53:34 +00:00
Tim Northover
1968324efa AArch64: avoid deleting the current iterator in a loop.
std::map invalidates the iterator to any element that gets deleted, which means
we can't increment it correctly afterwards. This was causing Darwin test
failures.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215233 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 17:31:52 +00:00
Juergen Ributzka
5714db474a [FastISel][AArch64] Attach MachineMemOperands to load and store instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215231 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 17:24:10 +00:00
NAKAMURA Takumi
b6273f0b1a AArch64A57FPLoadBalancing.cpp: Define ColorNames in !NDEBUG.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215226 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 17:00:59 +00:00
Jiangning Liu
3b85f30319 [AArch64] Fix a type conversion bug for anlyzing compare.
The bug can cause spec2006/483.xalancbmk failure.

Patched by David Xu.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215206 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 14:19:29 +00:00
James Molloy
3a106a2813 [AArch64] Add an FP load balancing pass for Cortex-A57
For best-case performance on Cortex-A57, we should try to use a balanced mix of odd and even D-registers when performing a critical sequence of independent, non-quadword FP/ASIMD floating-point multiply or multiply-accumulate operations.

This pass attempts to detect situations where the register allocation may adversely affect this load balancing and to change the registers used so as to better utilize the CPU.

Ideally we'd just take each multiply or multiply-accumulate in turn and allocate it alternating even or odd registers. However, multiply-accumulates are most efficiently performed in the same functional unit as their accumulation operand. Therefore this pass tries to find maximal sequences ("Chains") of multiply-accumulates linked via their accumulation operand, and assign them all the same "color" (oddness/evenness).

This optimization affects S-register and D-register floating point multiplies and FMADD/FMAs, as well as vector (floating point only) muls and FMADD/FMA. Q register instructions (and 128-bit vector instructions) are not affected.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215199 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 12:33:21 +00:00
Tim Northover
bbdf1e0432 AArch64: stop trying to take control of all UnknownArch triples.
This short-circuited our error reporting for incorrectly specified
target triples (you'd get AArch64 code instead).

Should fix PR20567.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215191 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 08:27:44 +00:00
NAKAMURA Takumi
4f8cb33c7a AArch64InstrInfo.cpp: Fix \param(s). [-Wdocumentation]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215180 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 02:04:18 +00:00
Eric Christopher
aa5b9c0f6f Temporarily Revert "Nuke the old JIT." as it's not quite ready to
be deleted. This will be reapplied as soon as possible and before
the 3.6 branch date at any rate.

Approved by Jim Grosbach, Lang Hames, Rafael Espindola.

This reverts commits r215111, 215115, 215116, 215117, 215136.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215154 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 22:02:54 +00:00
Gerolf Hoflehner
e4fa341dde MachineCombiner Pass for selecting faster instruction sequence on AArch64
Re-commit of r214832,r21469 with a work-around that
avoids the previous problem with gcc build compilers

The work-around is to use SmallVector instead of ArrayRef
of basic blocks in preservesResourceLen()/MachineCombiner.cpp



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215151 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 21:40:58 +00:00
Rafael Espindola
875710a2fd Nuke the old JIT.
I am sure we will be finding bits and pieces of dead code for years to
come, but this is a good start.

Thanks to Lang Hames for making MCJIT a good replacement!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215111 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 14:21:18 +00:00
Eric Christopher
41612a9b85 Remove the target machine from CCState. Previously it was only used
to get the subtarget and that's accessible from the MachineFunction
now. This helps clear the way for smaller changes where we getting
a subtarget will require passing in a MachineFunction/Function as
well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214988 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-06 18:45:26 +00:00
Chad Rosier
af4e76402f [AArch64] Add a few isTarget* API to AArch64 Subtarget.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214977 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-06 16:56:58 +00:00
Chad Rosier
b47a215dd0 [AArch64] Fix OS ABI flag for aarch64-linux-gnu target.
For triple aarch64-linux-gnu we were incorrectly setting IRIX.
For triple aarch64 we are correctly setting SYSV.

Patch by Ana Pazos <apazos@codeaurora.org>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214974 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-06 16:05:02 +00:00
James Molloy
e0243fb42d [AArch64] Add a testcase for r214957.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214965 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-06 13:31:32 +00:00
James Molloy
c2482dfc39 [AArch64] Conditional selects are expensive on out-of-order cores.
Specifically Cortex-A57. This probably applies to Cyclone too but I haven't enabled it for that as I can't test it.

This gives ~4% improvement on SPEC 174.vpr, and ~1% in 471.omnetpp.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214957 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-06 10:42:18 +00:00
Yi Kong
68b3d680f2 AArch64: Add support for instruction prefetch intrinsic
Instruction prefetch is not implemented for AArch64, it is incorrectly
translated into data prefetch instruction.

Differential Revision: http://reviews.llvm.org/D4777


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214860 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 12:46:47 +00:00
James Molloy
72035e9a8e Teach the SLP Vectorizer that keeping some values live over a callsite can have a cost.
Some types, such as 128-bit vector types on AArch64, don't have any callee-saved registers. So if a value needs to stay live over a callsite, it must be spilled and refilled. This cost is now taken into account.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214859 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 12:30:34 +00:00
Juergen Ributzka
baa40c68ac [FastIsel][AArch64] Fix previous commit r214844 (Don't perform sign-/zero-extension for function arguments that have already been sign-/zero-extended.)
The original code would fail for unsupported value types like i1, i8, and i16.
This fix changes the code to only create a sub-register copy for i64 value types
and all other types (i1/i8/i16/i32) just use the source register without any
modifications.

getRegClassFor() is now guarded by the i64 value type check, that guarantees
that we always request a register for a valid value type.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214848 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 07:31:30 +00:00
Juergen Ributzka
eee659a076 [FastISel][AArch64] Implement the FastLowerArguments hook.
This implements basic argument lowering for AArch64 in FastISel. It only
handles a small subset of the C calling convention. It supports simple
arguments that can be passed in GPR and FPR registers.

This should cover most of the trivial cases without falling back to
SelectionDAG.

This fixes <rdar://problem/17890986>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214846 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 05:43:48 +00:00
Kevin Qin
6739735812 Revert "r214832 - MachineCombiner Pass for selecting faster instruction"
It broke compiling of most Benchmark and internal test, as clang got
clashed by segmentation fault or assertion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214845 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 05:43:47 +00:00
Juergen Ributzka
7e9c0bc511 [FastISel][AArch64] Don't perform sign-/zero-extension for function arguments that have already been sign-/zero-extended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214844 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 05:43:44 +00:00
Eric Christopher
6035518e3b Have MachineFunction cache a pointer to the subtarget to make lookups
shorter/easier and have the DAG use that to do the same lookup. This
can be used in the future for TargetMachine based caching lookups from
the MachineFunction easily.

Update the MIPS subtarget switching machinery to update this pointer
at the same time it runs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214838 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 02:39:49 +00:00
Gerolf Hoflehner
c2328d552c MachineCombiner Pass for selecting faster instruction
sequence on AArch64

Re-commit of r214669 without changes to test cases
LLVM::CodeGen/AArch64/arm64-neon-mul-div.ll and
LLVM:: CodeGen/AArch64/dp-3source.ll
This resolves the reported compfails of the original commit.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214832 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-05 01:16:13 +00:00
Juergen Ributzka
2c68cde701 [FastISel][AArch64] Fix shift lowering for i8 and i16 value types.
This fix changes the parameters #r and #s that are passed to the UBFM/SBFM
instruction to get the zero/sign-extension for free.

The original problem was that the shift left would use the 32-bit shift even for
i8/i16 value types, which could leave the upper bits set with "garbage" values.

The arithmetic shift right on the other side would use the wrong MSB as sign-bit
to determine what bits to shift into the value.

This fixes <rdar://problem/17907720>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214788 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 21:49:51 +00:00
Eric Christopher
9f85dccfc6 Remove the TargetMachine forwards for TargetSubtargetInfo based
information and update all callers. No functional change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214781 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 21:25:23 +00:00
Chad Rosier
82c93451f3 [AArch64] Extend the number of scalar instructions supported in the AdvSIMD
scalar integer instruction pass.

This is a patch I had lying around from a few months ago.  The pass is
currently disabled by default, so nothing to interesting.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214779 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 21:20:25 +00:00
Kevin Qin
534100b31e Revert "r214669 - MachineCombiner Pass for selecting faster instruction"
This commit broke "make check" for several hours, so get it reverted.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214697 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-04 05:10:33 +00:00
Gerolf Hoflehner
48e1bd7287 MachineCombiner Pass for selecting faster instruction
sequence -  AArch64 target support

 This patch turns off madd/msub generation in the DAGCombiner and generates
 them in the MachineCombiner instead. It replaces the original code sequence
 with the combined sequence when it is beneficial to do so.

 When there is no machine model support it always generates the madd/msub
 instruction. This is true also when the objective is to optimize for code
 size: when the combined sequence is shorter is always chosen and does not
 get evaluated.

 When there is a machine model the combined instruction sequence
 is evaluated for critical path and resource length using machine
 trace metrics and the original code sequence is replaced when it is
 determined to be faster.

 rdar://16319955



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214669 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-03 22:03:40 +00:00
Juergen Ributzka
8d3bf10dd3 [FastISel][AArch64] Fold offset into the memory operation.
Fold simple offsets into the memory operation:
  add x0, x0, #8
  ldr x0, [x0]
-->
  ldr x0, [x0, #8]

Fixes <rdar://problem/17887945>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214545 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-01 19:40:16 +00:00
Juergen Ributzka
3d253a3f80 [FastISel][AArch64] Add branch weights.
Add branch weights to branch instructions, so that the following passes can
optimize based on it (i.e. basic block ordering).

Fixes <rdar://problem/17887137>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214537 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-01 18:39:24 +00:00
Renato Golin
de97719d77 Add missing breaks to AArch64InstrInfo::isGPRCopy
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214528 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-01 17:27:31 +00:00
Chad Rosier
4175e2f597 [AArch64] Generate tbz/tbnz when comparing against zero.
The tbz/tbnz checks the sign bit to convert

op w1, w1, w10
cmp w1, #0
b.lt .LBB0_0

to

op w1, w1, w10
tbnz w1, #31, .LBB0_0

Differential Revision: http://reviews.llvm.org/D4440

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214518 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-01 14:48:56 +00:00
Juergen Ributzka
74ac16386b [FastISel][AArch64] Fix the immediate versions of the {s|u}{add|sub}.with.overflow intrinsics.
ADDS and SUBS cannot encode negative immediates or immediates larger than 12bit.
This fix checks if the immediate version can be used under this constraints and
if we can convert ADDS to SUBS or vice versa to support negative immediates.

Also update the test cases to test the immediate versions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214470 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-01 01:25:55 +00:00
Louis Gerbarg
7d54c5b0f2 Make sure no loads resulting from load->switch DAGCombine are marked invariant
Currently when DAGCombine converts loads feeding a switch into a switch of
addresses feeding a load the new load inherits the isInvariant flag of the left
side. This is incorrect since invariant loads can be reordered in cases where it
is illegal to reoarder normal loads.

This patch adds an isInvariant parameter to getExtLoad() and updates all call
sites to pass in the data if they have it or false if they don't. It also
changes the DAGCombine to use that data to make the right decision when
creating the new load.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214449 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-31 21:45:05 +00:00
Aaron Ballman
43a151dbf9 Fixing an -Woverloaded-virtual warnings by exposing the hidden virtual function as well. No functional changes intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214400 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-31 12:58:50 +00:00
Juergen Ributzka
95ec2761f3 [FastISel][AArch64] Add basic bitcast support for conversion between float and int.
Fixes <rdar://problem/17867078>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214389 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-31 06:25:37 +00:00
Juergen Ributzka
a60ed423a6 [FastISel][AArch64] Add sqrt intrinsic support.
Fixes <rdar://problem/17867067>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214388 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-31 06:25:33 +00:00
Juergen Ributzka
e3a75015d7 [FastISel][AArch64] Add MachO large code model support for function calls.
Currently the large code model for MachO uses the GOT to make function calls.
Emit the required adrp and ldr instructions to load the address from the GOT.

Related to <rdar://problem/17733076>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214381 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-31 04:10:40 +00:00
Juergen Ributzka
450c33b212 [FastISel][AArch64 and X86] Don't emit stores for UNDEF arguments during function call lowering.
UNDEF arguments are not ment to be touched - especially for the webkit_js
calling convention. This fix reproduces the already existing behavior of
SelectionDAG in FastISel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214366 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-31 00:11:11 +00:00
Juergen Ributzka
cb99212bc1 [FastISel][AArch64] Add select folding support for the XALU intrinsics.
This improves the code generation for the XALU intrinsics when the
condition is feeding a select instruction.

This also updates and enables the XALU unit tests for FastISel.

This fixes <rdar://problem/17831117>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214350 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-30 22:04:37 +00:00
Juergen Ributzka
29c424cb12 [FastISel][AArch64] Add branch folding support for the XALU intrinsics.
This improves the code generation for the XALU intrinsics when the
condition is feeding a branch instruction.

This is related to <rdar://problem/17831117>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214349 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-30 22:04:34 +00:00
Juergen Ributzka
2235d7b9a7 [FastISel][AArch64] Add {s|u}{add|sub|mul}.with.overflow intrinsic support.
This commit adds support for the {s|u}{add|sub|mul}.with.overflow intrinsics.
The unit tests for FastISel will be enabled in a later commit, once there is
also branch and select folding support.

This is related to <rdar://problem/17831117>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214348 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-30 22:04:31 +00:00
Juergen Ributzka
24383df3ea [FastISel][AArch64] Create helper functions to create the various multiplies on AArch64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214346 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-30 22:04:25 +00:00
Juergen Ributzka
b0dba10fa6 [FastISel][AArch64] Add support for shift-immediate.
Currently the shift-immediate versions are not supported by tblgen and
hopefully this can be later removed, once the required support has been
added to tblgen.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214345 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-30 22:04:22 +00:00
Jiangning Liu
a3b8caf8a7 Implement AArch64 TTI interface isAsCheapAsAMove.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214159 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-29 02:09:26 +00:00
Matt Arsenault
2dd264c8a3 Add alignment value to allowsUnalignedMemoryAccess
Rename to allowsMisalignedMemoryAccess.

On R600, 8 and 16 byte accesses are mostly OK with 4-byte alignment,
and don't need to be split into multiple accesses. Vector loads with
an alignment of the element type are not uncommon in OpenCL code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214055 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-27 17:46:40 +00:00
Tim Northover
54a7f7f9e0 AArch64: fix conversion of 'J' inline asm constraints.
'J' represents a negative number suitable for an add/sub alias
instruction, but while preparing it to become an int64_t we were
mangling the sign extension. So "i32 -1" became 0xffffffffLL, for
example.

Should fix one half of PR20456.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214052 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-27 07:10:29 +00:00
Akira Hatanaka
0651a556fe [stack protector] Fix a potential security bug in stack protector where the
address of the stack guard was being spilled to the stack.

Previously the address of the stack guard would get spilled to the stack if it
was impossible to keep it in a register. This patch introduces a new target
independent node and pseudo instruction which gets expanded post-RA to a
sequence of instructions that load the stack guard value. Register allocator
can now just remat the value when it can't keep it in a register. 

<rdar://problem/12475629>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213967 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-25 19:31:34 +00:00
Juergen Ributzka
06640d93e0 [FastISel][AArch64] Add support for frameaddress intrinsic.
This commit implements the frameaddress intrinsic for the AArch64 architecture
in FastISel.

There were two test cases that pretty much tested the same, so I combined them
to a single test case.

Fixes <rdar://problem/17811834>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213959 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-25 17:47:14 +00:00
Benjamin Kramer
ce63ab327a Run sort_includes.py on the AArch64 backend.
No functionality change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213938 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-25 11:42:14 +00:00
Tim Northover
3715084d47 AArch64: refactor ReconstructShuffle function
Quite a bit of cruft had accumulated as we realised the various different cases
it had to handle and squeezed them in where possible. This refactoring mostly
flattens the logic and special-cases. The result is slightly longer, but I
think clearer.

Should be no functionality change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213867 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-24 15:39:55 +00:00
NAKAMURA Takumi
a4abc43a56 Prune redundant libdeps.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213857 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-24 11:45:27 +00:00
NAKAMURA Takumi
9fda421ee4 Update library dependencies.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213832 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-24 02:10:42 +00:00
Kevin Qin
2daff76c05 [AArch64] Fix a bug generating incorrect instruction when building small vector.
This bug is introduced by r211144. The element of operand may be
smaller than the element of result, but previous commit can
only handle the contrary condition. This commit is to handle this
scenario and generate optimized codes like ZIP1.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213830 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-24 02:05:42 +00:00
Jiangning Liu
1bc34d71b7 [AArch64] Disable some optimization cases for type conversion from sint to fp, because those optimization cases are micro-architecture dependent and only make sense for Cyclone. A new predicate Cyclone is introduced in .td file.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213827 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-24 01:29:59 +00:00
Rafael Espindola
34c5b5b952 Finish inverting the MC -> Object dependency.
There were still some disassembler bits in lib/MC, but their use of Object
was only visible in the includes they used, not in the symbols.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213808 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 22:26:07 +00:00
Jim Grosbach
adb6a3649e [X86,AArch64] Extend vcmp w/ unary op combine to work w/ more constants.
The transform to constant fold unary operations with an AND across a
vector comparison applies when the constant is not a splat of a scalar
as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213800 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 20:41:43 +00:00
Jim Grosbach
4070037e3c X86: restrict combine to when type sizes are safe.
The folding of unary operations through a vector compare and mask operation
is only safe if the unary operation result is of the same size as its input.
For example, it's not safe for [su]itofp from v4i32 to v4f64.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213799 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 20:41:38 +00:00
Juergen Ributzka
4fa6ecc26f [FastISel][AArch64] Fix return type in FastLowerCall.
I used the wrong method to obtain the return type inside FinishCall. This fix
simply uses the return type from FastLowerCall, which we already determined to
be a valid type.

Reduced test case from Chad. Thanks.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213788 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 20:03:13 +00:00
Chad Rosier
67c325e9f0 [AArch64] Lower sdiv x, pow2 using add + select + shift.
The target-independent DAGcombiner will generate:
asr w1, X, #31 w1 = splat sign bit.
add X, X, w1, lsr #28 X = X + 0 or pow2-1
asr w0, X, asr #4 w0 = X/pow2

However, the add + shifts is expensive, so generate:
add w0, X, 15 w0 = X + pow2-1
cmp X, wzr X - 0
csel X, w0, X, lt X = (X < 0) ? X + pow2-1 : X;
asr w0, X, asr 4 w0 = X/pow2

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213758 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 14:57:52 +00:00
Tim Northover
8b6257629a AArch64: remove "arm64_be" support in favour of "aarch64_be".
There really is no arm64_be: it was a useful fiction to test big-endian support
while both backends existed in parallel, but now the only platform that uses
the name (iOS) doesn't have a big-endian variant, let alone one called
"arm64_be".

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213748 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 12:58:11 +00:00
Tim Northover
9231148e69 AArch64: remove arm64 triple enumerator.
Having both Triple::arm64 and Triple::aarch64 is extremely confusing, and
invites bugs where only one is checked. In reality, the only legitimate
difference between the two (arm64 usually means iOS) is also present in the OS
part of the triple and that's what should be checked.

We still parse the "arm64" triple, just canonicalise it to Triple::aarch64, so
there aren't any LLVM-side test changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213743 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 12:32:47 +00:00
Juergen Ributzka
7edf396977 [FastIsel][AArch64] Add support for the FastLowerCall and FastLowerIntrinsicCall target-hooks.
This commit modifies the existing call lowering functions to be used as the
FastLowerCall and FastLowerIntrinsicCall target-hooks instead.

This enables patchpoint intrinsic lowering for AArch64.

This fixes <rdar://problem/17733076>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213704 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-22 23:14:58 +00:00
Tim Northover
f8d927f22b CodeGen: emit IR-level f16 conversion intrinsics as fptrunc/fpext
This makes the first stage DAG for @llvm.convert.to.fp16 an fptrunc,
and correspondingly @llvm.convert.from.fp16 an fpext. The legalisation
path is now uniform, regardless of the input IR:

  fptrunc -> FP_TO_FP16 (if f16 illegal) -> libcall
  fpext -> FP16_TO_FP (if f16 illegal) -> libcall

Each target should be able to select the version that best matches its
operations and not be required to duplicate patterns for both fptrunc
and FP_TO_FP16 (for example).

As a result we can remove some redundant AArch64 patterns.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213507 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 09:13:56 +00:00
David Peixotto
12f33da20b MC: support different sized constants in constant pools
On AArch64 the pseudo instruction ldr <reg>, =... supports both
32-bit and 64-bit constants. Add support for 64 bit constants for
the pools to support the pseudo instruction fully.

Changes the AArch64 ldr-pseudo tests to use 32-bit registers and
adds tests with 64-bit registers.

Patch by Janne Grunau!

Differential Revision: http://reviews.llvm.org/D4279



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213387 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-18 16:05:14 +00:00
Tim Northover
e72ff8829e AArch64: implement efficient f16 bitcasts
Because i16 is illegal, there's no native DAG method to
represent a bitcast to or from an f16 type. This meant LLVM was
inserting a stack store/load pair which is really not ideal.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213378 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-18 13:07:05 +00:00
Tim Northover
1a8bcdb72e AArch64: support f16 extend/trunc operations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213375 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-18 13:01:31 +00:00
Jim Grosbach
f4e104f5eb AArch64: Constant fold converting vector setcc results to float.
Since the result of a SETCC for AArch64 is 0 or -1 in each lane, we can
move unary operations, in this case [su]int_to_fp through the mask
operation and constant fold the operation away. Generally speaking:
  UNARYOP(AND(VECTOR_CMP(x,y), constant))
      --> AND(VECTOR_CMP(x,y), constant2)
where constant2 is UNARYOP(constant).

This implements the transform where UNARYOP is [su]int_to_fp.

For example, consider the simple function:
define <4 x float> @foo(<4 x float> %val, <4 x float> %test) nounwind {
  %cmp = fcmp oeq <4 x float> %val, %test
  %ext = zext <4 x i1> %cmp to <4 x i32>
  %result = sitofp <4 x i32> %ext to <4 x float>
  ret <4 x float> %result
}

Before this change, the code is generated as:
  fcmeq.4s  v0, v0, v1
  movi.4s v1, #0x1        // Integer splat value.
  and.16b v0, v0, v1      // Mask lanes based on the comparison.
  scvtf.4s  v0, v0        // Convert each lane to f32.
  ret

After, the code is improved to:
  fcmeq.4s  v0, v0, v1
  fmov.4s v1, #1.00000000 // f32 splat value.
  and.16b v0, v0, v1      // Mask lanes based on the comparison.
  ret

The svvtf.4s has been constant folded away and the floating point 1.0f
vector lanes are materialized directly via fmov.4s.

Rather than do the folding manually in the target code, teach getNode()
in the generic SelectionDAG to handle folding constant operands of
vector [su]int_to_fp nodes. It is reasonable (as noted in a FIXME) to do
additional constant folding there as well, but I don't have test cases
for those operations, so leaving them for another time when it becomes
appropriate.

rdar://17693791

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213341 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-18 00:40:52 +00:00
Arnaud A. de Grandmaison
08f689e9b0 [AArch64] Cleanup AsmParser: no need to use dyn_cast + assert. cast does it for us.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213296 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-17 19:08:14 +00:00
Tim Northover
3e61ccdded CodeGen: extend f16 conversions to permit types > float.
This makes the two intrinsics @llvm.convert.from.f16 and
@llvm.convert.to.f16 accept types other than simple "float". This is
only strictly needed for the truncate operation, since otherwise
double rounding occurs and there's no way to represent the strict IEEE
conversion. However, for symmetry we allow larger types in the extend
too.

During legalization, we can expand an "fp16_to_double" operation into
two extends for convenience, but abort when the truncate isn't legal. A new
libcall is probably needed here.

Even after this commit, various target tweaks are needed to actually use the
extended intrinsics. I've put these into separate commits for clarity, so there
are no actual tests of f64 conversion here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213248 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-17 10:51:23 +00:00
Yi Kong
f33a30cdd0 Port memory barriers intrinsics to AArch64
Memory barrier __builtin_arm_[dmb, dsb, isb] intrinsics are required to
implement their corresponding ACLE and MSVC intrinsics.

This patch ports ARM dmb, dsb, isb intrinsic to AArch64.

Differential Revision: http://reviews.llvm.org/D4520


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213247 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-17 10:50:20 +00:00
Tim Northover
fbb631183a AArch64: fall back to generic code for out of range extract/insert.
rdar://problem/17624784

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213059 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-15 10:00:26 +00:00
Tim Northover
26012cec89 AArch64: remove unnecessary pseudo-instruction.
Sufficiently twisted use of TableGen lets us write patterns directly for f16
(as an i16 promoted to i32) -> f32 conversion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212933 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-14 11:16:02 +00:00
Saleem Abdulrasool
01c06d7954 AArch64: add support for llvm.aarch64.hint intrinsic
This adds a llvm.aarch64.hint intrinsic to mirror the llvm.arm.hint in order to
support the various hint intrinsic functions in the ACLE.

Add an optional pattern field that permits the subclass to specify the pattern
that matches the selection.  The intrinsic pattern is set as mayLoad, mayStore,
so overload the value for the definition of the hint instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212883 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-12 21:20:49 +00:00
Oliver Stannard
cb047f2a74 ARM: Allow __fp16 as a function arg or return type for AArch64
ACLE 2.0 allows __fp16 to be used as a function argument or return
type. This enables this for AArch64.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212812 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-11 13:33:46 +00:00
Arnaud A. de Grandmaison
a9af0558b2 [AArch64] Add logical alias instructions to MC AsmParser
This patch teaches the AsmParser to accept some logical+immediate
instructions and convert them as shown:

  bic  Rd, Rn, #imm  ->  and Rd, Rn, #~imm
  bics Rd, Rn, #imm  ->  ands Rd, Rn, #~imm
  orn  Rd, Rn, #imm  ->  orr Rd, Rn, #~imm
  eon  Rd, Rn, #imm  ->  eor Rd, Rn, #~imm

Those instructions are an alternate syntax available to assembly coders,
and are needed in order to support code already compiling with some other
assemblers. For example, the bic construct is used by the linux kernel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212722 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-10 15:12:26 +00:00
Tim Northover
6b0ac2aa02 AArch64: correctly fast-isel i8 & i16 multiplies
We were asking for a register for type i8 or i16 which caused an assert.

rdar://problem/17620015

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212718 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-10 14:18:46 +00:00
Jim Grosbach
a3edd6a038 AArch64: Better codegen for storing to __fp16.
Storing will generally be immediately preceded by rounding from an f32
or f64, so make sure to match those patterns directly to convert into the
FPR16 register class directly rather than going through the integer GPRs.

This also eliminates an extra step in the convert-from-f64 path
which was first converting to f32 and then to f16 from there.

rdar://17594379

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212638 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-09 18:55:52 +00:00
Louis Gerbarg
479af7ce0c Make AArch64FastISel::EmitIntExt explicitly check its source and destination types
This is a follow up to r212492. There should be no functional difference, but
this patch makes it clear that SrcVT must be an i1/i8/16/i32 and DestVT must be
an i8/i16/i32/i64.

rdar://17516686

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212633 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-09 17:54:32 +00:00
Jim Grosbach
05bb7c5045 AArch64: Better codegen for loading from __fp16.
Loading will generally extend to an f32 or an 64, so make sure
to match those patterns directly to load into the FPR16 register
class directly rather than going through the integer GPRs.

This also eliminates an extra step in the convert-to-f64 path
which was first converting to f32 and then to f64 from there.

rdar://17594379

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212573 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-08 23:28:48 +00:00
Arnaud A. de Grandmaison
60d8767211 Truncate the immediate in logical operation to the register width
And continue to produce an error if the 32 most significant bits are not all ones or zeros.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212520 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-08 09:53:04 +00:00
Louis Gerbarg
e7f8191b18 Allow AArch64FastISel to degrade graceully in the presence of an MVT::i128
Currently AArch64FastISel crashes if it tries to extend an integer into an
MVT::i128. This can happen by creating 128 bit integers like so:

  typedef unsigned int uint128_t __attribute__((mode(TI)));
  typedef int sint128_t __attribute__((mode(TI)));

This patch makes EmitIntExt check for their presence and then falls back to
SelectionDAG.

Tests included.

rdar://17516686

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212492 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-07 21:37:51 +00:00
Kevin Qin
307e97d066 [AArch64] Normalize all constants to build a vector.
The value of constant operands will be truncated to fit element width.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212428 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-07 02:45:40 +00:00
Saleem Abdulrasool
36019bbc3a AArch64: whitespace cleanup
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212420 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-06 22:13:26 +00:00
Chandler Carruth
70968365db [codegen,aarch64] Add a target hook to the code generator to control
vector type legalization strategies in a more fine grained manner, and
change the legalization of several v1iN types and v1f32 to be widening
rather than scalarization on AArch64.

This fixes an assertion failure caused by scalarizing nodes like "v1i32
trunc v1i64". As v1i64 is legal it will fail to scalarize v1i32.

This also provides a foundation for other targets to have more granular
control over how vector types are legalized.

Patch by Hao Liu, reviewed by Tim Northover. I'm committing it to allow
some work to start taking place on top of this patch as it adds some
really important hooks to the backend that I'd like to immediately start
using. =]

http://reviews.llvm.org/D4322

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212242 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-03 00:23:43 +00:00
Duncan P. N. Exon Smith
9b4509a759 AArch64: Re-enable AArch64AddressTypePromotion
This reverts commits r212189 and r212190.

While this pass was accidentally disabled (until r212073), r205437
slipped in a use of `auto` that should have been `auto&`.

This fixes PR20188.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212201 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-02 18:17:40 +00:00
Duncan P. N. Exon Smith
a95253080b AArch64: Remove unnecessary parens
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212199 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-02 18:14:03 +00:00
Duncan P. N. Exon Smith
8ec4365621 AArch64: Merge isa with dyn_cast
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212194 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-02 17:26:39 +00:00
Duncan P. N. Exon Smith
2a56dd87c8 AArch64: Temporarily disable AArch64AddressTypePromotion
Temporarily disable AArch64AddressTypePromotion, which was effectively
re-enabled in r212073 and r212075, while I look into PR20188.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212189 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-02 17:03:16 +00:00
Saleem Abdulrasool
8f9108459e aarch64: support target-specific .req assembler directive
Based on the support for .req on ARM. The aarch64 variant has to keep track if
the alias register was a vector register (v0-31) or a general purpose or
VFP/Advanced SIMD ([bhsdq]0-31) register.

Patch by Janne Grunau!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212161 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-02 04:50:23 +00:00
Juergen Ributzka
75909261f0 [DAG] Pass the argument list to the CallLoweringInfo via move semantics. NFCI.
The argument list vector is never used after it has been passed to the
CallLoweringInfo and moving it to the CallLoweringInfo is cleaner and
pretty much as cheap as keeping a pointer to it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212135 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-01 22:01:54 +00:00
Tim Northover
4f5fa35b59 AArch64: fix comment typo
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212120 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-01 19:47:09 +00:00
Duncan P. N. Exon Smith
c3c8e7b79d AArch64: Follow-up to r212073
In r212073 I missed a call of `use_begin()` that assumed the wrong
semantics.  It's not clear to me at all what this code does without the
fix, so I'm not sure how to write a testcase.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212075 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-01 00:05:37 +00:00
Duncan P. N. Exon Smith
91fa94884d AArch64: Actually do address type promotion
AArch64AddressTypePromotion was doing nothing because it was using the
old semantics of `Use` and `uses()`, when it really wanted to get at the
`users()`.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212073 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-30 23:42:14 +00:00
Chad Rosier
99f2d6fcc2 [AArch64] Unsized types don't specify an alignment.
PR20109

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212045 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-30 15:03:00 +00:00
Chad Rosier
e7dfa85e85 [AArch64] Convert mul x, -(pow2 +/- 1) to shift + add/sub.
The combine for mul x, pow2 +/- 1 is unchanged. Test cases for
both combines as well as mul x, pow2 have been added as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212044 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-30 14:51:14 +00:00
Eric Christopher
db1c494276 Remove extraneous includes from the target machines.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211800 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-26 19:30:05 +00:00
Rafael Espindola
c7abd27294 Move expression visitation logic up to MCStreamer.
Remove the duplicate from MCRecordStreamer. No functionality change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211714 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-25 15:45:33 +00:00
Rafael Espindola
d4feaf82bc Simplify the visitation of target expressions. No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211707 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-25 15:29:54 +00:00
Weiming Zhao
c33b4883b3 Resubmit commit r211533
"Fix PR20056: Implement pseudo LDR <reg>, =<literal/label> for AArch64"
Missed files are added in this commit.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211605 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 16:21:38 +00:00
Rafael Espindola
7e7e89f178 This reverts commit r211533 and r211539.
Revert "Fix PR20056: Implement pseudo LDR <reg>, =<literal/label> for AArch64"
 Revert "Fix cmake build."

It was missing a file.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211540 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 21:20:58 +00:00
Juergen Ributzka
af5c54f140 Fix cmake build.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211539 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 21:15:55 +00:00
Weiming Zhao
3cffac5061 Fix PR20056: Implement pseudo LDR <reg>, =<literal/label> for AArch64
This patch is based on the changes from ARM target [1,2]

Based on ARM doc [3], if the literal value can be loaded with a valid MOV,
it can emit that instruction. This is implemented in this patch.

[1] Fix PR18345: ldr= pseudo instruction produces incorrect code when using in inline assembly
Author: David Peixotto <dpeixott@codeaurora.org>
commit b92cca2228 (r200777)
[2] Implement the ldr-pseudo opcode for ARM assembly
Author: David Peixotto <dpeixott@codeaurora.org>
commit 0fa193b086 (r197708)
[3] http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0802a/CJAHAIBC.html

Differential Revision: http://reviews.llvm.org/D4163


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211533 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 20:44:16 +00:00
Craig Topper
bd01df2487 Convert some assert(0) to llvm_unreachable or fold an 'if' condition into the assert.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211254 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-19 06:10:58 +00:00
Kevin Qin
74287ec34c [AArch64] Fix a pattern match failure caused by creating improper CONCAT_VECTOR.
ReconstructShuffle() may wrongly creat a CONCAT_VECTOR trying to
concat 2 of v2i32 into v4i16. This commit is to fix this issue and
try to generate UZP1 instead of lots of MOV and INS.
Patch is initalized by Kevin Qin, and refactored by Tim Northover.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211144 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-18 05:54:42 +00:00
Craig Topper
10d664fee7 Replace some assert(0)'s with llvm_unreachable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211141 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-18 05:05:13 +00:00
Tim Northover
c22960dba6 AArch64: estimate inline asm length during branch relaxation
To make sure branches are in range, we need to do a better job of estimating
the length of an inline assembly block than "it's probably 1 instruction, who'd
write asm with more than that?".

Fortunately there's already a (highly suspect, see how many ways you can think
of to break it!) callback for this purpose, which is used by the other targets.

rdar://problem/17277590

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211095 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-17 11:31:42 +00:00
Jim Grosbach
44d2cdcbf3 AArch64: Add backend intrinsic for rbit.
Define an intrinsic for the frontend to use and pattern match it to
the RBIT instruction.

rdar://9283021

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211058 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-16 21:55:35 +00:00
Tilmann Scheller
d492c19894 [AArch64] Remove dead code.
Both function declarations lack a callee and an implementation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211029 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-16 15:15:41 +00:00
James Molloy
b3820b4289 [AArch64] Fix a fencepost error in lowering for llvm.aarch64.neon.uqshl.
Patch by Jiangning Liu!



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211014 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-16 10:39:21 +00:00
Tim Northover
8bfc50e4a9 AArch64: improve handling & modelling of FP_TO_XINT nodes.
There's probably no acatual change in behaviour here, just updating
the LowerFP_TO_INT function to be more similar to the reverse
implementation and updating costs to current CodeGen.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210985 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-15 09:27:15 +00:00
Tim Northover
94fe5c1fe2 AArch64: improve vector [su]itofp handling.
This somehow got missed in the AArch64 merge, so should fix a
performance regression since 3.4.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210984 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-15 09:27:06 +00:00