Commit Graph

10544 Commits

Author SHA1 Message Date
Louis Gerbarg
7a9fbab182 Add custom lowering for add/sub with overflow intrinsics to ARM
This patch adds support to ARM for custom lowering of the
llvm.{u|s}add.with.overflow.i32 intrinsics for i32/i64. This is particularly useful
for handling idiomatic saturating math functions as generated by
InstCombineCompare.

Test cases included.

rdar://14853450

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208435 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-09 17:02:49 +00:00
Tom Stellard
3f26d366a4 R600/SI: Teach SIInstrInfo::moveToVALU() how to move S_LOAD_*_IMM instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208432 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-09 16:42:22 +00:00
Tom Stellard
300094fd84 R600/SI: Fix SMRD pattern for offsets > 32 bits
We were dropping the high bits of 64-bit immediate offsets.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208431 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-09 16:42:21 +00:00
Tom Stellard
561bb44525 R600: Expand i64 SELECT_CC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208430 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-09 16:42:19 +00:00
Tom Stellard
87b983680c R600: Move MIN/MAX matching from LowerOperation() to PerformDAGCombine()
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208429 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-09 16:42:16 +00:00
James Molloy
bfaccd494f Attempt to pacify the bots - this commit requires asserts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208424 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-09 16:20:53 +00:00
Oliver Stannard
e2948385b9 ARM: HFAs must be passed in consecutive registers
When using the ARM AAPCS, HFAs (Homogeneous Floating-point Aggregates) must
be passed in a block of consecutive floating-point registers, or on the stack.
This means that unused floating-point registers cannot be back-filled with
part of an HFA, however this can currently happen. This patch, along with the
corresponding clang patch (http://reviews.llvm.org/D3083) prevents this.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208413 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-09 14:01:47 +00:00
Daniel Sanders
32650944eb [mips][mips64r6] Add experimental support for MIPS32r6 and MIPS64r6
Summary:
Adds MIPS32r6/MIPS64r6 and checks the compatibility requirements for these
processors.

I've also included comments to describe removed and re-encoded instructions,
along with placeholder def's for the new instructions but there are no
functional changes to codegen at this point.

Reviewers: jkolek, vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3622

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208399 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-09 09:46:21 +00:00
Saleem Abdulrasool
74d614a6fc ARM: support PIC on Windows on ARM
Handle lowering of global addresses for PIC mode compilation on Windows.  Always
use the movw/movt load to load the address as Windows on ARM requires ARMv7+ and
is a pure Thumb environment.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208385 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-09 00:58:32 +00:00
Filipe Cabecinhas
e4a3254c02 Optimize shufflevector that copies an i64/f64 and zeros the rest.
Summary:
Also ran clang-format on the function. The code added is the last else
if block.

Reviewers: nadav, craig.topper, delena

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3518

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208372 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-08 23:16:08 +00:00
Justin Bogner
73773ce844 test/CodeGen: Check that the correct register is used in a store
This tightens up r208351 to ensure that a store is fed with the
correct value.

Thanks to Quentin Colombet for spotting this!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208368 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-08 22:45:07 +00:00
Justin Bogner
8115f93cdb Make a CodeGen test more robust against vector register selection
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208351 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-08 18:53:56 +00:00
Andrea Di Biagio
2360e51fd0 [X86] Add target specific combine rules to fold SSE2/AVX2 packed arithmetic shift intrinsics.
This patch teaches the backend how to combine packed SSE2/AVX2 arithmetic shift
intrinsics.

The rules are:
 - Always fold a packed arithmetic shift by zero to its first operand;
 - Convert a packed arithmetic shift intrinsic dag node into a ISD::SRA only if
   the shift count is known to be smaller than the vector element size.

This patch also teaches to function 'getTargetVShiftByConstNode' how fold
target specific vector shifts by zero.

Added two new tests to verify that the DAGCombiner is able to fold
sequences of SSE2/AVX2 packed arithmetic shift calls.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208342 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-08 17:44:04 +00:00
Saleem Abdulrasool
f37151a2fd test: fix test on Windows
When building on Windows, the default target is Windows.  Windows on ARM does
not support ARM mode compilation, resulting in test failures.  Simply specify a
triple to ensure that we are testing the correct behaviour.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208340 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-08 17:11:29 +00:00
Christian Pirker
c60a59cad3 ARM big endian function argument passing
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208316 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-08 14:06:24 +00:00
James Molloy
00c4dbd10e [ARM64-BE] Teach fast-isel about how to set up sub-word stack arguments for big endian calls.
SelectionDAG already knows about this, but fast-isel was ignorant.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208307 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-08 12:53:50 +00:00
Tim Northover
291cd09645 ARM64: make sure FastISel emits SSA MachineInstrs
We need to use a temporary register for a 2-step operation like REM.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208297 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-08 10:30:56 +00:00
Hao Liu
1c2f863df9 AArch64/ARM64: Port NEON post-increment load/store with 2/3/4 vectors to ARM64 backend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208284 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-08 07:38:13 +00:00
Filipe Cabecinhas
b19c087aa7 Lower certain build_vectors to insertps instructions
Summary:
Vectors built with zeros and elements in the same order as another
(source) vector are optimized to be built using a single insertps
instruction.
Also optimize when we move one element in a vector to a different place
in that vector while zeroing out some of the other elements.

Further optimizations are possible, described in TODO comments.
I will be implementing at least some of them in the near future.

Added some tests for different cases where this optimization triggers.

Reviewers: nadav, delena, craig.topper

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3521

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208271 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-08 00:25:16 +00:00
Quentin Colombet
57b4c5d473 [X86] Add a test case for r208252.
Prior to r208252, the FMA 231 family was marked as isCommutable. However the
memory variants of this family are not commutable. Therefore, we did not
implemented the findCommutedOpIndices for those variants and missed that
the default implementation (more or less: commute indices 1 and 2) was
firing behind our back.
As a result, as demonstrated in the test case before the fix, we were
transforming a = b * c + a into a = a * c + b.

I.e., before r208252 we were generating for this test case:
vmovaps %xmm0, %xmm1
vmoss (%rsi), %xmm0
vfmadd231ss (%rdi), %xmm1, %xmm0

Instead of:
vmoss (%rsi), %xmm1
vfmadd231ss (%rdi), %xmm1, %xmm0

<rdar://problem/16800495> 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208260 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-07 22:52:58 +00:00
Chad Rosier
8f0f458824 [ARM64][fast-isel] Disable target specific optimizations at -O0. Functionally,
this patch disables the dead register elimination pass and the load/store pair
optimization pass at -O0.  The ILP optimizations don't require the optimization
level to be checked because the call to addILPOpts is predicated with the
necessary check.  The AdvSIMDScalar pass is disabled by default at all
optimization levels.  This patch leaves that pass disabled by default.

Also, move command-line options into ARM64TargetMachine.cpp and add a few
additional flags to aid in debugging.  This fixes an issue with the
-debug-pass=Structure flag where passes were printed, but not actually run
(i.e., AdvSIMDScalar pass).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208223 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-07 16:41:55 +00:00
Tim Northover
04a359f768 AArch64/ARM64: optimise vector selects & enable test
When performing a scalar comparison that feeds into a vector select,
it's actually better to do the comparison on the vector side: the
scalar route would be "CMP -> CSEL -> DUP", the vector is "CM -> DUP"
since the vector comparisons are all mask based.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208210 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-07 14:10:27 +00:00
James Molloy
2712c87cfe [ARM64-BE] Fix fast-isel, and add appropriate RUN lines to appropriate tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208200 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-07 12:33:55 +00:00
James Molloy
d93d214a67 [ARM64-BE] Fix variable-argument saving.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208199 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-07 12:33:48 +00:00
James Molloy
fca7f5c585 [ARM64-BE] Implement the lane-twiddling logic at AAPCS boundaries for big endian.
The AAPCS states that values passed in registers must have a value as though
they had been loaded with "LDR". LDR is equivalent to "LD1.64 vX.1D" - that is,
loading scalars to vector registers and loading 1-element vectors is equivalent.

The logic implemented here is to ensure that at all call boundaries and during
formal argument lowering all vectors are treated as their bitwidth-based floating
point scalar counterpart, which is always one of f64 or f128 (v2i32 -> f64,
v4i32 -> f128 etc). A BITCAST is inserted so that the appropriate REV will be
generated during code generation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208198 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-07 12:33:41 +00:00
James Molloy
737c2ac4fc [ARM64-BE] Implement the crazy bitcast handling for big endian vectors.
Because we've canonicalised on using LD1/ST1, every time we do a bitcast
between vector types we must do an equivalent lane reversal.

Consider a simple memory load followed by a bitconvert then a store.
  v0 = load v2i32
  v1 = BITCAST v2i32 v0 to v4i16
       store v4i16 v2

In big endian mode every memory access has an implicit byte swap. LDR and
STR do a 64-bit byte swap, whereas LD1/ST1 do a byte swap per lane - that
is, they treat the vector as a sequence of elements to be byte-swapped.
The two pairs of instructions are fundamentally incompatible. We've decided
to use LD1/ST1 only to simplify compiler implementation.

LD1/ST1 perform the equivalent of a sequence of LDR/STR + REV. This makes
the original code sequence:  v0 = load v2i32

  v1 = REV v2i32                  (implicit)
  v2 = BITCAST v2i32 v1 to v4i16
  v3 = REV v4i16 v2               (implicit)
       store v4i16 v3

But this is now broken - the value stored is different to the value loaded
due to lane reordering. To fix this, on every BITCAST we must perform two
other REVs:

  v0 = load v2i32
  v1 = REV v2i32                  (implicit)
  v2 = REV v2i32
  v3 = BITCAST v2i32 v2 to v4i16
  v4 = REV v4i16
  v5 = REV v4i16 v4               (implicit)
       store v4i16 v5

This means an extra two instructions, but actually in most cases the two REV
instructions can be combined into one. For example:
  (REV64_2s (REV64_4h X)) === (REV32_4h X)

There is also no 128-bit REV instruction. This must be synthesized with an
EXT instruction.

Most bitconverts require some sort of conversion. The only exceptions are:
  a) Identity conversions -  vNfX <-> vNiX
  b) Single-lane-to-scalar - v1fX <-> fX or v1iX <-> iX

Even though there are hundreds of changed lines, I have a fairly high confidence
that they are somewhat correct. The changes to add two REV instructions per
bitcast were pretty mechanical, and once I'd done that I threw the resulting
.td at a script I wrote which combined the two REVs together (and added
an EXT instruction, for f128) based on an instruction description I gave it.

This was much less prone to error than doing it all manually, plus my brain
would not just have melted but would have vapourised.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208194 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-07 11:28:53 +00:00
James Molloy
104629cc7c [ARM64-BE] Make big endian (scalar) argument passing work correctly.
This completes the port of r204814 (cpirker "AArch64_BE function argument
passing for ARM ABI") from AArch64 to ARM64, and fixes a bunch of issues
found during later development along the way. The biggest of these was
that the alignment fixup logic wasn't replicated into all the places it
should have been.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208192 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-07 11:28:36 +00:00
Tim Northover
0d427515b3 AArch64/ARM64: run test on ARM64 too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208188 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-07 10:47:04 +00:00
Tim Northover
5a1d5139b8 AArch64/ARM64: put annotation in test
It makes finding already covered tests much easier with "grep -L
arm64".

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208187 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-07 10:47:00 +00:00
Joerg Sonnenberger
2ecdcdc026 Allow using normal .eh_frame based unwinding on ARM. Use the same
encodings as x86. Use this exception model for NetBSD.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208166 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-07 07:49:34 +00:00
Saleem Abdulrasool
0029e2d665 ARM: fix WoA PEI instruction selection
The ARM::BLX instruction is an ARM mode instruction.  The Windows on ARM target
is limited to Thumb instructions.  Correctly use the thumb mode tBLXr
instruction.  This would manifest as an errant write into the object file as the
instruction is 4-bytes in length rather than 2.  The result would be a corrupted
object file that would eventually result in an executable that would crash at
runtime.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208152 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-07 03:03:27 +00:00
Joerg Sonnenberger
b84f890bc3 If a function needs a frame pointer, but r11 (aka fp) has not been used,
remove it from the list of unspilled registers. Otherwise the following
attempt to keep the stack aligned by picking an extra GPR register to
spill will not work as it picks up r11.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208129 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-06 20:43:01 +00:00
Andrea Di Biagio
8a712ba229 [X86] Improve the lowering of BITCAST dag nodes from type f64 to type v2i32 (and vice versa).
Before this patch, the backend always emitted a store+load sequence to
bitconvert from f64 to i64 the input operand of a ISD::BITCAST dag node that
performed a bitconvert from type MVT::f64 to type MVT::v2i32. The resulting
i64 node was then used to build a v2i32 vector.

With this patch, the backend now produces a cheaper SCALAR_TO_VECTOR from
MVT::f64 to MVT::v2f64. That SCALAR_TO_VECTOR is then followed by a "free"
bitcast to type MVT::v4i32. The elements of the resulting
v4i32 are then extracted to build a v2i32 vector (which is illegal and
therefore promoted to MVT::v2i64).

This is in general cheaper than emitting a stack store+load sequence
to bitconvert the operand from type f64 to type i64.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208107 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-06 17:09:03 +00:00
Renato Golin
22f779d1fd Implememting named register intrinsics
This patch implements the infrastructure to use named register constructs in
programs that need access to specific registers (bare metal, kernels, etc).

So far, only the stack pointer is supported as a technology preview, but as it
is, the intrinsic can already support all non-allocatable registers from any
architecture.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208104 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-06 16:51:25 +00:00
Rafael Espindola
b889448841 Special case aliases in GlobalValue::getAlignment.
An alias has the address of what it points to, so it also has the same
alignment.

This allows a few optimizations to see past aliases for free.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208103 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-06 16:48:58 +00:00
Kevin Qin
03145ebd88 [ARM64] Enable alignment control option in front-end for ARM64.
This is the modification in llvm part.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208074 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-06 09:48:52 +00:00
Reid Kleckner
9ad48c11b1 Fix i128 div/mod on mingw64
The Win64 docs are very clear that anything larger than 8 bytes is
passed by reference, and GCC MinGW64 honors that for __modti3 and
friends.

Patch by Jameson Nash!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208029 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-06 01:20:42 +00:00
Tom Stellard
4b84b524e5 R600: Expand i64 ISD:SUB
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208005 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-05 21:47:15 +00:00
Filipe Cabecinhas
75ea413a1b Revert "Optimize shufflevector that copies an i64/f64 and zeros the rest."
This reverts commit 207992. I misread the phab number on the LGTM.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207993 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-05 19:40:36 +00:00
Filipe Cabecinhas
a0fa9eb606 Optimize shufflevector that copies an i64/f64 and zeros the rest.
Summary:
Also ran clang-format on the function. The code added is the last else
if block.

Reviewers: nadav, craig.topper

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3518

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207992 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-05 19:36:28 +00:00
Michael Zolotukhin
830195e5fb Move test from r207969 to another folder and rename it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207984 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-05 18:10:15 +00:00
Rafael Espindola
c6e42b5590 Remove the -disable-cfi option.
This also add a release note about it. If this stays I will cleanup MC
next week.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207977 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-05 17:33:26 +00:00
Rafael Espindola
3b4c8c2b2a Modify test to not use -disable-cfi.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207974 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-05 16:47:07 +00:00
Rafael Espindola
260b6b05b9 Convert a CodeGen test into a MC test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207971 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-05 15:34:13 +00:00
Saleem Abdulrasool
8d538f1122 CodeGen: correct memset emittance for WoA
Windows on ARM does not conform to AEABI.  However, memset would be emitted
using the AEABI signature, resulting in inverted parameters.  Handle this
special case appropriately.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207943 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-04 23:13:21 +00:00
Saleem Abdulrasool
4fc5273a49 CodeGen: strengthen WoA AEABI avoidance tests
Add additional test cases for WoA AEABI avoidance checking.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207942 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-04 23:13:18 +00:00
Elena Demikhovsky
8a3751f813 AVX-512: minor change in rndscale intrinsic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207937 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-04 13:35:37 +00:00
Saleem Abdulrasool
f3b2ed7498 X86: repair export compatibility with MinGW/cygwin
Both MinGW and cygwin (i686) construct export directives without the global
leader prefix.  This is mostly due to the fact that they use GNU ld which does
not correctly handle the export directive.  This apparently has been been broken
for a while.  However, this was recently reported as being broken by
mingwandroid and diorcety of the msys2 project.

Remove the global leader prefix if targeting MinGW or cygwin, otherwise, retain
the global leader prefix.  Add an explicit test for cygwin's behaviour of export
directives.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207926 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-04 00:03:48 +00:00
Joey Gouly
72e96a51bf [ARM64] Correctly select ANDWri in FastISel.
http://reviews.llvm.org/D3598


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207917 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-03 17:27:06 +00:00
Tim Northover
b20252764d DAGCombine: prevent formation of illegal ConstantFP nodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207850 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-02 17:25:02 +00:00