Commit Graph

10980 Commits

Author SHA1 Message Date
Bill Schmidt
808d878a96 [PPC64] Fix PR20071 (fctiduz generated for targets lacking that instruction)
PR20071 identifies a problem in PowerPC's fast-isel implementation for
floating-point conversion to integer.  The fctiduz instruction was added in
Power ISA 2.06 (i.e., Power7 and later).  However, this instruction is being
generated regardless of which 64-bit PowerPC target is selected.

The intent is for fast-isel to punt to DAG selection when this instruction is
not available.  This patch implements that change.  For testing purposes, the
existing fast-isel-conversion.ll test adds a RUN line for -mcpu=970 and tests
for the expected code generation.  Additionally, the existing test
fast-isel-conversion-p5.ll was found to be incorrectly expecting the
unavailable instruction to be generated.  I've removed these test variants
since we have adequate coverage in fast-isel-conversion.ll.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211627 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 20:05:18 +00:00
Robert Khasanov
031ad1b930 vpblend intrinsics combines as shifts intrinsics due to absence return stmt between them
Fix PR20088

Differential Revision: http://reviews.llvm.org/D4277


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211617 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 18:08:04 +00:00
Weiming Zhao
c33b4883b3 Resubmit commit r211533
"Fix PR20056: Implement pseudo LDR <reg>, =<literal/label> for AArch64"
Missed files are added in this commit.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211605 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 16:21:38 +00:00
Christian Pirker
01c8340c3d ARM: Fix TPsoft for Thumb mode
Reviewed at http://reviews.llvm.org/D4230



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211601 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 15:45:59 +00:00
Kevin Qin
8c0787e83a [AArch64] Fix a build_vector pattern match fail
caused by defect in isBuildVectorAllZeros().

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211567 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 05:37:27 +00:00
Juergen Ributzka
20732d55c2 [FastISel][X86] Lower unsupported selects to control-flow.
The extends the select lowering coverage by emiting pseudo cmov
instructions. These insturction will be later on lowered to control-flow to
simulate the select.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211545 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 21:55:44 +00:00
Juergen Ributzka
d0976a3d20 [FastISel][X86] Add support for floating-point select.
This extends the select lowering to support floating-point selects. The
lowering depends on SSE instructions and that the conditon comes from a
floating-point compare. Under this conditions it is possible to emit an
optimized instruction sequence that doesn't require any branches to
simulate the select.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211544 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 21:55:40 +00:00
Juergen Ributzka
5f4e6e1ec0 [FastISel][X86] Optimize selects when the condition comes from a compare.
Optimize the select instructions sequence to use the EFLAGS directly from a
compare when possible.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211543 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 21:55:36 +00:00
Rafael Espindola
5e761eb4ae [Mips] Add a target streamer when creating a null streamer.
Should fix DebugInfo/global.ll on the mips bot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211527 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 19:43:40 +00:00
Matt Arsenault
ed143b7c0c R600/SI: Fix div_scale intrinsic.
The operand that must match one of the others does matter,
and implement selecting for it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211523 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 18:28:28 +00:00
Christian Pirker
737f207468 ARMEB: Vector extend operations
Reviewed at http://reviews.llvm.org/D4043



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211520 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 18:05:53 +00:00
Matt Arsenault
9ad2c7ef92 R600: Move add/sub with overflow out of AMDILISelLowering
Add more tests for these.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211517 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 18:00:49 +00:00
Matt Arsenault
c4471e9248 R600/SI: Handle i64 sub.
We can handle it the same way as add

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211514 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 18:00:38 +00:00
Ulrich Weigand
9a154bfe94 [PowerPC] Allow stack frames without parameter save area
The PPCFrameLowering::determineFrameLayout routine currently ensures
that every function that allocates a stack frame provides space for the
parameter save area (via PPCFrameLowering::getMinCallFrameSize).

This is actually not necessary.  There may be functions that never call
another routine but still allocate a frame; those do not require the
parameter save area.  In the future, with the ELFv2 ABI, even some
routines that do call other functions do not need to allocate the
parameter save area.

While it is not a bug to allocate the parameter area when it is not
needed, it is better to avoid it to save stack space.

Note that when any particular function call requires the parameter save
area, this space will already have been included by ABI code in the size
the CALLSEQ_START insn is annotated with, and therefore included in the
size returned by MFI->getMaxCallFrameSize().

This means that determineFrameLayout simply does not need to care about
the parameter save area.  (It still needs to ensure that every frame
provides the linkage area.)  This is implemented by this patch.

Note that this exposed a bug in the new fast-isel code where the parameter
area was *not* included in the CALLSEQ_START size; this is also fixed.

A couple of test cases needed to be adapted for the new (smaller) stack
frame size those tests now see.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211495 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 13:47:52 +00:00
Ulrich Weigand
fdb6eb65c7 [PowerPC] Fix on-stack AltiVec arguments with 64-bit SVR4
Current 64-bit SVR4 code seems to have some remnants of Darwin code
in AltiVec argument handing.  This had the effect that AltiVec arguments
(or subsequent arguments) were not correctly placed in the parameter area
in some cases.

The correct behaviour with the 64-bit SVR4 ABI is:
- All AltiVec arguments take up space in the parameter area, just like
  any other arguments, whether vararg or not.
- They are always 16-byte aligned, skipping a parameter area doubleword
  (and the associated GPR, if any), if necessary.

This patch implements the correct behaviour and adds a test case.
(Verified against GCC behaviour via the ABI compat test suite.)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211492 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 12:36:34 +00:00
NAKAMURA Takumi
9124b45918 Revert r211399, "Generate native unwind info on Win64"
It broke Legacy JIT Tests on x86_64-{mingw32|msvc}, aka Windows x64.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211480 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-22 22:00:56 +00:00
Jan Vesely
728ea0c91b R600: Add udivrem test
v2: move < %s to the end of the line
    space after ;
    add v4i32 test

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211476 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-22 21:42:58 +00:00
Filipe Cabecinhas
7798d5992a Fix PR20087 by using the source index when changing the vector load
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211472 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-22 17:21:37 +00:00
Benjamin Kramer
636a9bece4 Legalizer: Add support for splitting insert_subvectors.
We handle this by spilling the whole thing to the stack and doing the
insertion as a store.

PR19492. This happens in real code because the vectorizer creates v2i128 when AVX is enabled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211435 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-21 12:56:42 +00:00
Andrea Di Biagio
5d0ff9c928 [X86] Add ISel patterns to select SSE3/AVX ADDSUB instructions.
This patch adds ISel patterns to select SSE3/AVX ADDSUB instructions
from a sequence of "vadd + vsub + blend".

Example:

///
typedef float float4 __attribute__((ext_vector_type(4)));

float4 foo(float4 A, float4 B) {
  float4 X = A - B;
  float4 Y = A + B;
  return (float4){X[0], Y[1], X[2], Y[3]};
}
///

Before this patch, (with flag -mcpu=corei7) llc produced the following
assembly sequence:
  movaps  %xmm0, %xmm2
  addps   %xmm1, %xmm2
  subps   %xmm1, %xmm0
  blendps $10, %xmm2, %xmm0


With this patch, we now get a single
  addsubps  %xmm1, %xmm0



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211427 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-21 01:31:15 +00:00
Reid Kleckner
5b8e73ef81 Generate native unwind info on Win64
This patch enables LLVM to emit Win64-native unwind info rather than
DWARF CFI.  It handles all corner cases (I hope), including stack
realignment.

Because the unwind info is not flexible enough to describe stack frames
with a gap of unknown size in the middle, such as the one caused by
stack realignment, I modified register spilling code to place all spills
into the fixed frame slots, so that they can be accessed relative to the
frame pointer.

Patch by Vadim Chugunov!

Reviewed By: rnk

Differential Revision: http://reviews.llvm.org/D4081

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211399 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-20 20:35:47 +00:00
Tom Stellard
c0bf939e80 R600/SI: Add patterns for ctpop inside a branch
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211378 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-20 17:06:11 +00:00
Tom Stellard
61d64acd0c R600/SI: Add a pattern for f32 ftrunc
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211377 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-20 17:06:09 +00:00
Tom Stellard
2cda6e8ca6 R600: Expand vector flog2
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211376 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-20 17:06:07 +00:00
Tom Stellard
2d245e2da4 R600: Expand vector fexp2
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211375 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-20 17:06:05 +00:00
Tom Stellard
538c95179c R600/SI: Add a VALU pattern for i64 xor
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211373 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-20 17:05:57 +00:00
Ulrich Weigand
69e4786797 [PowerPC] Fix small argument stack slot offset for LE
When small arguments (structures < 8 bytes or "float") are passed in a
stack slot in the ppc64 SVR4 ABI, they must reside in the least
significant part of that slot.  On BE, this means that an offset needs
to be added to the stack address of the parameter, but on LE, the least
significant part of the slot has the same address as the slot itself.

This changes the PowerPC back-end ABI code to only add the small
argument stack slot offset for BE.  It also adds test cases to verify
the correct behavior on both BE and LE.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211368 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-20 16:34:05 +00:00
Rafael Espindola
e54c32ebc6 Move test so that it is skipped if the ARM target is not enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211366 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-20 15:30:38 +00:00
Oliver Stannard
e5241cc488 Emit the ARM build attributes ABI_PCS_wchar_t and ABI_enum_size.
Emit the ARM build attributes ABI_PCS_wchar_t and ABI_enum_size based on
module flags metadata.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211349 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-20 10:08:11 +00:00
Zoran Jovanovic
a5efeb6b39 ps][mips64r6] Added LSA/DLSA instructions
Differential Revision: http://reviews.llvm.org/D3897


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211346 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-20 09:28:09 +00:00
Alp Toker
d06976aba7 Fix typos
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211304 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-19 19:41:26 +00:00
Andrea Di Biagio
cfdf805286 [X86] Teach how to combine horizontal binop even in the presence of undefs.
Before this change, the backend was unable to fold a build_vector dag
node with UNDEF operands into a single horizontal add/sub.

This patch teaches how to combine a build_vector with UNDEF operands into a
horizontal add/sub when possible. The algorithm conservatively avoids to combine
a build_vector with only a single non-UNDEF operand.

Added test haddsub-undef.ll to verify that we correctly fold horizontal binop
even in the presence of UNDEFs.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211265 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-19 10:29:41 +00:00
Matt Arsenault
64429cefba R600: Add a few tests I forgot to add.
These belong with r210827

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211253 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-19 04:24:43 +00:00
Matt Arsenault
d9b35435b8 R600/SI: Add intrinsics for various math instructions.
These will be used for custom lowering and for library
implementations of various math functions, so it's useful
to expose these as builtins.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211247 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-19 01:19:19 +00:00
Matt Arsenault
ce09bda96e R600: Handle fnearbyint
The difference from rint isn't really relevant here,
so treat them as equivalent. OpenCL doesn't have nearbyint,
so this is sort of pointless other than for completeness.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211229 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-18 22:03:45 +00:00
Marek Olsak
f286d63757 R600/SI: add gather4 and getlod intrinsics (v3)
This contains all the previous patches + getlod support on top of it.
It doesn't use SDNodes anymore, so it's quite small.
It also adds v16i8 to SReg_128, which is used for the sampler descriptor.

Reviewed-by: Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211228 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-18 22:00:29 +00:00
Jan Vesely
52b6c2d6ef R600: Expand vector fceil
Move fp64 fceil tests to fceil64.ll

v2: rebase

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211194 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-18 17:57:29 +00:00
Ulrich Weigand
0c57babfc6 [PowerPC] Simplify and improve loading into TOC register
During an indirect function call sequence on the 64-bit SVR4 ABI,
generate code must load and then restore the TOC register.

This does not use a regular LOAD instruction since the TOC
register r2 is marked as reserved.  Instead, the are two
special instruction patterns:

 let RST = 2, DS = 2 in
 def LDinto_toc: DSForm_1a<58, 0, (outs), (ins g8rc:$reg),
                     "ld 2, 8($reg)", IIC_LdStLD,
                     [(PPCload_toc i64:$reg)]>, isPPC64;
 
 let RST = 2, DS = 10, RA = 1 in
 def LDtoc_restore : DSForm_1a<58, 0, (outs), (ins),
                     "ld 2, 40(1)", IIC_LdStLD,
                     [(PPCtoc_restore)]>, isPPC64;

Note that these not only restrict the destination of the
load to r2, but they also restrict the *source* of the
load to particular address combinations.  The latter is
a problem when we want to support the ELFv2 ABI, since
there the TOC save slot is no longer at 40(1).

This patch replaces those two instructions with a single
instruction pattern that only hard-codes r2 as destination,
but supports generic addresses as source.  This will allow
supporting the ELFv2 ABI, and also helps generate more
efficient code for calls to absolute addresses (allowing
simplification of the ppc64-calls.ll test case).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211193 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-18 17:52:49 +00:00
Ulrich Weigand
496e7ea119 [PowerPC] Add back test case for absolute calls (removed in r211174)
As requested by Hal Finkel, this adds back a test for calls to
a known-constant function pointer value, and verifies that the
64-bit SVR4 indirect function call sequence is used.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211190 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-18 17:28:56 +00:00
Arnold Schwaighofer
5d5ddf9663 Add a triple so that right syntax is choosen on mac osx systems
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211188 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-18 17:20:49 +00:00
Matt Arsenault
2b6e6fc1a8 R600/SI: Add intrinsics for brev instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211187 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-18 17:13:57 +00:00
Matt Arsenault
795ae8615f R600/SI: Prettier operand printing for 64-bit ops.
Copy what is done for 32-bit already so the order is about the same.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211186 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-18 17:13:51 +00:00
Matheus Almeida
95f1fa7ec3 [mips] SYNC $stype instruction was added in Mips32
but SYNC with an implied operand ($stype = 0) is valid since Mips2.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211185 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-18 17:10:30 +00:00
Matt Arsenault
debd831223 R600: Implement f64 ftrunc, ffloor and fceil.
CI has instructions for these, so this fixes them for older hardware.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211183 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-18 17:05:30 +00:00
Matt Arsenault
a5395c03f0 R600: Custom lower f64 frint for pre-CI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211182 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-18 17:05:26 +00:00
Adam Nemet
f1b790f791 [X86] AVX512: Add non-temporal stores
Note that I followed the AVX2 convention here and didn't add LLVM intrinsics
for stores.  These can be generated with the nontemporal hint on LLVM IR
stores (see new test). The GCC builtins are lowered directly into nontemporal
stores.

<rdar://problem/17082571>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211176 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-18 16:51:10 +00:00
Ulrich Weigand
336da8cdc5 [PowerPC] Do not use BLA with the 64-bit SVR4 ABI
The PowerPC back-end uses BLA to implement calls to functions at
known-constant addresses, which is apparently used for certain
system routines on Darwin.

However, with the 64-bit SVR4 ABI, this is actually incorrect.
An immediate function pointer value on this platform is not
directly usable as a target address for BLA:
- in the ELFv1 ABI, the function pointer value refers to the
  *function descriptor*, not the code address
- in the ELFv2 ABI, the function pointer value refers to the
  global entry point, but BL(A) would only be correct when
  calling the *local* entry point

This bug didn't show up since using immediate function pointer
values is not usually done in the 64-bit SVR4 ABI in the first
place.  However, I ran into this issue with a certain use case
of LLVM as JIT, where immediate function pointer values were
uses to implement callbacks from JITted code to helpers in
statically compiled code.

Fixed by simply not using BLA with the 64-bit SVR4 ABI.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211174 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-18 16:14:04 +00:00
Cameron McInally
c52345c0fc Add pattern for unsigned v4i32->v4f64 convert on AVX512.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211164 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-18 14:04:37 +00:00
Jan Vesely
c32d52df24 R600: Implement 64bit SRA
v2: Use capitalized variable name

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211159 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-18 12:27:17 +00:00
Jan Vesely
2d06e73d88 R600: Implement 64bit SRL
v2: use C++ style comment

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211158 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-18 12:27:15 +00:00
Jan Vesely
a64058f3eb R600: Implement 64bit SHL
v2: Use c++ style comment

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211157 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-18 12:27:13 +00:00
Tim Northover
5393254646 DAG: move sret demotion into most basic LowerCallTo implementation.
It looks like there are two versions of LowerCallTo here: the
SelectionDAGBuilder one is designed to operate on LLVM IR, and the
TargetLowering one in the case where everything is at DAG level.

Previously, only the SelectionDAGBuilder variant could handle demoting
an impossible return to sret semantics (before delegating to the
TargetLowering version), but this functionality is also useful for
certain libcalls (e.g. 128-bit operations on 32-bit x86).  So this
commit moves the sret handling down a level.

rdar://problem/17242889

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211155 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-18 11:52:44 +00:00
Kevin Qin
74287ec34c [AArch64] Fix a pattern match failure caused by creating improper CONCAT_VECTOR.
ReconstructShuffle() may wrongly creat a CONCAT_VECTOR trying to
concat 2 of v2i32 into v4i16. This commit is to fix this issue and
try to generate UZP1 instead of lots of MOV and INS.
Patch is initalized by Kevin Qin, and refactored by Tim Northover.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211144 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-18 05:54:42 +00:00
Louis Gerbarg
41b33299cf Allow X86FastIsel to cope with 64 bit absolute relocations
This patch is a follow up to r211040 & r211052. Rather than bailing out of fast
isel this patch will generate an alternate instruction (movabsq) instead of the
leaq. While this will always have enough room to handle the 64 bit displacment
it is generally over kill for internal symbols (most displacements will be
within 32 bits) but since we have no way of communicating the code model to the
the assmebler in order to avoid flagging an absolute leal/leaq as illegal when
using a symbolic displacement.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211130 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-17 23:22:41 +00:00
Juergen Ributzka
e8cb2ee1cd [FastISel][X86] Optimize predicates and fold CMP instructions.
This optimizes predicates for certain compares, such as fcmp oeq %x, %x to
fcmp ord %x, %x. The latter one is more efficient to generate.

The same optimization is applied to conditional branches.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211126 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-17 21:55:43 +00:00
Matt Arsenault
3f1f259c22 R600/SI: Match cttz_zero_undef
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211116 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-17 17:36:27 +00:00
Matt Arsenault
62e378b057 R600/SI: Match ctlz_zero_undef
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211115 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-17 17:36:24 +00:00
Tom Stellard
f56e7678d1 R600: Use LDS and vectors for private memory
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211110 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-17 16:53:14 +00:00
Tom Stellard
bae98b1b45 SelectionDAG: Expand i64 = FP_TO_SINT i32
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211108 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-17 16:53:07 +00:00
Juergen Ributzka
1d5ff6bb7a [FastISel][X86] Fix previous refactoring commit (r211077)
Overlooked that fcmp_une uses an "or" instead of an "and" for combining the
flags.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211104 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-17 14:47:45 +00:00
Tim Northover
c22960dba6 AArch64: estimate inline asm length during branch relaxation
To make sure branches are in range, we need to do a better job of estimating
the length of an inline assembly block than "it's probably 1 instruction, who'd
write asm with more than that?".

Fortunately there's already a (highly suspect, see how many ways you can think
of to break it!) callback for this purpose, which is used by the other targets.

rdar://problem/17277590

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211095 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-17 11:31:42 +00:00
Juergen Ributzka
408691f967 [FastISel][X86] Refactor the code to get the X86 condition from a helper function. NFC.
Make use of helper functions to simplify the branch and compare instruction
selection in FastISel. Also add test cases for compare and conditonal branch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211077 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-16 23:58:24 +00:00
Reed Kotler
8b3a8d6343 Add load/store functionality
Summary:
This patches allows non conversions like i1=i2; where both are global ints.
In addition, arithmetic and other things start to work since fast-isel will use
existing patterns for non fast-isel from tablegen files where applicable.

In addition i8, i16 will work in this limited context for assignment without the need
for sign extension (zero or signed). It does not matter how i8 or i16 are loaded (zero or sign extended)
since only the 8 or 16 relevant bits are used and clang will ask for sign extension before using them in
arithmetic. This is all made more complete in forthcoming patches.

for example:
  int i, j=1, k=3;
 
  void foo() {
    i = j + k;
  }

Keep in mind that this pass is not enabled right now and is an experimental pass
It can only be enabled with a hidden option to llvm of -mips-fast-isel.

Test Plan: Run test-suite, loadstore2.ll and I will run some executable tests.

Reviewers: dsanders

Subscribers: mcrosier

Differential Revision: http://reviews.llvm.org/D3856

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211061 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-16 22:05:47 +00:00
Bill Schmidt
212ec3a739 [PPC64] Fix PR19893 - improve code generation for local function addresses
Rafael opened http://llvm.org/bugs/show_bug.cgi?id=19893 to track non-optimal
code generation for forming a function address that is local to the compile
unit.  The existing code was treating both local and non-local functions
identically.

This patch fixes the problem by properly identifying local functions and
generating the proper addis/addi code.  I also noticed that Rafael's earlier
changes to correct the surrounding code in PPCISelLowering.cpp were also
needed for fast instruction selection in PPCFastISel.cpp, so this patch
fixes that code as well.

The existing test/CodeGen/PowerPC/func-addr.ll is modified to test the new
code generation.  I've added a -O0 run line to test the fast-isel code as
well.

Tested on powerpc64[le]-unknown-linux-gnu with no regressions.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211056 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-16 21:36:02 +00:00
Tim Northover
73b142e656 ARM: implement correct atomic operations on v7M
ARM v7M has ldrex/strex but not ldrexd/strexd. This means 32-bit
operations should work as normal, but 64-bit ones are almost certainly
doomed.

Patch by Phoebe Buckheister.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211042 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-16 18:49:36 +00:00
Louis Gerbarg
a564159d85 Fix illegal relocations in X86FastISel
On x86_86  the lea instruction can only use a 32 bit immediate value. When
the code is compiled statically the RIP register is not used, meaning the
immediate is all that can be used for the relocation, which is not sufficient
in the case of targets more than +/- 2GB away. This patch bails out of fast
isel in those cases and reverts to DAG which does the right thing.

Test case included.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211040 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-16 17:35:40 +00:00
Cameron McInally
5d57928a32 Hook up vector int_ctlz for AVX512.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211024 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-16 14:12:28 +00:00
Daniel Sanders
77ae274ae7 [mips][mips64r6] cl[oz], and dcl[oz] are re-encoded in MIPS32r6/MIPS64r6
Summary:
There is no change to the restrictions, just the result register is stored
once in the encoding rather than twice. The rt field is zero in
MIPS32r6/MIPS64r6.

Depends on D4119

Reviewers: zoran.jovanovic, jkolek, vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D4120

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211019 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-16 13:18:59 +00:00
Daniel Sanders
af0d72a6f9 [mips][mips64r6] ll, sc, lld, and scd are re-encoded on MIPS32r6/MIPS64r6.
Summary:
The linked-load, store-conditional operations have been re-encoded such
that have a 9-bit offset instead of the 16-bit offset they have prior to
MIPS32r6/MIPS64r6.

While implementing this, I noticed that the atomic load/store pseudos always
emit a sign extension using sll and sra. I have improved this to use seb/seh
when they are available (MIPS32r2/MIPS64r2 and above).

Depends on D4118

Reviewers: jkolek, zoran.jovanovic, vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D4119

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211018 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-16 13:13:03 +00:00
James Molloy
b3820b4289 [AArch64] Fix a fencepost error in lowering for llvm.aarch64.neon.uqshl.
Patch by Jiangning Liu!



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211014 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-16 10:39:21 +00:00
Daniel Sanders
b0e4096481 [mips] Merge most of the big/little endian checks in atomic.ll
Summary:
There is very little difference between the big and little endian cases in
test/CodeGen/Mips/atomic.ll. Merge them together using multiple
FileCheck prefixes.

Depends on D4117

Reviewers: jkolek, zoran.jovanovic, vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D4118

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211013 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-16 10:25:17 +00:00
Christian Pirker
467e6ad2e5 ARMEB: Fix trunc store for vector types
Reviewed at http://reviews.llvm.org/D4135



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211010 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-16 09:17:30 +00:00
Jingyue Wu
f6eb7e3175 Canonicalize addrspacecast ConstExpr between different pointer types
As a follow-up to r210375 which canonicalizes addrspacecast
instructions, this patch canonicalizes addrspacecast constant
expressions.

Given clang uses ConstantExpr::getAddrSpaceCast to emit addrspacecast
cosntant expressions, this patch is also a step towards having the
frontend emit canonicalized addrspacecasts.

Piggyback a minor refactor in InstCombineCasts.cpp

Update three affected tests in addrspacecast-alias.ll,
access-non-generic.ll and constant-fold-gep.ll and added one new test in
constant-fold-address-space-pointer.ll


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211004 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-15 21:40:57 +00:00
Matt Arsenault
3c11f69001 R600: Add a rotr testcase I forgot to add
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211002 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-15 21:09:00 +00:00
Matt Arsenault
fa848ccd09 R600: Remove a few more things from AMDILISelLowering
Try to keep all the setOperationActions for integer ops
together.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211001 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-15 21:08:58 +00:00
Matt Arsenault
e2480a202f R600: Fix assert on vector sdiv
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211000 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-15 21:08:54 +00:00
Matt Arsenault
61bfbc4d96 R600: Report that integer division is expensive.
Divides by weird constants now emit much better code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210995 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-15 19:48:16 +00:00
Tim Northover
8bfc50e4a9 AArch64: improve handling & modelling of FP_TO_XINT nodes.
There's probably no acatual change in behaviour here, just updating
the LowerFP_TO_INT function to be more similar to the reverse
implementation and updating costs to current CodeGen.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210985 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-15 09:27:15 +00:00
Tim Northover
94fe5c1fe2 AArch64: improve vector [su]itofp handling.
This somehow got missed in the AArch64 merge, so should fix a
performance regression since 3.4.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210984 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-15 09:27:06 +00:00
NAKAMURA Takumi
e12d893d7a Don't expect tests always crashing. Add "REQUIRES:asserts".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210983 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-15 01:01:11 +00:00
Matt Arsenault
7a3a020478 R600: Add failing testcases.
These are reduced from assert in the
OpenCV CvtColor8u.BGR5652GRAY test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210969 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-14 04:26:09 +00:00
Matt Arsenault
aac455af99 R600: Fix asserts related to constant initializers
This would assert if a constant address space was extern
and therefore didn't have an initializer. If the initializer
was undef, it would hit the unreachable unhandled initializer case.

An extern global should never really occur since we don't have
machine linking, but bugpoint likes to remove initializers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210967 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-14 04:26:05 +00:00
Jiangning Liu
c5bc067a0f Move GlobalMerge from Transform to CodeGen.
This patch is to move GlobalMerge pass from Transform/Scalar                                                           
to CodeGen, because GlobalMerge depends on TargetMachine.
In the mean time, the macro INITIALIZE_TM_PASS is also moved
to CodeGen/Passes.h. With this fix we can avoid making
libScalarOpts depend on libCodeGen.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210951 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-13 22:57:59 +00:00
David Blaikie
0fcb9cb1c1 DebugInfo: Following up to r209677, refactor local variable emission to delay the choice between emitting the definition attributes or using DW_AT_abstract_definition
This doesn't fix the abstract variable handling yet, but it introduces a
similar delay mechanism as was added for subprograms, causing
DW_AT_location to be reordered to the beginning of the attribute list
for local variables, and fixes all the test fallout for that.

A subsequent commit will remove the abstract variable handling in
DbgVariable and just do the abstract variable lookup at module end to
ensure that abstract variables introduced after their concrete
counterparts are appropriately referenced by the concrete variable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210943 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-13 22:18:23 +00:00
Tim Northover
eee7a7a836 X86: lower ATOMIC_CMP_SWAP_WITH_SUCCESS directly
Lowering this new node allows us to fold the almost universal
comparison for success before it's even formed. Instead we can create
a copy from EFLAGS and an X86ISD::SETCC operation since all "cmpxchg"
instructions set the zero-flag to the correct value.

rdar://problem/13201607

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210923 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-13 17:29:39 +00:00
Tim Northover
33fe993f2e Atomics: make use of the "cmpxchg weak" instruction.
This also simplifies the IR we create slightly: instead of working out
where success & failure should go manually, it turns out we can just
always jump to a success/failure block created for the purpose. Later
phases will sort out the mess without much difficulty.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210917 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-13 16:45:52 +00:00
Tim Northover
8f2a85e099 IR: add "cmpxchg weak" variant to support permitted failure.
This commit adds a weak variant of the cmpxchg operation, as described
in C++11. A cmpxchg instruction with this modifier is permitted to
fail to store, even if the comparison indicated it should.

As a result, cmpxchg instructions must return a flag indicating
success in addition to their original iN value loaded. Thus, for
uniformity *all* cmpxchg instructions now return "{ iN, i1 }". The
second flag is 1 when the store succeeded.

At the DAG level, a new ATOMIC_CMP_SWAP_WITH_SUCCESS node has been
added as the natural representation for the new cmpxchg instructions.
It is a strong cmpxchg.

By default this gets Expanded to the existing ATOMIC_CMP_SWAP during
Legalization, so existing backends should see no change in behaviour.
If they wish to deal with the enhanced node instead, they can call
setOperationAction on it. Beware: as a node with 2 results, it cannot
be selected from TableGen.

Currently, no use is made of the extra information provided in this
patch. Test updates are almost entirely adapting the input IR to the
new scheme.

Summary for out of tree users:
------------------------------

+ Legacy Bitcode files are upgraded during read.
+ Legacy assembly IR files will be invalid.
+ Front-ends must adapt to different type for "cmpxchg".
+ Backends should be unaffected by default.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210903 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-13 14:24:07 +00:00
Cameron McInally
1d5e37946d Fix bad copy-and-paste from r210652. AVX512 masked leading zero intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210901 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-13 13:20:01 +00:00
NAKAMURA Takumi
c5865cb4fc llvm/test/CodeGen/X86/fast-isel-args-fail2.ll: Don't expect to fail with -Asserts. It might or might not crash.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210894 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-13 12:05:06 +00:00
Tim Northover
d5d864b553 CPP backend: set volatile property on atomic instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210890 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-13 09:14:50 +00:00
Oliver Stannard
a19c2d4a6d ARM: Fix fastcc calling convention for Thumb1
When targetting Thumb1 on a processor which has a VFP unit (which
is not accessible from Thumb1), we were converting the fastcc calling
convention to AAPCS-VFP, which is not possible.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210889 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-13 08:33:03 +00:00
Matt Arsenault
d344c6bcf9 R600/SI: Fix selection error on i64 rotl / rotr.
Evergreen is still broken due to missing shl_parts.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210885 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-13 04:00:30 +00:00
Juergen Ributzka
e431884ed7 [FastISel][X86] Add support for cvttss2si/cvttsd2si intrinsics.
This adds support for the cvttss2si/cvttsd2si intrinsics. Preceding
insertelement instructions are folded into the conversion instruction (if
possible).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210870 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-13 02:21:58 +00:00
Juergen Ributzka
7f8d138f50 [FastISel][X86] - Add branch weights
Add branch weights to branch instructions, so that the following passes can
optimize based on it (i.e. basic block ordering).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210863 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-13 00:45:11 +00:00
Juergen Ributzka
4eddf94a14 [FastISel][X86] Add MachineMemOperand to load/store instructions.
This commit adds MachineMemOperands to load and store instructions. This allows
the peephole optimizer to fold load instructions. Unfortunatelly the peephole
optimizer currently doesn't run at -O0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210858 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 23:27:57 +00:00
Juergen Ributzka
cc738f8367 Update test case to use "not" instead of "XFAIL".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210829 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 21:17:40 +00:00
Matt Arsenault
00c3986254 R600: Mostly remove remaining AMDIL intrinsics.
Delete all unused ones, and add new AMDGPU named intrinsics for
the ones that are. Handle the old AMDIL names for comptability (although
remove their GCCBuiltin names) and add tests since there weren't any
for these before.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210827 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 21:15:44 +00:00
Juergen Ributzka
c7147a3b6d [FastISel][X86] Argument lowering test case
This test case is supposed to xfail, because we do not handle structs or byval
arguments.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210816 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 20:34:09 +00:00
Juergen Ributzka
3f2b28dcaf [FastIsel][X86] Add support for lowering the first 8 floating-point arguments.
Recommit with fixed argument attribute checking code, which is required to bail
out of all the cases we don't handle yet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210815 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 20:12:34 +00:00
Saleem Abdulrasool
3455c40f91 CodeGen: enable mov.w/mov.t pairs with minsize for WoA
Windows on ARM uses COFF/PE which is intrinsically position independent.  For
the case of 32-bit immediates, use a pair-wise relocation as otherwise we may
exceed the range of operators.  This fixes a code generation crash when using
-Oz when targeting Windows on ARM.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210814 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 20:06:33 +00:00
Juergen Ributzka
a15b05e1aa Revert "[FastIsel][X86] Add support for lowering the first 8 floating-point arguments."
Reverting it because it breaks several tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210810 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 19:21:43 +00:00
Tom Stellard
82a51defb6 Revert "SelectionDAG: Enable (and (setcc x), (setcc y)) -> (setcc (and x, y)) for vectors"
This reverts commit r210540, adds a testcase for the regression it
caused, and marks the R600 test it was supposed to fix as XFAIL.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210792 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 16:04:47 +00:00
James Molloy
5eba90a861 Disable the load/store optimization pass for Thumb-1.
Moritz's changes have improved codegen a lot, but further testing showed significant correctness problems. Disable by default until these have been worked out.

Patch by Moritz Roth!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210789 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 15:18:33 +00:00
Daniel Sanders
1f4c755c2c [mips][mips64r6] bc1[tf] are not available on MIPS32r6/MIPS64r6
Summary:
Also tightened up the acceptable condition operand for these instructions
on MIPS-I to MIPS-III. Support for $fcc[1-7] was added in MIPS-IV. Prior
to that only $fcc0 is acceptable.

We currently don't optimize (BEQZ (NOT $a), $target) and similar. It's
probably best to do this in InstCombine.

Depends on D4111

Reviewers: jkolek, zoran.jovanovic, vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D4112

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210787 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 15:00:17 +00:00
Daniel Sanders
28002c2f82 [mips][mips64r6] [sl][duw]xc1 are not available on MIPS32r6/MIPS64r6
Summary:
Folded mips64-fp-indexed-ls.ll into fp-indexed-ls.ll. To do so, the zext's in
mips64-fp-indexed-ls.ll were changed to implicit sign extensions (performed
by getelementptr). This does not affect the purpose of the test.

Depends on D4004

Reviewers: zoran.jovanovic, jkolek, vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D4110

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210784 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 14:19:28 +00:00
Daniel Sanders
8007133f3e [mips][mips64r6] c.cond.fmt, mov[fntz], and mov[fntz].[ds] are not available on MIPS32r6/MIPS64r6
Summary:
c.cond.fmt has been replaced by cmp.cond.fmt. Where c.cond.fmt wrote to
dedicated condition registers, cmp.cond.fmt writes 1 or 0 to normal FGR's
(like the GPR comparisons).

mov[fntz] have been replaced by seleqz and selnez. These instructions
conditionally zero a register based on a bool in a GPR. The results can
then be or'd together to act as a select without, for example, requiring a third
register read port.

mov[fntz].[ds] have been replaced with sel.[ds]

MIPS64r6 currently generates unnecessary sign-extensions for most selects.
This is because the result of a SETCC is currently an i32. Bits 32-63 are
undefined in i32 and the behaviour of seleqz/selnez would otherwise depend
on undefined bits. Later, we will fix this by making the result of SETCC an
i64 on MIPS64 targets.

Depends on D3958

Reviewers: jkolek, vmedic, zoran.jovanovic

Reviewed By: vmedic, zoran.jovanovic

Differential Revision: http://reviews.llvm.org/D4003

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210777 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 13:39:06 +00:00
Daniel Sanders
14e97c4f51 [mips] Use MTHC1 when it is available (MIPS32r2 and later) for both FP32 and FP64
Summary:
To make this work for both AFGR64 and FGR64 register sets, I've had to make the
instruction definition consistent with the white lie (that it reads the lower
32-bits of the register) when they are generated by expandBuildPairF64().

Corrected the definition of hasMips32r2() and hasMips64r2() to include
MIPS32r6 and MIPS64r6.

Depends on D3956

Reviewers: jkolek, zoran.jovanovic, vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3957

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210771 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 11:55:58 +00:00
Daniel Sanders
a61aa38ee1 [mips][mips64r6] madd.[ds], msub.[ds], nmadd.[ds], and nmsub.[ds] are not available on MIPS32r6/MIPS64r6
Summary:
This patch updates both the assembler and the code generator.

MIPS32r6/MIPS64r6 replaces them with maddf.[ds] and msubf.[ds] which are fused
multiply-add/sub operations. We don't emit these yet, this patch only prevents the removed instructions from being emitted.

Depends on D3955

Reviewers: jkolek, zoran.jovanovic, vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3956

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210763 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 11:04:18 +00:00
Daniel Sanders
d94bc707c4 [mips][mips64r6] madd/maddu/msub/msubu are not available on MIPS32r6/MIPS64r6
Summary:
This patch disables madd/maddu/msub/msubu in both the assembler and code
generator.

Depends on D3896

Reviewers: jkolek, zoran.jovanovic, vmedic

Reviewed By: vmedic

Differential Revision: http://reviews.llvm.org/D3955

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210762 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 10:54:16 +00:00
Andrea Di Biagio
bf4e625cf1 [X86] Teach how to combine AVX and AVX2 horizontal binop on packed 256-bit vectors.
This patch adds target combine rules to match:
 - [AVX] Horizontal add/sub of packed single/double precision floating point
   values from 256-bit vectors;
 - [AVX2] Horizontal add/sub of packed integer values from 256-bit vectors.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210761 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 10:53:48 +00:00
Daniel Sanders
38b2a0bfdd [mips][mips64r6] Replace m[tf]hi, m[tf]lo, mult, multu, dmult, dmultu, div, ddiv, divu, ddivu for MIPS32r6/MIPS64.
Summary:
The accumulator-based (HI/LO) multiplies and divides from earlier ISA's have
been removed and replaced with GPR-based equivalents. For example:
  div $1, $2
  mflo $3
is now:
  div $3, $1, $2

This patch disables the accumulator-based multiplies and divides for
MIPS32r6/MIPS64r6 and uses the GPR-based equivalents instead.

Renamed expandPseudoDiv to insertDivByZeroTrap to better describe the
behaviour of the function.

MipsDelaySlotFiller now invalidates the liveness information when moving
instructions to the delay slot. Without this, divrem.ll will abort since
%GP ends up used before it is defined.

Reviewers: vmedic, zoran.jovanovic, jkolek

Reviewed By: jkolek

Differential Revision: http://reviews.llvm.org/D3896

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210760 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 10:44:10 +00:00
Matt Arsenault
0b87955888 R600/SI: Use a register set to -1 for data0 on ds_inc*/ds_dec*
There is not such thing as a 0-data ds instruction, and the data
operand needs to be a vgpr set to something meaningful.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210756 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 08:21:54 +00:00
Juergen Ributzka
729fc8facc [FastISel][x86] Add testcase for r210719.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210746 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 03:54:05 +00:00
Juergen Ributzka
f66c14f656 [x86] Improve frameaddress test from r210709.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210743 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 03:29:29 +00:00
Juergen Ributzka
2c9a12f081 [FastISel] Add support for the stackmap intrinsic.
This implements target-independent FastISel lowering for the stackmap intrinsic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210742 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-12 03:29:26 +00:00
Juergen Ributzka
02503401b4 [FastISel][X86] Add support for the sqrt intrinsic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210720 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 23:11:02 +00:00
Juergen Ributzka
a2d36a20fa [FastISel][X86] Add support for the frameaddress intrinsic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210709 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 21:44:44 +00:00
Chad Rosier
2e4cd27799 [AArch64] Basic Sched Model for Cortex-A57.
Patch by Dave Estes<cestes@codeaurora.org>
Differential Revision: http://reviews.llvm.org/D4008

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210705 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 21:06:56 +00:00
Matt Arsenault
7fa80b45eb R600/SI: Fix bitcast between v2i32 and f64
This is the same problem fixed in r210664 for more types.

The test passes without this fix. For some reason
I'm only hitting this when creating selects lowered
to v2i32 selects.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210692 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 19:31:13 +00:00
Matt Arsenault
c9dbd0da7a R600/SI: Add common 64-bit LDS atomics
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210680 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 18:08:54 +00:00
Matt Arsenault
6b19a3a474 R600/SI: Add 32-bit LDS atomic cmpxchg
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210678 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 18:08:48 +00:00
Matt Arsenault
a396a70c1d R600/SI: Use LDS atomic inc / dec
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210677 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 18:08:45 +00:00
Matt Arsenault
2da1a85cbb R600/SI: Add other LDS atomic operations
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210676 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 18:08:42 +00:00
Matt Arsenault
4a19dd468d R600/SI: Fix backwards names for local atomic instructions.
The manual lists them as *_RTN_U32, not *_U32_RTN, which is more
consistent with how every other sized instruction is named.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210674 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 18:08:37 +00:00
Matt Arsenault
b97095b94f R600/SI: Refactor local atomics.
Use patterns that will also match the immediate offset to
match the normal read / writes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210673 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 18:08:34 +00:00
Matt Arsenault
8a9df8f92c R600/SI: Use v_cvt_f32_ubyte* instructions
This eliminates extra extract instructions when loading an i8 vector to
a float vector.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210666 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 17:50:44 +00:00
Matt Arsenault
a2dca4cc04 R600/SI: Fix selection failure on scalar_to_vector
There seem to be only 2 places that produce these,
and it's kind of tricky to hit them.

Also fixes failure to bitcast between i64 and v2f32,
although this for some reason wasn't actually broken in the
simple bitcast testcase, but did in the scalar_to_vector one.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210664 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 17:40:32 +00:00
Daniel Sanders
0ee5398b7f [mips][mips64r6] Improve tests affected by the changes to multiplies and divides
Summary:
MIPS32r6/MIPS64r6 support has not been added yet.

inlineasm-cnstrnt-reg.ll:
  Explicitly specify the CPU since it will not work on MIPS32r6/MIPS64r6
  when -integrated-as is the default. We can't change the mnemonic since the
  LO register is an implicit def of mtlo and MIPS32r6/MIPS64r6 has no
  instructions that use LO.

2008-08-01-AsmInline.ll:
  Explicitly specify the CPU since MIPS32r6/MIPS64r6 will correctly emit
  different code and this is a regression test.

mips64instrs.ll and mips64muldiv.ll
  Check registers and the way the multiply is used in m1

divrem.ll
  Check registers and use multiple filecheck prefixes to limit redundancy

Reviewers: vmedic, jkolek, zoran.jovanovic, matheusalmeida

Reviewed By: matheusalmeida

Subscribers: matheusalmeida

Differential Revision: http://reviews.llvm.org/D3894

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210656 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 15:48:00 +00:00
Cameron McInally
998d8f50a7 Add AVX512 masked leadz instrinsic support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210652 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 12:54:45 +00:00
Andrea Di Biagio
a069e64112 [X86] Refactor the logic to select horizontal adds/subs to a helper function.
This patch moves part of the logic implemented by the target specific
combine rules added at r210477 to a separate helper function.
This should make easier to add more rules for matching AVX/AVX2 horizontal
adds/subs.

This patch also fixes a problem caused by a wrong check performed on indices
of extract_vector_elt dag nodes in input to the scalar adds/subs.

New tests have been added to verify that we correctly check indices of
extract_vector_elt dag nodes when selecting a horizontal operation.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210644 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 07:57:50 +00:00
Jiangning Liu
f847ccb87a Global merge for global symbols.
This commit is to improve global merge pass and support global symbol merge.
The global symbol merge is not enabled by default. For aarch64, we need some
more back-end fix to make it really benifit ADRP CSE.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210640 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-11 06:44:53 +00:00
Juergen Ributzka
0adbcf3ba9 [FastISel][X86] Extend support for {s|u}{add|sub|mul}.with.overflow intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210610 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 23:52:44 +00:00
Matt Arsenault
1f4772305a R600: Use BCNT_INT for evergreen
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210569 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 19:18:28 +00:00
Matt Arsenault
69891c0115 R600/SI: Implement i64 ctpop
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210568 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 19:18:24 +00:00
Matt Arsenault
ee9772d9dd R600/SI: Use bcnt instruction for ctpop
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210567 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 19:18:21 +00:00
Matt Arsenault
bfd00e21b7 R600: Handle fcopysign
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210564 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 19:00:20 +00:00
Matt Arsenault
0ba78a9121 R600/SI: Handle sign_extend and zero_extend to i64 with patterns.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210563 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 18:54:59 +00:00
Reed Kotler
805c9e4943 Do Materialize Floating Point in Mips Fast-Isel
Summary:
Implement materialize of floating point literals in Mips Fast-Isel

Reopened version of D3659

Test Plan: simplestorefp1.ll

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D4071

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210546 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 16:45:44 +00:00
Andrea Di Biagio
c0edcf7de8 [X86] Improved target combine rules for selecting horizontal add/sub.
This patch slightly changes the algorithm introduced at revision 210477
to fix a problem where the algorithm was producing incorrect code for 
the VEX.256 encoded versions of horizontal add/sub.

For these cases, we now try to split the two 256-bit vectors into
128-bit chunks before emitting horizontal add/sub dag nodes.

Added a new test case into haddsub-2.ll.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210545 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 16:42:57 +00:00
Adam Nemet
8dea1c4167 [X86] AVX512: Add vmovntdqa
Along with the corresponding intrinsic and tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210543 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 16:39:53 +00:00
Renato Golin
2d89932fb2 Fix a bug in the Thumb1 ARM Load/Store optimizer
Previously, the basic block was searched for future uses of the base register,
and if necessary any writeback to the base register was reset using a SUB
instruction (e.g. before calling a function) just before such a use. However,
this step happened *before* the merged LDM/STM instruction was built. So if
there was (e.g.) a function call directly after the not-yet-formed LDM/STM,
the pass would first insert a SUB instruction to reset the base register,
and then (at the same location, incorrectly) insert the LDM/STM itself.

This patch fixes PR19972. Patch by Moritz Roth.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210542 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 16:39:21 +00:00
Tom Stellard
102d0f3e3f SelectionDAG: Don't use MVT::Other to determine legality of ISD::SELECT_CC
The SelectionDAG bad a special case for ISD::SELECT_CC, where it would
allow targets to specify:

setOperationAction(ISD::SELECT_CC, MVT::Other, Expand);

to indicate that they wanted to expand ISD::SELECT_CC for all types.
This wasn't applied correctly everywhere, and it makes writing new
DAG patterns with ISD::SELECT_CC difficult.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210541 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 16:01:29 +00:00
Tom Stellard
f586a260ca SelectionDAG: Expand SELECT_CC to SELECT + SETCC
This consolidates code from the Hexagon, R600, and XCore targets.

No functionality change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210539 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 16:01:22 +00:00
Bill Schmidt
b02d95cb66 [PPC64LE] Recognize shufflevector patterns for little endian
Various masks on shufflevector instructions are recognizable as
specific PowerPC instructions (vector pack, vector merge, etc.).
There is existing code in PPCISelLowering.cpp to recognize the correct
patterns for big endian code.  The masks for these instructions are
different for little endian code due to the big-endian numbering
employed by these instructions.  This patch adds the recognition code
for little endian.

I've added a new test case test/CodeGen/PowerPC/vec_shuffle_le.ll for
this.  The existing recognizer test (vec_shuffle.ll) is unnecessarily
verbose and difficult to read, so I felt it was better to add a new
test rather than modify the old one.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210536 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 14:35:01 +00:00
Chad Rosier
0db9526c1a [AArch64] Emit .ident compiler version attribute.
Patch by Ana Pazos<apazos@codeaurora.org>!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210535 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 14:32:08 +00:00
Tim Northover
e1db6ac10b AArch64: disallow x30 & x29 as the destination for indirect tail calls
As Ana Pazos pointed out, these have to be restored to their incoming values
before a function returns; i.e. before the tail call. So they can't be used
correctly as the destination register.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210525 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 10:50:24 +00:00
Tim Northover
efbf7d1ceb Revert "X86: elide comparisons after cmpxchg instructions."
This reverts commit r210523. It was committed prematurely without waiting for
review.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210524 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 10:50:11 +00:00
Tim Northover
984ee65445 X86: elide comparisons after cmpxchg instructions.
The C++ and C semantics of the compare_and_swap operations actually
require us to return a boolean "success" value. In LLVM terms this
means a second comparison of the output of "cmpxchg" against the input
desired value.

However, x86's "cmpxchg" instruction sets all flags for the comparison
formed, so we can skip any secondary comparison. (N.b. this isn't true
for cmpxchg8b/16b, which only set ZF).

rdar://problem/13201607

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210523 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 10:49:07 +00:00
Tim Northover
46b3076cd0 AArch64: teach FastISel how to handle offset FrameIndices
Previously we were abandonning the attempt, leading to some combination of
extra work (when selection of a load/store fails completely) and inferior code
(when this leads to a real memcpy call instead of inlining).

rdar://problem/17187463

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210520 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 09:52:44 +00:00
Tim Northover
292c7c6a48 AArch64: make FastISel memcpy emission more robust.
We were hitting an assert if FastISel couldn't create the load or store we
requested. Currently this happens for large frame-local addresses, though
CodeGen could be improved there.

rdar://problem/17187463

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210519 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-10 09:52:40 +00:00