11143 Commits

Author SHA1 Message Date
Juergen Ributzka
d01f1c4054 [FastISel][X86] Only fold the cmp into the select when both instructions are in the same basic block.
If the cmp is in a different basic block, then it is possible that not all
operands of that compare have defined registers. This can happen when one of
the operands to the cmp is a load and the load gets folded into the cmp. In
this case FastISel will skip the load instruction and the vreg is never
defined.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211730 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-25 20:06:12 +00:00
Andrea Di Biagio
cae1ea691d [X86] Always prefer to lower a VECTOR_SHUFFLE into a BLENDI instead of SHUFP (or VPERM2X128).
This patch teaches method 'LowerVECTOR_SHUFFLE' to give higher precedence to
the check for 'isBlendMask'; the idea is that, when possible, we should firstly
check if a shuffle performs a blend, and in case, try to lower it into a BLENDI
instead of selecting a SHUFP or (worse) a VPERM2X128.

In general:
 - AVX VBLENDPS/D always have better latency and throughput than VPERM2F128;
 - BLENDPS/D instructions tend to always have better 'reciprocal throughput'
   than the equivalent SHUFPS/D;
 - Both BLENDPS/D and SHUFPS/D are often decoded into the same number of
   m-ops; however, a m-op obtained from a BLENDPS/D can be scheduled to more
   than one execution port.

This patch:
 - Moves the check for 'isBlendMask' immediately before the check for
   'isSHUFPMask' within method 'LowerVECTOR_SHUFFLE';
 - Updates existing tests for sse/avx shuffle/blend instructions to verify
   that we select (v)blendps/d when possible (instead of (v)shufps/d or
   vperm2f128).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211720 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-25 17:41:58 +00:00
Eli Bendersky
bb167336b3 Rename loop unrolling and loop vectorizer metadata to have a common prefix.
[LLVM part]

These patches rename the loop unrolling and loop vectorizer metadata
such that they have a common 'llvm.loop.' prefix.  Metadata name
changes:

llvm.vectorizer.* => llvm.loop.vectorizer.*
llvm.loopunroll.* => llvm.loop.unroll.*

This was a suggestion from an earlier review
(http://reviews.llvm.org/D4090) which added the loop unrolling
metadata. 

Patch by Mark Heffernan.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211710 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-25 15:41:00 +00:00
Chandler Carruth
2edf5e45ec [x86] Add intrinsics for the pshufd, pshuflw, and pshufhw instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211694 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-25 13:12:54 +00:00
NAKAMURA Takumi
b720a3d15c Re-apply r211399, "Generate native unwind info on Win64" with a fix to ignore SEH pseudo ops in X86 JIT emitter.
--
This patch enables LLVM to emit Win64-native unwind info rather than
DWARF CFI.  It handles all corner cases (I hope), including stack
realignment.

Because the unwind info is not flexible enough to describe stack frames
with a gap of unknown size in the middle, such as the one caused by
stack realignment, I modified register spilling code to place all spills
into the fixed frame slots, so that they can be accessed relative to the
frame pointer.

Patch by Vadim Chugunov!

Reviewed By: rnk

Differential Revision: http://reviews.llvm.org/D4081

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211691 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-25 12:41:52 +00:00
Andrea Di Biagio
3e5582cc15 [X86] Add target combine rule to select ADDSUB instructions from a build_vector
This patch teaches the backend how to combine a build_vector that implements
an 'addsub' between packed float vectors into a sequence of vector add
and vector sub followed by a VSELECT.

The new VSELECT is expected to be lowered into a BLENDI.
At ISel stage, the sequence 'vector add + vector sub + BLENDI' is
pattern-matched against ISel patterns added at r211427 to select
'addsub' instructions.
Added three more ISel patterns for ADDSUB.

Added test sse3-avx-addsub-2.ll to verify that we correctly emit 'addsub'
instructions.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211679 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-25 10:02:21 +00:00
Rafael Espindola
5178d868c8 Fix a regression from r211653.
The method was empty in the null streamer but I mistakenly replaced it with
the aborting one in MCStreamer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211666 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-25 05:31:22 +00:00
NAKAMURA Takumi
9a18ab013f CodeGen/X86/pr20088.ll: Add -march=x86-64, or llc fails due to non-x86 default target.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211659 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-25 03:05:47 +00:00
Juergen Ributzka
35a6a81407 [FastISel][X86] Fold XALU condition into branch and compare.
Optimize the codegen of select and branch instructions to directly use the
EFLAGS from the {s|u}{add|sub|mul}.with.overflow intrinsics.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211645 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 23:51:21 +00:00
Tom Stellard
78d1e95201 R600: Promote i64 stores to v2i32
Now we need only one 64-bit pattern for stores.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211643 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 23:33:04 +00:00
Rafael Espindola
4186005edc Print a=b as an assignment.
In assembly the expression a=b is parsed as an assignment, so it should be
printed as one.

This remove a truly horrible hack for producing a label with "a=.". It would
be used by codegen but would never be reached by the asm parser. Sorry I
missed this when it was first committed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211639 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 22:45:16 +00:00
Matt Arsenault
95eb45c5d9 R600: Fix inconsistency in rsq instructions.
R600 was using a clamped version of rsq, but SI was not. Add a
new rsq_clamped intrinsic and use them consistently.

It's unclear to me from the documentation what behavior
the R600 instructions have, so I assume they have the legacy behavior
described by the SI documents. For R600, use RECIPSQRT_IEEE
for both llvm.AMDGPU.rsq.legacy and llvm.AMDGPU.rsq. R600 also
has RECIPSQRT_FF, which I'm not sure how it fits in here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211637 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 22:13:39 +00:00
David Blaikie
639c71bafb Fix up scoping in a few tests (and delete one that validates unnecessary behavior).
Most of this is just tests that were silently succeeding in spite of
schema changes I made over a year ago. Cleaning them up as they lead to
failures in a change I'm working on/will come soon.

test/DebugInfo/2010-01-19-DbgScope.ll was removed as it tested miscoping
where a DebugLoc described a location not in the current function. The
test case doesn't describe why this is a valid situation and should be
supported, so I'm removing it and shortly going to commit changes that
make this firmly unsupported/assert-fail.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211628 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 20:10:27 +00:00
Bill Schmidt
808d878a96 [PPC64] Fix PR20071 (fctiduz generated for targets lacking that instruction)
PR20071 identifies a problem in PowerPC's fast-isel implementation for
floating-point conversion to integer.  The fctiduz instruction was added in
Power ISA 2.06 (i.e., Power7 and later).  However, this instruction is being
generated regardless of which 64-bit PowerPC target is selected.

The intent is for fast-isel to punt to DAG selection when this instruction is
not available.  This patch implements that change.  For testing purposes, the
existing fast-isel-conversion.ll test adds a RUN line for -mcpu=970 and tests
for the expected code generation.  Additionally, the existing test
fast-isel-conversion-p5.ll was found to be incorrectly expecting the
unavailable instruction to be generated.  I've removed these test variants
since we have adequate coverage in fast-isel-conversion.ll.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211627 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 20:05:18 +00:00
Robert Khasanov
031ad1b930 vpblend intrinsics combines as shifts intrinsics due to absence return stmt between them
Fix PR20088

Differential Revision: http://reviews.llvm.org/D4277


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211617 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 18:08:04 +00:00
Weiming Zhao
c33b4883b3 Resubmit commit r211533
"Fix PR20056: Implement pseudo LDR <reg>, =<literal/label> for AArch64"
Missed files are added in this commit.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211605 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 16:21:38 +00:00
Christian Pirker
01c8340c3d ARM: Fix TPsoft for Thumb mode
Reviewed at http://reviews.llvm.org/D4230



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211601 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 15:45:59 +00:00
Kevin Qin
8c0787e83a [AArch64] Fix a build_vector pattern match fail
caused by defect in isBuildVectorAllZeros().

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211567 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 05:37:27 +00:00
Juergen Ributzka
20732d55c2 [FastISel][X86] Lower unsupported selects to control-flow.
The extends the select lowering coverage by emiting pseudo cmov
instructions. These insturction will be later on lowered to control-flow to
simulate the select.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211545 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 21:55:44 +00:00
Juergen Ributzka
d0976a3d20 [FastISel][X86] Add support for floating-point select.
This extends the select lowering to support floating-point selects. The
lowering depends on SSE instructions and that the conditon comes from a
floating-point compare. Under this conditions it is possible to emit an
optimized instruction sequence that doesn't require any branches to
simulate the select.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211544 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 21:55:40 +00:00
Juergen Ributzka
5f4e6e1ec0 [FastISel][X86] Optimize selects when the condition comes from a compare.
Optimize the select instructions sequence to use the EFLAGS directly from a
compare when possible.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211543 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 21:55:36 +00:00
Rafael Espindola
5e761eb4ae [Mips] Add a target streamer when creating a null streamer.
Should fix DebugInfo/global.ll on the mips bot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211527 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 19:43:40 +00:00
Matt Arsenault
ed143b7c0c R600/SI: Fix div_scale intrinsic.
The operand that must match one of the others does matter,
and implement selecting for it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211523 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 18:28:28 +00:00
Christian Pirker
737f207468 ARMEB: Vector extend operations
Reviewed at http://reviews.llvm.org/D4043



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211520 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 18:05:53 +00:00
Matt Arsenault
9ad2c7ef92 R600: Move add/sub with overflow out of AMDILISelLowering
Add more tests for these.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211517 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 18:00:49 +00:00
Matt Arsenault
c4471e9248 R600/SI: Handle i64 sub.
We can handle it the same way as add

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211514 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 18:00:38 +00:00
Ulrich Weigand
9a154bfe94 [PowerPC] Allow stack frames without parameter save area
The PPCFrameLowering::determineFrameLayout routine currently ensures
that every function that allocates a stack frame provides space for the
parameter save area (via PPCFrameLowering::getMinCallFrameSize).

This is actually not necessary.  There may be functions that never call
another routine but still allocate a frame; those do not require the
parameter save area.  In the future, with the ELFv2 ABI, even some
routines that do call other functions do not need to allocate the
parameter save area.

While it is not a bug to allocate the parameter area when it is not
needed, it is better to avoid it to save stack space.

Note that when any particular function call requires the parameter save
area, this space will already have been included by ABI code in the size
the CALLSEQ_START insn is annotated with, and therefore included in the
size returned by MFI->getMaxCallFrameSize().

This means that determineFrameLayout simply does not need to care about
the parameter save area.  (It still needs to ensure that every frame
provides the linkage area.)  This is implemented by this patch.

Note that this exposed a bug in the new fast-isel code where the parameter
area was *not* included in the CALLSEQ_START size; this is also fixed.

A couple of test cases needed to be adapted for the new (smaller) stack
frame size those tests now see.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211495 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 13:47:52 +00:00
Ulrich Weigand
fdb6eb65c7 [PowerPC] Fix on-stack AltiVec arguments with 64-bit SVR4
Current 64-bit SVR4 code seems to have some remnants of Darwin code
in AltiVec argument handing.  This had the effect that AltiVec arguments
(or subsequent arguments) were not correctly placed in the parameter area
in some cases.

The correct behaviour with the 64-bit SVR4 ABI is:
- All AltiVec arguments take up space in the parameter area, just like
  any other arguments, whether vararg or not.
- They are always 16-byte aligned, skipping a parameter area doubleword
  (and the associated GPR, if any), if necessary.

This patch implements the correct behaviour and adds a test case.
(Verified against GCC behaviour via the ABI compat test suite.)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211492 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 12:36:34 +00:00
NAKAMURA Takumi
9124b45918 Revert r211399, "Generate native unwind info on Win64"
It broke Legacy JIT Tests on x86_64-{mingw32|msvc}, aka Windows x64.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211480 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-22 22:00:56 +00:00
Jan Vesely
728ea0c91b R600: Add udivrem test
v2: move < %s to the end of the line
    space after ;
    add v4i32 test

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211476 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-22 21:42:58 +00:00
Filipe Cabecinhas
7798d5992a Fix PR20087 by using the source index when changing the vector load
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211472 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-22 17:21:37 +00:00
Benjamin Kramer
636a9bece4 Legalizer: Add support for splitting insert_subvectors.
We handle this by spilling the whole thing to the stack and doing the
insertion as a store.

PR19492. This happens in real code because the vectorizer creates v2i128 when AVX is enabled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211435 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-21 12:56:42 +00:00
Andrea Di Biagio
5d0ff9c928 [X86] Add ISel patterns to select SSE3/AVX ADDSUB instructions.
This patch adds ISel patterns to select SSE3/AVX ADDSUB instructions
from a sequence of "vadd + vsub + blend".

Example:

///
typedef float float4 __attribute__((ext_vector_type(4)));

float4 foo(float4 A, float4 B) {
  float4 X = A - B;
  float4 Y = A + B;
  return (float4){X[0], Y[1], X[2], Y[3]};
}
///

Before this patch, (with flag -mcpu=corei7) llc produced the following
assembly sequence:
  movaps  %xmm0, %xmm2
  addps   %xmm1, %xmm2
  subps   %xmm1, %xmm0
  blendps $10, %xmm2, %xmm0


With this patch, we now get a single
  addsubps  %xmm1, %xmm0



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211427 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-21 01:31:15 +00:00
Reid Kleckner
5b8e73ef81 Generate native unwind info on Win64
This patch enables LLVM to emit Win64-native unwind info rather than
DWARF CFI.  It handles all corner cases (I hope), including stack
realignment.

Because the unwind info is not flexible enough to describe stack frames
with a gap of unknown size in the middle, such as the one caused by
stack realignment, I modified register spilling code to place all spills
into the fixed frame slots, so that they can be accessed relative to the
frame pointer.

Patch by Vadim Chugunov!

Reviewed By: rnk

Differential Revision: http://reviews.llvm.org/D4081

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211399 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-20 20:35:47 +00:00
Tom Stellard
c0bf939e80 R600/SI: Add patterns for ctpop inside a branch
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211378 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-20 17:06:11 +00:00
Tom Stellard
61d64acd0c R600/SI: Add a pattern for f32 ftrunc
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211377 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-20 17:06:09 +00:00
Tom Stellard
2cda6e8ca6 R600: Expand vector flog2
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211376 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-20 17:06:07 +00:00
Tom Stellard
2d245e2da4 R600: Expand vector fexp2
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211375 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-20 17:06:05 +00:00
Tom Stellard
538c95179c R600/SI: Add a VALU pattern for i64 xor
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211373 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-20 17:05:57 +00:00
Ulrich Weigand
69e4786797 [PowerPC] Fix small argument stack slot offset for LE
When small arguments (structures < 8 bytes or "float") are passed in a
stack slot in the ppc64 SVR4 ABI, they must reside in the least
significant part of that slot.  On BE, this means that an offset needs
to be added to the stack address of the parameter, but on LE, the least
significant part of the slot has the same address as the slot itself.

This changes the PowerPC back-end ABI code to only add the small
argument stack slot offset for BE.  It also adds test cases to verify
the correct behavior on both BE and LE.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211368 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-20 16:34:05 +00:00
Rafael Espindola
e54c32ebc6 Move test so that it is skipped if the ARM target is not enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211366 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-20 15:30:38 +00:00
Oliver Stannard
e5241cc488 Emit the ARM build attributes ABI_PCS_wchar_t and ABI_enum_size.
Emit the ARM build attributes ABI_PCS_wchar_t and ABI_enum_size based on
module flags metadata.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211349 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-20 10:08:11 +00:00
Zoran Jovanovic
a5efeb6b39 ps][mips64r6] Added LSA/DLSA instructions
Differential Revision: http://reviews.llvm.org/D3897


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211346 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-20 09:28:09 +00:00
Alp Toker
d06976aba7 Fix typos
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211304 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-19 19:41:26 +00:00
Andrea Di Biagio
cfdf805286 [X86] Teach how to combine horizontal binop even in the presence of undefs.
Before this change, the backend was unable to fold a build_vector dag
node with UNDEF operands into a single horizontal add/sub.

This patch teaches how to combine a build_vector with UNDEF operands into a
horizontal add/sub when possible. The algorithm conservatively avoids to combine
a build_vector with only a single non-UNDEF operand.

Added test haddsub-undef.ll to verify that we correctly fold horizontal binop
even in the presence of UNDEFs.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211265 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-19 10:29:41 +00:00
Matt Arsenault
64429cefba R600: Add a few tests I forgot to add.
These belong with r210827

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211253 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-19 04:24:43 +00:00
Matt Arsenault
d9b35435b8 R600/SI: Add intrinsics for various math instructions.
These will be used for custom lowering and for library
implementations of various math functions, so it's useful
to expose these as builtins.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211247 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-19 01:19:19 +00:00
Matt Arsenault
ce09bda96e R600: Handle fnearbyint
The difference from rint isn't really relevant here,
so treat them as equivalent. OpenCL doesn't have nearbyint,
so this is sort of pointless other than for completeness.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211229 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-18 22:03:45 +00:00
Marek Olsak
f286d63757 R600/SI: add gather4 and getlod intrinsics (v3)
This contains all the previous patches + getlod support on top of it.
It doesn't use SDNodes anymore, so it's quite small.
It also adds v16i8 to SReg_128, which is used for the sampler descriptor.

Reviewed-by: Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211228 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-18 22:00:29 +00:00
Jan Vesely
52b6c2d6ef R600: Expand vector fceil
Move fp64 fceil tests to fceil64.ll

v2: rebase

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211194 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-18 17:57:29 +00:00