Commit Graph

11322 Commits

Author SHA1 Message Date
Robert Khasanov
232202439a [SKX] Extended non-temporal load/store instructions for AVX512VL subsets.
Added avx512_movnt_vl multiclass for handling 256/128-bit forms of instruction.
Added encoding and lowering tests.

Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215536 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 10:46:00 +00:00
Elena Demikhovsky
4c97c1420b AVX-512: Fixed a bug in shufflevector lowering.
PALIGNR instruction does not exist in AVX-512F set.
Added a test.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215526 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 07:58:43 +00:00
Chandler Carruth
6bb093bbe7 [x86] Rewrite a core part of the new vector shuffle lowering to handle
one pesky test case correctly.

This test case caused the old code to infloop occilating between solving
the low-half and the high-half. The 'side balancing' part of
single-input v8 shuffle lowering didn't handle the one pattern which can
cause it to occilate. Fortunately the fuzz testing found this case.
Unfortuately it was *terrible* to handle. I'm really sorry for the
amount and density of the code here, I'd love suggestions on how to
simplify it. I feel like there *must* be a simpler form here, but after
a lot of days I've not found it. This is the only one I've found that
even works. I've added the one pesky test case along with some nice
comments explaining the core problem that we have to solve here.

So far this has survived approximately 32k test cases. More strenuous
fuzzing commencing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215519 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 01:25:45 +00:00
Hal Finkel
e693d3c558 [PowerPC] Implement PPCTargetLowering::getTgtMemIntrinsic
This implements PPCTargetLowering::getTgtMemIntrinsic for Altivec load/store
intrinsics. As with the construction of the MachineMemOperands for the
intrinsic calls used for unaligned load/store lowering, the only slight
complication is that we need to represent a larger memory range than the
loaded/stored value-type size (because the address is rounded down to an
aligned address, and we need to conservatively represent the entire possible
range of the actual access). This required adding an extra size field to
TargetLowering::IntrinsicInfo, and this was done in a way that required no
modifications to other targets (the size defaults to the store size of the
provided memory data type).

This fixes test/CodeGen/PowerPC/unal-altivec-wint.ll (so it can be un-XFAILed).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215512 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 01:15:40 +00:00
Hal Finkel
695e914c03 Fix classof for ISD::INTRINSIC_W_CHAIN and INTRINSIC_VOID
Unfortunately, our use of the SDNode class hierarchy for INTRINSIC_W_CHAIN and
INTRINSIC_VOID nodes is somewhat broken right now. These nodes sometimes are
used for memory intrinsics (those with MachineMemOperands), and sometimes not.
When not, the nodes are not created as instances of MemIntrinsicSDNode, but
rather created as some other subclass of SDNode using DAG::getNode. When they
are memory intrinsics, they are created using DAG::getMemIntrinsicNode as
instances of MemIntrinsicSDNode. MemIntrinsicSDNode is a subclass of
MemSDNode, but prior to r214452, we had a non-self-consistent setup whereby
MemIntrinsicSDNode::classof on INTRINSIC_W_CHAIN and INTRINSIC_VOID would
return true but MemSDNode::classof on INTRINSIC_W_CHAIN and INTRINSIC_VOID
would return false. In r214452, MemSDNode::classof was changed to return true
for INTRINSIC_W_CHAIN and INTRINSIC_VOID, which is now self-consistent. The
problem is that neither the pre-r214452 logic and the post-r214452 logic are
really right. The truth is that not all INTRINSIC_W_CHAIN and INTRINSIC_VOID
nodes are instances of MemIntrinsicSDNode (or MemSDNode for that matter), and
the return value from classof needs to reflect that. This was broken before
r214452 (because MemIntrinsicSDNode::classof always returned true), and was
broken afterward (because MemSDNode::classof also always returned true), and
will now be correct.

The minimal solution is to grab one of the SubclassData bits (there is one left
for MemIntrinsicSDNode nodes) and use it to store whether or not a particular
INTRINSIC_W_CHAIN or INTRINSIC_VOID is really an instance of
MemIntrinsicSDNode or not. Doing this allows both MemIntrinsicSDNode::classof
and MemSDNode::classof to return the correct answer for the underlying object
for both the memory-intrinsic and non-memory-intrinsic cases.

This fixes the problem that r214452 created in the SelectionDAGDumper (thanks
to Matt Arsenault for pointing it out).

Because PowerPC does not implement getTgtMemIntrinsic, this change breaks
test/CodeGen/PowerPC/unal-altivec-wint.ll. I've XFAILed it for now, and will
fix it in a follow-up commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215511 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 01:15:37 +00:00
Adam Nemet
4c9467ea5d [AVX512] Verify the code generated for the intrinsic _mm512_broadcastsd_pd
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215487 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 00:30:05 +00:00
Adam Nemet
6ea7f36872 [AVX512] Handle valign masking intrinsic via C++ lowering
I think that this will scale better in most cases than adding a Pat<> for each
mapping from the intrinsic DAG to the intruction (i.e. rri, rrik, rrikz).  We
can just lower to the SDNode and have the resulting DAG be matches by the DAG
patterns.

Alternatively (long term), we could keep the Pat<>s but generate them via the
new AVX512_masking multiclass.  The difficulty is that in order to formulate
that we would have to concatenate DAGs.  Currently this is only supported if
the operators of the input DAGs are identical.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215473 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-12 21:13:12 +00:00
Jan Vesely
3c57820bbb R600: Use optimized 24bit path in udivrem
v2: drop enum keyword
    use correct extension mode
    don't bother computing the sign in unsinged case

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215462 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-12 17:31:20 +00:00
Jan Vesely
b40562c0ec R600: Use i24 optimized path for SREM
v2: add tests
    rename LowerSDIV24 to LowerSDIVREM24
    handle the rem part in this function

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215460 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-12 17:31:17 +00:00
Gerolf Hoflehner
392f7d970c [MachineCombiner] Fix for ICE bug 20598
The combiner ignored DBG nodes when checking
the uses of a virtual register.

It combined a sequence like
   %vreg1 = madd %vreg2, %vreg3,...
   DBG_VALUE (%vreg1 ...)
   %vreg4 = add %vreg1,...
to
  %vreg4 = madd %vreg2, %vreg3

leaving behind a dangling DBG_VALUE with
a definition. This triggered an assertion
in the MachineTraceMetrics.cpp module.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215431 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-12 07:54:12 +00:00
Michael J. Spencer
4935833df5 [x86] Fold extract_vector_elt of a load into the Load's address computation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215409 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 23:49:33 +00:00
Tom Stellard
13f4476c55 R600/SI: Add a ComplexPattern for selecting MUBUF _OFFSET variant
This saves us from having to copy a 64-bit 0 value into VGPRs for
BUFFER_* instruction which only have a 12-bit immediate offset.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215399 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 22:18:17 +00:00
Tom Stellard
68e9ebbe44 R600/SI: Add check for low 32 bits of encoding to mubuf tests
There are no variable values like registers encoded in the low 32 bits of MUBUF
instructions, so it is relatively easy to check these bits, and it will
help prevent us from introducing encoding bugs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215397 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 22:18:11 +00:00
Tom Stellard
728d0e4218 R600/SI: Clear lds bit on MUBUF instructions used for private stores
This bit was left uninitialized, which was causing some random failures
of piglit tests.

NOTE: This is a candidate for the 3.5 branch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215396 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 22:18:09 +00:00
Tom Stellard
0df264a0fd R600/SI: Fix broken test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215395 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 22:18:05 +00:00
Quentin Colombet
7f4f923aa5 [AArch64] Fix registerAllocator assigns same register for base and wback in
pre/post-index load and store.

Patch by Steven Wu <stevenwu@apple.com>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215390 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 21:39:53 +00:00
Saleem Abdulrasool
6c2be4ff95 ARM: try harder to detect non-IT eligible instructions
For many Thumb-1 register register instructions, setting the CPSR is not
permitted inside an IT block.  We would not correctly flag those instructions.
The previous change to identify this scenario was insufficient as it did not
actually catch all the instances.  The current list is formed by manual
inspection of the ARMv6M ARM.

The change to the Thumb2 IT block test is due to the fact that the new more
stringent checking of the MIs results in the If Conversion pass being prevented
from executing (since not all the instructions in the BB are predicable).  This
results in code gen changes.

Thanks to Tim Northover for pointing out that the previous patch was
insufficient and hinting that the use of the v6M ARM would be much easier to use
than the v7 or v8!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215382 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 20:13:25 +00:00
Sanjay Patel
7c0fa0cfab Correct a missing RUN line in the ARM codegen test for fneg ops. We should also explicitly specify +/-neonfp.
The bug was introduced at r99570 when use of "-arm-use-neon-fp" was removed.

Differential Revision: http://reviews.llvm.org/D4846



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215377 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 19:04:28 +00:00
Oliver Stannard
17ef00ea94 ARM: __gnu_h2f_ieee and __gnu_f2h_ieee always use the soft-float calling convention
By default, LLVM uses the "C" calling convention for all runtime
library functions. The half-precision FP conversion functions use the
soft-float calling convention, and are needed for some targets which
use the hard-float convention by default, so must have their calling
convention explicitly set.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215348 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 09:12:32 +00:00
Jiangning Liu
0679d2d0a4 In Machine CSE pass, the source register of a COPY machine instruction can
be propagated to all its users, and this propagation could increase the 
probability of finding common subexpressions. If the COPY has only one user,
the COPY itself can be removed.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215344 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-11 05:17:19 +00:00
Petar Jovanovic
97b0c63f6b Add support for scalarizing cttz_zero_undef
Follow up to r214266. Add missing case in ScalarizeVectorResult() for
cttz_zero_undef.

Differential Revision: http://reviews.llvm.org/D4813


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215330 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-10 22:49:54 +00:00
Saleem Abdulrasool
3e5734dc38 ARM: correct isPredicable for MULS in ThHUMB mode
The ARM ARM states that CPSR may not be updated by a MUL in thumb mode.  Due to
an ordering of Thumb 2 Size Reduction and If Conversion, we would end up
generating a THUMB MULS inside an IT block.

The If Conversion pass uses the TTI isPredicable method to ensure that it can
transform a Basic Block.  However, because we only check for IT handling on
Thumb2 functions, we may miss some cases.  Even then, it only validates that the
CPSR is not *live* rather than it is not accessed.  This corrects the handling
for that particular case since the same restriction does not hold on the vast
majority of the instructions.

This does prevent the IfConversion optimization from kicking in in certain
cases, but generating correct code is more valuable.  Addresses PR20555.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215328 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-10 22:20:37 +00:00
Tom Stellard
4e8a136db8 R600/SI: Custom lower CONCAT_VECTORS
This will lower them using register copies rather than loads and stores
to the stack.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215270 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-09 01:06:56 +00:00
Tom Stellard
f1ba587963 R600/SI: Update concat_vectors.ll to check for scratch usage
These tests were using SI-NOT: MOVREL to make sure concat vectors
weren't being lowered to stack loads and stores, but we are using
scratch buffers for the stack now instead of registers, so we need
to add an additional SI-NOT check for scratch buffers.

With this change I was able to uncover one broken test which will
be fixed in a future commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215269 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-09 01:06:53 +00:00
Joerg Sonnenberger
f0b70e2fbc Provide an implementation of getNoopForMachoTarget for PPC, otherwise
empty functions will assert in the MC object writer.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215238 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 19:13:23 +00:00
Juergen Ributzka
0980ef248e [FastISel][X86] Fix INC/DEC optimization (r215230)
I accidentally also used INC/DEC for unsigned arithmetic which doesn't work,
because INC/DEC don't set the required flag which is used for the overflow
check.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215237 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 18:47:04 +00:00
Juergen Ributzka
cbda4b32c6 [FastISel][X86] Use INC/DEC when possible for {sadd|ssub}.with.overflow intrinsics.
This is a small peephole optimization to emit INC/DEC when possible.

Fixes <rdar://problem/17952308>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215230 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 17:21:37 +00:00
Daniel Sanders
a19fc6deb8 [mips] Invert the abicalls feature bit to be noabicalls so that it's possible for -mno-abicalls to take effect.
Also added the testcase that should have been in r215194.

This behaviour has surprised me a few times now. The problem is that the
generated MipsSubtarget::ParseSubtargetFeatures() contains code like this:

   if ((Bits & Mips::FeatureABICalls) != 0) IsABICalls = true;

so '-abicalls' means 'leave it at the default' and '+abicalls' means 'set it to
true'. In this case, (and the similar -modd-spreg case) I'd like the code to be

  IsABICalls = (Bits & Mips::FeatureABICalls) != 0;

or possibly:

   if ((Bits & Mips::FeatureABICalls) != 0)
     IsABICalls = true;
   else
     IsABICalls = false;

and preferably arrange for 'Bits & Mips::FeatureABICalls' to be true by default
(on some triples).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215211 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 15:47:17 +00:00
Jiangning Liu
3b85f30319 [AArch64] Fix a type conversion bug for anlyzing compare.
The bug can cause spec2006/483.xalancbmk failure.

Patched by David Xu.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215206 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 14:19:29 +00:00
Daniel Sanders
1952807c9c [mips] Remove reason for XFAIL from a test that isn't actually XFAILed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215201 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 12:58:17 +00:00
James Molloy
3a106a2813 [AArch64] Add an FP load balancing pass for Cortex-A57
For best-case performance on Cortex-A57, we should try to use a balanced mix of odd and even D-registers when performing a critical sequence of independent, non-quadword FP/ASIMD floating-point multiply or multiply-accumulate operations.

This pass attempts to detect situations where the register allocation may adversely affect this load balancing and to change the registers used so as to better utilize the CPU.

Ideally we'd just take each multiply or multiply-accumulate in turn and allocate it alternating even or odd registers. However, multiply-accumulates are most efficiently performed in the same functional unit as their accumulation operand. Therefore this pass tries to find maximal sequences ("Chains") of multiply-accumulates linked via their accumulation operand, and assign them all the same "color" (oddness/evenness).

This optimization affects S-register and D-register floating point multiplies and FMADD/FMAs, as well as vector (floating point only) muls and FMADD/FMA. Q register instructions (and 128-bit vector instructions) are not affected.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215199 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 12:33:21 +00:00
Tim Northover
bbdf1e0432 AArch64: stop trying to take control of all UnknownArch triples.
This short-circuited our error reporting for incorrectly specified
target triples (you'd get AArch64 code instead).

Should fix PR20567.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215191 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 08:27:44 +00:00
Patrik Hagglund
cf403861a3 [pr19635] Revert most of r170537, and add new testcase.
Patch provided by Andrey Kuharev.

Sorry, r170537 was obviously wrong.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215190 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-08 08:21:19 +00:00
Adam Nemet
a8e1cda622 [AVX512] Add zero-masking variant to AVX512_masking multiclass
This completes one item from the todo-list of r215125 "Generate masking
instruction variants with tablegen".

The AddedComplexity is needed just like for the k variant.

Added a codegen test based on valignq.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215173 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 23:53:38 +00:00
Adam Nemet
690499ed49 [AVX512] Add codegen test for the masking variant of valign
The AddedComplexity is needed just like in avx512_perm_3src.  There may be a
bug in the complexity computation...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215168 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 23:18:18 +00:00
Akira Hatanaka
43f6ce9289 [stack protector] Look through bitcasts to get global variable
__stack_chk_guard.

Handle the case where the pointer operand of the load instruction that loads the
stack guard is not a global variable but instead a bitcast.

%StackGuard = load i8** bitcast (i64** @__stack_chk_guard to i8**)
call void @llvm.stackprotector(i8* %StackGuard, i8** %StackGuardSlot)

Original test case provided by Ana Pazos.

This fixes PR20558.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215167 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 23:08:24 +00:00
Adrian Prantl
7f48f056f7 Make these regexes stricter by disallowing any additional characters in the output.
Thanks to dblaikie for pointing this out!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215166 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 23:04:07 +00:00
Adrian Prantl
2ead89ae61 Reflow this comment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215160 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 22:44:24 +00:00
Reed Kotler
cf76da912c fix materialization of one bit constants and global values which are accessed through
a base GOT entry.

Summary:
get tip of tree mips fast-isel to pass test-suite

Two bugs were fixed:

1) one bit booleans were treated as 1 bit signed integers and so the literal '1' could become sign extended.
2) mips uses got for pic but in certain cases, as with string constants for example, many items can be referenced from the same got entry and this case was not handled properly.

Test Plan: test-suite

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: mcrosier

Differential Revision: http://reviews.llvm.org/D4801

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215155 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 22:09:01 +00:00
Gerolf Hoflehner
e4fa341dde MachineCombiner Pass for selecting faster instruction sequence on AArch64
Re-commit of r214832,r21469 with a work-around that
avoids the previous problem with gcc build compilers

The work-around is to use SmallVector instead of ArrayRef
of basic blocks in preservesResourceLen()/MachineCombiner.cpp



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215151 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 21:40:58 +00:00
Akira Hatanaka
70b56056a1 [Branch probability] Recompute branch weights of tail-merged basic blocks.
BranchFolderPass was not correctly setting the basic block branch weights when
tail-merging created or merged blocks. This patch recomutes the weights of
tail-merged blocks using the following formula:

branch_weight(merged block to successor j) =
sum(block_frequency(bb) * branch_probability(bb -> j))

bb is a block that is in the set of merged blocks.

<rdar://problem/16256423>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215135 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 19:30:13 +00:00
Chandler Carruth
0e89fbb120 [x86] Fix another miscompile found through fuzz testing the new vector
shuffle lowering.

This is closely related to the previous one. Here we failed to use the
source offset when swapping in the other case -- where we end up
swapping the *final* shuffle. The cause of this bug is a bit different:
I simply wasn't thinking about the fact that this mask is actually
a slice of a wide mask and thus has numbers that need SourceOffset
applied. Simple fix. Would be even more simple with an algorithm-y thing
to use here, but correctness first. =]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215095 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 10:37:35 +00:00
Chandler Carruth
0651861b7b [x86] Fix another miscompile in the new vector shuffle lowering found
via the fuzz tester.

Here I missed an offset when round-tripping a value through a shuffle
mask. I got it right 2 lines below. See a problem? I do. ;] I'll
probably be adding a little "swap" algorithm which accepts a range and
two values and swaps those values where they occur in the range. Don't
really have a name for it, let me know if you do.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215094 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 10:14:27 +00:00
Chandler Carruth
b3364512fc [x86] Fix another miscompile in the new vector shuffle lowering found
through the new fuzzer.

This one is great: bad operator precedence led the modulus to happen at
the wrong point. All the asserts didn't fire because there were usually
the right values past the end of the 4 element region we were looking
at. Probably could have gotten a crash here with ASan + fuzzing, but the
correctness tests pinpointed this really nicely.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215092 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 09:45:02 +00:00
Pavel Chupin
5d8c984e54 [x32] Use ebp/esp as frame and stack pointer
Summary:
Since pointers are 32-bit on x32 we can use ebp and esp as frame and stack
pointer. Some operations like PUSH/POP and CFI_INSTRUCTION still
require 64-bit register, so using 64-bit MachineFramePtr where required.

X86_64 NaCl uses 64-bit frame/stack pointers, however it's been found that
both isTarget64BitLP64 and isTarget64BitILP32 are true for NaCl. Addressing
this issue here as well by making isTarget64BitLP64 false.

Also mark hasReservedSpillSlot unreachable on X86. See inlined comments.

Test Plan: Add one new simple test and upgrade 2 existing with x32 target case.

Reviewers: nadav, dschuff

Subscribers: llvm-commits, zinovy.nis

Differential Revision: http://reviews.llvm.org/D4617

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215091 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 09:41:19 +00:00
Chandler Carruth
15d82b7d33 [x86] Fix a miscompile in the new shuffle lowering found through the new
fuzz testing.

The function which tested for adjacency did what it said on the tin, but
when I called it, I wanted it to do something more thorough: I wanted to
know if the *pairs* of shuffle elements were adjacent and started at
0 mod 2. In one place I had the decency to try to test for this, but in
the other it was completely skipped, miscompiling this test case. Fix
this by making the helper actually do what I wanted it to do everywhere
I called it (and removing the now redundant code in one place).

I *really* dislike the name "canWidenShuffleElements" for this
predicate. If anyone can come up with a better name, please let me know.
The other name I thought about was "canWidenShuffleMask" but is it
really widening the mask to reduce the number of lanes shuffled? I don't
know. Naming things is hard.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215089 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 08:11:31 +00:00
Sanjay Patel
b9736caa6a Fix a test that has no checks.
X86 doesn't have fneg, so check for xor.

Differential Revision: http://reviews.llvm.org/D4812


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214992 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-06 20:45:30 +00:00
Matt Arsenault
60178b180f R600: Cleanup fadd and fsub tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214991 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-06 20:27:55 +00:00
Reid Kleckner
7911f2db78 Add a triple to this test to get the right IR mangling
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214982 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-06 18:09:15 +00:00
Reid Kleckner
5d04e520c0 Don't count inreg params when mangling fastcall functions
This is consistent with MSVC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214981 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-06 18:09:04 +00:00