Commit Graph

6185 Commits

Author SHA1 Message Date
Richard Sandiford
086791eca2 Add TargetLowering::prepareVolatileOrAtomicLoad
One unusual feature of the z architecture is that the result of a
previous load can be reused indefinitely for subsequent loads, even if
a cache-coherent store to that location is performed by another CPU.
A special serializing instruction must be used if you want to force
a load to be reattempted.

Since volatile loads are not supposed to be omitted in this way,
we should insert a serializing instruction before each such load.
The same goes for atomic loads.

The patch implements this at the IR->DAG boundary, in a similar way
to atomic fences.  It is a no-op for targets other than SystemZ.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196905 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-10 10:36:34 +00:00
Reid Kleckner
cc8d39acf5 Revert "Fix miscompile of MS inline assembly with stack realignment"
This reverts commit r196876.  Its tests failed on the bots, so I'll
figure it out tomorrow.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196879 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-10 05:31:27 +00:00
Reid Kleckner
ec4d326aad Fix miscompile of MS inline assembly with stack realignment
For stack frames requiring realignment, three pointers may be needed:
- ebp to address incoming arguments
- esi (could be any callee-saved register) to address locals
- esp to address outgoing arguments

We would use esi unconditionally without verifying that it did not
conflict with inline assembly.

This change doesn't do the verification, it simply emits a fatal error
on functions that use stack realignment, dynamic SP adjustments, and
inline assembly.

Because stack realignment is common on Windows, we also no longer assume
that MS inline assembly clobbers esp.  Instead, we analyze the inline
instructions for implicit definitions and check if esp is there.  If so,
we require the use of a base pointer and consider it in the condition
above.

Mostly fixes PR16830, but we could try harder to find a non-conflicting
base pointer.

Reviewers: sunfish

Differential Revision: http://llvm-reviews.chandlerc.com/D1317

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196876 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-10 05:12:23 +00:00
Nadav Rotem
6806e11612 Fix PR18162 - Incorrect assertion assumed that the SDValue resno is zero.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196858 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-10 01:13:59 +00:00
Alp Toker
087ab613f4 Correct word hyphenations
This patch tries to avoid unrelated changes other than fixing a few
hyphen-related ambiguities and contractions in nearby lines.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196471 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-05 05:44:44 +00:00
Rafael Espindola
ba2a226fab Try harder to get a consistent floating point results.
This just extends the existing hack. It should be enough to get a reproducible bootstrap
on 32 bits.

I will open a bug to track getting a real fix for this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196462 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-05 04:14:33 +00:00
Andrew Trick
501aeea325 StackMap: Implement support for DirectMemRefOp.
A Direct stack map location records the address of frame index. This
address is itself the value that the runtime requested. This differs
from IndirectMemRefOp locations, which refer to a stack locations from
which the requested values must be loaded. Direct locations can
directly communicate the address if an alloca, while IndirectMemRefOp
handle register spills.

For example:

entry:
  %a = alloca i64...
  llvm.experimental.stackmap(i32 <ID>, i32 <shadowBytes>, i64* %a)

Since both the alloca and stackmap intrinsic are in the entry block,
and the intrinsic takes the address of the alloca, the runtime can
assume that LLVM will not substitute alloca with any intervening
value. This must be verified by the runtime by checking that the stack
map's location is a Direct location type. The runtime can then
determine the alloca's relative location on the stack immediately after
compilation, or at any time thereafter. This differs from Register and
Indirect locations, because the runtime can only read the values in
those locations when execution reaches the instruction address of the
stack map.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195712 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-26 02:03:25 +00:00
Bill Wendling
5df09f0367 Unrevert r195599 with testcase fix.
I'm not sure how it was checking for the wrong values...
PR18023.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195670 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-25 18:05:22 +00:00
Amara Emerson
99812474c3 Revert r195599 as it broke the builds.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195636 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-25 11:24:18 +00:00
Daniel Sanders
4ac67fa809 Fixed tryFoldToZero() for vector types that need expansion.
Summary:
Moved the requirement for SelectionDAG::getConstant() to return legally
typed nodes slightly earlier. There were two optional DAGCombine passes
that were missed out and were required to produce type-legal DAGs.

Simplified a code-path in tryFoldToZero() to use SelectionDAG::getConstant().
This provides support for both promoted and expanded vector types whereas the
previous code only supported promoted vector types.

Fixes a "Type for zero vector elements is not legal" assertion detected by
an llvm-stress generated test.

Reviewers: resistor

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D2251

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195635 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-25 11:14:43 +00:00
Bill Wendling
dfc615f284 Don't look past volatile loads.
A volatile load should block us from trying to coalesce stores.
PR18023

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195599 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-25 05:01:21 +00:00
Paul Robinson
16c7e0b48c Teach ISel not to optimize 'optnone' functions (revised).
Improvements over r195317:
- Set/restore EnableFastISel flag instead of just running FastISel within
  SelectAllBasicBlocks; the flag is checked in various places, and
  FastISel won't run properly if those places don't do the right thing.
- Test looks for normal ISel versus FastISel behavior, and not
  something more subtle that doesn't work everywhere.

Based on work by Andrea Di Biagio.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195491 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-22 19:11:24 +00:00
Andrew Trick
ed20bf5ef8 patchpoint: factor SD builder code for live vars. Plain stackmap also optimizes Constant values now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195488 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-22 19:07:36 +00:00
Andrew Trick
3419a7dde3 patchpoint: eliminate hard coded operand indices.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195487 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-22 19:07:33 +00:00
Tom Stellard
0ffcaa0d54 SelectionDAG: Optimize expansion of vec_type = BITCAST scalar_type
The legalizer can now do this type of expansion for more
type combinations without loading and storing to and
from the stack.

NOTE: This is a candidate for the 3.4 branch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195398 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-22 00:41:05 +00:00
Tom Stellard
b7bad852f4 Split SETCC if VSELECT requires splitting too.
This patch is a rewrite of the original patch commited in r194542. Instead of
relying on the type legalizer to do the splitting for us, we now peform the
splitting ourselves in the DAG combiner. This is necessary for the case where
the vector mask is a legal type after promotion and still wouldn't require
splitting.

Patch by: Juergen Ributzka

NOTE: This is a candidate for the 3.4 branch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195397 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-22 00:39:23 +00:00
Daniel Sanders
f89ddfccc0 Add support for legalizing SETNE/SETEQ by inverting the condition code and the result of the comparison.
Summary:
LegalizeSetCCCondCode can now legalize SETEQ and SETNE by returning the inverse
condition and requesting that the caller invert the result of the condition.

The caller of LegalizeSetCCCondCode must handle the inverted CC, and they do
so as follows:
  SETCC, BR_CC:
    Invert the result of the SETCC with SelectionDAG::getNOT()
  SELECT_CC:
    Swap the true/false operands.

This is necessary for MSA which lacks an integer SETNE instruction.

Reviewers: resistor

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D2229

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195355 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-21 13:24:49 +00:00
NAKAMURA Takumi
b05bddb4ba Revert r195317 (and r195333), "Teach ISel not to optimize 'optnone' functions."
It broke, at least, i686 target. It is reproducible with "llc -mtriple=i686-unknown".

FYI, it didn't appear to add either "-O0" or "-fast-isel".

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195339 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-21 10:55:15 +00:00
Paul Robinson
6079f00035 Teach ISel not to optimize 'optnone' functions.
Based on work by Andrea Di Biagio.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195317 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-21 06:33:32 +00:00
Jack Carter
1357859c0d long line correction
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195179 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-20 00:32:32 +00:00
Jack Carter
2f0f121732 long lines and white space correction
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195170 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-19 23:43:22 +00:00
Juergen Ributzka
217baac774 [DAG] Refactor vector splitting code in SelectionDAG. No functional change intended.
Reviewed by Tom

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195156 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-19 21:20:17 +00:00
Andrew Trick
0b843861c6 Fix patchpoint comments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195103 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-19 05:05:43 +00:00
Benjamin Kramer
d5ae5b0186 DAGCombiner: Partially revert r192795, getNOT was fixed not to create illegal constants.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194959 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-17 10:40:03 +00:00
Matt Arsenault
ca1b7799aa Use more getZExtOrTruncs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194945 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-17 02:31:26 +00:00
Matt Arsenault
91053d585a Use getZExtOrTrunc instead of repeating the same logic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194944 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-17 02:24:21 +00:00
Matt Arsenault
94437c9691 Use right address space pointer size
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194940 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-17 00:06:39 +00:00
Matt Arsenault
e6e811277f Fix assert on unaligned access to global with different address space size.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194934 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-16 20:50:54 +00:00
Matt Arsenault
4fe5b640ee Fix codegen for null different sized pointer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194932 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-16 20:24:41 +00:00
Bob Wilson
cc7052343e Avoid illegal integer promotion in fastisel
Stop folding constant adds into GEP when the type size doesn't match.
Otherwise, the adds' operands are effectively being promoted, changing the
conditions of an overflow.  Results are different when:

    sext(a) + sext(b) != sext(a + b)

Problem originally found on x86-64, but also fixed issues with ARM and PPC,
which used similar code.

<rdar://problem/15292280>

Patch by Duncan Exon Smith!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194840 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-15 19:09:27 +00:00
Daniel Sanders
ea28aafa83 Fix illegal DAG produced by SelectionDAG::getConstant() for v2i64 type
Summary:
When getConstant() is called for an expanded vector type, it is split into
multiple scalar constants which are then combined using appropriate build_vector
and bitcast operations.

In addition to the usual big/little endian differences, the case where the
element-order of the vector does not have the same endianness as the elements
themselves is also accounted for.  For example, for v4i32 on big-endian MIPS,
the byte-order of the vector is <3210,7654,BA98,FEDC>. For little-endian, it is
<0123,4567,89AB,CDEF>.
Handling this case turns out to be a nop since getConstant() returns a splatted
vector (so reversing the element order doesn't change the value)

This fixes a number of cases in MIPS MSA where calling getConstant() during
operation legalization introduces illegal types (e.g. to legalize v2i64 UNDEF
into a v2i64 BUILD_VECTOR of illegal i64 zeros). It should also handle bigger
differences between illegal and legal types such as legalizing v2i64 into v8i16.

lowerMSASplatImm() in the MIPS backend no longer needs to avoid calling
getConstant() so this function has been updated in the same patch.

For the sake of transparency, the steps I've taken since the review are:
* Added 'virtual' to isVectorEltOrderLittleEndian() as requested. This revealed
  that the MIPS tests were falsely passing because a polymorphic function was
  not actually polymorphic in the reviewed patch.
* Fixed the tests that were now failing. This involved deleting the code to
  handle the MIPS MSA element-order (which was previously doing an byte-order
  swap instead of an element-order swap). This left
  isVectorEltOrderLittleEndian() unused and it was deleted.
* Fixed build failures caused by rebasing beyond r194467-r194472. These build
  failures involved the bset, bneg, and bclr instructions added in these commits
  using lowerMSASplatImm() in a way that was no longer valid after this patch.
  Some of these were fixed by calling SelectionDAG::getConstant() instead,
  others were fixed by a new function getBuildVectorSplat() that provided the
  removed functionality of lowerMSASplatImm() in a more sensible way.

Reviewers: bkramer

Reviewed By: bkramer

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D1973

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194811 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-15 12:56:49 +00:00
Matt Arsenault
509a492442 Add target hook to prevent folding some bitcasted loads.
This is to avoid this transformation in some cases:
fold (conv (load x)) -> (load (conv*)x)

On architectures that don't natively support some vector
loads efficiently casting the load to a smaller vector of
larger types and loading is more efficient.

Patch by Micah Villmow.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194783 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-15 04:42:23 +00:00
Matt Arsenault
59d3ae6cdc Add addrspacecast instruction.
Patch by Michele Scandale!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194760 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-15 01:34:59 +00:00
Andrew Trick
72cf01cc7c Minor extension to llvm.experimental.patchpoint: don't require a call.
If a null call target is provided, don't emit a dummy call. This
allows the runtime to reserve as little nop space as it needs without
the requirement of emitting a call.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194676 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-14 06:54:10 +00:00
Juergen Ributzka
c7e77f91fe SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too.
This patch reapplies r193676 with an additional fix for the Hexagon backend. The
SystemZ backend has already been fixed by r194148.

The Type Legalizer recognizes that VSELECT needs to be split, because the type
is to wide for the given target. The same does not always apply to SETCC,
because less space is required to encode the result of a comparison. As a result
VSELECT is split and SETCC is unrolled into scalar comparisons.

This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG
Combiner. If a matching pattern is found, then the result mask of SETCC is
promoted to the expected vector mask type for the given target. Now the type
legalizer will split both VSELECT and SETCC.

This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX
pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>.

Reviewed by Nadav

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194542 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-13 01:57:54 +00:00
Daniel Sanders
028e4d27b1 Vector forms of SHL, SRA, and SRL can be constant folded using SimplifyVBinOp too
Reviewers: dsanders

Reviewed By: dsanders

CC: llvm-commits, nadav

Differential Revision: http://llvm-reviews.chandlerc.com/D1958

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194393 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-11 17:23:41 +00:00
Juergen Ributzka
d4f5a61567 [Stackmap] Materialize the jump address within the patchpoint noop slide.
This patch moves the jump address materialization inside the noop slide. This
enables patching of the materialization itself or its complete removal. This
patch also adds the ability to define scratch registers that can be used safely
by the code called from the patchpoint intrinsic. At least one scratch register
is required, because that one is used for the materialization of the jump
address. This patch depends on D2009.

Differential Revision: http://llvm-reviews.chandlerc.com/D2074

Reviewed by Andy

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194306 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-09 01:51:33 +00:00
Juergen Ributzka
623d2e618f [Stackmap] Add AnyReg calling convention support for patchpoint intrinsic.
The idea of the AnyReg Calling Convention is to provide the call arguments in
registers, but not to force them to be placed in a paticular order into a
specified set of registers. Instead it is up tp the register allocator to assign
any register as it sees fit. The same applies to the return value (if
applicable).

Differential Revision: http://llvm-reviews.chandlerc.com/D2009

Reviewed by Andy

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194293 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-08 23:28:16 +00:00
Andrew Trick
dc8224def3 Slightly change the way stackmap and patchpoint intrinsics are lowered.
MorphNodeTo is not safe to call during DAG building. It eagerly
deletes dependent DAG nodes which invalidates the NodeMap. We could
expose a safe interface for morphing nodes, but I don't think it's
worth it. Just create a new MachineNode and replaceAllUsesWith.

My understaning of the SD design has been that we want to support
early target opcode selection. That isn't very well supported, but
generally works. It seems reasonable to rely on this feature even if
it isn't widely used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194102 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-05 22:44:04 +00:00
Juergen Ributzka
26cc826a0e [Stackmap] Remove erroneous assert.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193871 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-01 17:53:27 +00:00
Aaron Ballman
307cfaeaf0 Commenting out this assert because it is causing the build bots to fail. This effectively reverts r193861, but needs to be fixed as part of r193769.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193862 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-01 15:12:23 +00:00
Aaron Ballman
8d13de3f09 Fixing an order of evaluation error in an assert.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193861 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-01 14:53:14 +00:00
Andrew Trick
3d74dea4bd Add support for stack map generation in the X86 backend.
Originally implemented by Lang Hames.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193811 91177308-0d34-0410-b5e6-96231b3b80d8
2013-10-31 22:11:56 +00:00
Andrew Trick
2343e3b228 Lower stackmap intrinsics directly to their target opcode in the DAG builder.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193769 91177308-0d34-0410-b5e6-96231b3b80d8
2013-10-31 17:18:24 +00:00
Andrew Trick
cf940ceff7 whitespace
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193765 91177308-0d34-0410-b5e6-96231b3b80d8
2013-10-31 17:18:07 +00:00
Jim Grosbach
0e536ee4ca Legalize: Improve legalization of long vector extends.
When an extend more than doubles the size of the elements (e.g., a zext
from v16i8 to v16i32), the normal legalization method of splitting the
vectors will run into problems as by the time the destination vector is
legal, the source vector is illegal. The end result is the operation
often becoming scalarized, with the typical horrible performance. For
example, on x86_64, the simple input of:
define void @bar(<16 x i8> %a, <16 x i32>* %p) nounwind {
  %tmp = zext <16 x i8> %a to <16 x i32>
  store <16 x i32> %tmp, <16 x i32>*%p
  ret void
}

Generates:
  .section  __TEXT,__text,regular,pure_instructions
  .section  __TEXT,__const
  .align  5
LCPI0_0:
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .long 255                     ## 0xff
  .section  __TEXT,__text,regular,pure_instructions
  .globl  _bar
  .align  4, 0x90
_bar:
  vpunpckhbw  %xmm0, %xmm0, %xmm1
  vpunpckhwd  %xmm0, %xmm1, %xmm2
  vpmovzxwd %xmm1, %xmm1
  vinsertf128 $1, %xmm2, %ymm1, %ymm1
  vmovaps LCPI0_0(%rip), %ymm2
  vandps  %ymm2, %ymm1, %ymm1
  vpmovzxbw %xmm0, %xmm3
  vpunpckhwd  %xmm0, %xmm3, %xmm3
  vpmovzxbd %xmm0, %xmm0
  vinsertf128 $1, %xmm3, %ymm0, %ymm0
  vandps  %ymm2, %ymm0, %ymm0
  vmovaps %ymm0, (%rdi)
  vmovaps %ymm1, 32(%rdi)
  vzeroupper
  ret

So instead we can check if there are legal types that enable us to split
more cleverly when the input vector is already legal such that we don't
turn it into an illegal type. If the extend is such that it's more than
doubling the size of the input we check if
  - the number of vector elements is even,
  - the source type is legal,
  - the type of a split source is illegal,
  - the type of an extended (by doubling element size) source is legal, and
  - the type of that extended source when split is legal.
If the conditions are met, instead of just splitting both the
destination and the source types, we create an extend that only goes up
one "step" (doubling the element width), and the continue legalizing the
rest of the operation normally. The result is that this operates as a
new, more effecient, termination condition for the loop of "split the
operation until the destination type is legal."

With this change, the above example now compiles to:
_bar:
  vpxor %xmm1, %xmm1, %xmm1
  vpunpcklbw  %xmm1, %xmm0, %xmm2
  vpunpckhwd  %xmm1, %xmm2, %xmm3
  vpunpcklwd  %xmm1, %xmm2, %xmm2
  vinsertf128 $1, %xmm3, %ymm2, %ymm2
  vpunpckhbw  %xmm1, %xmm0, %xmm0
  vpunpckhwd  %xmm1, %xmm0, %xmm3
  vpunpcklwd  %xmm1, %xmm0, %xmm0
  vinsertf128 $1, %xmm3, %ymm0, %ymm0
  vmovaps %ymm0, 32(%rdi)
  vmovaps %ymm2, (%rdi)
  vzeroupper
  ret

This generalizes a custom lowering that was added a while back to the
ARM backend. That lowering is no longer necessary, and is removed. The
testcases for it, however, provide excellent ARM tests for this change
and so remain.

rdar://14735100

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193727 91177308-0d34-0410-b5e6-96231b3b80d8
2013-10-31 00:20:48 +00:00
Matt Arsenault
4f17f88071 Fix CodeGen for unaligned loads with address spaces
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193721 91177308-0d34-0410-b5e6-96231b3b80d8
2013-10-30 23:30:05 +00:00
Juergen Ributzka
9a5df73e32 Revert "SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too."
Now Hexagon and SystemZ are not happy with it :-(

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193677 91177308-0d34-0410-b5e6-96231b3b80d8
2013-10-30 06:36:19 +00:00
Juergen Ributzka
4eced19c50 SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too.
The Type Legalizer recognizes that VSELECT needs to be split, because the type
is to wide for the given target. The same does not always apply to SETCC,
because less space is required to encode the result of a comparison. As a result
VSELECT is split and SETCC is unrolled into scalar comparisons.

This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG
Combiner. If a matching pattern is found, then the result mask of SETCC is
promoted to the expected vector mask type for the given target. This mask has
usually the same size as the VSELECT return type (except for Intel KNL). Now the
type legalizer will split both VSELECT and SETCC.

This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX
pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>.

Reviewed by Nadav

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193676 91177308-0d34-0410-b5e6-96231b3b80d8
2013-10-30 05:48:18 +00:00
Alp Toker
18a988e3a7 Fix "existant" typos
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193579 91177308-0d34-0410-b5e6-96231b3b80d8
2013-10-29 02:35:28 +00:00