Commit Graph

10245 Commits

Author SHA1 Message Date
Tim Northover
badb137729 ARM: expand atomic ldrex/strex loops in IR
The previous situation where ATOMIC_LOAD_WHATEVER nodes were expanded
at MachineInstr emission time had grown to be extremely large and
involved, to account for the subtly different code needed for the
various flavours (8/16/32/64 bit, cmpxchg/add/minmax).

Moving this transformation into the IR clears up the code
substantially, and makes future optimisations much easier:

1. an atomicrmw followed by using the *new* value can be more
   efficient. As an IR pass, simple CSE could handle this
   efficiently.
2. Making use of cmpxchg success/failure orderings only has to be done
   in one (simpler) place.
3. The common "cmpxchg; did we store?" idiom can be exposed to
   optimisation.

I intend to gradually improve this situation within the ARM backend
and make sure there are no hidden issues before moving the code out
into CodeGen to be shared with (at least ARM64/AArch64, though I think
PPC & Mips could benefit too).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205525 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-03 11:44:58 +00:00
Silviu Baranga
3f11cd0d25 [ARM] When generating a vpaddl node the input lane type is not always the type of the
add operation since extract_vector_elt can perform an extend operation. Get the input lane
type from the vector on which we're performing the vpaddl operation on and extend or
truncate it to the output type of the original add node.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205523 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-03 10:44:27 +00:00
Tim Northover
107283d7cb ARM64: add regression test for r205519.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205520 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-03 09:36:05 +00:00
Tim Northover
b642eb5dbc ARM64: don't generate __sincos_stret calls unless on MachO
This should fix PR19314.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205514 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-03 07:06:13 +00:00
Richard Trieu
1498ceee9e Fix test case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205492 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-03 00:14:18 +00:00
Lang Hames
ed2154b816 [CodeGen] Teach the peephole optimizer to remember (and exploit) all folding
opportunities in the current basic block, rather than just the last one seen.

<rdar://problem/16478629>



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205481 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-02 22:59:58 +00:00
Juergen Ributzka
75cea2c73e Add comments and test case for [DAG] Keep the opaque constant flag when performing unary constant folding operations (r204737).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205474 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-02 22:21:01 +00:00
Saleem Abdulrasool
6d4e5ab349 ARM: fixup tests to specify the target more explicitly
This changes the tests that were targeting ARM EABI to explicitly specify the
environment rather than relying on the default.  This breaks with the new
Windows on ARM support when running the tests on Windows where the default
environment is no longer EABI.

Take the opportunity to avoid a pointless redirect (helps when trying to debug
with providing a command line invocation which can be copy and pasted) and
removing a few greps in favour of FileCheck.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205465 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-02 21:22:03 +00:00
Saleem Abdulrasool
396e5e328c ARM: update subtarget information for Windows on ARM
Update the subtarget information for Windows on ARM.  This enables using the MC
layer to target Windows on ARM.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205459 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-02 20:32:05 +00:00
Tom Stellard
adb852ddf3 TargetLibraryInfo: Disable memcpy and memset on R600
There are no implementations of these for R600.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205455 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-02 19:53:29 +00:00
Jim Grosbach
acb6d9834a ARM: cortex-m0 doesn't support unaligned memory access.
Unlike other v6+ processors, cortex-m0 never supports unaligned accesses.
From the v6m ARM ARM:

"A3.2 Alignment support: ARMv6-M always generates a fault when an unaligned
access occurs."

rdar://16491560

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205452 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-02 19:28:13 +00:00
Kai Nacke
b96fc4a5ea [mips] Add more Octeon cnMips instructions
Adds the instructions ext/ext32/cins/cins32.
It also changes pop/dpop to accept the two operand version and
adds a simple pattern to generate baddu.
Tests for the two operand versions (including baddu/dmul/dpop/pop)
and the code generation pattern for baddu are included.

Reviewed by: Daniel.Sanders@imgtec.com


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205449 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-02 18:40:43 +00:00
Oliver Stannard
af48fc4136 ARM: Add support for segmented stacks
Patch by Alex Crichton, ILyoan, Luqman Aden and Svetoslav.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205430 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-02 16:10:33 +00:00
Tim Northover
6584d94610 ARM64: use GOT for weak symbols & PIC.
Weak symbols cannot use the small code model's usual ADRP sequences since the
instruction simply may not be able to encode a value of 0.

This redirects them to use the GOT, which hopefully linkers are able to cope
with even in the static relocation model.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205426 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-02 14:39:11 +00:00
Tim Northover
671c92d886 ARM64: fix lowering of fp128 fptosi/fptoui
We were creating libcall nodes that returned an MVT::f128, when these
particular operations actually return an int of some stripe.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205425 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-02 14:39:07 +00:00
Tim Northover
3844cadc9a ARM64: make sure first argument to INSERT_SUBVECTOR has right type.
Again, coalescing and other optimisations swiftly made the MachineInstrs
consistent again, but when compiled at -O0 a bad INSERT_SUBREGISTER was
produced.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205423 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-02 14:38:58 +00:00
Tim Northover
87e824120d ARM64: convert fp16 narrowing ISel to pseudo-instruction
The previous attempt was fine with optimisations, but was actually rather
cavalier with its types. When compiled at -O0, it produced invalid COPY
MachineInstrs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205422 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-02 14:38:54 +00:00
Job Noorman
4e7ec2b053 Mark FPB as a reserved register when needed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205421 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-02 13:13:56 +00:00
Renato Golin
421397ac00 Remove duplicated DMB instructions
ARM specific optimiztion, finding places in ARM machine code where 2 dmbs
follow one another, and eliminating one of them.

Patch by Reinoud Elhorst.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205409 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-02 09:03:43 +00:00
Hal Finkel
4a6c0afc52 [PowerPC] Add some missing VSX bitcast patterns
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205352 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-01 19:24:27 +00:00
Reid Kleckner
f319a2c3b3 Support segmented stacks on Win64
Identical to Win32 method except the GS segment register is used for TLS
instead of FS and pvArbitrary is at TEB offset 0x28 instead of 0x14.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205342 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-01 18:34:21 +00:00
Matt Arsenault
a173c01556 Fix missing RUN line in test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205341 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-01 18:34:13 +00:00
Matt Arsenault
eb8eac6be2 Make isSetCCEquivalent respect the TargetBooleanContents
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205336 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-01 18:13:26 +00:00
Tim Northover
c077472250 ARM: add cyclone CPU with ZeroCycleZeroing feature.
The Cyclone CPU is similar to swift for most LLVM purposes, but does have two
preferred instructions for zeroing a VFP register. This teaches LLVM about
them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205309 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-01 13:22:02 +00:00
Tim Northover
ed839120c4 ARM64: add intrinsic for pmull (p64 x p64 = p128) operations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205302 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-01 12:22:37 +00:00
Tim Northover
acfb679618 ARM64: add patterns for more lane-wise ld1/st1 operations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205294 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-01 10:37:09 +00:00
Tim Northover
233c567ea9 ARM64: fix bug in ld3r (1d) SelectionDAG.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205293 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-01 10:37:03 +00:00
Alexey Volkov
1d75829ddc [x86] Do not convert to cmp32 for Atom arch by Sergey Okunev
Differential Revision: http://llvm-reviews.chandlerc.com/D2824


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205288 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-01 08:13:07 +00:00
David Blaikie
a07c1ab4e6 DebugInfo: Avoid creating unnecessary/empty line tables and remove the special case of '0' in DwarfCompileUnit::initStmtList by just always using a label difference
This moves one case of raw text checking down into the MCStreamer
interfaces in the form of a virtual function, even if we ultimately end
up consolidating on the one-or-many line tables issue one day, this is
nicer in the interim. This just generally streamlines a bunch of use
cases into a common code path.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205287 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-01 08:07:52 +00:00
Juergen Ributzka
d8c9577764 [Stackmaps] Update the stackmap format to use 64-bit relocations for the function address and properly align all entries.
This commit updates the stackmap format to version 1 to indicate the
reorganizaion of several fields. This was done in order to align stackmap
entries to their natural alignment and to minimize padding.

Fixes <rdar://problem/16005902>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205254 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-31 22:14:04 +00:00
Matt Arsenault
193c3e91b9 R600: Compute masked bits for min and max
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205242 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-31 19:35:33 +00:00
Matt Arsenault
828bfc7350 R600: Add BFE, BFI, and BFM intrinsics to help with writing tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205236 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-31 18:21:18 +00:00
Hal Finkel
e2b3751924 [PowerPC] Don't ever expand BUILD_VECTOR of v2i64 with shuffles
If we have two unique values for a v2i64 build vector, this will always result
in two vector loads if we expand using shuffles. Only one is necessary.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205231 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-31 17:48:16 +00:00
Yaron Keren
b6f3f9ba45 Two updated tests for MinGW 32 and 64 exception handling code generation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205227 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-31 17:34:15 +00:00
Eli Bendersky
416a0e993f Fix for PR19099 - NVPTX produces invalid symbol names.
This is a more thorough fix for the issue than r203483. An IR pass will run
before NVPTX codegen to make sure there are no invalid symbol names that can't
be consumed by the ptxas assembler.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205212 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-31 15:56:26 +00:00
Tim Northover
93b3fcae28 ARM64: add extra patterns for scalar shifts
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205209 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-31 15:46:46 +00:00
Tim Northover
812a174ba4 ARM64: add extra scalar neg pattern & tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205208 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-31 15:46:42 +00:00
Tim Northover
b35ffdff16 ARM64: add patterns for scalar sqdmlal & sqdmlsl.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205207 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-31 15:46:38 +00:00
Tim Northover
9542846832 ARM64: add more patterns for commuted fmsub operations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205206 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-31 15:46:34 +00:00
Tim Northover
4417e07e39 ARM64: shuffle patterns around for fmin/fmax & add tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205205 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-31 15:46:30 +00:00
Tim Northover
576c3f709f ARM64: add more scalar patterns for usqadd & suqadd.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205204 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-31 15:46:26 +00:00
Tim Northover
13277a78bc ARM64: add more scalar patterns for reciprocal ops.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205203 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-31 15:46:22 +00:00
Tim Northover
8f93e159ed ARM64: add i64 scalar pattern for @llvm.arm64.abs
This will be used by the Clang front-end code for vabsd_s64.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205202 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-31 15:46:17 +00:00
Tom Stellard
1f143aa3e9 R600/SI: Lower i64 SELECT by bitcasting to a vector type
This allows allows us to replace ISD::EXTRACT_ELEMENT, which is lowered
using shifts, with ISD::EXTRACT_VECTOR_ELT, which is a no-op.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205187 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-31 14:01:55 +00:00
Zoran Jovanovic
077aa54e4e Fixed issue with microMIPS JAL instruction.
Differential Revision: http://llvm-reviews.chandlerc.com/D3200


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205185 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-31 14:00:10 +00:00
Hal Finkel
65fafbb109 Look at shuffles of build_vectors in DAGCombiner::visitEXTRACT_VECTOR_ELT
When the loop vectorizer vectorizes code that uses the loop induction variable,
we often end up with IR like this:

  %b1 = insertelement <2 x i32> undef, i32 %v, i32 0
  %b2 = shufflevector <2 x i32> %b1, <2 x i32> undef, <2 x i32> zeroinitializer
  %i = add <2 x i32> %b2, <i32 2, i32 3>

If the add in this example is not legal (as is the case on PPC with VSX), it
will be scalarized, and we'll end up with a number of extract_vector_elt nodes
with the vector shuffle as the input operand, and that vector shuffle is fed by
one or more build_vector nodes. By the time that vector operations are
expanded, visitEXTRACT_VECTOR_ELT will not create new extract_vector_elt by
looking through the vector shuffle (to make sure that no illegal operations are
created), and so the extract_vector_elt -> vector shuffle -> build_vector is
never simplified to an operand of the build vector.

By looking at build_vectors through a shuffle we fix this particular situation,
preventing a vector from being built, only to be deconstructed again (for the
scalarized add) -- an expensive proposition when this all needs to be done via
the stack. We probably want a more comprehensive fix here where we look back
recursively through any shuffles to any build_vectors or scalar_to_vectors,
etc. but that can come later.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205179 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-31 11:43:19 +00:00
Daniel Sanders
dd67fd636c [mips] Check emitted code for llvm.bswap.i32 on MIPS16/MIPS64 and llvm.bswap.i64 on MIPS16.
While reviewing r204163, I noticed that the MIPS16 test only checked for a .ent
directive and didn't actually check the code emitted. Fixed this and added a
check for llvm.bswap.i32 on MIPS64 at the same time.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205177 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-31 11:00:04 +00:00
Chandler Carruth
2530cd31f4 [ARM64] Fix materialization of an fp128 zero immediate. There currently
is not a pattern to lower this with clever instructions that zero the
register, so restrict the zero immediate legality special case to f64
and f32 (the only two sizes which fmov seems to directly support). Fixes
backend errors when building code such as libxml.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205161 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-31 00:02:10 +00:00
Hal Finkel
111bcf9b59 Make use of previously generated stores in SelectionDAGLegalize::ExpandExtractFromVectorThroughStack
When expanding EXTRACT_VECTOR_ELT and EXTRACT_SUBVECTOR using
SelectionDAGLegalize::ExpandExtractFromVectorThroughStack, we store the entire
vector and then load the piece we want. This is fine in isolation, but
generating a new store (and corresponding stack slot) for each extraction ends
up producing code of poor quality. When we scalarize a vector operation (using
SelectionDAG::UnrollVectorOp for example) we generate one EXTRACT_VECTOR_ELT
for each element in the vector. This used to generate one stored copy of the
vector for each element in the vector. Now we search the uses of the vector for
a suitable store before generating a new one, which results in much more
efficient scalarization code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205153 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-30 15:10:18 +00:00
Hal Finkel
ee8e48d4c9 [PowerPC] Handle VSX v2i64 SIGN_EXTEND_INREG
sitofp from v2i32 to v2f64 ends up generating a SIGN_EXTEND_INREG v2i64 node
(and similarly for v2i16 and v2i8). Even though there are no sign-extension (or
algebraic shifts) for v2i64 types, we can handle v2i32 sign extensions by
converting two and from v2i64. The small trick necessary here is to shift the
i32 elements into the right lanes before the i32 -> f64 step. This is because
of the big Endian nature of the system, we need the i32 portion in the high
word of the i64 elements.

For v2i16 and v2i8 we can do the same, but we first use the default Altivec
shift-based expansion from v2i16 or v2i8 to v2i32 (by casting to v4i32) and
then apply the above procedure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205146 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-30 13:22:59 +00:00