9709 Commits

Author SHA1 Message Date
Chad Rosier
72800f3a06 [AArch64] Refactor the Neon vector/scalar floating-point convert implementation.
Specifically, reuse the ARM intrinsics when possible.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196926 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-10 15:35:33 +00:00
Andrea Di Biagio
4b3fcc21ec Ensure that the backend no longer emits unnecessary vector insert instructions
immediately after SSE scalar fp instructions like addss or mulss.

Added patterns to select SSE scalar fp arithmetic instructions from a scalar
fp operation followed by a blend.

For example, given the following code:
  __m128 foo(__m128 A, __m128 B) {
    A[0] += B[0];
    return A;
  }

previously we generated:
  addss %xmm0, %xmm1
  movss %xmm1, %xmm0

now we generate:
  addss %xmm1, %xmm0



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196925 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-10 15:22:48 +00:00
Vincent Lejeune
a563c91840 R600: Fix an infinite loop when trying to reorganize export/tex vector input
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196923 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-10 14:43:31 +00:00
Vincent Lejeune
8ff689b443 R600: Fix input modifiers lost for Cayman
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196922 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-10 14:43:27 +00:00
Reed Kotler
526522728c Next step in Mips16 prologue/epilogue cleanup.
Save S2(reg 18) only when we are calling floating point stubs that
have a return value of float or complex. Some more work to make this
better but this is the first step.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196921 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-10 14:29:38 +00:00
Elena Demikhovsky
8a8581ca4b AVX-512: changed intrinsics for mask operations
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196918 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-10 13:53:10 +00:00
Elena Demikhovsky
89458ced87 AVX-512: Changed intrinsics of VPCONFLICT to match GCC builtin form
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196914 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-10 11:58:35 +00:00
Daniel Sanders
dafdc80765 [mips][msa] Correct sld and sldi builtins.
Summary: The result register of these instructions is also the first operand.

Reviewers: jacksprat, dsanders

Reviewed By: dsanders

Differential Revision: http://llvm-reviews.chandlerc.com/D2362
Differential Revision: http://llvm-reviews.chandlerc.com/D2363



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196910 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-10 11:37:00 +00:00
Richard Sandiford
aedb288d86 Add TargetLowering::prepareVolatileOrAtomicLoad
One unusual feature of the z architecture is that the result of a
previous load can be reused indefinitely for subsequent loads, even if
a cache-coherent store to that location is performed by another CPU.
A special serializing instruction must be used if you want to force
a load to be reattempted.

Since volatile loads are not supposed to be omitted in this way,
we should insert a serializing instruction before each such load.
The same goes for atomic loads.

The patch implements this at the IR->DAG boundary, in a similar way
to atomic fences.  It is a no-op for targets other than SystemZ.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196906 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-10 10:49:34 +00:00
Richard Sandiford
086791eca2 Add TargetLowering::prepareVolatileOrAtomicLoad
One unusual feature of the z architecture is that the result of a
previous load can be reused indefinitely for subsequent loads, even if
a cache-coherent store to that location is performed by another CPU.
A special serializing instruction must be used if you want to force
a load to be reattempted.

Since volatile loads are not supposed to be omitted in this way,
we should insert a serializing instruction before each such load.
The same goes for atomic loads.

The patch implements this at the IR->DAG boundary, in a similar way
to atomic fences.  It is a no-op for targets other than SystemZ.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196905 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-10 10:36:34 +00:00
Kevin Qin
3171b8df48 [AArch64 NEON] Support poly128_t and implement relevant intrinsic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196887 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-10 06:48:35 +00:00
Reid Kleckner
cc8d39acf5 Revert "Fix miscompile of MS inline assembly with stack realignment"
This reverts commit r196876.  Its tests failed on the bots, so I'll
figure it out tomorrow.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196879 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-10 05:31:27 +00:00
Reid Kleckner
ec4d326aad Fix miscompile of MS inline assembly with stack realignment
For stack frames requiring realignment, three pointers may be needed:
- ebp to address incoming arguments
- esi (could be any callee-saved register) to address locals
- esp to address outgoing arguments

We would use esi unconditionally without verifying that it did not
conflict with inline assembly.

This change doesn't do the verification, it simply emits a fatal error
on functions that use stack realignment, dynamic SP adjustments, and
inline assembly.

Because stack realignment is common on Windows, we also no longer assume
that MS inline assembly clobbers esp.  Instead, we analyze the inline
instructions for implicit definitions and check if esp is there.  If so,
we require the use of a base pointer and consider it in the condition
above.

Mostly fixes PR16830, but we could try harder to find a non-conflicting
base pointer.

Reviewers: sunfish

Differential Revision: http://llvm-reviews.chandlerc.com/D1317

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196876 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-10 05:12:23 +00:00
Nadav Rotem
6806e11612 Fix PR18162 - Incorrect assertion assumed that the SDValue resno is zero.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196858 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-10 01:13:59 +00:00
Chad Rosier
e02fa056d9 [AArch64] Refactor the NEON scalar reduce pairwise intrinsics, so that they use
float/double rather than the vector equivalents when appropriate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196833 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-09 22:47:38 +00:00
Chad Rosier
97eda18693 [AArch64] Refactor NEON scalar reduce pairwise front-end codegen to remove
unnecessary patterns in tablegen.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196832 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-09 22:47:34 +00:00
Chad Rosier
6c6344e6a9 [AArch64] Remove q and non-q intrinsic definitions in the NEON scalar reduce
pairwise implementation, using an overloaded definition instead.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196831 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-09 22:47:31 +00:00
Rafael Espindola
5201d61654 Don't add suffixes for stdcall/fastcall on 64 coff.
This matches the behavior of both msvc and mingw.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196814 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-09 20:44:48 +00:00
Ana Pazos
ddf4eb3d03 Fix pattern match for movi with 0D result
Patch by Jiangning Liu.

With some test case changes:
- intrinsic test added to the existing /test/CodeGen/AArch64/neon-aba-abd.ll.
- New test cases to cover movi 1D scenario without using the intrinsic in
test/CodeGen/AArch64/neon-mov.ll.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196806 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-09 19:29:14 +00:00
Daniel Sanders
68138dc9a8 [mips][msa] Fix invalid generated code when lowering FrameIndex involving unaligned offsets.
Summary:
The MSA ld.[bhwd] and st.[bhwd] instructions scale the immediate by the
element size before use as an offset. The offset must therefore be a
multiple of the element size to be valid in these instructions. However,
an unaligned base address is valid in MSA.

This commit causes the compiler to emit valid code when the calculated
offset is not a multiple of the element size by accounting for the offset
using addiu and using a zero offset in the load/store.

Depends on D2338

Reviewers: matheusalmeida

Reviewed By: matheusalmeida

Differential Revision: http://llvm-reviews.chandlerc.com/D2339

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196777 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-09 12:47:12 +00:00
Daniel Sanders
897268d931 [mips][msa] Fix suboptimal FrameIndex lowering for ld.[hwd] and st.[hwd]
Summary:
The immediate in these instructions is scaled before use as an offset.
They therefore have a wider reach than ld.b/st.b.

Reviewers: matheusalmeida

Reviewed By: matheusalmeida

Differential Revision: http://llvm-reviews.chandlerc.com/D2338

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196775 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-09 11:50:16 +00:00
Venkatraman Govindaraju
847b5d976d [SPARCV9]: Adjust the resultant pointer of DYNAMIC_STACKALLOC with the stack BIAS on sparcV9.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196755 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-09 05:13:25 +00:00
Venkatraman Govindaraju
dc50e9af4b [Sparc]: Implement getSetCCResultType() in SparcTargetLowering so that umulo/smulo can be lowered on sparcv9 without an assertion error.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196751 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-09 04:02:15 +00:00
Hao Liu
a339740cb8 [AArch64]Add missing pair intrinsics such as:
int32_t vminv_s32(int32x2_t a)
which should be compiled into SMINP Vd.2S,Vn.2S,Vm.2S


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196749 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-09 03:51:42 +00:00
Hao Liu
2f3f02f6f5 [AArch64]Pattern match failures for truncate store and extend load
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196748 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-09 03:34:08 +00:00
Venkatraman Govindaraju
e0dc442801 [SparcV9]: Expand MULHU/MULHS:i64 and UMUL_LOHI/SMUL_LOHI:i64 on sparcv9.
This fixes PR18150.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196735 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-08 22:06:07 +00:00
Reed Kotler
c9ea75ee5b Cleaning up of prologue/epilogue code for Mips16. First step
here is to make save/restore into variable number of argument instructions.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196726 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-08 16:51:52 +00:00
Tim Northover
7c4342e90b ARM: fix folding of stack-adjustment (yet again).
When trying to eliminate an "sub sp, sp, #N" instruction by folding
it into an existing push/pop using dummy registers, we need to account
for the fact that this might affect precisely how "fp" gets set in the
prologue.

We were attempting this, but assuming that *whenever* we performed a
fold it would make a difference. This is false, for example, in:
    push {r4, r7, lr}
    add fp, sp, #4
    vpush {d8}
    sub sp, sp, #8

we can fold the "sub" into the "vpush", forming "vpush {d7, d8}".
However, in that case the "add fp" instruction mustn't change, which
we were getting wrong before.

Should fix PR18160.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196725 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-08 15:56:50 +00:00
Akira Hatanaka
ca37060166 [mips] Fix test case.
Indent the command lines to indicate they continue from previous lines. Also,
fix incorrect uses of CHECK-DAG and CHECK-NOT.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196636 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-07 02:48:29 +00:00
Vincent Lejeune
d254d3111e Add a RequireStructuredCFG Field to TargetMachine.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196634 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-07 01:49:19 +00:00
Weiming Zhao
1c6611db44 Bug 18149: [AArch32] VSel instructions has no ARMCC field
The current peephole optimizing for compare inst assumes an instr that
uses CPSR has an MO for ARM Cond code.However, for VSEL instructions
(vseqeq, vselgt, vselgt, vselvs), there is no such operand nor do
they support the modification of Cond Code.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196588 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-06 17:56:48 +00:00
Cameron McInally
febc28b529 Update AVX512 vector blend intrinsic names.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196581 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-06 13:35:35 +00:00
Richard Sandiford
9f9758935a [SystemZ] Use LOAD AND TEST for comparisons with -0
...since it os equivalent to comparison with +0.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196580 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-06 09:59:12 +00:00
Richard Sandiford
8bf51dc72b [SystemZ] Extend the use of C(L)GFR
instcombine prefers to put extended operands first, so this patch
handles that case for C(L)GFR.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196579 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-06 09:56:50 +00:00
Richard Sandiford
e3a804ba21 [SystemZ] Optimize selects between 0 and -1
Since z has no setcc instruction as such, the choice of setBooleanContents
is a bit arbitrary.  Currently it's set to ZeroOrOneBooleanContent,
so we produced a branch-free form when selecting between 0 and 1,
but not when selecting between 0 and -1.  This patch handles the latter
case too.

At some point I'd like to measure whether it's better to use conditional
moves for constant selects on z196, but that's future work.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196578 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-06 09:53:09 +00:00
Juergen Ributzka
fca7695903 [Stackmap] Update stackmap unit test to use AnyRegCC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196552 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-06 00:28:54 +00:00
Andrew Trick
573931394f MI-Sched: handle latency of in-order operations with the new machine model.
The per-operand machine model allows the target to define "unbuffered"
processor resources. This change is a quick, cheap way to model stalls
caused by the latency of operations that use such resources. This only
applies when the processor's micro-op buffer size is non-zero
(Out-of-Order). We can't precisely model in-order stalls during
out-of-order execution, but this is an easy and effective
heuristic. It benefits cortex-a9 scheduling when using the new
machine model, which is not yet on by default.

MI-Sched for armv7 was evaluated on Swift (and only not enabled because
of a performance bug related to predication). However, we never
evaluated Cortex-A9 performance on MI-Sched in its current form. This
change adds MI-Sched functionality to reach performance goals on
A9. The only remaining change is to allow MI-Sched to run as a PostRA
pass.

I evaluated performance using a set of options to estimate the performance impact once MI sched is default on armv7:
-mcpu=cortex-a9 -disable-post-ra -misched-bench -scheditins=false

For a simple saxpy loop I see a 1.7x speedup. Here are the llvm-testsuite results:
(min run time over 2 runs, filtering tiny changes)

Speedups:
| Benchmarks/BenchmarkGame/recursive         |  52.39% |
| Benchmarks/VersaBench/beamformer           |  20.80% |
| Benchmarks/Misc/pi                         |  19.97% |
| Benchmarks/Misc/mandel-2                   |  19.95% |
| SPEC/CFP2000/188.ammp                      |  18.72% |
| Benchmarks/McCat/08-main/main              |  18.58% |
| Benchmarks/Misc-C++/Large/sphereflake      |  18.46% |
| Benchmarks/Olden/power                     |  17.11% |
| Benchmarks/Misc-C++/mandel-text            |  16.47% |
| Benchmarks/Misc/oourafft                   |  15.94% |
| Benchmarks/Misc/flops-7                    |  14.99% |
| Benchmarks/FreeBench/distray               |  14.26% |
| SPEC/CFP2006/470.lbm                       |  14.00% |
| mediabench/mpeg2/mpeg2dec/mpeg2decode      |  12.28% |
| Benchmarks/SmallPT/smallpt                 |  10.36% |
| Benchmarks/Misc-C++/Large/ray              |   8.97% |
| Benchmarks/Misc/fp-convert                 |   8.75% |
| Benchmarks/Olden/perimeter                 |   7.10% |
| Benchmarks/Bullet/bullet                   |   7.03% |
| Benchmarks/Misc/mandel                     |   6.75% |
| Benchmarks/Olden/voronoi                   |   6.26% |
| Benchmarks/Misc/flops-8                    |   5.77% |
| Benchmarks/Misc/matmul_f64_4x4             |   5.19% |
| Benchmarks/MiBench/security-rijndael       |   5.15% |
| Benchmarks/Misc/flops-6                    |   5.10% |
| Benchmarks/Olden/tsp                       |   4.46% |
| Benchmarks/MiBench/consumer-lame           |   4.28% |
| Benchmarks/Misc/flops-5                    |   4.27% |
| Benchmarks/mafft/pairlocalalign            |   4.19% |
| Benchmarks/Misc/himenobmtxpa               |   4.07% |
| Benchmarks/Misc/lowercase                  |   4.06% |
| SPEC/CFP2006/433.milc                      |   3.99% |
| Benchmarks/tramp3d-v4                      |   3.79% |
| Benchmarks/FreeBench/pifft                 |   3.66% |
| Benchmarks/Ptrdist/ks                      |   3.21% |
| Benchmarks/Adobe-C++/loop_unroll           |   3.12% |
| SPEC/CINT2000/175.vpr                      |   3.12% |
| Benchmarks/nbench                          |   2.98% |
| SPEC/CFP2000/183.equake                    |   2.91% |
| Benchmarks/Misc/perlin                     |   2.85% |
| Benchmarks/Misc/flops-1                    |   2.82% |
| Benchmarks/Misc-C++-EH/spirit              |   2.80% |
| Benchmarks/Misc/flops-2                    |   2.77% |
| Benchmarks/NPB-serial/is                   |   2.42% |
| Benchmarks/ASC_Sequoia/CrystalMk           |   2.33% |
| Benchmarks/BenchmarkGame/n-body            |   2.28% |
| Benchmarks/SciMark2-C/scimark2             |   2.27% |
| Benchmarks/Olden/bh                        |   2.03% |
| skidmarks10/skidmarks                      |   1.81% |
| Benchmarks/Misc/flops                      |   1.72% |

Slowdowns:
| Benchmarks/llubenchmark/llu                | -14.14% |
| Benchmarks/Polybench/stencils/seidel-2d    |  -5.67% |
| Benchmarks/Adobe-C++/functionobjects       |  -5.25% |
| Benchmarks/Misc-C++/oopack_v1p8            |  -5.00% |
| Benchmarks/Shootout/hash                   |  -2.35% |
| Benchmarks/Prolangs-C++/ocean              |  -2.01% |
| Benchmarks/Polybench/medley/floyd-warshall |  -1.98% |
| Polybench/linear-algebra/kernels/3mm       |  -1.95% |
| Benchmarks/McCat/09-vor/vor                |  -1.68% |

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196516 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-05 17:55:58 +00:00
Justin Holewinski
7add5421a6 [NVPTX] Fix off-by-one error when creating the VT list for an SDNode
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196503 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-05 12:58:00 +00:00
Matheus Almeida
bc7114feab [mips] Small code generation improvement for conditional operator (select)
in case the operands are constants and its difference is |1|.
It should be possible in those cases to rematerialize the result using
MIPS's slt and similar instructions.

The small update to some of the tests in cmov.ll, sel1c.ll and sel2c.ll was needed
otherwise the optimization implemented in this patch would have been triggered
(difference between the operands was 1) and that would have changed the semantic
of the tests.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196498 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-05 12:07:05 +00:00
Tim Northover
52123d1842 ARM: fix yet another stack-folding bug
We were trying to fold the stack adjustment into the wrong instruction in the
situation where the entire basic-block was epilogue code. Really, it can only
ever be valid to do the folding precisely where the "add sp, ..." would be
placed so there's no need for a separate iterator to track that.

Should fix PR18136.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196493 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-05 11:02:02 +00:00
Alp Toker
087ab613f4 Correct word hyphenations
This patch tries to avoid unrelated changes other than fixing a few
hyphen-related ambiguities and contractions in nearby lines.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196471 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-05 05:44:44 +00:00
Rafael Espindola
9155b17815 Hide the stub created for MO_ExternalSymbol too.
given

declare void @llvm.memset.p0i8.i32(i8* nocapture, i8, i32, i32, i1)
declare void @foo()
define void @bar() {
  call void @foo()
  call void @llvm.memset.p0i8.i32(i8* null, i8 0, i32 188, i32 1, i1 false)
  ret void
}

We used to produce

L_foo$stub:
        .indirect_symbol        _foo
        .ascii  "\364\364\364\364\364"

_memset$stub:
        .indirect_symbol        _memset
        .ascii  "\364\364\364\364\364"

We not produce a private stub for memset too.

Stubs are not needed with recent linkers, but we still produce them for darwin8.

Thanks to David Fang for confirming that gcc used to do this too.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196468 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-05 05:19:12 +00:00
Matt Arsenault
87234703e8 R600/SI: Add comments for number of used registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196467 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-05 05:15:35 +00:00
Jiangning Liu
4fd58529ab For AArch64, add missing register cost calculation for big value types like v4i64 and v8i64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196456 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-05 02:12:01 +00:00
Cameron McInally
f6770bcee8 Add FileCheck statements for r196435.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196449 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-05 01:20:36 +00:00
Cameron McInally
6c8faddaf5 Add AVX512 patterns for v16i32 broadcast and v2i64 zero extend load.
Patch by Aleksey Bader.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196435 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-05 00:11:25 +00:00
David Peixotto
0fc8c68b11 Add support for parsing ARM symbol variants on ELF targets
ARM symbol variants are written with parens instead of @ like this:

  .word __GLOBAL_I_a(target1)

This commit adds support for parsing these symbol variants in
expressions. We introduce a new flag to MCAsmInfo that indicates the
parser should use parens to parse the symbol variant. The expression
parser is modified to look for symbol variants using parens instead
of @ when the corresponding MCAsmInfo flag is true.

The MCAsmInfo parens flag is enabled only for ARM on ELF.

By adding this flag to MCAsmInfo, we are able to get rid of
redundant ARM-specific symbol variants and use the generic variants
instead (e.g. VK_GOT instead of VK_ARM_GOT). We use the new
UseParensForSymbolVariant attribute in MCAsmInfo to correctly print
the symbol variants for arm.

To achive this we need to keep a handle to the MCAsmInfo in the
MCSymbolRefExpr class that we can check when printing the symbol
variant.

Updated Tests:
  Changed case of symbol variant to match the generic kind.
  test/CodeGen/ARM/tls-models.ll
  test/CodeGen/ARM/tls1.ll
  test/CodeGen/ARM/tls2.ll
  test/CodeGen/Thumb2/tls1.ll
  test/CodeGen/Thumb2/tls2.ll

PR18080


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196424 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-04 22:43:20 +00:00
Cameron McInally
6d3d93c40b Fix assembly syntax for AVX512 vector blend instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196393 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-04 18:05:36 +00:00
Cameron McInally
80955805e4 Suppress '(x < y) ? a : 0 -> (x < y) & a' transform on X86 architectures with dedicated mask registers.
Patch by Aleksey Bader.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196386 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-04 14:52:33 +00:00
Kevin Qin
dd302615b1 [AArch64 Neon] Add ACLE intrinsic vceqz_f64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196362 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-04 08:02:34 +00:00