Commit Graph

22519 Commits

Author SHA1 Message Date
Tom Stellard
cbf79028c3 R600: Correctly handle vertex fetch clauses the precede ENDIFs
The control flow finalizer would sometimes use an ALU_POP_AFTER
instruction before the vetex fetch clause instead of using a POP
instruction after it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199917 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-23 18:49:31 +00:00
Tom Stellard
01df2fa3c4 R600: Unconditionally unroll loops that contain GEPs with alloca pointers
Implement the getUnrollingPreferences() function for
AMDGPUTargetTransformInfo so that loops that do address calculations
on pointers derived from alloca are unconditionally unrolled.

Unrolling these loops makes it more likely that SROA will be able to
eliminate the allocas, which is a big win for R600 since memory
allocated by alloca (private memory) is really slow.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199916 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-23 18:49:28 +00:00
Andrew Trick
fc7edee631 Move a unit test into the correct dir. Sorry if it broke Mips-only builds.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199911 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-23 17:47:57 +00:00
Rafael Espindola
35b78fd04c Remove tail marker when changing an argument to an alloca.
Argument promotion can replace an argument of a call with an alloca. This
requires clearing the tail marker as it is very likely that the callee is now
using an alloca in the caller.

This fixes pr14710.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199909 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-23 17:19:42 +00:00
Tom Stellard
7d3b9d96b6 R600: Recommit 199842: Add work-around for the CF stack entry HW bug
The unit test is now disabled on non-asserts builds.

The CF stack can be corrupted if you use CF_ALU_PUSH_BEFORE,
CF_ALU_ELSE_AFTER, CF_ALU_BREAK, or CF_ALU_CONTINUE when the number of
sub-entries on the stack is greater than or equal to the stack entry
size and sub-entries modulo 4 is either 0 or 3 (on cedar the bug is
present when number of sub-entries module 8 is either 7 or 0)

We choose to be conservative and always apply the work-around when the
number of sub-enries is greater than or equal to the stack entry size,
so that we can safely over-allocate the stack when we are unsure of the
stack allocation rules.

reviewed-by: Vincent Lejeune <vljn at ovi.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199905 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-23 16:18:02 +00:00
Simon Atanasyan
03b94a460d [Object][ELF][Mips] Print symbol name for MIPS ELF relocations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199898 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-23 15:05:45 +00:00
Elena Demikhovsky
e1a621d84f AVX-512: added VPERM2D VPERM2Q VPERM2PS VPERM2PD instructions,
they give better sequences than VPERMI


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199893 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-23 14:27:26 +00:00
Tim Northover
1334acd8c6 ARM: use litpools for normal i32 imms when compiling minsize.
With constant-sharing, litpool loads consume 4 + N*2 bytes of code, but
movw/movt pairs consume 8*N. This means litpools are better than movw/movt even
with just one use. Other materialisation strategies can still be better though,
so the logic is a little odd.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199891 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-23 13:43:47 +00:00
Artyom Skrobov
93681fa6c6 Prevent repetitive warnings for unrecognized processors and features
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199886 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-23 11:31:38 +00:00
Chandler Carruth
aaf44af769 [LPM] Make LoopSimplify no longer a LoopPass and instead both a utility
function and a FunctionPass.

This has many benefits. The motivating use case was to be able to
compute function analysis passes *after* running LoopSimplify (to avoid
invalidating them) and then to run other passes which require
LoopSimplify. Specifically passes like unrolling and vectorization are
critical to wire up to BranchProbabilityInfo and BlockFrequencyInfo so
that they can be profile aware. For the LoopVectorize pass the only
things in the way are LoopSimplify and LCSSA. This fixes LoopSimplify
and LCSSA is next on my list.

There are also a bunch of other benefits of doing this:
- It is now very feasible to make more passes *preserve* LoopSimplify
  because they can simply run it after changing a loop. Because
  subsequence passes can assume LoopSimplify is preserved we can reduce
  the runs of this pass to the times when we actually mutate a loop
  structure.
- The new pass manager should be able to more easily support loop passes
  factored in this way.
- We can at long, long last observe that LoopSimplify is preserved
  across SCEV. This *halves* the number of times we run LoopSimplify!!!

Now, getting here wasn't trivial. First off, the interfaces used by
LoopSimplify are all over the map regarding how analysis are updated. We
end up with weird "pass" parameters as a consequence. I'll try to clean
at least some of this up later -- I'll have to have it all clean for the
new pass manager.

Next up I discovered a really frustrating bug. LoopUnroll *claims* to
preserve LoopSimplify. That's actually a lie. But the way the
LoopPassManager ends up running the passes, it always ran LoopSimplify
on the unrolled-into loop, rectifying this oversight before any
verification could kick in and point out that in fact nothing was
preserved. So I've added code to the unroller to *actually* simplify the
surrounding loop when it succeeds at unrolling.

The only functional change in the test suite is that we now catch a case
that was previously missed because SCEV and other loop transforms see
their containing loops as simplified and thus don't miss some
opportunities. One test case has been converted to check that we catch
this case rather than checking that we miss it but at least don't get
the wrong answer.

Note that I have #if-ed out all of the verification logic in
LoopSimplify! This is a temporary workaround while extracting these bits
from the LoopPassManager. Currently, there is no way to have a pass in
the LoopPassManager which preserves LoopSimplify along with one which
does not. The LPM will try to verify on each loop in the nest that
LoopSimplify holds but the now-Function-pass cannot distinguish what
loop is being verified and so must try to verify all of them. The inner
most loop is clearly no longer simplified as there is a pass which
didn't even *attempt* to preserve it. =/ Once I get LCSSA out (and maybe
LoopVectorize and some other fixes) I'll be able to re-enable this check
and catch any places where we are still failing to preserve
LoopSimplify. If this causes problems I can back this out and try to
commit *all* of this at once, but so far this seems to work and allow
much more incremental progress.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199884 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-23 11:23:19 +00:00
Hao Liu
fa6e5cb511 [AArch64]Add CHECK for two test cases testing scalar_to_vector committed in r199461.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199861 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-23 02:09:30 +00:00
Jack Carter
2ccf523ce7 [Mips] TargetStreamer Support for .set mips16.
This patch updates .set mips16 support which
affects the ELF ABI and its flags. In addition the patch uses
a common interface for both the MipsTargetSteamer and
MipsObjectStreamer that the assembler uses for
both ELF and ASCII output for these directives.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199851 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 23:08:42 +00:00
Owen Anderson
9920bd341a Revert r162101 and replace it with a solution that works for targets where the pointer type is illegal.
This is a horrible bit of code.  We're calling a simplification routine *in the middle* of type legalization.  We tell the
simplification routine that it's running after legalization, but some of the types it will encounter will be illegal!  The
fix is only to invoke the simplification if the types in question were legal, so that none of its invariants will be violated.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199847 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 22:34:17 +00:00
Matt Arsenault
4c72b6da97 Add CHECK-LABELs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199846 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 22:32:58 +00:00
Tom Stellard
8a4f11e3b6 Revert "R600: Add work-around for the CF stack entry HW bug"
This reverts commit 35b8331cad6eb512a2506adbc394201181da94ba.

The -debug-only flag for llc doesn't appear to be available in
all build configurations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199845 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 22:20:54 +00:00
Rafael Espindola
a73e920f09 Provide a dummy section to fix a crash with inline assembly in LTO.
Fixes pr18508.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199843 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 22:11:14 +00:00
Tom Stellard
efa1355495 R600: Add work-around for the CF stack entry HW bug
The CF stack can be corrupted if you use CF_ALU_PUSH_BEFORE,
CF_ALU_ELSE_AFTER, CF_ALU_BREAK, or CF_ALU_CONTINUE when the number of
sub-entries on the stack is greater than or equal to the stack entry
size and sub-entries modulo 4 is either 0 or 3 (on cedar the bug is
present when number of sub-entries module 8 is either 7 or 0)

We choose to be conservative and always apply the work-around when the
number of sub-enries is greater than or equal to the stack entry size,
so that we can safely over-allocate the stack when we are unsure of the
stack allocation rules.

reviewed-by: Vincent Lejeune <vljn at ovi.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199842 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 21:55:46 +00:00
Tom Stellard
5c0c884e42 R600: Refactor stack size calculation
reviewed-by: Vincent Lejeune <vljn at ovi.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199840 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 21:55:43 +00:00
Matt Arsenault
88a9f0476c Handle an addrspacecast case in memcpyopt
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199836 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 21:53:19 +00:00
Alp Toker
1214e71d77 Eliminate inappropriate use of FindProgramByName() from lli
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199835 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 21:52:35 +00:00
Quentin Colombet
8b594ba851 Add a testcase for r199430.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199831 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 20:11:50 +00:00
Tom Stellard
b1d24c51fc R600: MOVA is vector only
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199827 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 19:24:24 +00:00
Tom Stellard
e7d4e83702 R600: Take alignment into account when calculating the stack offset
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199826 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 19:24:23 +00:00
Tom Stellard
9c3e0ede1d R600: Add support for global addresses with constant initializers
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199825 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 19:24:21 +00:00
Tom Stellard
655ba251b5 R600: Begin private memory at the second GPR.
This way private memory does not over-write work group information
stored in GPRs 0 and 1.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199824 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 19:24:19 +00:00
Tom Stellard
7dd37ae57a R600/SI: Add support for i8 and i16 private loads/stores
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199823 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 19:24:14 +00:00
Matt Arsenault
79e3fb53d6 Bug 18228 - Fix accepting bitcasts between vectors of pointers with a
different number of elements.

Bitcasts were passing with vectors of pointers with different number of
elements since the number of elements was checking
SrcTy->getVectorNumElements() == SrcTy->getVectorNumElements() which
isn't helpful. The addrspacecast was also wrong, but that case at least
is caught by the verifier. Refactor bitcast and addrspacecast handling
in castIsValid to be more readable and fix this problem.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199821 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 19:21:33 +00:00
Greg Fitzgerald
148c7f286c Fix inline assembly that switches between ARM and Thumb modes
This patch restores the ARM mode if the user's inline assembly
does not.  In the object streamer, it ensures that instructions
following the inline assembly are encoded correctly and that
correct mapping symbols are emitted.  For the asm streamer, it
emits a .arm or .thumb directive.

This patch does not ensure that the inline assembly contains
the ADR instruction to switch modes at runtime.

The problem we need to solve is code like this:

  int foo(int a, int b) {
    int r = a + b;
    asm volatile(
        ".align 2     \n"
        ".arm         \n"
        "add r0,r0,r0 \n"
    : : "r"(r));
    return r+1;
  }

If we compile this function in thumb mode then the inline assembly
will switch to arm mode. We need to make sure that we switch back to
thumb mode after emitting the inline assembly or we will incorrectly
encode the instructions that follow (i.e. the assembly instructions
for return r+1).

Based on patch by David Peixotto

Change-Id: Ib57f6d2d78a22afad5de8693fba6230ff56ba48b

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199818 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 18:32:35 +00:00
David Woodhouse
0ff018e500 [x86] Allow segment and address-size overrides for INS[BWLQ] (PR9385)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199809 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 15:08:55 +00:00
David Woodhouse
af588b9f0e [x86] Allow segment and address-size overrides for OUTS[BWLQ] (PR9385)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199808 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 15:08:49 +00:00
David Woodhouse
51cd16cbd5 [x86] Allow segment and address-size overrides for MOVS[BWLQ] (PR9385)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199807 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 15:08:42 +00:00
David Woodhouse
674140fc3e ]x86] Allow segment and address-size overrides for CMPS[BWLQ] (PR9385)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199806 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 15:08:36 +00:00
David Woodhouse
6abfcfe155 [x86] Allow address-size overrides for SCAS{8,16,32,64} (PR9385)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199805 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 15:08:27 +00:00
David Woodhouse
ccbfd5b18a [x86] Allow address-size overrides for STOS[BWLQ] (PR9385)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199804 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 15:08:21 +00:00
David Woodhouse
db9fa461d7 [x86] Allow segment and address-size overrides for LODS[BWLQ] (PR9385)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199803 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 15:08:08 +00:00
Elena Demikhovsky
c75a44cda7 AVX512: combining setcc and zext is wrong on AVX512
because vector compare instruction puts result in mask register.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199798 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 12:26:19 +00:00
James Molloy
bb96dfcf4f MachineCopyPropagation has special logic for removing COPY instructions. It will remove plain COPYs using eraseFromParent(), but if the COPY has imp-defs/imp-uses it will convert it to a KILL, to keep the imp-def around.
This actually totally breaks and causes the machine verifier to cry in several cases, one of which being:

%RAX<def> = COPY %RCX<kill>
%ECX<def> = COPY %EAX<kill>, %RAX<imp-use,kill>

These subregister copies are together identified as noops, so are both removed. However, the second one as it has an imp-use gets converted into a kill:

%ECX<def> = KILL %EAX<kill>, %RAX<imp-use,kill>

As the original COPY has been removed, the verifier goes into tears at the use of undefined EAX and RAX.

There are several hacky solutions to this hacky problem (which is all to do with imp-use/def weirdnesses), but the least hacky I've come up with is to *always* remove COPYs by converting to KILLs. KILLs are no-ops to the code generator so the generated code doesn't change (which is why they were partially used in the first place), but using them also keeps the def/use and imp-def/imp-use chains alive:

%RAX<def> = KILL %RCX<kill>
%ECX<def> = KILL %EAX<kill>, %RAX<imp-use,kill>

The patch passes all test cases including the ones that check the removal of MOVs in this circumstance, along with an extra test I added to check subregister behaviour (which made the machine verifier fall over before my patch).

The patch also adds some DEBUG() statements because the file hadn't got any.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199797 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 09:12:27 +00:00
Kevin Qin
0af7a7db53 [AArch64 NEON] Try to generate CONCAT_VECTOR when lowering BUILD_VECTOR or SHUFFLE_VECTOR.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199791 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 06:11:03 +00:00
Venkatraman Govindaraju
6a0fffd799 [Sparc] Add support for inline assembly constraints which specify registers by their aliases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199786 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 03:18:42 +00:00
Venkatraman Govindaraju
29b1a24a42 [Sparc] Add support for inline assembly constraint 'I'.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199781 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 01:29:51 +00:00
Venkatraman Govindaraju
6220c8f960 [Sparc] Do not add PC to _GLOBAL_OFFSET_TABLE_ address to access GOT in absolute code.
Fixes PR#18521


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199775 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 00:13:18 +00:00
Duncan P. N. Exon Smith
dcaab3b681 CodeGen: Stop treating vectors as aggregates
Fix a crash in SjLjEHPrepare::lowerIncomingArguments caused by treating
VectorType like an aggregate.  It's first-class!

<rdar://problem/15854596>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199768 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-21 22:46:46 +00:00
Chandler Carruth
182a02a8dc Tweak the spelling of the asserts requirement a bit more. This makes it
match the (reasonably prevelant) usage in Clang's test suite and so
seems more "canonical".

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199767 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-21 22:39:19 +00:00
Andrew Trick
10afb02d48 Fix PR18572 - llc crash during GenericScheduler::initPolicy().
Generalized the heuristic that looks at the (very rough) size of the
register file before enabling regpressure tracking.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199766 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-21 21:27:37 +00:00
David Majnemer
ce5f07f33c Forgot to add testcase for r198590
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199765 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-21 20:39:11 +00:00
Hal Finkel
da3446099d Fix pointer info on PPC byval stores
For PPC64 SVR (and Darwin), the stores that take byval aggregate parameters
from registers into the stack frame had MachinePointerInfo objects with
incorrect offsets. These offsets are relative to the object itself, not to the
stack frame base.

This fixes self hosting on PPC64 when compiling with -enable-aa-sched-mi.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199763 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-21 20:15:58 +00:00
Justin Holewinski
c93ed1707a [NVPTX] Add missing patterns for div.approx with immediate denominator
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199746 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-21 14:40:05 +00:00
Saleem Abdulrasool
d28c094c80 tools: support decoding ARM EHABI opcodes in readobj
Add support to llvm-readobj to decode the actual opcodes.  The ARM EHABI opcodes
are a variable length instruction set that describe the operations required for
properly unwinding stack frames.

The primary motivation for this change is to ease the creation of tests for the
ARM EHABI object emission as well as the unwinding directive handling in the ARM
IAS.

Thanks to Logan Chien for an extra test case!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199708 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-21 02:33:15 +00:00
Saleem Abdulrasool
e502a6aad3 ARM IAS: add support for .unwind_raw directive
This implements the unwind_raw directive for the ARM IAS.  The unwind_raw
directive takes the form of a stack offset value followed by one or more bytes
representing the opcodes to be emitted.  The opcode emitted will interpreted as
if it were assembled by the opcode assembler via the standard unwinding
directives.

Thanks to Logan Chien for an extra test!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199707 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-21 02:33:10 +00:00
Saleem Abdulrasool
27276437ae ARM IAS: support .personalityindex
The .personalityindex directive is equivalent to the .personality directive with
the ARM EABI personality with the specific index (0, 1, 2).  Both of these
directives indicate personality routines, so enhance the personality directive
handling to take into account personalityindex.

Bonus fix: flush the UnwindContext at the beginning of a new function.

Thanks to Logan Chien for additional tests!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199706 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-21 02:33:02 +00:00
Kevin Qin
9fe8c2b527 [AArch64 NEON] Fix a bug caused by undef lane when generating VEXT.
It was commited as r199628 but reverted in r199628 as causing
regression test failed. It's because of old vervsion of patch
I used to commit. Sorry for mistake.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199704 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-21 01:48:52 +00:00
Andrea Di Biagio
825b93b2df [X86] Teach how to combine a vselect into a movss/movsd
Add target specific rules for combining vselect dag nodes into movss/movsd
when possible.

If the vector type of the vselect dag node in input is either MVT::v4i13 or
MVT::v4f32, then try to fold according to rules:

  1) fold (vselect (build_vector (0, -1, -1, -1)), A, B) -> (movss A, B)
  2) fold (vselect (build_vector (-1, 0, 0, 0)), A, B) -> (movss B, A)

If the vector type of the vselect dag node in input is either MVT::v2i64 or
MVT::v2f64 (and we have SSE2), then try to fold according to rules:

  3) fold (vselect (build_vector (0, -1)), A, B) -> (movsd A, B)
  4) fold (vselect (build_vector (-1, 0)), A, B) -> (movsd B, A)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199683 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-20 19:35:22 +00:00
Adrian Prantl
16d00e4b64 Debug info: On ARM ensure that all __TEXT sections come before the
optional DWARF sections, so compiling with -g does not result in
different code being generated for PC-relative loads.

This is reapplying a diet r197922 (__TEXT-only).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199681 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-20 19:15:59 +00:00
Adrian Prantl
6e08a410aa Revert "Debug info: On ARM ensure that the data sections come before the"
Cut back on the cargo cult. The order of __DATA sections doesn't affect
generated code.

This reverts commit r197922.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199680 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-20 19:15:55 +00:00
James Molloy
61a7bb039a Remove the useless pseudo instructions VDUPfdf and VDUPfqf, replacing them with patterns to match VDUPLN.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199675 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-20 17:14:48 +00:00
Hal Finkel
54c251c77f Fix misched-aa-colored.ll to require asserts (trying again)
Perhaps it needs to be in caps.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199661 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-20 14:15:28 +00:00
Hal Finkel
42691ad862 Fix misched-aa-colored.ll to require asserts.
-misched=shuffle is NDEBUG only. Maybe we should change that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199659 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-20 14:09:34 +00:00
Hal Finkel
48f7a2389e Update IR when merging slots in stack coloring
The way that stack coloring updated MMOs when merging stack slots, while
correct, is suboptimal, and is incompatible with the use of AA during
instruction scheduling. The solution, which involves the use of const_cast (and
more importantly, updating the IR from within an MI-level pass), obviously
requires some explanation:

When the stack coloring pass was originally committed, the code in
ScheduleDAGInstrs::buildSchedGraph tracked possible alias sets by using
GetUnderlyingObject, and all load/store and store/store memory control
dependencies where added between SUs at the object level (where only one
object, that returned by GetUnderlyingObject, was used to identify the object
associated with each MMO). When stack coloring merged stack slots, it would
replace MMOs derived from the remapped alloca with the alloca with which the
remapped alloca was being replaced. Because ScheduleDAGInstrs only used single
objects, and tracked alias sets at the object level, this was a fine solution.

In r169744, (Andy and) I updated the code in ScheduleDAGInstrs to use
GetUnderlyingObjects, and track alias sets using, potentially, multiple
underlying objects for each MMO. This was done, primarily, to provide the
ability to look through PHIs, and provide better scheduling for
induction-variable-dependent loads and stores inside loops. At this point, the
MMO-updating code in stack coloring became suboptimal, because it would clear
the MMOs for (i.e. completely pessimize) all instructions for which r169744
might help in scheduling. Updating the IR directly is the simplest fix for this
(and the one with, by far, the least compile-time impact), but others are
possible (we could give each MMO a small vector of potential values, or make
use of a remapping table, constructed from MFI, inside ScheduleDAGInstrs).

Unfortunately, replacing all MMO values derived from the remapped alloca with
the base replacement alloca fundamentally breaks our ability to use AA during
instruction scheduling (which is critical to performance on some targets). The
reason is that the original MMO might have had an offset (either constant or
dynamic) from the base remapped alloca, and that offset is not present in the
updated MMO. One possible way around this would be to use
GetPointerBaseWithConstantOffset, and update not only the MMO's value, but also
its offset based on the original offset. Unfortunately, this solution would
only handle constant offsets, and for safety (because AA is not completely
restricted to deducing relationships with constant offsets), we would need to
clear all MMOs without constant offsets over the entire function. This would be
an even worse pessimization than the current single-object restriction. Any
other solution would involve passing around a vector of remapped allocas, and
teaching AA to use it, introducing additional complexity and overhead into AA.

Instead, when remapping an alloca, we replace all IR uses of that alloca as
well (optionally inserting a bitcast as necessary). This is even more efficient
that the old MMO-updating code in the stack coloring pass (because it removes
the need to call GetUnderlyingObject on all MMO values), removes the
single-object pessimization in the default configuration, and enables the
correct use of AA during instruction scheduling (all without any additional
overhead).

LLVM now no longer miscompiles itself on x86_64 when using -enable-misched
-enable-aa-sched-mi -misched-bottomup=0 -misched-topdown=0 -misched=shuffle!
Fixed PR18497.

Because the alloca replacement is now done at the IR level, unless the MMO
directly refers to the remapped alloca, the change cannot be seen at the MI
level. As a result, there is no good way to fix test/CodeGen/X86/pr14090.ll.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199658 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-20 14:03:16 +00:00
David Woodhouse
9334b07527 [x86] Fix disassembly of MOV16ao16 et al.
The addition of IC_OPSIZE_ADSIZE in r198759 wasn't quite complete. It
also turns out to have been unnecessary. The disassembler handles the
AdSize prefix for itself, and doesn't care about the difference between
(e.g.) MOV8ao8 and MOB8ao8_16 definitions. So just let them coexist and
don't worry about it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199654 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-20 12:02:53 +00:00
David Woodhouse
a3fb0f9773 [x86] Fix 16-bit disassembly of JCXZ/JECXZ
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199653 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-20 12:02:48 +00:00
David Woodhouse
fc19ac9654 [x86] Rename MOVSD/STOSD/LODSD/OUTSD to MOVSL/STOSL/LODSL/OUTSL
The disassembler has a special case for 'L' vs. 'W' in its heuristic for
checking for 32-bit and 16-bit equivalents. We could expand the heuristic,
but better just to be consistent in using the 'L' suffix.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199652 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-20 12:02:44 +00:00
David Woodhouse
d1c3f6664e [x86] Fix disassembly of callw instruction
Not quite sure why this was marked isAsmParserOnly, but it means that the
disassembler can't see it either.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199651 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-20 12:02:40 +00:00
David Woodhouse
7360e8caa3 [x86] Fix 16-bit handling of OpSize bit
When disassembling in 16-bit mode the meaning of the OpSize bit is
inverted. Instructions found in the IC_OPSIZE context will actually
*not* have the 0x66 prefix, and instructions in the IC context will
have the 0x66 prefix. Make use of the existing special-case handling
for the 0x66 prefix being in the wrong place, to cope with this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199650 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-20 12:02:35 +00:00
David Woodhouse
70ece0ada7 [x86] Support i386-*-*-code16 triple for emitting 16-bit code
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199648 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-20 12:02:25 +00:00
Chandler Carruth
1d9ab25560 [PM] Wire up the Verifier for the new pass manager and connect it to the
various opt verifier commandline options.

Mostly mechanical wiring of the verifier to the new pass manager.
Exercises one of the more unusual aspects of it -- a pass can be either
a module or function pass interchangably. If this is ever problematic,
we can make things more constrained, but for things like the verifier
where there is an "obvious" applicability at both levels, it seems
convenient.

This is the next-to-last piece of basic functionality left to make the
opt commandline driving of the new pass manager minimally functional for
testing and further development. There is still a lot to be done there
(notably the factoring into .def files to kill the current boilerplate
code) but it is relatively uninteresting. The only interesting bit left
for minimal functionality is supporting the registration of analyses.
I'm planning on doing that on top of the .def file switch mostly because
the boilerplate for the analyses would be significantly worse.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199646 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-20 11:34:08 +00:00
Kai Nacke
843fa74d38 ARM: add tlsldo relocation
Add support for the symbol(tlsldo) relocation. This is required in order to 
solve PR18554.

Reviewed by R. Golin, A. Korobeynikov.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199644 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-20 11:00:40 +00:00
Artyom Skrobov
3767c7446e [ARM] Do not generate Tag_DIV_use=AllowDIVExt when hardware div is non-optional: it should have the default value of AllowDIVIfExists
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199638 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-20 10:18:42 +00:00
Chandler Carruth
ce30a8106d Revert r199628: "[AArch64 NEON] Fix a bug caused by undef lane when generating VEXT."
This test fails the newly added regression tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199631 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-20 08:18:01 +00:00
Owen Anderson
1e1446bf84 Fix all the remaining lost-fast-math-flags bugs I've been able to find. The most important of these are cases in the generic logic for combining BinaryOperators.
This logic hadn't been updated to handle FastMathFlags, and it took me a while to detect it because it doesn't show up in a simple search for CreateFAdd.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199629 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-20 07:44:53 +00:00
Kevin Qin
f55ec9ac18 [AArch64 NEON] Fix a bug caused by undef lane when generating VEXT.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199628 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-20 07:32:26 +00:00
Kevin Qin
7582d8d76f [AArch64 NEON] Accept both #0.0 and #0 for comparing with floating point zero in asm parser.
For FCMEQ, FCMGE, FCMGT, FCMLE and FCMLT, floating point zero will be
printed as #0.0 instead of #0. To support the history codes using #0,
we consider to let asm parser accept both #0.0 and #0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199621 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-20 02:14:05 +00:00
Benjamin Kramer
b45edea9b3 InstCombine: Modernize a bunch of cast combines.
Also make them vector-aware.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199608 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-19 20:05:13 +00:00
Benjamin Kramer
c7645e860a InstCombine: Replace a hand-rolled version of isKnownToBeAPowerOfTwo with the real thing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199604 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-19 16:48:41 +00:00
Benjamin Kramer
0487faa97b InstCombine: Teach most integer add/sub/mul/div combines how to deal with vectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199602 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-19 15:24:22 +00:00
Benjamin Kramer
3f6a9d705a InstCombine: Refactor fmul/fdiv combines to handle vectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199598 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-19 13:36:27 +00:00
Chandler Carruth
e1a5243053 Fix a really nasty SROA bug with how we handled out-of-bounds memcpy
intrinsics.

Reported on the list by Evan with a couple of attempts to fix, but it
took a while to dig down to the root cause. There are two overlapping
bugs here, both centering around the circumstance of discovering
a memcpy operand which is known to be completely outside the bounds of
the alloca.

First, we need to kill the *other* side of the memcpy if it was added to
this alloca. Otherwise we'll factor it into our slicing and try to
rewrite it even though we know for a fact that it is dead. This is made
more tricky because we can visit the sides in either order. So we have
to both kill the other side and skip instructions marked as dead. The
latter really should be goodness in every case, but here is a matter of
correctness.

Second, we need to actually remove the *uses* of the alloca by the
memcpy when queuing it for later deletion. Otherwise it may still be
using the alloca when we go to promote it (if the rewrite re-uses the
existing alloca instruction). Do this by factoring out the
use-clobbering used when for nixing a Phi argument and re-using it
across the operands of a to-be-deleted instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199590 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-19 12:16:54 +00:00
Saleem Abdulrasool
d0fb7e49cc ARM ELF: ensure that the tag types are corrected
Ensure that the tag types are reflected on a replacement.  This is particularly
important for the compatibility tag which has multiple representations where the
last definition wins.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199577 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-19 08:25:41 +00:00
Saleem Abdulrasool
70c092f3ec ARM: update build attributes for ABI r2.09
Update names for the names as per the current ABI errata.  Mark deprecated tags
as such.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199576 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-19 08:25:35 +00:00
Arnold Schwaighofer
2becaaf3a1 LoopVectorizer: A reduction that has multiple uses of the reduction value is not
a reduction.

Really. Under certain circumstances (the use list of an instruction has to be
set up right - hence the extra pass in the test case) we would not recognize
when a value in a potential reduction cycle was used multiple times by the
reduction cycle.

Fixes PR18526.
radar://15851149

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199570 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-19 03:18:31 +00:00
Chandler Carruth
e608d695de [PM] Make the verifier work independently of any pass manager.
This makes the 'verifyFunction' and 'verifyModule' functions totally
independent operations on the LLVM IR. It also cleans up their API a bit
by lifting the abort behavior into their clients and just using an
optional raw_ostream parameter to control printing.

The implementation of the verifier is now just an InstVisitor with no
multiple inheritance. It also is significantly more const-correct, and
hides the const violations internally. The two layers that force us to
break const correctness are building a DomTree and dispatching through
the InstVisitor.

A new VerifierPass is used to implement the legacy pass manager
interface in terms of the other pieces.

The error messages produced may be slightly different now, and we may
have slightly different short circuiting behavior with different usage
models of the verifier, but generally everything works equivalently and
this unblocks wiring the verifier up to the new pass manager.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199569 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-19 02:22:18 +00:00
Nick Lewycky
6d2bd95ff1 Don't refuse to transform constexpr(call(arg, ...)) to call(constexpr(arg), ...)) just because the function has multiple return values even if their return types are the same. Patch by Eduard Burtescu!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199564 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-18 22:47:12 +00:00
Benjamin Kramer
c975958498 ARM: Let the assembler reject v5 instructions in v4 mode.
PR18524.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199559 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-18 19:03:19 +00:00
NAKAMURA Takumi
ee760949ab [CMake] Add llvm-tblgen to dependencies of check-llvm.
llvm-tblgen is not built when external LLVM_TABLEGEN is specified.
Even then, llvm-tblgen should be built for testing tblgen itself.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199558 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-18 19:01:08 +00:00
Benjamin Kramer
8e937c39bb InstCombine: Make the (fmul X, -1.0) -> (fsub -0.0, X) transform handle vectors too.
PR18532.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199553 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-18 16:43:14 +00:00
Adrian Prantl
f9af5bdd63 Debug info (LTO): Move the creation of accessibility flags to
getOrCreateSubprogramDIE to avoid attributes being added twice when DIEs
are merged.

rdar://problem/15842330.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199536 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-18 02:12:00 +00:00
Owen Anderson
774cec5748 Fix more instances of dropped fast math flags when optimizing FADD instructions. All found by inspection (aka grep).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199528 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-18 00:48:14 +00:00
Reid Kleckner
3cbfa1617f Add an inalloca flag to allocas
Summary:
The only current use of this flag is to mark the alloca as dynamic, even
if its in the entry block.  The stack adjustment for the alloca can
never be folded into the prologue because the call may clear it and it
has to be allocated at the top of the stack.

Reviewers: majnemer

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D2571

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199525 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-17 23:58:17 +00:00
Rui Ueyama
9106d365f5 llvm-objdump/COFF: Print ordinal base number.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199518 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-17 22:02:24 +00:00
Juergen Ributzka
ceaf829339 Add two new calling conventions for runtime calls
This patch adds two new target-independent calling conventions for runtime
calls - PreserveMost and PreserveAll.
The target-specific implementation for X86-64 is defined as following:
  - Arguments are passed as for the default C calling convention
  - The same applies for the return value(s)
  - PreserveMost preserves all GPRs - except R11
  - PreserveAll preserves all GPRs and all XMMs/YMMs - except R11

Reviewed by Lang and Philip

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199508 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-17 19:47:03 +00:00
Daniel Sanders
41e1c04201 [mips][msa] Correct pattern for LSA
Summary:
$rs and $rt were the wrong way round in the .td and the testcase wasn't
strict enough to detect the mistake.

Reviewers: matheusalmeida

Reviewed By: matheusalmeida

Differential Revision: http://llvm-reviews.chandlerc.com/D2554

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199498 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-17 15:40:05 +00:00
Renato Golin
3e2346341c Add MLA alias for ARMv4 support.
Fix MLA defs to use register class GPRnopc.
Add encoding tests for multiply instructions.
(Alias for MUL/SMLAL/UMLAL added by r199026.)

Patch by Zhaoshi.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199491 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-17 13:53:08 +00:00
Kostya Serebryany
bd2c711cdd [asan] extend asan-coverage (still experimental).
- add a mode for collecting per-block coverage (-asan-coverage=2).
   So far the implementation is naive (all blocks are instrumented),
   the performance overhead on top of asan could be as high as 30%.
 - Make sure the one-time calls to __sanitizer_cov are moved to function buttom,
   which in turn required to copy the original debug info into the call insn.

Here is the performance data on SPEC 2006
(train data, comparing asan with asan-coverage={0,1,2}):

                             asan+cov0     asan+cov1      diff 0-1    asan+cov2       diff 0-2      diff 1-2
       400.perlbench,        65.60,        65.80,         1.00,        76.20,         1.16,         1.16
           401.bzip2,        65.10,        65.50,         1.01,        75.90,         1.17,         1.16
             403.gcc,         1.64,         1.69,         1.03,         2.04,         1.24,         1.21
             429.mcf,        21.90,        22.60,         1.03,        23.20,         1.06,         1.03
           445.gobmk,       166.00,       169.00,         1.02,       205.00,         1.23,         1.21
           456.hmmer,        88.30,        87.90,         1.00,        91.00,         1.03,         1.04
           458.sjeng,       210.00,       222.00,         1.06,       258.00,         1.23,         1.16
      462.libquantum,         1.73,         1.75,         1.01,         2.11,         1.22,         1.21
         464.h264ref,       147.00,       152.00,         1.03,       160.00,         1.09,         1.05
         471.omnetpp,       115.00,       116.00,         1.01,       140.00,         1.22,         1.21
           473.astar,       133.00,       131.00,         0.98,       142.00,         1.07,         1.08
       483.xalancbmk,       118.00,       120.00,         1.02,       154.00,         1.31,         1.28
            433.milc,        19.80,        20.00,         1.01,        20.10,         1.02,         1.01
            444.namd,        16.20,        16.20,         1.00,        17.60,         1.09,         1.09
          447.dealII,        41.80,        42.20,         1.01,        43.50,         1.04,         1.03
          450.soplex,         7.51,         7.82,         1.04,         8.25,         1.10,         1.05
          453.povray,        14.00,        14.40,         1.03,        15.80,         1.13,         1.10
             470.lbm,        33.30,        34.10,         1.02,        34.10,         1.02,         1.00
         482.sphinx3,        12.40,        12.30,         0.99,        13.00,         1.05,         1.06


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199488 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-17 11:00:30 +00:00
Kevin Qin
b9536ac581 [AArch64 NEON] Expand vector for UDIV/SDIV/UREM/SREM/FREM as neon doesn't support these operations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199485 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-17 09:54:30 +00:00
Craig Topper
50a2b1672d Teach x86 asm parser to handle 'opaque ptr' in Intel syntax.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199477 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-17 07:44:10 +00:00
Craig Topper
9d0b786f72 Teach X86 asm parser to understand 'ZMMWORD PTR' in Intel syntax.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199476 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-17 07:37:39 +00:00
Craig Topper
5d59bb44ee Fix intel syntax for 64-bit version of FXSAVE/FXRSTOR to use '64' suffix instead of 'q'
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199474 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-17 07:25:39 +00:00
Hao Liu
84887ceca3 [AArch64]Fix the problem can't select f16_to_f32 and f32_to_f16.
Also add copy support for FPR16.
Also add a missing test case file belongs to commit r197361.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199463 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-17 06:23:30 +00:00
Kevin Qin
16511208f2 [AArch64 NEON] Custom lower conversion between vector integer and vector floating point if element bit-width doesn't match.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199462 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-17 05:52:35 +00:00
Hao Liu
555f57f67b [AArch64]Fix the problem can't select concat_vectors of two v1i32 types.
Also fix the problem can't select scalar_to_vector from f32 to v2f32/v4f32.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199461 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-17 05:44:46 +00:00
Reid Kleckner
ad60d3c304 Change inalloca rules to make it only apply to the last parameter
This makes things a lot easier, because we can now talk about the
"argument allocation", which allocates all the memory for the call in
one shot.

The only functional change is to the verifier for a feature that hasn't
shipped yet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199434 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-16 22:59:24 +00:00