Commit Graph

24917 Commits

Author SHA1 Message Date
Justin Holewinski
3c81367a5d [NVPTX] Error out if initializer is given for variable in an address space that does not support initialization
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211943 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:36:01 +00:00
Justin Holewinski
0ded57ccc5 [NVPTX] Add support for .managed variables for UVM
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211942 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:35:58 +00:00
Justin Holewinski
2a8dc35cca [NVPTX] Emit .weak linkage for link_once, weak, available_externally, and common linkage
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211941 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:35:56 +00:00
Justin Holewinski
cb8f98382b [NVPTX] Fix handling of ldg/ldu intrinsics.
The address space of the pointer must be global (1) for these intrinsics.  There must also be alignment metadata attached to the intrinsic calls, e.g.

%val = tail call i32 @llvm.nvvm.ldu.i.global.i32.p1i32(i32 addrspace(1)* %ptr), !align !0

!0 = metadata !{i32 4}

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211939 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:35:51 +00:00
Justin Holewinski
8992274412 [NVPTX] Clean up argument lowering code and properly handle alignment for structs and vectors
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211938 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:35:44 +00:00
Justin Holewinski
863b0d45a5 [NVPTX] Add support for [SHL,SRA,SRL]_PARTS
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211936 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:35:40 +00:00
Justin Holewinski
10da1651ed [NVPTX] Implement fma and imad contraction as target DAGCombiner patterns
This also introduces DAGCombiner patterns for mul.wide to multiply two smaller integers and produce a larger integer

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211935 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:35:37 +00:00
Justin Holewinski
508c80f11f [NVPTX] Add support for efficient rotate instructions on SM 3.2+
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211934 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:35:33 +00:00
Justin Holewinski
1f75f4a0ee [NVPTX] Add missing isel patterns for 64-bit atomics
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211933 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:35:30 +00:00
Justin Holewinski
ef92cf50d6 [NVPTX] Add isel patterns for bit-field extract (bfe)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211932 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:35:27 +00:00
Justin Holewinski
de7bbdff33 [NVPTX] Add support for isspacep instruction
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211931 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:35:24 +00:00
Justin Holewinski
1571d272c8 [NVPTX] Add support for envreg reads
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211930 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:35:21 +00:00
Justin Holewinski
a54609ed93 [NVPTX] Emit .weak when linkage is not external, internal, or private
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211926 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:35:10 +00:00
Chandler Carruth
75504d45ec [x86] Fix a miscompile in the new shuffle lowering uncovered by
a bootstrap.

I managed to mis-remember how PACKUS worked on x86, and was using undef
for the high bytes instead of zero. The fix is fairly obvious.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211922 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:25:23 +00:00
David Majnemer
c8a1169c93 IR: Add COMDATs to the IR
This new IR facility allows us to represent the object-file semantic of
a COMDAT group.

COMDATs allow us to tie together sections and make the inclusion of one
dependent on another. This is required to implement features like MS
ABI VFTables and optimizing away certain kinds of initialization in C++.

This functionality is only representable in COFF and ELF, Mach-O has no
similar mechanism.

Differential Revision: http://reviews.llvm.org/D4178

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211920 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:19:56 +00:00
David Blaikie
e70cdf9468 Fix test so it doesn't try to write out temporary files into the test tree.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211916 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 17:45:43 +00:00
David Majnemer
d7d427732e MC: Fix associative sections on COFF
COFF sections in MC were represented by a tuple of section-name and
COMDAT-name.  This is not sufficient to represent a .text section
associated with another .text section; we need a way to distinguish
between the key section and the one marked associative.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211913 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 17:19:44 +00:00
Matt Arsenault
ee5d4a7b73 R600: Don't crash on unhandled instruction in promote alloca
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211906 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 16:52:49 +00:00
Ulrich Weigand
1edaab996f [PowerPC] Constrain base register in PPCRegisterInfo::resolveFrameIndex
I've run into a bug where current LLVM at -O0 (with fast-isel)
generated invalid code like:

        ld 0, 20936(1)                  # 8-byte Folded Reload
        stw 12, 10348(0)
        stw 12, 10344(0)

The underlying vreg had been introduced as base register by the
Local Stack Slot Allocation pass.  That register was constrained
to G8RC by PPCRegisterInfo::materializeFrameBaseRegister to match
the ADDI instruction used to set it, but it was *not* constrained
to G8RC_NOX0 to fit the *use* of the register in an address.

That should have happened in PPCRegisterInfo::resolveFrameIndex.
This patch adds an appropriate constrainRegClass call.

Reviewed by Hal Finkel.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211897 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 13:04:12 +00:00
Chandler Carruth
c5114dbcc3 [x86] Teach the target combine step to aggressively fold pshufd insturcions.
Summary:
This allows it to fold pshufd instructions across intervening
half-shuffles and other noise. This pattern actually shows up in the
generic lowering tests, but I've also added direct tests using
intrinsics to make sure that the specific desired functionality is
working even if the lowering stuff changes in the future.

Differential Revision: http://reviews.llvm.org/D4292

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211892 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 11:40:13 +00:00
Simon Atanasyan
fc9897f66f [ELF][Mips] Fix recognition of MIPS 64-bit arch in the ELFObjectFile:getArch() method.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211891 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 11:36:45 +00:00
Chandler Carruth
4363b0729b [x86] Teach the target-specific combining how to aggressively fold
half-shuffles, even looking through intervening instructions in a chain.

Summary:
This doesn't happen to show up with any test cases I've found for the current
shuffle lowering, but previous attempts would benefit from this and it seems
generally useful. I've tested it directly using intrinsics, which also shows
that it will work with hand vectorized code as well.

Note that even though pshufd isn't directly used in these tests, it gets
exercised because we combine some of the half shuffles into a pshufd
first, and then merge them.

Differential Revision: http://reviews.llvm.org/D4291

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211890 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 11:34:40 +00:00
Chandler Carruth
f91161874e [x86] Teach the X86 backend to DAG-combine SSE2 shuffles that are
trivially redundant.

This fixes several cases in the new vector shuffle lowering algorithm
which would generate redundant shuffle instructions for the sake of
simplicity.

I'm also deleting a testcase which was somewhat ridiculous. It was
checking for a bug in 2007 about incorrectly transforming shuffles by
looking for the string "-86" in the output of a pretty substantial
function. This test case doesn't seem to have any value at this point.

Differential Revision: http://reviews.llvm.org/D4240

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211889 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 11:27:52 +00:00
Chandler Carruth
050d187bc8 [x86] Begin a significant overhaul of how vector lowering is done in the
x86 backend.

This sketches out a new code path for vector lowering, hidden behind an
off-by-default flag while it is under development. The fundamental idea
behind the new code path is to aggressively break down the problem space
in ways that ease selecting the odd set of instructions available on
x86, and carefully avoid scalarizing code even when forced to use older
ISAs. Notably, this starts off restricting itself to SSE2 and implements
the complete vector shuffle and blend space for 128-bit vectors in SSE2
without scalarizing. The plan is to layer on top of this ISA extensions
where we can bail out of the complex SSE2 lowering and opt for
a cheaper, specialized instruction (or set of instructions). It also
needs to be generalized to AVX and AVX512 vector widths.

Currently, this does a decent but not perfect job for SSE2. There are
some specific shortcomings that I plan to address:
- We need a peephole combine to fold together shuffles where possible.
  There are cases where a previous shuffle could be modified slightly to
  arrange for elements to be in the correct position and a later shuffle
  eliminated. Doing this eagerly added quite a bit of complexity, and
  so my plan is to combine away these redundancies afterward.
- There are a lot more clever ways to use unpck and pack that need to be
  added. This is essential for real world shuffles as it turns out...

Once SSE2 is polished a bit I should be able to get interesting numbers
on performance improvements on benchmarks conducive to vectorization.
All of this will be off by default until it is functionally equivalent
of course.

Differential Revision: http://reviews.llvm.org/D4225

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211888 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 11:23:44 +00:00
Dinesh Dwivedi
22e371c74e Added instruction combine to transform few more negative values addition to subtraction (Part 3)
This patch enables transforms for

(x + (~(y | c) + 1) --> x - (y | c) if c is odd

Differential Revision: http://reviews.llvm.org/D4210



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211881 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 07:47:35 +00:00
David Majnemer
d023a353c1 GlobalOpt: Fix constantfold-initializers.ll test
The test added in r211762 was sloppy, the correct initializer wasn't
added to @llvm.global_ctors

Spotted by Pasi Parviainen!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211879 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 07:36:26 +00:00
David Blaikie
0263debd04 Revert "Revert "Revert "PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location."""
Reverting this again, didn't mean to commit it - while r211872 fixes one
of the issues here, there are still others to figure out and address.

This reverts commit r211871.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211873 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 05:34:05 +00:00
David Blaikie
effea626e2 ArgumentPromotion: Propagate debug locations on calls for which arguments are promoted.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211872 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 05:32:09 +00:00
David Blaikie
150f5b8920 Revert "Revert "PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location.""
This reverts commit r211724.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211871 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 05:31:49 +00:00
Andrew Trick
e8f8db1c5a MachineScheduler: add some book-keeping to fix an assert.
Fixe for Bug 20057 - Assertion failied in llvm::SUnit* llvm::SchedBoundary::pickOnlyChoice(): Assertion `i <= (HazardRec->getMaxLookAhead() + MaxObservedStall) && "permanent hazard"'

Thanks to Chad for the test case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211865 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 04:57:05 +00:00
Matt Arsenault
88a3c72e25 R600: Add some testcases for promote alloca pass.
More complicated GEPs are skipped. Add some tests to
actually stress this skipping.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211859 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 03:55:55 +00:00
Adam Nemet
a50d4dd9b0 [X86] AVX512: Add vbroadcasti*
For now I used a separate template for these sub-vector/tuple broadcasts
rather than sharing the mem variants with avx512_int_broadcast_rm.

<rdar://problem/17402869>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211828 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 00:43:38 +00:00
Juergen Ributzka
b1b6d10d09 [StackMaps] Enable patchpoint liveness analysis per default.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211817 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-26 23:39:52 +00:00
Juergen Ributzka
307a6447e5 [Stackmaps] Remove the liveness calculation for stackmap intrinsics.
There is no need to calculate the liveness information for stackmaps. The
liveness information is still available for the patchpoint intrinsic and
that is also the intended usage model.

Related to <rdar://problem/17473725>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211816 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-26 23:39:44 +00:00
Arnold Schwaighofer
c2d93c4048 GVN: Preserve invariant.load metadata
If both instructions to be replaced are marked invariant the resulting
instruction is invariant.

rdar://13358910

Fix by Erik Eckstein!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211801 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-26 19:51:19 +00:00
Matt Arsenault
3cd8cf6bbd R600/SI: Add FP mode bits to binary.
The default rounding mode to initialize the mode register needs
to be reported to the runtime. Fill in other bits a kernel
may be interested in setting for future use.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211791 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-26 17:22:30 +00:00
Renato Golin
7eacf4e813 Added parsing co-processor names starting with "cr"
Additional compliant GAS names for coprocessor register name
are enabled for all instruction with parameter MCK_CoprocReg:
LDC,LDC2,STC,STC2,CDP,CDP2,MCR,MCR2,MCRR,MCRR2,MRC,MRC2,MRRC,MRRC2

Patch by Andrey Kuharev.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211776 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-26 13:10:53 +00:00
Andrea Di Biagio
817812f61c [X86] Improve the selection of SSE3/AVX addsub instructions.
This patch teaches the backend how to canonicalize a shuffle vectors
according to the rule:

 - (shuffle (FADD A, B), (FSUB A, B), Mask) ->
       (shuffle (FSUB A, -B), (FADD A, -B), Mask)

Where 'Mask' is:
  <0,5,2,7>            ;; for v4f32 and v4f64 shuffles.
  <0,3>                ;; for v2f64 shuffles.
  <0,9,2,11,4,13,6,15> ;; for v8f32 shuffles.

In general, ISel only knows how to pattern-match a canonical
'fadd + fsub + blendi' dag node sequence into an ADDSUB instruction.

This new rule allows to convert a non-canonical dag sequence into a
canonical one that will be matched by a single ADDSUB at ISel stage.

The idea of converting a non-canonical ADDSUB into a canonical one by
swapping the first two operands of the shuffle, and then negating the
second operand of the FADD and FSUB, was originally proposed by Hal Finkel.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211771 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-26 10:45:21 +00:00
Dinesh Dwivedi
c2b11baf5f This patch removed duplicate code for matching patterns
which are now handled in SimplifyUsingDistributiveLaws() 
(after r211261)

Differential Revision: http://reviews.llvm.org/D4253



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211768 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-26 08:57:33 +00:00
Dinesh Dwivedi
0bf7c06b63 Added instruction combine to transform few more negative values addition to subtraction (Part 2)
This patch enables transforms for

(x + (~(y | c) + 1)   -->   x - (y | c) if c is even

Differential Revision: http://reviews.llvm.org/D4209



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211765 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-26 05:40:22 +00:00
David Majnemer
29640bcb7d GlobalOpt: Don't optimize thread_local for initializers
Folding a reference to a thread_local variable into another global
variable's initializer is very problematic, there is no relocation that
exists to represent such an access.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211762 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-26 03:02:19 +00:00
Matt Arsenault
b0f5a0e7e7 R600: Fix vector FMA
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211757 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-26 01:28:05 +00:00
Hans Wennborg
0545f16700 Don't build switch tables for dllimport and TLS variables in GEPs
This is a follow-up to r211331, which failed to notice that we were
returning early from ValidLookupTableConstant for GEPs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211753 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-26 00:30:52 +00:00
Adam Nemet
f93fe90504 [X86] AVX512: Fix asm syntax for packed vcmp
The *_alt defs for vcmp are used by the InstParser (the asm string in the main
def is used by the InstPrinter) .  The former was accepting vector registers
as destination rather than mask registers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211750 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-26 00:21:12 +00:00
Juergen Ributzka
d01f1c4054 [FastISel][X86] Only fold the cmp into the select when both instructions are in the same basic block.
If the cmp is in a different basic block, then it is possible that not all
operands of that compare have defined registers. This can happen when one of
the operands to the cmp is a load and the load gets folded into the cmp. In
this case FastISel will skip the load instruction and the vreg is never
defined.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211730 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-25 20:06:12 +00:00
David Blaikie
276ef73f4a Revert "PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location."
This reverts commit r211723.

Breaks the ASan/compiler-rt build... guess I didn't test very far at all
:/.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211724 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-25 18:20:54 +00:00
David Blaikie
c39f36df61 PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location.
This situation does bad things when inlined, so I've fixed Clang not to
produce inlinable call sites without locations when the caller has debug
info (in the one case where I could find that this occurred). This
updates the PR20038 test case to be what clang now produces, and readds
the assertion that had to be removed due to this bug.

I've also beefed up the debug info verifier to help diagnose these
issues in the future, and I hope to add checks to the inliner to just
assert-fail if it encounters this situation. If, in the future, we
decide we have to cope with this situation, the right thing to do is
probably to just remove all the DebugLocs from the inlined instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211723 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-25 18:03:10 +00:00
Tyler Nowicki
d5a8fa72bb Add Rpass-missed and Rpass-analysis reports to the loop vectorizer. The remarks give the vector width of vectorized loops and a brief analysis of loops that fail to be vectorized. For example, an analysis will be generated for loops containing control flow that cannot be simplified to a select. The optimization remarks also give the debug location of expressions that cannot be vectorized, for example the location of an unvectorizable call.
Reviewed by: Arnold Schwaighofer


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211721 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-25 17:50:15 +00:00
Andrea Di Biagio
cae1ea691d [X86] Always prefer to lower a VECTOR_SHUFFLE into a BLENDI instead of SHUFP (or VPERM2X128).
This patch teaches method 'LowerVECTOR_SHUFFLE' to give higher precedence to
the check for 'isBlendMask'; the idea is that, when possible, we should firstly
check if a shuffle performs a blend, and in case, try to lower it into a BLENDI
instead of selecting a SHUFP or (worse) a VPERM2X128.

In general:
 - AVX VBLENDPS/D always have better latency and throughput than VPERM2F128;
 - BLENDPS/D instructions tend to always have better 'reciprocal throughput'
   than the equivalent SHUFPS/D;
 - Both BLENDPS/D and SHUFPS/D are often decoded into the same number of
   m-ops; however, a m-op obtained from a BLENDPS/D can be scheduled to more
   than one execution port.

This patch:
 - Moves the check for 'isBlendMask' immediately before the check for
   'isSHUFPMask' within method 'LowerVECTOR_SHUFFLE';
 - Updates existing tests for sse/avx shuffle/blend instructions to verify
   that we select (v)blendps/d when possible (instead of (v)shufps/d or
   vperm2f128).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211720 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-25 17:41:58 +00:00
Eli Bendersky
1ca9d7610d Add some test files for r211710.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211711 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-25 15:41:39 +00:00
Eli Bendersky
bb167336b3 Rename loop unrolling and loop vectorizer metadata to have a common prefix.
[LLVM part]

These patches rename the loop unrolling and loop vectorizer metadata
such that they have a common 'llvm.loop.' prefix.  Metadata name
changes:

llvm.vectorizer.* => llvm.loop.vectorizer.*
llvm.loopunroll.* => llvm.loop.unroll.*

This was a suggestion from an earlier review
(http://reviews.llvm.org/D4090) which added the loop unrolling
metadata. 

Patch by Mark Heffernan.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211710 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-25 15:41:00 +00:00
Evgeniy Stepanov
6ce4a9f175 [msan] Fix bad interaction between with-calls mode and chained origin tracking.
Origin history should only be recorded for uninitialized values, because it is
meaningless otherwise. This change moves __msan_chain_origin to the runtime
library side and makes it conditional on the corresponding shadow value.

Previous code was correct, but _very_ inefficient.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211700 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-25 14:41:57 +00:00
Chandler Carruth
2edf5e45ec [x86] Add intrinsics for the pshufd, pshuflw, and pshufhw instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211694 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-25 13:12:54 +00:00
NAKAMURA Takumi
b720a3d15c Re-apply r211399, "Generate native unwind info on Win64" with a fix to ignore SEH pseudo ops in X86 JIT emitter.
--
This patch enables LLVM to emit Win64-native unwind info rather than
DWARF CFI.  It handles all corner cases (I hope), including stack
realignment.

Because the unwind info is not flexible enough to describe stack frames
with a gap of unknown size in the middle, such as the one caused by
stack realignment, I modified register spilling code to place all spills
into the fixed frame slots, so that they can be accessed relative to the
frame pointer.

Patch by Vadim Chugunov!

Reviewed By: rnk

Differential Revision: http://reviews.llvm.org/D4081

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211691 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-25 12:41:52 +00:00
Andrea Di Biagio
3e5582cc15 [X86] Add target combine rule to select ADDSUB instructions from a build_vector
This patch teaches the backend how to combine a build_vector that implements
an 'addsub' between packed float vectors into a sequence of vector add
and vector sub followed by a VSELECT.

The new VSELECT is expected to be lowered into a BLENDI.
At ISel stage, the sequence 'vector add + vector sub + BLENDI' is
pattern-matched against ISel patterns added at r211427 to select
'addsub' instructions.
Added three more ISel patterns for ADDSUB.

Added test sse3-avx-addsub-2.ll to verify that we correctly emit 'addsub'
instructions.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211679 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-25 10:02:21 +00:00
Evgeniy Stepanov
98726c311b [LICM] Don't create more than one copy of an instruction per loop exit block when sinking.
Fixes exponential compilation complexity in PR19835, caused by
LICM::sink not handling the following pattern well:

f = op g
e = op f, g
d = op e
c = op d, e
b = op c
a = op b, c

When an instruction with N uses is sunk, each of its operands gets N
new uses (all of them - phi nodes). In the example above, if a had 1
use, c would have 2, e would have 4, and g would have 8.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211673 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-25 07:54:58 +00:00
Rafael Espindola
aa2e057bc1 Fix another asserting method in the null streamer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211668 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-25 05:37:58 +00:00
Rafael Espindola
5178d868c8 Fix a regression from r211653.
The method was empty in the null streamer but I mistakenly replaced it with
the aborting one in MCStreamer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211666 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-25 05:31:22 +00:00
NAKAMURA Takumi
9a18ab013f CodeGen/X86/pr20088.ll: Add -march=x86-64, or llc fails due to non-x86 default target.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211659 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-25 03:05:47 +00:00
Juergen Ributzka
35a6a81407 [FastISel][X86] Fold XALU condition into branch and compare.
Optimize the codegen of select and branch instructions to directly use the
EFLAGS from the {s|u}{add|sub|mul}.with.overflow intrinsics.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211645 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 23:51:21 +00:00
Tom Stellard
78d1e95201 R600: Promote i64 stores to v2i32
Now we need only one 64-bit pattern for stores.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211643 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 23:33:04 +00:00
NAKAMURA Takumi
e572ec68d2 ldr-pseudo-obj-errors.s: Fix silly copypasto.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211642 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 23:18:07 +00:00
NAKAMURA Takumi
d14e98ad5b llvm/test/MC/AArch64/ldr-pseudo-obj-errors.s: Add -triple=aarch64-linux. AArch64 is unaware of PECOFF for now.
FIXME: This should pass for also targeting aarch64-darwin.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211640 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 23:11:42 +00:00
Rafael Espindola
4186005edc Print a=b as an assignment.
In assembly the expression a=b is parsed as an assignment, so it should be
printed as one.

This remove a truly horrible hack for producing a label with "a=.". It would
be used by codegen but would never be reached by the asm parser. Sorry I
missed this when it was first committed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211639 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 22:45:16 +00:00
Matt Arsenault
95eb45c5d9 R600: Fix inconsistency in rsq instructions.
R600 was using a clamped version of rsq, but SI was not. Add a
new rsq_clamped intrinsic and use them consistently.

It's unclear to me from the documentation what behavior
the R600 instructions have, so I assume they have the legacy behavior
described by the SI documents. For R600, use RECIPSQRT_IEEE
for both llvm.AMDGPU.rsq.legacy and llvm.AMDGPU.rsq. R600 also
has RECIPSQRT_FF, which I'm not sure how it fits in here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211637 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 22:13:39 +00:00
David Blaikie
639c71bafb Fix up scoping in a few tests (and delete one that validates unnecessary behavior).
Most of this is just tests that were silently succeeding in spite of
schema changes I made over a year ago. Cleaning them up as they lead to
failures in a change I'm working on/will come soon.

test/DebugInfo/2010-01-19-DbgScope.ll was removed as it tested miscoping
where a DebugLoc described a location not in the current function. The
test case doesn't describe why this is a valid situation and should be
supported, so I'm removing it and shortly going to commit changes that
make this firmly unsupported/assert-fail.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211628 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 20:10:27 +00:00
Bill Schmidt
808d878a96 [PPC64] Fix PR20071 (fctiduz generated for targets lacking that instruction)
PR20071 identifies a problem in PowerPC's fast-isel implementation for
floating-point conversion to integer.  The fctiduz instruction was added in
Power ISA 2.06 (i.e., Power7 and later).  However, this instruction is being
generated regardless of which 64-bit PowerPC target is selected.

The intent is for fast-isel to punt to DAG selection when this instruction is
not available.  This patch implements that change.  For testing purposes, the
existing fast-isel-conversion.ll test adds a RUN line for -mcpu=970 and tests
for the expected code generation.  Additionally, the existing test
fast-isel-conversion-p5.ll was found to be incorrectly expecting the
unavailable instruction to be generated.  I've removed these test variants
since we have adequate coverage in fast-isel-conversion.ll.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211627 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 20:05:18 +00:00
Robert Khasanov
031ad1b930 vpblend intrinsics combines as shifts intrinsics due to absence return stmt between them
Fix PR20088

Differential Revision: http://reviews.llvm.org/D4277


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211617 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 18:08:04 +00:00
Weiming Zhao
1357f0e1a7 Fix test case in r211605/r211533
The test case in
"Fix PR20056: Implement pseudo LDR <reg>, =<literal/label> for AArch64" should
only work with Linux.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211613 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 17:05:43 +00:00
Diego Novillo
10ec44d87a Add new debug kind LocTrackingOnly.
Summary:
This new debug emission kind supports emitting line location
information in all instructions, but stops code generation
from emitting debug info to the final output.

This mode is useful when the backend wants to track source
locations during code generation, but it does not want to
produce debug info. This is currently used by optimization
remarks (-pass-remarks, -pass-remarks-missed and
-pass-remarks-analysis).

To prevent debug info emission, DIBuilder never inserts the
annotation 'llvm.dbg.cu' when LocTrackingOnly is enabled.

Reviewers: echristo, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D4234

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211609 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 17:02:03 +00:00
Weiming Zhao
c33b4883b3 Resubmit commit r211533
"Fix PR20056: Implement pseudo LDR <reg>, =<literal/label> for AArch64"
Missed files are added in this commit.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211605 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 16:21:38 +00:00
Christian Pirker
01c8340c3d ARM: Fix TPsoft for Thumb mode
Reviewed at http://reviews.llvm.org/D4230



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211601 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 15:45:59 +00:00
Daniel Sanders
90be077d09 [mips] Added support for assembling sdbbp.
Summary:
This instruction is re-encoded in MIPS32r6/MIPS64r6 without changing the
restrictions. We hadn't implemented it for earlier ISA's so it has been added to those too.

Differential Revision: http://reviews.llvm.org/D4265


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211590 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 13:00:32 +00:00
Benjamin Kramer
0e6156a1a2 InstCombine: Disable umul.with.overflow recognition for vectors.
It doesn't make a lot on most targets and the code isn't ready for it. PR20113.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211583 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 10:47:52 +00:00
Benjamin Kramer
9c88403625 InstCombine: Don't try to reorder shuffles where the mask is a ConstantExpr.
We can't analyze the individual values of a vector expression. PR20114.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211581 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 10:38:10 +00:00
David Majnemer
f396732d9b GlobalOpt: Don't optimize dllimport for initializers
Referencing a dllimport variable requires actually instructions, not
just a relocation.  This fixes PR19955.

Differential Revision: http://reviews.llvm.org/D4249

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211571 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 06:53:45 +00:00
Kevin Qin
8c0787e83a [AArch64] Fix a build_vector pattern match fail
caused by defect in isBuildVectorAllZeros().

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211567 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 05:37:27 +00:00
Adam Nemet
f36c3de849 [Disasm][AVX512] Implement decoding of top bit for non-destructive reg fields
V' bit in the P2 byte of the EVEX prefix provides the top bit of the NDD and
NDS register fields.  This was simply not used in the decoder until now.

Fixes <rdar://problem/17402661>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211565 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-24 01:42:32 +00:00
Juergen Ributzka
20732d55c2 [FastISel][X86] Lower unsupported selects to control-flow.
The extends the select lowering coverage by emiting pseudo cmov
instructions. These insturction will be later on lowered to control-flow to
simulate the select.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211545 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 21:55:44 +00:00
Juergen Ributzka
d0976a3d20 [FastISel][X86] Add support for floating-point select.
This extends the select lowering to support floating-point selects. The
lowering depends on SSE instructions and that the conditon comes from a
floating-point compare. Under this conditions it is possible to emit an
optimized instruction sequence that doesn't require any branches to
simulate the select.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211544 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 21:55:40 +00:00
Juergen Ributzka
5f4e6e1ec0 [FastISel][X86] Optimize selects when the condition comes from a compare.
Optimize the select instructions sequence to use the EFLAGS directly from a
compare when possible.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211543 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 21:55:36 +00:00
NAKAMURA Takumi
63a0ff93c0 nm-trivial-object.test requires shell since Lit internal runner isn't capable of chdir.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211537 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 21:07:04 +00:00
Kevin Enderby
138222e56f Change the default input for llvm-nm to be a.out instead of standard input
to match llvm-size and other UNIX systems for their nm(1).

Tweak test cases that used llvm-nm with standard input to add a "-" to
indicate that and add a test case to check the default of a.out for llvm-nm.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211529 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 20:27:53 +00:00
Rafael Espindola
5e761eb4ae [Mips] Add a target streamer when creating a null streamer.
Should fix DebugInfo/global.ll on the mips bot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211527 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 19:43:40 +00:00
Matt Arsenault
ed143b7c0c R600/SI: Fix div_scale intrinsic.
The operand that must match one of the others does matter,
and implement selecting for it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211523 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 18:28:28 +00:00
Christian Pirker
737f207468 ARMEB: Vector extend operations
Reviewed at http://reviews.llvm.org/D4043



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211520 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 18:05:53 +00:00
Matt Arsenault
9ad2c7ef92 R600: Move add/sub with overflow out of AMDILISelLowering
Add more tests for these.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211517 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 18:00:49 +00:00
Matt Arsenault
c4471e9248 R600/SI: Handle i64 sub.
We can handle it the same way as add

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211514 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 18:00:38 +00:00
Rafael Espindola
67c5f45306 Delete utils/FileUpdate.
It is unused and it looks like it was never used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211508 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 17:58:39 +00:00
Rafael Espindola
88a564f55e Allow using .cfi_startproc without a leading symbol.
This is possible now that we don't produce .eh symbols. This fixes pr19430.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211502 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 15:34:32 +00:00
Rafael Espindola
a118083442 Stop producing func.eh symbols on Darwin.
According Nick Kledzik (http://llvm.org/bugs/show_bug.cgi?id=19430#c2):
"... mach-o no longer needs names in the __eh_frame section (and has not for
years)."

Iain Sandoe confirms it is also unnecessary for their old darwin support.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211500 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 15:13:23 +00:00
Ulrich Weigand
9a154bfe94 [PowerPC] Allow stack frames without parameter save area
The PPCFrameLowering::determineFrameLayout routine currently ensures
that every function that allocates a stack frame provides space for the
parameter save area (via PPCFrameLowering::getMinCallFrameSize).

This is actually not necessary.  There may be functions that never call
another routine but still allocate a frame; those do not require the
parameter save area.  In the future, with the ELFv2 ABI, even some
routines that do call other functions do not need to allocate the
parameter save area.

While it is not a bug to allocate the parameter area when it is not
needed, it is better to avoid it to save stack space.

Note that when any particular function call requires the parameter save
area, this space will already have been included by ABI code in the size
the CALLSEQ_START insn is annotated with, and therefore included in the
size returned by MFI->getMaxCallFrameSize().

This means that determineFrameLayout simply does not need to care about
the parameter save area.  (It still needs to ensure that every frame
provides the linkage area.)  This is implemented by this patch.

Note that this exposed a bug in the new fast-isel code where the parameter
area was *not* included in the CALLSEQ_START size; this is also fixed.

A couple of test cases needed to be adapted for the new (smaller) stack
frame size those tests now see.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211495 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 13:47:52 +00:00
Ulrich Weigand
fdb6eb65c7 [PowerPC] Fix on-stack AltiVec arguments with 64-bit SVR4
Current 64-bit SVR4 code seems to have some remnants of Darwin code
in AltiVec argument handing.  This had the effect that AltiVec arguments
(or subsequent arguments) were not correctly placed in the parameter area
in some cases.

The correct behaviour with the 64-bit SVR4 ABI is:
- All AltiVec arguments take up space in the parameter area, just like
  any other arguments, whether vararg or not.
- They are always 16-byte aligned, skipping a parameter area doubleword
  (and the associated GPR, if any), if necessary.

This patch implements the correct behaviour and adds a test case.
(Verified against GCC behaviour via the ABI compat test suite.)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211492 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 12:36:34 +00:00
Tim Northover
6f7e87c751 ARM: mark UBFX as not allowing PC.
Strictly, it's unpredictable. But we don't quite model that yet and an error is
better than ignoring the issue. This one somehow got left out before though.

rdar://problem/15997748

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211490 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-23 09:20:02 +00:00
Saleem Abdulrasool
7a3698ecc9 MC: adjust text section flags for WoA
Correct the section flags for code built for Windows on ARM with
`-ffunction-sections`.  Windows on ARM uses solely Thumb-2 instructions, and
indicates that the function is thumb by placing it in a text section that has
IMAGE_SCN_MEM_16BIT flag set.

When we encounter a .section directive, a new section is constructed.  This may
be a text segment.  In order to identify that we need the additional flag,
expose the target triple through the ObjectFileInfo as this information is lost
otherwise.

Since any modern ARM targeting environment on Windows would be Thumb-2 (Windows
ARM NT or Windows Embedded Compact), introducing a new flag to indicate the
section attribute seems to be a bit overkill.  Simply depend on the target
triple.  Since there is one location that this information is currently needed,
creating a target specific assembly parser and delegating the parsing of section
switches also feels a bit heavy handed.  If it turns out that this information
ends up changing additional behaviour, then it may be worth considering that
alternative.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211481 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-22 22:25:01 +00:00
NAKAMURA Takumi
9124b45918 Revert r211399, "Generate native unwind info on Win64"
It broke Legacy JIT Tests on x86_64-{mingw32|msvc}, aka Windows x64.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211480 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-22 22:00:56 +00:00
Jan Vesely
728ea0c91b R600: Add udivrem test
v2: move < %s to the end of the line
    space after ;
    add v4i32 test

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211476 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-22 21:42:58 +00:00
Filipe Cabecinhas
7798d5992a Fix PR20087 by using the source index when changing the vector load
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211472 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-22 17:21:37 +00:00
NAKAMURA Takumi
fe755c57f8 Introduce a Lit feature "debug_frame" and apply it to llvm/test/MC/ELF/cfi-version.ll.
.debug_frame is not emitted for targeting Windows x64.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211466 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-22 12:35:39 +00:00
Benjamin Kramer
84dd75d8d2 Add a description to the test from r211433 explaining why it's written that way.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211465 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-22 12:22:04 +00:00