Commit Graph

25818 Commits

Author SHA1 Message Date
Chandler Carruth
da3a293313 [x86] Clean up some tests to use FileCheck and combine two into a single
file.

Changing code that is covered by these tests is just too hard to debug
currently, and now it will be clear the nature of the changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216643 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-28 03:41:28 +00:00
David Majnemer
b11fff1d8a InstSimplify: Move a transform from InstCombine to InstSimplify
Several combines involving icmp (shl C2, %X) C1 can be simplified
without introducing any new instructions.  Move them to InstSimplify;
while we are at it, make them more powerful.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216642 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-28 03:34:28 +00:00
Juergen Ributzka
4a76317ebb [FastISel] Undo phi node updates when falling-back to SelectionDAG.
The included test case would fail, because the MI PHI node would have two
operands from the same predecessor.

This problem occurs when a switch instruction couldn't be selected. This happens
always, because there is no default switch support for FastISel to begin with.

The problem was that FastISel would first add the operand to the PHI nodes and
then fall-back to SelectionDAG, which would then in turn add the same operands
to the PHI nodes again.

This fix removes these duplicate PHI node operands by reseting the
PHINodesToUpdate to its original state before FastISel tried to select the
instruction.

This fixes <rdar://problem/18155224>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216640 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-28 02:06:55 +00:00
Juergen Ributzka
d24494d672 [FastISel]
Currently instructions are folded very aggressively for AArch64 into the memory
operation, which can lead to the use of killed operands:
  %vreg1<def> = ADDXri %vreg0<kill>, 2
  %vreg2<def> = LDRBBui %vreg0, 2
  ... = ... %vreg1 ...

This usually happens when the result is also used by another non-memory
instruction in the same basic block, or any instruction in another basic block.

This fix teaches hasTrivialKill to not only check the LLVM IR that the value has
a single use, but also to check if the register that represents that value has
already been used. This can happen when the instruction with the use was folded
into another instruction (in this particular case a load instruction).

This fixes rdar://problem/18142857.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216634 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-28 00:09:46 +00:00
Juergen Ributzka
a26b1bdcc8 Revert "[FastISel][AArch64] Don't fold instructions too aggressively into the memory operation."
Quentin pointed out that this is not the correct approach and there is a better and easier solution.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216632 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 23:09:40 +00:00
Juergen Ributzka
1f5263e43f [FastISel][AArch64] Don't fold instructions too aggressively into the memory operation.
Currently instructions are folded very aggressively into the memory operation,
which can lead to the use of killed operands:
  %vreg1<def> = ADDXri %vreg0<kill>, 2
  %vreg2<def> = LDRBBui %vreg0, 2
  ... = ... %vreg1 ...

This usually happens when the result is also used by another non-memory
instruction in the same basic block, or any instruction in another basic block.

If the computed address is used by only memory operations in the same basic
block, then it is safe to fold them. This is because all memory operations will
fold the address computation and the original computation will never be emitted.

This fixes rdar://problem/18142857.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216629 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 22:52:33 +00:00
Juergen Ributzka
ccf53013cd [FastISel][AArch64] Fix simplify address when the address comes from a shift.
When the address comes directly from a shift instruction then the address
computation cannot be folded into the memory instruction, because the zero
register is not available as a base register. Simplify addess needs to emit the
shift instruction and use the result as base register.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216621 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 21:38:33 +00:00
Juergen Ributzka
d445e4acdb [FastISel][AArch64] Use the zero register for stores.
Use the zero register directly when possible to avoid an unnecessary register
copy and a wasted register at -O0. This also uses integer stores to store a
positive floating-point zero. This saves us from materializing the positive zero
in a register and then storing it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216617 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 21:04:52 +00:00
Reid Kleckner
2ab3b563da X86 MC: Handle instructions like fxsave that match multiple operand sizes
Instructions like 'fxsave' and control flow instructions like 'jne'
match any operand size. The loop I added to the Intel syntax matcher
assumed that using a different size would give a different instruction.
Now it handles the case where we get the same instruction for different
memory operand sizes.

This also allows us to remove the hack we had for unsized absolute
memory operands, because we can successfully match things like 'jnz'
without reporting ambiguity.  Removing this hack uncovered test case
involving 'fadd' that was ambiguous. The memory operand could have been
single or double precision.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216604 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 20:10:38 +00:00
David Majnemer
8ee308f499 InstCombine: Combine gep X, (Y-X) to Y
We try to perform this transform in InstSimplify but we aren't always
able to.  Sometimes, we need to insert a bitcast if X and Y don't have
the same time.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216598 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 20:08:37 +00:00
David Majnemer
48164ed24d InstSimplify: Don't simplify gep X, (Y-X) to Y if types differ
It's incorrect to perform this simplification if the types differ.
A bitcast would need to be inserted for this to work.

This fixes PR20771.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216597 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 20:08:34 +00:00
Nico Weber
1c768816d7 Reland r216439 215441, majnemer has a real fix for PR20771.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216586 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 20:06:19 +00:00
Nico Weber
b2f71836eb Revert r216439 (and r216441, else the former doesn't revert cleanly).
It caused PR 20771. I'll land a test on the clang side.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216582 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 20:00:13 +00:00
David Majnemer
d8e448bd27 InstSimplify: Compute comparison ranges for left shift instructions
'shl nuw CI, x' produces [CI, CI << CLZ(CI)]
'shl nsw CI, x' produces [CI << CLO(CI)-1, CI] if CI is negative
'shl nsw CI, x' produces [CI, CI << CLZ(CI)-1] if CI is non-negative

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216570 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 18:03:46 +00:00
Oliver Stannard
5e487f8dc7 Teach the AArch64 backend about v4f16 and v8f16
This teaches the AArch64 backend to deal with the operations required
to deal with the operations on v4f16 and v8f16 which are exposed by
NEON intrinsics, plus the add, sub, mul and div operations.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216555 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 16:16:04 +00:00
Michael Zolotukhin
b8c95a89e6 [SLP] Re-enable vectorization of GEP expressions (re-apply r210342 with a fix).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216549 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 15:01:18 +00:00
Chandler Carruth
7e3dc40fab [x86] Fix a regression introduced with r213897 for 32-bit targets where
we stopped efficiently lowering sextload using the SSE41 instructions
for that operation.

This is a consequence of a bad predicate I used thinking of the memory
access needs. The code actually handles the cases where the predicate
doesn't apply, and handles them much better. =] Simple fix and a test
case added. Fixes PR20767.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216538 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 11:39:47 +00:00
Chandler Carruth
963a5e6c61 [SDAG] Re-instate r215611 with a fix to a pesky X86 DAG combine.
This combine is essentially combining target-specific nodes back into target
independent nodes that it "knows" will be combined yet again by a target
independent DAG combine into a different set of target-independent nodes that
are legal (not custom though!) and thus "ok". This seems... deeply flawed. The
crux of the problem is that we don't combine un-legalized shuffles that are
introduced by legalizing other operations, and thus we don't see a very
profitable combine opportunity. So the backend just forces the input to that
combine to re-appear.

However, for this to work, the conditions detected to re-form the unlegalized
nodes must be *exactly* right. Previously, failing this would have caused poor
code (if you're lucky) or a crasher when we failed to select instructions.
After r215611 we would fall back into the legalizer. In some cases, this just
"fixed" the crasher by produces bad code. But in the test case added it caused
the legalizer and the dag combiner to iterate forever.

The fix is to make the alignment checking in the x86 side of things match the
alignment checking in the generic DAG combine exactly. This isn't really a
satisfying or principled fix, but it at least make the code work as intended.
It also highlights that it would be nice to detect the availability of under
aligned loads for a given type rather than bailing on this optimization. I've
left a FIXME to document this.

Original commit message for r215611 which covers the rest of the chang:
  [SDAG] Fix a case where we would iteratively legalize a node during
  combining by replacing it with something else but not re-process the
  node afterward to remove it.

  In a truly remarkable stroke of bad luck, this would (in the test case
  attached) end up getting some other node combined into it without ever
  getting re-processed. By adding it back on to the worklist, in addition
  to deleting the dead nodes more quickly we also ensure that if it
  *stops* being dead for any reason it makes it back through the
  legalizer. Without this, the test case will end up failing during
  instruction selection due to an and node with a type we don't have an
  instruction pattern for.

It took many million runs of the shuffle fuzz tester to find this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216537 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 11:22:16 +00:00
Robert Khasanov
e79a94a839 [SKX] Added new versions of cmp instructions in avx512_icmp_cc multiclass, added VL multiclass.
Added encoding tests


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216532 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 09:34:37 +00:00
Elena Demikhovsky
fe0c6ead85 AVX-512: Added intrinsic for VMOVSS store form with mask.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216530 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 07:38:43 +00:00
David Majnemer
dd5456bd01 InstCombine: Optimize GEP's involving ptrtoint better
We supported transforming:
(gep i8* X, -(ptrtoint Y))

to:
(inttoptr (sub (ptrtoint X), (ptrtoint Y)))

However, this only fired if 'X' had type i8*.  Generalize this to
support various types of different sizes.  This results in much better
CodeGen, especially for pointers to packed structs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216523 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 05:16:04 +00:00
David Blaikie
ad2a271781 Remove type unit skeletons. GDB no longer needs them & this saves a heap of space.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216521 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 05:04:14 +00:00
Juergen Ributzka
fc03e72b4f [FastISel][AArch64] Fix address simplification.
When a shift with extension or an add with shift and extension cannot be folded
into the memory operation, then the address calculation has to be materialized
separately. While doing so the code forgot to consider a possible sign-/zero-
extension. This fix folds now also the sign-/zero-extension into the add or
shift instruction which is used to materialize the address.

This fixes rdar://problem/18141718.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216511 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 00:58:30 +00:00
Juergen Ributzka
836f4bd090 [FastISel][AArch64] Fold Sign-/Zero-Extend into the shift immediate instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216510 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 00:58:26 +00:00
David Blaikie
c04b78627d Fix a couple of debug info test cases to match the metadata schema change in r216239
Found these while testing something else.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216505 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 00:04:16 +00:00
Reid Kleckner
3c92309f0d MC: Split the x86 asm matcher implementations by dialect
The existing matcher has lots of AT&T assembly dialect assumptions baked
into it.  In particular, the hack for resolving the size of a memory
operand by appending the four most common suffixes doesn't work at all.
The Intel assembly dialect mnemonic table has ambiguous entries, so we
need to try matching multiple times with different operand sizes, since
that's the only way to choose different instruction variants.

This makes us more compatible with gas's implementation of Intel
assembly syntax.  MSVC assumes you want byte-sized operations for the
instructions that we reject as ambiguous.

Reviewed By: grosbach

Differential Revision: http://reviews.llvm.org/D4747

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216481 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-26 20:32:34 +00:00
Joerg Sonnenberger
1d5cdfd751 Revert r210342 and r210343, add test case for the crasher.
PR 20642.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216475 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-26 19:06:41 +00:00
Yi Kong
2282afa6cc ARM: Add patterns for dbg
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216451 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-26 12:47:26 +00:00
David Majnemer
8058ffeb18 InstSimplify: Fold gep X, (sub 0, ptrtoint(X)) to null
Save InstCombine some work if we can perform this fold during
InstSimplify.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216441 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-26 07:08:03 +00:00
David Majnemer
594e4a1dd3 InstSimplify: Simplify trivial pointer expressions like b + (e - b)
consider:
long long *f(long long *b, long long *e) {
  return b + (e - b);
}

we would lower this to something like:
define i64* @f(i64* %b, i64* %e) {
  %1 = ptrtoint i64* %e to i64
  %2 = ptrtoint i64* %b to i64
  %3 = sub i64 %1, %2
  %4 = ashr exact i64 %3, 3
  %5 = getelementptr inbounds i64* %b, i64 %4
  ret i64* %5
}

This should fold away to just 'e'.

N.B.  This adds m_SpecificInt as a convenient way to match against a
particular 64-bit integer when using LLVM's match interface.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216439 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-26 05:55:16 +00:00
Reid Kleckner
fad8d818db musttail: Don't eliminate varargs packs if there is a forwarding call
Also clean up and beef up this grep test for the feature.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216425 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-26 00:59:51 +00:00
Reid Kleckner
44b3a0b411 Declare that musttail calls in variadic functions forward the ellipsis
Summary:
There is no functionality change here except in the way we assemble and
dump musttail calls in variadic functions. There's really no need to
separate out the bits for musttail and "is forwarding varargs" on call
instructions. A musttail call by definition has to forward the ellipsis
or it would fail verification.

Reviewers: chandlerc, nlewycky

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D4892

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216423 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-26 00:33:28 +00:00
Reid Kleckner
9d1f8b1b21 ArgPromotion: Don't touch variadic functions
Adding, removing, or changing non-pack parameters can change the ABI
classification of pack parameters. Clang and other frontends encode the
classification in the IR of the call site, but the callee side
determines it dynamically based on the number of registers consumed so
far. Changing the prototype affects the number of registers consumed
would break such code.

Dead argument elimination performs a similar task and already has a
similar check to avoid this problem.

Patch by Thomas Jablin!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216421 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-25 23:58:48 +00:00
Bruno Cardoso Lopes
ff69509f94 Remove dangling initializers in GlobalDCE
GlobalDCE deletes global vars and updates their initializers to nullptr
while leaving underlying constants to be cleaned up later by its uses.
The clean up may never happen, fix this by forcing it every time it's
safe to destroy constants.

Final patch by Rafael Espindola
http://reviews.llvm.org/D4931

<rdar://problem/17523868>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216390 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-25 17:51:14 +00:00
Chad Rosier
373fc00835 [AArch32] Add patterns for VCVT{A,N,P,M}.
Patterns for lowering libm calls to VCVT{A,N,P,M} are also included.
Phabricator Revision: http://reviews.llvm.org/D5033

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216388 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-25 16:56:33 +00:00
Robert Khasanov
cc4b123a47 [SKX] avx512_icmp_packed multiclass extension
Extended avx512_icmp_packed multiclass by masking versions.
Added avx512_icmp_packed_rmb multiclass for embedded broadcast versions.
Added corresponding _vl multiclasses.
Added encoding tests for CPCMP{EQ|GT}* instructions.
Add more fields for X86VectorVTInfo.
Added AVX512VLVectorVTInfo that include X86VectorVTInfo for 512/256/128-bit versions

Differential Revision: http://reviews.llvm.org/D5024


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216383 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-25 14:49:34 +00:00
Karthik Bhat
e637d65af3 Allow vectorization of division by uniform power of 2.
This patch adds support to recognize division by uniform power of 2 and modifies the cost table to vectorize division by uniform power of 2 whenever possible.
Updates Cost model for Loop and SLP Vectorizer.The cost table is currently only updated for X86 backend.
Thanks to Hal, Andrea, Sanjay for the review. (http://reviews.llvm.org/D4971)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216371 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-25 04:56:54 +00:00
David Majnemer
5cbd5a13a4 InstCombine: Properly optimize or'ing bittests together
CFE, with -03, would turn:
bool f(unsigned x) {
  bool a = x & 1;
  bool b = x & 2;
  return a | b;
}

into:
  %1 = lshr i32 %x, 1
  %2 = or i32 %1, %x
  %3 = and i32 %2, 1
  %4 = icmp ne i32 %3, 0

This sort of thing exposes a nasty pathology in GCC, ICC and LLVM.

Instead, we would rather want:
  %1 = and i32 %x, 3
  %2 = icmp ne i32 %1, 0

Things get a bit more interesting in the following case:
  %1 = lshr i32 %x, %y
  %2 = or i32 %1, %x
  %3 = and i32 %2, 1
  %4 = icmp ne i32 %3, 0

Replacing it with the following sequence is better:
  %1 = shl nuw i32 1, %y
  %2 = or i32 %1, 1
  %3 = and i32 %2, %x
  %4 = icmp ne i32 %3, 0

This sequence is preferable because %1 doesn't involve %x and could
potentially be hoisted out of loops if it is invariant; only perform
this transform in the non-constant case if we know we won't increase
register pressure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216343 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-24 09:10:57 +00:00
Hal Finkel
7ca2a7d742 [PowerPC] Add support for dcbtst and icbt (prefetch)
Adds code generation support for dcbtst (data cache prefetch for write) and
icbt (instruction cache prefetch for read - Book E cores only).

We still end up with a 'cannot select' error for the non-supported prefetch
intrinsic forms. This will be fixed in a later commit.

Fixes PR20692.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216339 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-23 23:21:04 +00:00
Chad Rosier
8eb867e97d Revert "ARM: improve RTABI 4.2 conformance on Linux"
This reverts commit r215862 due to nightly failures.  Will work on getting a
reduced test case, but I wanted to get our bots green in the meantime.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216325 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-23 18:29:43 +00:00
Chandler Carruth
bfed08e41f [x86] Start fixing a really subtle and terrible form of miscompile in
these DAG combines.

The DAG auto-CSE thing is truly terrible. Due to it, when RAUW-ing
a node with its operand, you can cause its uses to CSE to itself, which
then causes their uses to become your uses which causes them to be
picked up by the RAUW. For nodes that are determined to be "no-ops",
this is "fine". But if the RAUW is one of several steps to enact
a transformation, this causes the DAG to really silently eat an discard
nodes that you would never expect. It took days for me to actually
pinpoint a test case triggering this and a really frustrating amount of
time to even comprehend the bug because I never even thought about the
ability of RAUW to iteratively consume nodes due to CSE-ing them into
itself.

To fix this, we have to build up a brand-new chain of operations any
time we are combining across (potentially) intervening nodes. But once
the logic is added to do this, another issue surfaces: CombineTo eagerly
deletes the one node combined, *but no others*. This is... really
frustrating. If deleting it makes its operands become dead, those
operand nodes often won't go onto the worklist in the
order you would want -- they're already on it and not near the top. That
means things higher on the worklist will get combined prior to these
dead nodes being GCed out of the worklist, and if the chain is long, the
immediate users won't be enough to re-detect where the root of the chain
is that became single-use again after deleting the dead nodes. The
better way to do this is to never immediately delete nodes, and instead
to just enqueue them so we can recursively delete them. The
combined-from node is typically not on the worklist anyways by virtue of
having been popped off.... But that in turn breaks other tests that
*require* CombineTo to delete unused nodes. :: sigh ::

Fortunately, there is a better way. This whole routine should have been
returning the replacement rather than using CombineTo which is quite
hacky. Switch to that, and all the pieces fall together.

I suspect the same kind of miscompile is possible in the half-shuffle
folding code, and potentially the recursive folding code. I'll be
switching those over to a pattern more like this one for safety's sake
even though I don't immediately have any test cases for them. Note that
the only way I got a test case for this instance was with *heavily* DAG
combined 256-bit shuffle sequences generated by my fuzzer. ;]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216319 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-23 10:25:15 +00:00
Alex Lorenz
e62541e6d6 llvm-cov: test: add xfail for the big-endian buildbots
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216310 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-23 00:47:24 +00:00
Nick Lewycky
f591b9c33e Revert r215611 because it caused the infinite loop in bug 20736. There is a reduced testcase in that bug.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216307 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-23 00:45:03 +00:00
Yunzhong Gao
9fe92725af Add a test case for SROA where the store size is bigger than slice size. The
test case was fixed in r216248.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216303 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-22 23:27:04 +00:00
Rafael Espindola
03d2823e02 Add support for comdats to the gold plugin.
There are two parts to this. First, the plugin needs to tell gold the comdat by
setting comdat_key.

What gets things a bit more complicated is that gold only seems
symbols. In particular, if A is an alias to B, it only sees the symbols
A and B. It can then ask us to keep symbol A but drop symbol B. What
we have to do instead is to create an internal version of B and make A
an alias to that.

At some point some of this logic should be moved to lib/Linker so that
we don't map a Constant to an internal version just to have lib/Linker
map that again to the destination module.

The reason for implementing this in tools/gold for now is simplicity.
With it in place it should be possible to update clang to use comdats
for constructors and destructors on ELF without breaking the LTO
bootstrap. Once that is done I intend to come back and improve the
interface lib/Linker exposes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216302 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-22 23:26:10 +00:00
Alex Lorenz
6c7a6a1ba2 llvm-cov: add code coverage tool that's based on coverage mapping format and clang's pgo.
This commit expands llvm-cov's functionality by adding support for a new code coverage
tool that uses LLVM's coverage mapping format and clang's instrumentation based profiling.
The gcov compatible tool can be invoked by supplying the 'gcov' command as the first argument,
or by modifying the tool's name to end with 'gcov'.

Differential Revision: http://reviews.llvm.org/D4445


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216300 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-22 22:56:03 +00:00
Jingyue Wu
8be5600f0a [SROA] Fold a PHI node if all its incoming values are the same
Summary:
Fixes PR20425.

During slice building, if all of the incoming values of a PHI node are the same, replace the PHI node with the common value. This simplification makes alloca's used by PHI nodes easier to promote.

Test Plan: Added three more tests in phi-and-select.ll

Reviewers: nlewycky, eliben, meheff, chandlerc

Reviewed By: chandlerc

Subscribers: zinovy.nis, hfinkel, baldrick, llvm-commits

Differential Revision: http://reviews.llvm.org/D4659

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216299 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-22 22:45:57 +00:00
Reid Kleckner
d89c0abc07 ARM / x86_64 varargs: Don't save regparms in prologue without va_start
There's no need to do this if the user doesn't call va_start. In the
future, we're going to have thunks that forward these register
parameters with musttail calls, and they won't need these spills for
handling va_start.

Most of the test suite changes are adding va_start calls to existing
tests to keep things working.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216294 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-22 21:59:26 +00:00
Kevin Enderby
d9757b42ad Add the start of the support for llvm-objdump’s -private-headers for Mach-O files.
This adds the printing of the mach header. Load command printing will be next.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216285 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-22 20:35:18 +00:00
Tom Stellard
f50f927d65 R600/SI: Use READ2/WRITE2 instructions for 64-bit mem ops with 32-bit alignment
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216279 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-22 18:49:35 +00:00