Commit Graph

17145 Commits

Author SHA1 Message Date
Dylan Noblesmith
c97e8d8ddc CodeGen: switch raw array to std::vector
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216355 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-25 00:28:31 +00:00
Craig Topper
273fd11da9 Use range based for loops to avoid needing to re-mention SmallPtrSet size.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216351 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-24 23:23:06 +00:00
Nick Lewycky
f591b9c33e Revert r215611 because it caused the infinite loop in bug 20736. There is a reduced testcase in that bug.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216307 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-23 00:45:03 +00:00
Reid Kleckner
d89c0abc07 ARM / x86_64 varargs: Don't save regparms in prologue without va_start
There's no need to do this if the user doesn't call va_start. In the
future, we're going to have thunks that forward these register
parameters with musttail calls, and they won't need these spills for
handling va_start.

Most of the test suite changes are adding va_start calls to existing
tests to keep things working.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216294 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-22 21:59:26 +00:00
David Blaikie
c7260209a8 Use DILexicalBlockFile, rather than DILexicalBlock, to track discriminator changes to ensure discriminator changes don't introduce new DWARF DW_TAG_lexical_blocks.
Somewhat unnoticed in the original implementation of discriminators, but
it could cause instructions to end up in new, small,
DW_TAG_lexical_blocks due to the use of DILexicalBlock to track
discriminator changes.

Instead, use DILexicalBlockFile which we already use to track file
changes without introducing new scopes, so it works well to track
discriminator changes in the same way.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216239 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 22:45:21 +00:00
Sanjay Patel
d1a09c47d2 name change: isPow2DivCheap -> isPow2SDivCheap
isPow2DivCheap

That name doesn't specify signed or unsigned.

Lazy as I am, I eventually read the function and variable comments. It turns out that this is strictly about signed div. But I discovered that the comments are wrong:

   srl/add/sra

is not the general sequence for signed integer division by power-of-2. We need one more 'sra':

   sra/srl/add/sra

That's the sequence produced in DAGCombiner. The first 'sra' may be removed when dividing by exactly '2', but that's a special case.

This patch corrects the comments, changes the name of the flag bit, and changes the name of the accessor methods.

No functional change intended.

Differential Revision: http://reviews.llvm.org/D5010


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216237 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 22:31:48 +00:00
Quentin Colombet
bd66db27fd [PeepholeOptimizer] Enable the advanced copy optimization by default.
The advanced copy optimization does not yield any difference on the whole llvm
test-suite + SPECs, either in compile time or runtime (binaries are identical),
but has a big potential when data go back and forth between register files as
demonstrated with test/CodeGen/ARM/adv-copy-opt.ll.

Note: This was measured for both Os and O3 for armv7s, arm64, and x86_64.

<rdar://problem/12702965>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216236 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 22:23:52 +00:00
Robin Morisset
cf165c36ee Rename AtomicExpandLoadLinked into AtomicExpand
AtomicExpandLoadLinked is currently rather ARM-specific. This patch is the first of
a group that aim at making it more target-independent. See
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075873.html
for details

The command line option is "atomic-expand"

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216231 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 21:50:01 +00:00
Quentin Colombet
4921d1af7d [PeepholeOptimizer] Update the kill flags when extending the live-range of the
source of a copy.

<rdar://problem/12702965>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216229 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 21:34:06 +00:00
David Blaikie
95ca0fb247 Explicitly pass ownership of the MemoryBuffer to AddNewSourceBuffer using std::unique_ptr
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216223 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 20:44:56 +00:00
Benjamin Kramer
daada81e5c DAGCombiner: Make concat_vector combine safe for EVTs and concat_vectors with many arguments.
PR20677

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216175 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 13:28:02 +00:00
Oliver Stannard
760a46522a [ARM] Enable DP copy, load and store instructions for FPv4-SP
The FPv4-SP floating-point unit is generally referred to as
single-precision only, but it does have double-precision registers and
load, store and GPR<->DPR move instructions which operate on them.
This patch enables the use of these registers, the main advantage of
which is that we now comply with the AAPCS-VFP calling convention.
This partially reverts r209650, which added some AAPCS-VFP support,
but did not handle return values or alignment of double arguments in
registers.

This patch also adds tests for Thumb2 code generation for
floating-point instructions and intrinsics, which previously only
existed for ARM.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216172 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 12:50:31 +00:00
Craig Topper
431bdfc4c1 Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216158 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 05:55:13 +00:00
Jiangning Liu
150cef218f Revert r216066, "Optimize ZERO_EXTEND and SIGN_EXTEND in both SelectionDAG Builder and type".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216147 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 01:59:30 +00:00
Quentin Colombet
e817bdd304 [PeepholeOptimizer] Take advantage of the isInsertSubreg property in the
advanced copy optimization.

This is the final step patch toward transforming:
udiv    r0, r0, r2
udiv    r1, r1, r3
vmov.32 d16[0], r0
vmov.32 d16[1], r1
vmov    r0, r1, d16
bx      lr

into:
udiv    r0, r0, r2
udiv    r1, r1, r3
bx      lr

Indeed, thanks to this patch, this optimization is able to look through
vmov.32 d16[0], r0
vmov.32 d16[1], r1

and is able to rewrite the following sequence:
vmov.32 d16[0], r0
vmov.32 d16[1], r1
vmov    r0, r1, d16

into simple generic GPR copies that the coalescer managed to remove.

<rdar://problem/12702965>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216144 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 00:19:16 +00:00
Quentin Colombet
0d15213307 Add isInsertSubreg property.
This patch adds a new property: isInsertSubreg and the related target hooks:
TargetIntrInfo::getInsertSubregInputs and
TargetInstrInfo::getInsertSubregLikeInputs to specify that a target specific
instruction is a (kind of) INSERT_SUBREG.

The approach is similar to r215394.

<rdar://problem/12702965>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216139 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 23:49:36 +00:00
Quentin Colombet
8ccce6b8fe [PeepholeOptimizer] Take advantage of the isExtractSubreg property in the
advanced copy optimization.

This patch is a step toward transforming:
udiv	r0, r0, r2
udiv	r1, r1, r3
vmov.32	d16[0], r0
vmov.32	d16[1], r1
vmov	r0, r1, d16
bx	lr

into:
udiv	r0, r0, r2
udiv	r1, r1, r3
bx	lr

Indeed, thanks to this patch, this optimization is able to look through
vmov r0, r1, d16
but it does not understand yet
vmov.32 d16[0], r0
vmov.32 d16[1], r1

Comming patches will fix that and update the related test case.

<rdar://problem/12702965>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216136 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 23:13:02 +00:00
Quentin Colombet
dac67649f2 Add isExtractSubreg property.
This patch adds a new property: isExtractSubreg and the related target hooks:
TargetIntrInfo::getExtractSubregInputs and
TargetInstrInfo::getExtractSubregLikeInputs to specify that a target specific
instruction is a (kind of) EXTRACT_SUBREG.

The approach is similar to r215394.

<rdar://problem/12702965>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216130 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 21:51:26 +00:00
Alexey Samsonov
fa210acc3b Fix null reference creation in SelectionDAG constructor.
Store TargetSelectionDAGInfo as a pointer instead of a reference:
getSelectionDAGInfo() may not be implemented for certain backends
(e.g. it's not currently implemented for R600).

This bug is reported by UBSan.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216129 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 21:40:15 +00:00
Alexey Samsonov
a046b4149c Cleanup: Delete seemingly unused reference to MachineDominatorTree from ScheduleDAGInstrs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216124 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 20:57:26 +00:00
Alexey Samsonov
ada5f2a2c7 Fix null reference creation in ScheduleDAGInstrs constructor call.
Both MachineLoopInfo and MachineDominatorTree may be null in ScheduleDAGMI
constructor call. It is undefined behavior to take references to these values.

This bug is reported by UBSan.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216118 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 19:36:05 +00:00
Sanjay Patel
3deb3e32b2 critical-anti-dependency breaker: don't use reg def info from kill insts (PR20308)
In PR20308 ( http://llvm.org/bugs/show_bug.cgi?id=20308 ), the critical-anti-dependency breaker
caused a miscompile because it broke a WAR hazard using a register that it thinks is available
based on info from a kill inst. Until PR18663 is solved, we shouldn't use any def/use info from
a kill because they are really just nops.

This patch adds guard checks for kills around calls to ScanInstruction() where the DefIndices
array is set. For good measure, add an assert in ScanInstruction() so we don't hit this bug again.

The test case is a reduced version of the code from the bug report.

Differential Revision: http://reviews.llvm.org/D4977



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216114 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 18:03:00 +00:00
Quentin Colombet
dcd3cbea54 [PeepholeOptimizer] Refactor the advanced copy optimization to take advantage of
the isRegSequence property.

This is a follow-up of r215394 and r215404, which respectively introduces the
isRegSequence property and uses it for ARM.

Thanks to the property introduced by the previous commits, this patch is able
to optimize the following sequence:
vmov	d0, r2, r3
vmov	d1, r0, r1
vmov	r0, s0
vmov	r1, s2
udiv	r0, r1, r0
vmov	r1, s1
vmov	r2, s3
udiv	r1, r2, r1
vmov.32	d16[0], r0
vmov.32	d16[1], r1
vmov	r0, r1, d16
bx	lr

into:
udiv	r0, r0, r2
udiv	r1, r1, r3
vmov.32	d16[0], r0
vmov.32	d16[1], r1
vmov	r0, r1, d16
bx	lr

This patch refactors how the copy optimizations are done in the peephole
optimizer. Prior to this patch, we had one copy-related optimization that
replaced a copy or bitcast by a generic, more suitable (in terms of register
file), copy.

With this patch, the peephole optimizer features two copy-related optimizations:
1. One for rewriting generic copies to generic copies:
PeepholeOptimizer::optimizeCoalescableCopy.
2. One for replacing non-generic copies with generic copies:
PeepholeOptimizer::optimizeUncoalescableCopy.

The goals of these two optimizations are slightly different: one rewrite the
operand of the instruction (#1), the other kills off the non-generic instruction
and replace it by a (sequence of) generic instruction(s).

Both optimizations rely on the ValueTracker introduced in r212100.

The ValueTracker has been refactored to use the information from the
TargetInstrInfo for non-generic instruction. As part of the refactoring, we
switched the tracking from the index of the definition to the actual register
(virtual or physical). This one change is to provide better consistency with
register related APIs and to ease the use of the TargetInstrInfo.

Moreover, this patch introduces a new helper class CopyRewriter used to ease the
rewriting of generic copies (i.e., #1).

Finally, this patch adds a dead code elimination pass right after the peephole
optimizer to get rid of dead code that may appear after rewriting.

This is related to <rdar://problem/12702965>.

Review: http://reviews.llvm.org/D4874


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216088 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 17:41:48 +00:00
Jiangning Liu
e04455d72b Optimize ZERO_EXTEND and SIGN_EXTEND in both SelectionDAG Builder and type
legalization stage. With those two optimizations, fewer signed/zero extension
instructions can be inserted, and then we can expose more opportunities to
Machine CSE pass in back-end.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216066 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 12:05:15 +00:00
Juergen Ributzka
f08cddcf56 Reapply [FastISel] Let the target decide first if it wants to materialize a constant (215588).
Note: This was originally reverted to track down a buildbot error. This commit
exposed a latent bug that was fixed in r215753. Therefore it is reapplied
without any modifications.

I run it through SPEC2k and SPEC2k6 for AArch64 and it didn't introduce any new
regeressions.

Original commit message:
This changes the order in which FastISel tries to materialize a constant.
Originally it would try to use a simple target-independent approach, which
can lead to the generation of inefficient code.

On X86 this would result in the use of movabsq to materialize any 64bit
integer constant - even for simple and small values such as 0 and 1. Also
some very funny floating-point materialization could be observed too.

On AArch64 it would materialize the constant 0 in a register even the
architecture has an actual "zero" register.

On ARM it would generate unnecessary mov instructions or not use mvn.

This change simply changes the order and always asks the target first if it
likes to materialize the constant. This doesn't fix all the issues
mentioned above, but it enables the targets to implement such
optimizations.

Related to <rdar://problem/17420988>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216006 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-19 19:05:24 +00:00
Oliver Stannard
eb922109f9 Teach the AArch64 backend to handle f16
This allows the AArch64 backend to handle fadd, fsub, fmul and fdiv
operations on f16 (half-precision) types by promoting to f32.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215891 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-18 14:22:39 +00:00
Craig Topper
db77b82ed5 Revert "Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size."
Getting a weird buildbot failure that I need to investigate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215870 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-18 00:24:38 +00:00
Craig Topper
f06c7072c2 Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215868 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-17 23:47:00 +00:00
Matt Arsenault
5f8a9ae17c Fix fmul combines with constant splat vectors
Fixes things like fmul x, 2 -> fadd x, x

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215820 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-16 10:14:19 +00:00
Andrea Di Biagio
89cea3c36b [DAGCombiner] Improve the folding of target independet shuffles to Undef.
When combining a pair of shuffle nodes, check if the combined shuffle mask is
trivially Undef. In case, immediately fold that pair of shuffles to Undef.

The lack of checks for undef masks was the root-cause of a poor-codegen bug
in the dag combiner.

Example:
  %1 = shufflevector <4 x i32> %A, <4 x i32> %B, <4 x i32> <i32 4, i32 1, i32 1, i32 6>
  %2 = shufflevector <4 x i32> %1, <4 x i32> undef, <4 x i32> <i32 0, i32 4, i32 1, i32 6>
  %3 = shufflevector <4 x i32> %2, <4 x i32> undef, <4 x i32> <i32 1, i32 5, i32 3, i32 3>

Before this patch, on x86 (with -mcpu=corei7) we failed to fold the entire
sequence to Undef value and therefore we generated:
  shufps $-123, %xmm1, $xmm0
  pshufd $-46, %xmm0, %xmm0

With this patch, the entire shuffle sequence is folded to Undef and no
shuffles are generated in the output assembly.

Added new test cases to test 'combine-vec-shuffle-5.ll'.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215797 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-16 00:29:44 +00:00
Hal Finkel
227df4bca0 Make isAliased property for fixed-offset stack objects adjustable
We used to assume that any fixed-offset stack object was not aliased. This
meant that no IR value could point to the memory contained in such an object.
This is a reasonable default, but is not a universally-correct
target-independent fact. For example, on PowerPC (both Darwin and non-Darwin),
some byval arguments are allocated at fixed offsets by the ABI. These, however,
certainly can be pointed to by IR values. This change moves the 'isAliased'
logic out of FixedStackPseudoSourceValue and into MFI, and allows the isAliased
property to be overridden for fixed-offset objects.

This will be used by an upcoming commit to the PowerPC backend to fix PR20280.

No functionality change intended (the behavior of
FixedStackPseudoSourceValue::isAliased has been made more conservative for
callers that don't pass an MFI object, but I don't see any in-tree callers that
do that).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215794 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-16 00:17:02 +00:00
Robin Morisset
c51ec911e5 Fix typos in comments
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215777 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-15 22:17:28 +00:00
Juergen Ributzka
f92cdd62c0 [FastISel] Remove an performance debugging assert.
As Jim pointed out this assert isn't really needed to test for correctness,
because the code right afterwards does the same check and falls-back to
SelectionDAG - as intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215735 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-15 17:36:30 +00:00
Rafael Espindola
607e03b0d4 Delete dead code. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215720 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-15 14:58:22 +00:00
Juergen Ributzka
6398a7f5fd Revert several FastISel commits to track down a buildbot error.
This reverts:
r215595 "[FastISel][X86] Add large code model support for materializing floating-point constants."
r215594 "[FastISel][X86] Use XOR to materialize the "0" value."
r215593 "[FastISel][X86] Emit more efficient instructions for integer constant materialization."
r215591 "[FastISel][AArch64] Make use of the zero register when possible."
r215588 "[FastISel] Let the target decide first if it wants to materialize a constant."
r215582 "[FastISel][AArch64] Cleanup constant materialization code. NFCI."

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215673 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 19:56:28 +00:00
Sanjay Patel
9615d702ad optimize vector fneg of bitcasted integer value
This patch allows a vector fneg of a bitcasted integer value to be optimized in the same way that we already optimize a scalar fneg. If the integer variable is a constant, we can precompute the result and not require any logic ops.

This patch is very similar to a fabs patch committed at r214892.

Differential Revision: http://reviews.llvm.org/D4852



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215646 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 15:15:28 +00:00
Chandler Carruth
369e0ef67d [SDAG] Fix a bug in the DAG combiner where we would fail to return the
input node after manually adding it to the worklist and using CombineTo.

Once we use CombineTo the input node may have been deleted. Despite this
being *completely confusing* and somewhat broken, the only way to
"correctly" return from a DAG combine after potentially deleting the
input node is to return *that exact node*....

But really, this code should just never have used CombineTo. It won't do
what it wants (returning the node as mentioned above just causes the
combine to infloop). The correct way to combine away a casted load to
a load of the correct type is to RAUW the chain directly and then return
the loaded value to replace the actual value node.

I managed to find this with the vector shuffle fuzzer even though it
clearly has nothing at all to do with vector shuffles and rather those
happen to trigger a load of a constant pool that hits this combine *just
right*. I've included the test as it is small and a nice stress test
that the infrastructure isn't asserting.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215622 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 08:18:34 +00:00
Chandler Carruth
14ee003f1a [SDAG] Fix a case where we would iteratively legalize a node during
combining by replacing it with something else but not re-process the
node afterward to remove it.

In a truly remarkable stroke of bad luck, this would (in the test case
attached) end up getting some other node combined into it without ever
getting re-processed. By adding it back on to the worklist, in addition
to deleting the dead nodes more quickly we also ensure that if it
*stops* being dead for any reason it makes it back through the
legalizer. Without this, the test case will end up failing during
instruction selection due to an and node with a type we don't have an
instruction pattern for.

It took many million runs of the shuffle fuzz tester to find this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215611 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 01:07:37 +00:00
Juergen Ributzka
eb1c51f8b3 [FastISel] Let the target decide first if it wants to materialize a constant.
This changes the order in which FastISel tries to materialize a constant.
Originally it would try to use a simple target-independent approach, which
can lead to the generation of inefficient code.

On X86 this would result in the use of movabsq to materialize any 64bit
integer constant - even for simple and small values such as 0 and 1. Also
some very funny floating-point materialization could be observed too.

On AArch64 it would materialize the constant 0 in a register even the
architecture has an actual "zero" register.

On ARM it would generate unnecessary mov instructions or not use mvn.

This change simply changes the order and always asks the target first if it
likes to materialize the constant. This doesn't fix all the issues
mentioned above, but it enables the targets to implement such
optimizations.

Related to <rdar://problem/17420988>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215588 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 22:08:02 +00:00
Gerolf Hoflehner
2205044968 [MachineCombiner] Removal of dangling DBG_VALUES after combining [20598]
This is a cleaner solution to the problem described in r215431.
When instructions are combined a dangling DBG_VALUE is removed.
This resolves bug 20598.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215587 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 22:07:36 +00:00
Gerolf Hoflehner
4e917a2923 [Cleanup] Utility function to erase instruction and mark DBG_Values
New function to erase a machine instruction and mark DBG_VALUE
for removal. A DBG_VALUE is marked for removal when it references
an operand defined in the instruction.
Use the new function to cleanup code in dead machine instruction
removal pass.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215580 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 21:15:23 +00:00
Quentin Colombet
49128636b6 [MachineDominatorTree] Provide a method to inform a MachineDominatorTree that a
critical edge has been split. The MachineDominatorTree will when lazy update the
underlying dominance properties when require.

** Context **

This is a follow-up of r215410.
Each time a critical edge is split this invalidates the dominator tree
information. Thus, subsequent queries of that interface will be slow until the
underlying information is actually recomputed (costly).

** Problem **

Prior to this patch, splitting a critical edge needed to query the dominator
tree to update the dominator information.
Therefore, splitting a bunch of critical edges will likely produce poor
performance as each query to the dominator tree will use the slow query path.
This happens a lot in passes like MachineSink and PHIElimination.

** Proposed Solution **

Splitting a critical edge is a local modification of the CFG. Moreover, as soon
as a critical edge is split, it is not critical anymore and thus cannot be a
candidate for critical edge splitting anymore. In other words, the predecessor
and successor of a basic block inserted on a critical edge cannot be inserted by
critical edge splitting.

Using these observations, we can pile up the splitting of critical edge and
apply then at once before updating the DT information.

The core of this patch moves the update of the MachineDominatorTree information
from MachineBasicBlock::SplitCriticalEdge to a lazy MachineDominatorTree.

** Performance **

Thanks to this patch, the motivating example compiles in 4- minutes instead of
6+ minutes. No test case added as the motivating example as nothing special but
being huge!

The binaries are strictly identical for all the llvm test-suite + SPECs with and
without this patch for both Os and O3.

Regarding compile time, I observed only noise, although on average I saw a
small improvement.

<rdar://problem/17894619>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215576 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 21:00:07 +00:00
Benjamin Kramer
00e08fcaa0 Canonicalize header guards into a common format.
Add header guards to files that were missing guards. Remove #endif comments
as they don't seem common in LLVM (we can easily add them back if we decide
they're useful)

Changes made by clang-tidy with minor tweaks.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215558 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 16:26:38 +00:00
Andrea Di Biagio
05a76eb9f2 [DAGCombiner] Improved target independent vector shuffle combine rule.
This patch improves the existing algorithm in DAGCombiner that
attempts to fold shuffles according to rule:
  shuffle(shuffle(x, y, M1), undef, M2) -> shuffle(y, undef, M3)

Before this change, there were cases where the DAGCombiner conservatively
avoided folding shuffles even if the resulting mask would have been legal.
That is because the algorithm wrongly assumed that commuting
an illegal shuffle mask would always produce an illegal mask.

With this change, we now correctly compute the commuted shuffle mask before
calling method 'isShuffleMaskLegal' on it.
On X86, this improves for example the codegen for the following function:

define <4 x i32> @test(<4 x i32> %A, <4 x i32> %B) {
  %1 = shufflevector <4 x i32> %B, <4 x i32> %A, <4 x i32> <i32 1, i32 2, i32 6, i32 7>
  %2 = shufflevector <4 x i32> %1, <4 x i32> undef, <4 x i32> <i32 2, i32 3, i32 2, i32 3>
  ret <4 x i32> %2
}

Before this change the X86 backend (-mcpu=corei7) generated
the following assembly code for function @test:
  shufps $-23, %xmm0, %xmm1  # xmm1 = xmm1[1,2],xmm0[2,3]
  movhlps %xmm1, %xmm1       # xmm1 = xmm1[1,1]
  movaps %xmm1, %xmm0

Now we produce:
  movhlps %xmm0, %xmm0       # xmm0 = xmm0[1,1]

Added extra test cases in combine-vec-shuffle-2.ll to verify that we correctly
fold according to the above-mentioned rule.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215555 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 16:09:40 +00:00
Hal Finkel
e693d3c558 [PowerPC] Implement PPCTargetLowering::getTgtMemIntrinsic
This implements PPCTargetLowering::getTgtMemIntrinsic for Altivec load/store
intrinsics. As with the construction of the MachineMemOperands for the
intrinsic calls used for unaligned load/store lowering, the only slight
complication is that we need to represent a larger memory range than the
loaded/stored value-type size (because the address is rounded down to an
aligned address, and we need to conservatively represent the entire possible
range of the actual access). This required adding an extra size field to
TargetLowering::IntrinsicInfo, and this was done in a way that required no
modifications to other targets (the size defaults to the store size of the
provided memory data type).

This fixes test/CodeGen/PowerPC/unal-altivec-wint.ll (so it can be un-XFAILed).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215512 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-13 01:15:40 +00:00
Adrian Prantl
ee8b879822 Remove a condition that can never be true, as wittnessed by the assert
above.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215477 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-12 21:55:58 +00:00
Quentin Colombet
693defb6ea Fix a parentheses warning introduced in r215394.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215459 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-12 17:11:26 +00:00
Eric Christopher
99cd10fe11 Have MachineRegisterInfo take and store the MachineFunction it
was created for rather than the TargetMachine since we only
needed the TM for the subtarget and we can get that from the
MF.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215432 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-12 08:00:56 +00:00
Adrian Prantl
a6bd9bd30e DebugLocEntry: Restore the comparison predicate from before the
refactoring in 215384. This way it can unique multiple entries describing
the same piece even if they don't have the exact same location.
(The same piece may get merged in and be added from OpenRanges).
There ought to be a more elegant solution for this, though.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215418 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-12 01:07:53 +00:00
David Blaikie
524cd2eb7b Revert "Partially revert r214761 that asserted that all concrete debug info variables had DIEs, due to a failure on Darwin."
I believe this was addressed by r215157 and r215227, so let's have
another go at the bots, etc.

This reverts commit r214880.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215412 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-12 00:00:31 +00:00