Commit Graph

17794 Commits

Author SHA1 Message Date
Matthias Braun
095ca8f493 RegisterCoalescer: Turn some impossible conditions into asserts
This is a fixed version of reverted r225500. It fixes the too early
if() continue; of the last patch and adds a comment to the unorthodox
loop.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225652 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-12 19:10:17 +00:00
Ahmed Bougacha
2cec3e9c11 [SimplifyLibCalls] Factor out fortified libcall handling.
This lets us remove CGP duplicate.

Differential Revision: http://reviews.llvm.org/D6541


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225640 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-12 17:22:43 +00:00
Joerg Sonnenberger
3073d3960a Revert r225500, it leads to infinite loops.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225590 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-10 21:49:36 +00:00
Lang Hames
4c553e0367 Recommit r224935 with a fix for the ObjC++/AArch64 bug that that revision
introduced.

A test case for the bug was already committed in r225385.

Patch by Rafael Espindola.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225534 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-09 18:55:42 +00:00
Matthias Braun
c41acffe22 RegisterCoalescer: Fix removeCopyByCommutingDef with subreg liveness
The code that eliminated additional coalescable copies in
removeCopyByCommutingDef() used MergeValueNumberInto() which internally
may merge A into B or B into A. In this case A and B had different Def
points, so we have to reset ValNo.Def to the intended one after merging.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225503 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-09 03:01:31 +00:00
Matthias Braun
4493e98154 RegisterCoalescer: Some cleanup in removeCopyByCommutingDef(), NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225502 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-09 03:01:28 +00:00
Matthias Braun
fdddd47cc6 RegisterCoalescer: No need to set kill flags, they are recompute later anyway
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225501 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-09 03:01:26 +00:00
Matthias Braun
c25a1509c3 RegisterCoalescer: Turn some impossible conditions into asserts
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225500 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-09 03:01:23 +00:00
Hal Finkel
8e1d151abe [DAGCombine] Remainder of fix to r225380 (More FMA folding opportunities)
As pointed out by Aditya (and Owen), when we elide an FP extend to form an FMA,
we need to extend the incoming operands so that the resulting node will really
be legal. This is currently enabled only for PowerPC, and it happens to work
there regardless, but this should fix the functionality for everyone else
should anyone else wish to use it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225492 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-09 01:29:29 +00:00
Hal Finkel
40ddb2ce8f Partial fix to r225380 (More FMA folding opportunities)
As pointed out by Aditya (and Owen), there are two things wrong with this code.
First, it adds patterns which elide FP extends when forming FMAs, and that might
not be profitable on all targets (it belongs behind the pre-existing
aggressive-FMA-formation flag). This is fixed by this change.

Second, the resulting nodes might have operands of different types (the
extensions need to be re-added). That will be fixed in the follow-up commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225485 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-09 00:45:54 +00:00
Hal Finkel
9d1500e68f [MachineLICM] A command-line option to hoist even cheap instructions
Add a command-line option to enable hoisting even cheap instructions (in
low-register-pressure situations). This is turned off by default, but has
proved useful for testing purposes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225470 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-08 22:10:48 +00:00
Duncan P. N. Exon Smith
089a4ba180 CodeGen: Use handy new-fangled post-increment, NFC
Drive-by cleanup; I noticed this when reviewing the patch that became
r225466.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225468 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-08 21:07:55 +00:00
Duncan P. N. Exon Smith
8826cb9526 CodeGen: Use range-based for loops, NFC
Patch by Ramkumar Ramachandra!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225466 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-08 20:44:33 +00:00
Elena Demikhovsky
6e8b53da17 Masked Load/Store - fixed a bug in type legalization.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225441 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-08 12:29:19 +00:00
Michael Kuperstein
757d230f7d Fix include ordering, NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225439 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-08 11:59:43 +00:00
Michael Kuperstein
1cea749780 Move SPAdj logic from PEI into the targets (NFC)
PEI tries to keep track of how much starting or ending a call sequence adjusts the stack pointer by, so that it can resolve frame-index references. Currently, it takes a very simplistic view of how SP adjustments are done - both FrameStartOpcode and FrameDestroyOpcode adjust it exactly by the amount written in its first argument.

This view is in fact incorrect for some targets (e.g. due to stack re-alignment, or because it may want to adjust the stack pointer in multiple steps). However, that doesn't cause breakage, because most targets (the only in-tree exception appears to be 32-bit ARM) rely on being able to simplify the call frame pseudo-instructions earlier, so this code is never hit. 

Moving the computation into TargetInstrInfo allows targets to override the way the adjustment is computed if they need to have a non-zero SPAdj.

Differential Revision: http://reviews.llvm.org/D6863

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225437 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-08 11:04:38 +00:00
Quentin Colombet
9d60e0ff0a [RegAllocGreedy] Introduce a late pass to repair broken hints.
A broken hint is a copy where both ends are assigned different colors. When a
variable gets evicted in the neighborhood of such copies, it is likely we can
reconcile some of them.


** Context **

Copies are inserted during the register allocation via splitting. These split
points are required to relax the constraints on the allocation problem. When
such a point is inserted, both ends of the copy would not share the same color
with respect to the current allocation problem. When variables get evicted,
the allocation problem becomes different and some split point may not be
required anymore. However, the related variables may already have been colored.

This usually shows up in the assembly with pattern like this:
def A
...
save A to B
def A
use A
restore A from B
...
use B

Whereas we could simply have done:
def B
...
def A
use A
...
use B


** Proposed Solution **

A variable having a broken hint is marked for late recoloring if and only if
selecting a register for it evict another variable. Indeed, if no eviction
happens this is pointless to look for recoloring opportunities as it means the
situation was the same as the initial allocation problem where we had to break
the hint.

Finally, when everything has been allocated, we look for recoloring
opportunities for all the identified candidates.
The recoloring is performed very late to rely on accurate copy cost (all
involved variables are allocated).
The recoloring is simple unlike the last change recoloring. It propagates the
color of the broken hint to all its copy-related variables. If the color is
available for them, the recoloring uses it, otherwise it gives up on that hint
even if a more complex coloring would have worked.

The recoloring happens only if it is profitable. The profitability is evaluated
using the expected frequency of the copies of the currently recolored variable
with a) its current color and b) with the target color. If a) is greater or
equal than b), then it is profitable and the recoloring happen.


** Example **

Consider the following example:
BB1:
  a =
  b =
BB2:
  ...
   = b
   = a
Let us assume b gets split:
BB1:
  a =
  b =
BB2:
  c = b
  ...
  d = c
  = d
  = a
Because of how the allocation work, b, c, and d may be assigned different
colors. Now, if a gets evicted to make room for c, assuming b and d were
assigned to something different than a.
We end up with:
BB1:
  a =
  st a, SpillSlot
  b =
BB2:
  c = b
  ...
  d = c
  = d
  e = ld SpillSlot
  = e
This is likely that we can assign the same register for b, c, and d,
getting rid of 2 copies.


** Performances **

Both ARM64 and x86_64 show performance improvements of up to 3% for the
llvm-testsuite + externals with Os and O3. There are a few regressions too that
comes from the (in)accuracy of the block frequency estimate.

<rdar://problem/18312047>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225422 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-08 01:16:39 +00:00
Ahmed Bougacha
7fac1d945f [SelectionDAG] Allow targets to specify legality of extloads' result
type (in addition to the memory type).

The *LoadExt* legalization handling used to only have one type, the
memory type.  This forced users to assume that as long as the extload
for the memory type was declared legal, and the result type was legal,
the whole extload was legal.

However, this isn't always the case.  For instance, on X86, with AVX,
this is legal:
    v4i32 load, zext from v4i8
but this isn't:
    v4i64 load, zext from v4i8
Whereas v4i64 is (arguably) legal, even without AVX2.

Note that the same thing was done a while ago for truncstores (r46140),
but I assume no one needed it yet for extloads, so here we go.

Calls to getLoadExtAction were changed to add the value type, found
manually in the surrounding code.

Calls to setLoadExtAction were mechanically changed, by wrapping the
call in a loop, to match previous behavior.  The loop iterates over
the MVT subrange corresponding to the memory type (FP vectors, etc...).
I also pulled neighboring setTruncStoreActions into some of the loops;
those shouldn't make a difference, as the additional types are illegal.
(e.g., i128->i1 truncstores on PPC.)

No functional change intended.

Differential Revision: http://reviews.llvm.org/D6532


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225421 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-08 00:51:32 +00:00
Matthias Braun
9f6a38fc70 RegisterCoalescer: Do not remove IMPLICIT_DEFS if they are required for subranges.
The register coalescer used to remove implicit_defs when they are
covered by the main range anyway. With subreg liveness tracking we can't
do that anymore in places where the IMPLICIT_DEF is required as begin of
a subregister liverange.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225416 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-08 00:21:23 +00:00
Matthias Braun
a065cf13cd RegisterCoalescer: Fix valuesIdentical() in some subrange merge cases.
I got confused and assumed SrcIdx/DstIdx of the CoalescerPair is a
subregister index in SrcReg/DstReg, but they are actually subregister
indices of the coalesced register that get you back to SrcReg/DstReg
when applied.

Fixed the bug, improved comments and simplified code accordingly.

Testcase by Tom Stellard!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225415 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 23:58:38 +00:00
Matthias Braun
b6eccdd6b0 LiveInterval: Implement feedback by Quentin Colombet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225413 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 23:35:11 +00:00
Adrian Prantl
31208c1591 Update a comment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225399 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 21:35:13 +00:00
Ahmed Bougacha
8065738154 [CodeGen] Use MVT iterator_ranges in legality loops. NFC intended.
A few loops do trickier things than just iterating on an MVT subset,
so I'll leave them be for now.
Follow-up of r225387.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225392 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 21:27:10 +00:00
Olivier Sallenave
033a537a84 More FMA folding opportunities.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225380 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 20:54:17 +00:00
Adrian Prantl
7950596b55 Debug info: Allow aggregate types to be described by constants.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225378 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 20:48:58 +00:00
Olivier Sallenave
bc2572cac5 Test commit
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225368 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 19:45:17 +00:00
Philip Reames
22e5a23311 Add a missing file from 225365
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225366 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 19:13:28 +00:00
Philip Reames
28fa9e1e9f Introduce an example statepoint GC strategy
This change includes the most basic possible GCStrategy for a GC which is using the statepoint lowering code. At the moment, this GCStrategy doesn't really do much - aside from actually generate correct stackmaps that is - but I went ahead and added a few extra correctness checks as proof of concept. It's mostly here to provide documentation on how to do one, and to provide a point for various optimization legality hooks I'd like to add going forward. (For context, see the TODOs in InstCombine around gc.relocate.)

Most of the validation logic added here as proof of concept will soon move in to the Verifier.  That move is dependent on http://reviews.llvm.org/D6811

There was discussion in the review thread about addrspace(1) being reserved for something.  I'm going to follow up on a seperate llvmdev thread.  If needed, I'll update all the code at once.

Note that I am deliberately not making a GCStrategy required to use gc.statepoints with this change. I want to give folks out of tree - including myself - a chance to migrate. In a week or two, I'll make having a GCStrategy be required for gc.statepoints. To this end, I added the gc tag to one of the test cases but not others.

Differential Revision: http://reviews.llvm.org/D6808



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225365 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 19:07:50 +00:00
Jonas Paulsson
18a7c886c0 New method SDep::isNormalMemoryOrBarrier() in ScheduleDAGInstrs.cpp.
Used to iterate over previously added memory dependencies in
adjustChainDeps() and iterateChainSucc().

SDep::isCtrl() was previously used in these places, that also gave
anti and output edges. The code may be worse if these are followed,
because MisNeedChainEdge() will conservatively return true since a
non-memory instruction has no memory operands, and a false chain dep
will be added. It is also unnecessary since all memory accesses of
interest will be reached by memory dependencies, and there is a budget
limit for the number of edges traversed.

This problem was found on an out-of-tree target with enabled alias
analysis. No test case for an in-tree target has been found.

Reviewed by Hal Finkel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225351 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 13:38:29 +00:00
Jonas Paulsson
650c3a453d Fix typos in comment and option help texts.
For -enable-aa-sched-mi and -use-tbaa-in-sched-mi.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225350 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 13:20:57 +00:00
Lang Hames
84acf09f32 Revert r224935 "Refactor duplicated code. No intended functionality change."
This is affecting the behavior of some ObjC++ / AArch64 test cases on Darwin.
Reverting to get the bots green while I track down the source of the changed
behavior.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225311 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 23:04:36 +00:00
Mehdi Amini
35300ff218 SelectionDAGBuilder: move constant initialization out of loop
No semantic change intended.

Reviewers: resistor

Differential Revision: http://reviews.llvm.org/D6834



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225278 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 18:20:04 +00:00
Andrea Di Biagio
e46783d5b7 [CodeGenPrepare] Improved logic to speculate calls to cttz/ctlz.
This patch improves the logic added at revision 224899 (see review D6728) that
teaches the backend when it is profitable to speculate calls to cttz/ctlz.

The original algorithm conservatively avoided speculating more than one
instruction from a basic block in a control flow grap modelling an if-statement.
In particular, the only allowed instruction (excluding the terminator) was a
call to cttz/ctlz. However, there are cases where we could be less conservative
and still be able to speculate a call to cttz/ctlz.

With this patch, CodeGenPrepare now tries to speculate a cttz/ctlz if the
result is zero extended/truncated in the same basic block, and the zext/trunc
instruction is "free" for the target.

Added new test cases to CodeGen/X86/cttz-ctlz.ll

Differential Revision: http://reviews.llvm.org/D6853


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225274 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 17:41:18 +00:00
Frederic Riss
0c36e97b78 Make DIE.h a public CodeGen header.
dsymutil would like to use all the AsmPrinter/MCStreamer infrastructure
to stream out the DWARF. In order to do so, it will reuse the DIE object
and so this header needs to be public.

The interface exposed here has some corners that cannot be used without a
DwarfDebug object, but clients that want to stream Dwarf can just avoid
these.

Differential Revision: http://reviews.llvm.org/D6695

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225208 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 21:29:41 +00:00
Craig Topper
9bf73516cb Replace several 'assert(false' with 'llvm_unreachable' or fold a condition into the assert.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225160 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 10:15:49 +00:00
Hal Finkel
e05b232c20 [PowerPC/BlockPlacement] Allow target to provide a per-loop alignment preference
The existing code provided for specifying a global loop alignment preference.
However, the preferred loop alignment might depend on the loop itself. For
recent POWER cores, loops between 5 and 8 instructions should have 32-byte
alignment (while the others are better with 16-byte alignment) so that the
entire loop will fit in one i-cache line.

To support this, getPrefLoopAlignment has been made virtual, and can be
provided with an optional MachineLoop* so the target can inspect the loop
before answering the query. The default behavior, as before, is to return the
value set with setPrefLoopAlignment. MachineBlockPlacement now queries the
target for each loop instead of only once per function. There should be no
functional change for other targets.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225117 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-03 17:58:24 +00:00
Alexey Samsonov
c0319dd9c2 Revert "merge consecutive stores of extracted vector elements"
This reverts commit r224611. This change causes crashes
in X86 DAG->DAG Instruction Selection.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225031 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-31 00:40:28 +00:00
David Blaikie
1d68fc5021 DebugInfo: Omit is_stmt from line table entries on the same line.
GCC does this for non-zero discriminators and since GCC doesn't produce
column info, that was the only place it comes up there. For LLVM, since
we can emit discriminators and/or column info, it makes more sense to
invert the condition and just test for changes in line number.

This should resolve at least some of the GDB 7.5 test suite failures
created by recent Clang changes that increase the location fidelity
(which, since Clang defaults to including column info on Linux by
default created a bunch of cases that confused GDB).

In theory we could do this better/differently by grouping actual source
statements together in a similar manner to the way lexical scopes are
handled but given that GDB isn't really in a position to consume that (&
users are probably somewhat used to different lines being different
'statements') this seems the safest and cheapest change. (I'm concerned
that doing this 'right' would bloat the debugloc data even further -
something Duncan's working hard to address)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225011 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-30 22:47:13 +00:00
Peter Collingbourne
d8ae3e1fee x86_64: Fix calls to __morestack under the large code model.
Under the large code model, we cannot assume that __morestack lives within
2^31 bytes of the call site, so we cannot use pc-relative addressing. We
cannot perform the call via a temporary register, as the rax register may
be used to store the static chain, and all other suitable registers may be
either callee-save or used for parameter passing. We cannot use the stack
at this point either because __morestack manipulates the stack directly.

To avoid these issues, perform an indirect call via a read-only memory
location containing the address.

This solution is not perfect, as it assumes that the .rodata section
is laid out within 2^31 bytes of each function body, but this seems to
be sufficient for JIT.

Differential Revision: http://reviews.llvm.org/D6787

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225003 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-30 20:05:19 +00:00
Michael Kuperstein
08c26613e1 [COFF] Don't try to add quotes to already quoted linker directives
If a linker directive is already quoted, don't try to quote it again, otherwise it creates a mess.
This pops up in places like:
#pragma comment(linker,"\"/foo bar'\"")

Differential Revision: http://reviews.llvm.org/D6792

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224998 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-30 19:23:48 +00:00
Rafael Espindola
2a1c1c9dea Refactor duplicated code.
No intended functionality change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224935 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-29 15:18:31 +00:00
Andrea Di Biagio
70a7cda495 [CodeGenPrepare] Teach when it is profitable to speculate calls to @llvm.cttz/ctlz.
If the control flow is modelling an if-statement where the only instruction in
the 'then' basic block (excluding the terminator) is a call to cttz/ctlz,
CodeGenPrepare can try to speculate the cttz/ctlz call and simplify the control
flow graph.

Example:
\code
entry:
  %cmp = icmp eq i64 %val, 0
  br i1 %cmp, label %end.bb, label %then.bb

then.bb:
  %c = tail call i64 @llvm.cttz.i64(i64 %val, i1 true)
  br label %end.bb

end.bb:
  %cond = phi i64 [ %c, %then.bb ], [ 64, %entry]
\code

In this example, basic block %then.bb is taken if value %val is not zero.
Also, the phi node in %end.bb would propagate the size-of in bits of %val
only if %val is equal to zero.

With this patch, CodeGenPrepare will try to hoist the call to cttz from %then.bb
into basic block %entry only if cttz is cheap to speculate for the target.

Added two new hooks in TargetLowering.h to let targets customize the behavior
(i.e. decide whether it is cheap or not to speculate calls to cttz/ctlz). The
two new methods are 'isCheapToSpeculateCtlz' and 'isCheapToSpeculateCttz'.
By default, both methods return 'false'.
On X86, method 'isCheapToSpeculateCtlz' returns true only if the target has
LZCNT. Method 'isCheapToSpeculateCttz' only returns true if the target has BMI.

Differential Revision: http://reviews.llvm.org/D6728


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224899 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-28 11:07:35 +00:00
Elena Demikhovsky
8499a501e4 Scalarizer for masked load and store intrinsics.
Masked vector intrinsics are a part of common LLVM IR, but they are really supported on AVX2 and AVX-512 targets. I added a code that translates masked intrinsic for all other targets. The masked vector intrinsic is converted to a chain of scalar operations inside conditional basic blocks.

http://reviews.llvm.org/D6436



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224897 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-28 08:54:45 +00:00
Timur Iskhodzhanov
f4076dc995 Band-aid fix for PR22032: don't emit DWARF debug info if AddressSanitizer is enabled on Windows
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224860 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-26 17:00:51 +00:00
David Majnemer
b582949103 Silence GCC's -Wparentheses warning
No functionality change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224833 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-25 10:03:23 +00:00
Elena Demikhovsky
b31322328a Masked Load/Store - Changed the order of parameters in intrinsics.
No functional changes.
The documentation is coming.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224829 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-25 07:49:20 +00:00
David Majnemer
e277a13a71 CodeGen: Error on redefinitions instead of asserting
It's possible to have a prior definition of a symbol in module asm.
Raise an error instead of crashing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224828 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-24 23:06:55 +00:00
David Majnemer
d36cad9914 CodeGen: Allow aliases to be overridden by variables
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224827 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-24 22:44:29 +00:00
David Majnemer
e54eacce75 MC: Label definitions are permitted after .set directives
.set directives may be overridden by other .set directives as well as
label definitions.

This fixes PR22019.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224811 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-24 10:27:50 +00:00
Matthias Braun
13a193db05 LiveInterval: Remove accidentally committed debug code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224807 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-24 02:35:07 +00:00
Matthias Braun
8882414a11 LiveInterval: Introduce createMainRangeFromSubranges().
This function constructs the main liverange by merging all subranges if
subregister liveness tracking is available. This should be slightly
faster to compute instead of performing the liveness calculation again
for the main range. More importantly it avoids cases where the main
liverange would cover positions where no subrange was live. These cases
happened for partial definitions where the actual defined part was dead
and only the undefined parts used later.

The register coalescing requires that every part covered by the main
live range has at least one subrange live.

I also expect this function to become usefull later for places where the
subranges are modified in a way that it is hard to correctly fix the
main liverange in the machine scheduler, we can simply reconstruct it
from subranges then.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224806 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-24 02:11:51 +00:00
Matthias Braun
02add3f1a6 RegisterCoalescer: With subrange liveness there may be no RedefVNI for unused lanes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224805 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-24 02:11:48 +00:00
Matthias Braun
a2fd5b5fd0 LiveRangeEdit: Check for completely empy subranges after removing ValNos.
Completely empty subranges are not allowed and must be removed when
subreg liveness is enabled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224804 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-24 02:11:46 +00:00
Matthias Braun
94daeceeac LiveIntervalAnalysis: Fix performance bug that I introduced in r224663.
Without a reference the code did not remember when moving the iterators
of the subranges/registerunit ranges forward and instead would scan from
the beginning again at the next position.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224803 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-24 02:11:43 +00:00
Adrian Prantl
34f81e8bec Debug Info: In symmetry to DW_TAG_pointer_type, do not emit the byte size
of a DW_TAG_ptr_to_member_type.
This restores the behavior from before r224780-r224781.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224799 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-24 01:17:51 +00:00
Mehdi Amini
8548c2453f Always assert in DAGCombine and not only when -debug is enabled
Right now in DAG Combine check the validity of the returned type 
only when -debug is given on the command line. However usually 
the test cases in the validation does not use -debug. 
An Assert build should always check this.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224779 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-23 18:59:02 +00:00
Michael Kuperstein
1f0ddef593 [DagCombine] Improve DAGCombiner BUILD_VECTOR when it has two sources of elements
This partially fixes PR21943.

For AVX, we go from:

vmovq   (%rsi), %xmm0
vmovq   (%rdi), %xmm1
vpermilps       $-27, %xmm1, %xmm2 ## xmm2 = xmm1[1,1,2,3]
vinsertps       $16, %xmm2, %xmm1, %xmm1 ## xmm1 = xmm1[0],xmm2[0],xmm1[2,3]
vinsertps       $32, %xmm0, %xmm1, %xmm1 ## xmm1 = xmm1[0,1],xmm0[0],xmm1[3]
vpermilps       $-27, %xmm0, %xmm0 ## xmm0 = xmm0[1,1,2,3]
vinsertps       $48, %xmm0, %xmm1, %xmm0 ## xmm0 = xmm1[0,1,2],xmm0[0]

To the expected:

vmovq   (%rdi), %xmm0
vmovhpd (%rsi), %xmm0, %xmm0
retq

Fixing this for AVX2 is still open.

Differential Revision: http://reviews.llvm.org/D6749

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224759 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-23 08:59:45 +00:00
Reid Kleckner
34b7fde802 Make musttail more robust for vector types on x86
Previously I tried to plug musttail into the existing vararg lowering
code. That turned out to be a mistake, because non-vararg calls use
significantly different register lowering, even on x86. For example, AVX
vectors are usually passed in registers to normal functions and memory
to vararg functions.  Now musttail uses a completely separate lowering.

Hopefully this can be used as the basis for non-x86 perfect forwarding.

Reviewers: majnemer

Differential Revision: http://reviews.llvm.org/D6156

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224745 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-22 23:58:37 +00:00
Quentin Colombet
7b88565334 [CodeGenPrepare] Handle properly the promotion of operands when this does not
generate instructions.

Fixes PR21978.
Related to <rdar://problem/18310086>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224717 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-22 18:11:52 +00:00
Rafael Espindola
ada5f24b5f The leak detector is dead, long live asan and valgrind.
In resent times asan and valgrind have found way more memory management bugs
in llvm than the special purpose leak detector.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224703 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-22 13:00:36 +00:00
Saleem Abdulrasool
11dd9c3d55 CodeGen: minor style tweaks to SSP
Clean up some style related things in the StackProtector CodeGen.  NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224693 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-21 21:52:38 +00:00
Matt Arsenault
d796cf2e01 Enable (sext x) == C --> x == (trunc C) combine
Extend the existing code which handles this for zext. This makes this
more useful for targets with ZeroOrNegativeOne BooleanContent and
obsoletes a custom combine SI uses for i1 setcc (sext(i1), 0, setne)
since the constant will now be shrunk to i1.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224691 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-21 16:48:42 +00:00
Saleem Abdulrasool
281f568720 CodeGen: constify and use range loop for SSP
Use range-based for loop and constify the iterators.  NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224683 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-20 21:37:51 +00:00
Matthias Braun
4acc514cc2 LiveIntervalAnalysis: No kill flags for partially undefined uses.
We must not add kill flags when reading a vreg with some undefined
subregisters, if subreg liveness tracking is enabled.  This is because
the register allocator may reuse these undefined subregisters for other
values which are not killed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224664 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-20 01:54:50 +00:00
Matthias Braun
1f6bcf1b85 LiveIntervalAnalysis: cleanup addKills(), NFC
- Use more const modifiers
- Use references for things that can't be nullptr
- Improve some variable names

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224663 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-20 01:54:48 +00:00
Reid Kleckner
b60b7360f5 EH: Sink computation of local PadMap variable into function that uses it
No functionality change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224635 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-19 22:30:08 +00:00
Reid Kleckner
0f85d54670 Add the ExceptionHandling::MSVC enumeration
It is intended to be used for a family of personality functions that
have similar IR preparation requirements. Typically when interoperating
with MSVC personality functions, bits of functionality need to be
outlined from the main function into helper functions. There is also
usually more than one landing pad per invoke, which does not match the
LLVM IR landingpad representation.

None of this is implemented yet. This change just adds a new enum that
is active for *-windows-msvc and delegates to the EH removal preparation
pass.  No functionality change for other targets.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224625 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-19 22:19:48 +00:00
Sanjay Patel
3c3cd10928 merge consecutive stores of extracted vector elements
Add a path to DAGCombiner::MergeConsecutiveStores() 
to combine multiple scalar stores when the store operands
are extracted vector elements. This is a partial fix for
PR21711 ( http://llvm.org/bugs/show_bug.cgi?id=21711 ).

For the new test case, codegen improves from:

   vmovss  %xmm0, (%rdi)
   vextractps      $1, %xmm0, 4(%rdi)
   vextractps      $2, %xmm0, 8(%rdi)
   vextractps      $3, %xmm0, 12(%rdi)
   vextractf128    $1, %ymm0, %xmm0
   vmovss  %xmm0, 16(%rdi)
   vextractps      $1, %xmm0, 20(%rdi)
   vextractps      $2, %xmm0, 24(%rdi)
   vextractps      $3, %xmm0, 28(%rdi)
   vzeroupper
   retq

To:

   vmovups	%ymm0, (%rdi)
   vzeroupper
   retq

Patch reviewed by Nadav Rotem.

Differential Revision: http://reviews.llvm.org/D6698



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224611 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-19 20:23:41 +00:00
Matthias Braun
94dfce45bf RegisterCoalescer: rewrite eliminateUndefCopy().
This also fixes problems with undef copies of subregisters. I can't
attach a testcase for that as none of the targets in trunk has
subregister liveness tracking enabled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224560 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-19 01:39:46 +00:00
Adrian Prantl
7a6f0084e2 Explain why LLVM is emitting a DW_AT_containing_type inside of a class.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224555 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-19 00:01:20 +00:00
Matthias Braun
b82636bb80 LiveIntervalAnalysis: Cleanup computeDeadValues
- This also fixes a bug introduced in r223880 where values were not
  correctly marked as Dead anymore.
- Cleanup computeDeadValues(): split up SubRange code variant, simplify
  arguments.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224538 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-18 19:58:52 +00:00
Eric Christopher
c559ba7251 Add a new string member to the TargetOptions struct for the name
of the abi we should be using. For targets that don't use the
option there's no change, otherwise this allows external users
to set the ABI via string and avoid some of the -backend-option
pain in clang.

Use this option to move the ABI for the ARM port from the
Subtarget to the TargetMachine and update the testcases
accordingly since it's no longer valid to set via -mattr.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224492 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-18 02:20:58 +00:00
Matthias Braun
a1a68905b5 RegisterCoalescer: Fix stripCopies() picking up main range instead of subregister range
This fixes a problem where stripCopies() would switch to values in the
main liverange when it crossed a copy instruction. However when joining
subranges we need to stay in the respective subregister ranges.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224461 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-17 21:25:20 +00:00
Matthias Braun
cd56c19f3c ExecutionDepsFix: Correctly handle wide registers.
The ExecutionDepsFix previously mapped each register to 1 or zero
registers of the register class it was called with and therefore
simulating liveness for.  This was problematic for cases involving wider
registers like Q0 on ARM where ExecutionDepsFix gets invoked for the Dxx
registers. In these cases the wide register would get mapped to the last
matching D register, while it should have been all matching D registers.
This commit changes the AliasMap to use a SmallVector to map registers
to potentially multiple destination regclass registers. This is required
to avoid regressions with subregister liveness tracking enabled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224447 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-17 19:13:47 +00:00
Michael Kuperstein
fd350586f5 [DAGCombine] Slightly improve lowering of BUILD_VECTOR into a shuffle.
This handles the case of a BUILD_VECTOR being constructed out of elements extracted from a vector twice the size of the result vector. Previously this was always scalarized. Now, we try to construct a shuffle node that feeds on extract_subvectors.

This fixes PR15872 and provides a partial fix for PR21711.

Differential Revision: http://reviews.llvm.org/D6678

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224429 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-17 12:32:17 +00:00
Toma Tabacu
3fea427a63 [mips] Set GCC-compatible MIPS asssembler options before inline asm blocks.
Summary:
When generating MIPS assembly, LLVM always overrides the default assembler options by emitting the '.set noreorder', '.set nomacro' and '.set noat' directives,
while GCC uses the default options if an assembly-level function contains inline assembly code.

This becomes a problem when the code generated by LLVM is interleaved with inline assembly which assumes GCC-like assembler options (from Linux, for example).

This patch fixes these conflicts by setting the appropriate assembler options at the beginning of an inline asm block and popping them at the end.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6637

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224425 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-17 10:56:16 +00:00
Matthias Braun
b0de67232f RegisterCoalescer: Sprinkle some const modifiers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224409 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-17 02:18:13 +00:00
Quentin Colombet
1e2604dccc [CodeGenPrepare] Reapply r224351 with a fix for the assertion failure:
The type promotion helper does not support vector type, so when make
such it does not kick in in such cases.

Original commit message:
[CodeGenPrepare] Move sign/zero extensions near loads using type promotion.

This patch extends the optimization in CodeGenPrepare that moves a sign/zero
extension near a load when the target can combine them. The optimization may
promote any operations between the extension and the load to make that possible.

Although this optimization may be beneficial for all targets, in particular
AArch64, this is enabled for X86 only as I have not benchmarked it for other
targets yet.


** Context **

Most targets feature extended loads, i.e., loads that perform a zero or sign
extension for free. In that context it is interesting to expose such pattern in
CodeGenPrepare so that the instruction selection pass can form such loads.
Sometimes, this pattern is blocked because of instructions between the load and
the extension. When those instructions are promotable to the extended type, we
can expose this pattern.


** Motivating Example **

Let us consider an example:
define void @foo(i8* %addr1, i32* %addr2, i8 %a, i32 %b) {
  %ld = load i8* %addr1
  %zextld = zext i8 %ld to i32
  %ld2 = load i32* %addr2
  %add = add nsw i32 %ld2, %zextld
  %sextadd = sext i32 %add to i64
  %zexta = zext i8 %a to i32
  %addza = add nsw i32 %zexta, %zextld
  %sextaddza = sext i32 %addza to i64
  %addb = add nsw i32 %b, %zextld
  %sextaddb = sext i32 %addb to i64
  call void @dummy(i64 %sextadd, i64 %sextaddza, i64 %sextaddb)
  ret void
}

As it is, this IR generates the following assembly on x86_64:
[...]
  movzbl  (%rdi), %eax   # zero-extended load
  movl  (%rsi), %es      # plain load
  addl  %eax, %esi       # 32-bit add
  movslq  %esi, %rdi     # sign extend the result of add
  movzbl  %dl, %edx      # zero extend the first argument
  addl  %eax, %edx       # 32-bit add
  movslq  %edx, %rsi     # sign extend the result of add
  addl  %eax, %ecx       # 32-bit add
  movslq  %ecx, %rdx     # sign extend the result of add
[...]
The throughput of this sequence is 7.45 cycles on Ivy Bridge according to IACA.

Now, by promoting the additions to form more extended loads we would generate:
[...]
  movzbl  (%rdi), %eax   # zero-extended load
  movslq  (%rsi), %rdi   # sign-extended load
  addq  %rax, %rdi       # 64-bit add
  movzbl  %dl, %esi      # zero extend the first argument
  addq  %rax, %rsi       # 64-bit add
  movslq  %ecx, %rdx     # sign extend the second argument
  addq  %rax, %rdx       # 64-bit add
[...]
The throughput of this sequence is 6.15 cycles on Ivy Bridge according to IACA.

This kind of sequences happen a lot on code using 32-bit indexes on 64-bit
architectures.

Note: The throughput numbers are similar on Sandy Bridge and Haswell.


** Proposed Solution **

To avoid the penalty of all these sign/zero extensions, we merge them in the
loads at the beginning of the chain of computation by promoting all the chain of
computation on the extended type. The promotion is done if and only if we do not
introduce new extensions, i.e., if we do not degrade the code quality.
To achieve this, we extend the existing “move ext to load” optimization with the
promotion mechanism introduced to match larger patterns for addressing mode
(r200947).
The idea of this extension is to perform the following transformation:
ext(promotableInst1(...(promotableInstN(load))))
=>
promotedInst1(...(promotedInstN(ext(load))))

The promotion mechanism in that optimization is enabled by a new TargetLowering
switch, which is off by default. In other words, by default, the optimization
performs the “move ext to load” optimization as it was before this patch.


** Performance **

Configuration: x86_64: Ivy Bridge fixed at 2900MHz running OS X 10.10.
Tested Optimization Levels: O3/Os
Tests: llvm-testsuite + externals.
Results:
- No regression beside noise.
- Improvements:
CINT2006/473.astar:  ~2%
Benchmarks/PAQ8p: ~2%
Misc/perlin: ~3%

The results are consistent for both O3 and Os.

<rdar://problem/18310086>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224402 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-17 01:36:17 +00:00
David Blaikie
7c3af9120c PR21875: codegen for non-type template parameters of nullptr_t type
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224399 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-17 00:43:22 +00:00
Reid Kleckner
0c7f4e46b6 Revert "[CodeGenPrepare] Move sign/zero extensions near loads using type promotion."
This reverts commit r224351. It causes assertion failures when building
ICU.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224397 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-17 00:29:23 +00:00
Hans Wennborg
74b5b195fd SelectionDAG switch lowering: use 'unsigned' to count destination popularity
SwitchInst::getNumCases() returns unsinged, so using uint64_t to count cases
seems unnecessary.

Also fix a missing CHECK in the test case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224393 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-16 23:41:59 +00:00
Sanjay Patel
128b2c383c merge consecutive loads that are offset from a base address
SelectionDAG::isConsecutiveLoad() was not detecting consecutive loads
when the first load was offset from a base address. 

This patch recognizes that pattern and subtracts the offset before comparing
the second load to see if it is consecutive.

The codegen change in the new test case improves from:

vmovsd	32(%rdi), %xmm0
vmovsd	48(%rdi), %xmm1 
vmovhpd	56(%rdi), %xmm1, %xmm1
vmovhpd	40(%rdi), %xmm0, %xmm0
vinsertf128	$1, %xmm1, %ymm0, %ymm0

To:

vmovups	32(%rdi), %ymm0

An existing test case is also improved from:

vmovsd	(%rdi), %xmm0
vmovsd	16(%rdi), %xmm1
vmovsd	24(%rdi), %xmm2
vunpcklpd	%xmm2, %xmm0, %xmm0 ## xmm0 = xmm0[0],xmm2[0]
vmovhpd	8(%rdi), %xmm1, %xmm3

To:

vmovsd	(%rdi), %xmm0
vmovsd	16(%rdi), %xmm1
vmovhpd	24(%rdi), %xmm0, %xmm0
vmovhpd	8(%rdi), %xmm1, %xmm1

This patch fixes PR21771 ( http://llvm.org/bugs/show_bug.cgi?id=21771 ).

Differential Revision: http://reviews.llvm.org/D6642



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224379 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-16 21:57:18 +00:00
Matt Arsenault
8b5b0f5ee5 Move lowerConstant to AsmPrinter
This was a static function before, and NVPTX duplicated it
because it wasn't exposed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224354 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-16 19:16:14 +00:00
Quentin Colombet
93b6e016b1 [CodeGenPrepare] Move sign/zero extensions near loads using type promotion.
This patch extends the optimization in CodeGenPrepare that moves a sign/zero
extension near a load when the target can combine them. The optimization may
promote any operations between the extension and the load to make that possible.

Although this optimization may be beneficial for all targets, in particular
AArch64, this is enabled for X86 only as I have not benchmarked it for other
targets yet.


** Context **

Most targets feature extended loads, i.e., loads that perform a zero or sign
extension for free. In that context it is interesting to expose such pattern in
CodeGenPrepare so that the instruction selection pass can form such loads.
Sometimes, this pattern is blocked because of instructions between the load and
the extension. When those instructions are promotable to the extended type, we
can expose this pattern.


** Motivating Example **

Let us consider an example:
define void @foo(i8* %addr1, i32* %addr2, i8 %a, i32 %b) {
  %ld = load i8* %addr1
  %zextld = zext i8 %ld to i32
  %ld2 = load i32* %addr2
  %add = add nsw i32 %ld2, %zextld
  %sextadd = sext i32 %add to i64
  %zexta = zext i8 %a to i32
  %addza = add nsw i32 %zexta, %zextld
  %sextaddza = sext i32 %addza to i64
  %addb = add nsw i32 %b, %zextld
  %sextaddb = sext i32 %addb to i64
  call void @dummy(i64 %sextadd, i64 %sextaddza, i64 %sextaddb)
  ret void
}

As it is, this IR generates the following assembly on x86_64:
[...]
  movzbl  (%rdi), %eax   # zero-extended load
  movl  (%rsi), %es      # plain load
  addl  %eax, %esi       # 32-bit add
  movslq  %esi, %rdi     # sign extend the result of add
  movzbl  %dl, %edx      # zero extend the first argument
  addl  %eax, %edx       # 32-bit add
  movslq  %edx, %rsi     # sign extend the result of add
  addl  %eax, %ecx       # 32-bit add
  movslq  %ecx, %rdx     # sign extend the result of add
[...]
The throughput of this sequence is 7.45 cycles on Ivy Bridge according to IACA.

Now, by promoting the additions to form more extended loads we would generate:
[...]
  movzbl  (%rdi), %eax   # zero-extended load
  movslq  (%rsi), %rdi   # sign-extended load
  addq  %rax, %rdi       # 64-bit add
  movzbl  %dl, %esi      # zero extend the first argument
  addq  %rax, %rsi       # 64-bit add
  movslq  %ecx, %rdx     # sign extend the second argument
  addq  %rax, %rdx       # 64-bit add
[...]
The throughput of this sequence is 6.15 cycles on Ivy Bridge according to IACA.

This kind of sequences happen a lot on code using 32-bit indexes on 64-bit
architectures.

Note: The throughput numbers are similar on Sandy Bridge and Haswell.


** Proposed Solution **

To avoid the penalty of all these sign/zero extensions, we merge them in the
loads at the beginning of the chain of computation by promoting all the chain of
computation on the extended type. The promotion is done if and only if we do not
introduce new extensions, i.e., if we do not degrade the code quality.
To achieve this, we extend the existing “move ext to load” optimization with the
promotion mechanism introduced to match larger patterns for addressing mode
(r200947).
The idea of this extension is to perform the following transformation:
ext(promotableInst1(...(promotableInstN(load))))
=>
promotedInst1(...(promotedInstN(ext(load))))

The promotion mechanism in that optimization is enabled by a new TargetLowering
switch, which is off by default. In other words, by default, the optimization
performs the “move ext to load” optimization as it was before this patch.


** Performance **

Configuration: x86_64: Ivy Bridge fixed at 2900MHz running OS X 10.10.
Tested Optimization Levels: O3/Os
Tests: llvm-testsuite + externals.
Results:
- No regression beside noise.
- Improvements:
CINT2006/473.astar:  ~2%
Benchmarks/PAQ8p: ~2%
Misc/perlin: ~3%

The results are consistent for both O3 and Os.

<rdar://problem/18310086>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224351 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-16 19:09:03 +00:00
Aaron Ballman
51c2bdca72 Fixing -Wsign-compare warnings; NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224337 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-16 14:04:11 +00:00
Matthias Braun
68445710a2 LiveRangeCalc: Rewrite subrange calculation
This changes subrange calculation to calculate subranges sequentially
instead of in parallel. The code is easier to understand that way and
addresses the code review issues raised about LiveOutData being
hard to understand/needing more comments by removing them :)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224313 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-16 04:03:38 +00:00
Adrian Prantl
6f059afde6 ARM/AArch64: Attach the FrameSetup MIFlag to CFI instructions.
Debug info marks the first instruction without the FrameSetup flag
as being the end of the function prologue. Any CFI instructions in the
middle of the function prologue would cause debug info to end the prologue
too early and worse, attach the line number of the CFI instruction, which
incidentally is often 0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224294 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-16 00:20:49 +00:00
Matthias Braun
4151e89f1e Revert "LiveRangeCalc: Rewrite subrange calculation"
Revert until I find out why non-subreg enabled targets break.

This reverts commit 6097277eef.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224278 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-15 21:36:35 +00:00
Matthias Braun
6097277eef LiveRangeCalc: Rewrite subrange calculation
This changes subrange calculation to calculate subranges sequentially
instead of in parallel. The code is easier to understand that way and
addresses the code review issues raised about LiveOutData being
hard to understand/needing more comments by removing them :)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224272 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-15 21:16:21 +00:00
Matthias Braun
908d623dff LiveRangeCalc: use more range based for loops; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224263 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-15 19:40:46 +00:00
Michael Ilseman
9ecdca9115 Silence more static analyzer warnings.
Add in definedness checks for shift operators, null checks when
pointers are assumed by the code to be non-null, and explicit
unreachables.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224255 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-15 18:48:43 +00:00
Akira Hatanaka
ce9b37c087 Rename argument strings of codegen passes to avoid collisions with command line
options.

This commit changes the command line arguments (PassInfo::PassArgument) of two
passes, MachineFunctionPrinter and MachineScheduler, to avoid collisions with
command line options that have the same argument strings.

This bug manifests when the PassList construct (defined in opt.cpp) is used
in a tool that links with codegen passes. To reproduce the bug, paste the
following lines into llc.cpp and run llc.

#include "llvm/IR/LegacyPassNameParser.h"
static llvm:🆑:list<const llvm::PassInfo*, bool, llvm::PassNameParser>
PassList(llvm:🆑:desc("Optimizations available:"));

rdar://problem/19212448


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224186 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-13 04:52:04 +00:00
Andrea Di Biagio
de48903a20 Reapply "[MachineScheduler] Fix for PR21807: minor code difference building with/without -g."
This reapplies r224118 with a fix for test 'misched-code-difference-with-debug.ll'.
That test was failing on some buildbots because it was x86 specific but it was
missing a target triple.
Added an explicit triple to test misched-code-difference-with-debug.ll.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224126 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-12 15:09:58 +00:00
Andrea Di Biagio
bf72e14565 Revert: [MachineScheduler] Fix for PR21807: minor code difference building with/without -g.
Test 'misched-code-difference-with-debug.ll' was failing on some buildbots.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224121 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-12 13:34:03 +00:00
Andrea Di Biagio
c15d82e259 [MachineScheduler] Fix for PR21807: minor code difference building with/without -g.
This patch fixes the issue reported as PR21807. There was a minor difference
in the generated code depending on the -g flag.

The cause was that with -g the machine scheduler used a different
scheduling strategy. This decision was based on the number of instructions
in a schedule region and included debug instructions in that count.

This patch fixes the issue in MISched and provides a test.

Patch by Russell Gallop!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224118 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-12 12:41:22 +00:00
Ekaterina Romanova
e0b0363e44 A fix for PR21176.
DW_OP_const <const> doesn't describe a constant value, but a value at a constant address. 
The proper way to describe a constant value is DW_OP_constu <const>, DW_OP_stack_value. 
Added DW_OP_stack_value to the stack. 

Marked incorrect-variable-debugloc1.ll to xfail for PowerPC64, while the the failure (PR21881) 
is being investigated. 



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224098 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-12 05:11:47 +00:00
Philip Reames
b7dfa31ac8 Comment and minor code cleanup for GCStrategy (NFC)
Updating comments to reflect the current state of the world after my recent changes to ownership structure and generally better describe what a GCStrategy is and how it works.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224086 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-12 00:49:03 +00:00
Matt Arsenault
6e6318f148 Add target hook for whether it is profitable to reduce load widths
Add an option to disable optimization to shrink truncated larger type
loads to smaller type loads. On SI this prevents using scalar load
instructions in some cases, since there are no scalar extloads.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224084 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-12 00:00:24 +00:00
Duncan P. N. Exon Smith
f88e4c8e91 CodeGen: Stop using LeakDetector for MachineInstr
Since `MachineInstr` is required to have a trivial destructor, it cannot
remove itself from `LeakDetection`.  Remove the calls.

As it happens, this requirement is because `MachineFunction` allocates
all `MachineInstr`s in a custom allocator; when the `MachineFunction` is
destroyed they're dropped of the edge.  There's no benefit to detecting
leaks.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224061 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-11 21:51:37 +00:00
Matthias Braun
5b17297b3d [CodeGen] Add print and verify pass after each MachineFunctionPass by default
Previously print+verify passes were added in a very unsystematic way, which is
annoying when debugging as you miss intermediate steps and allows bugs to stay
unnotice when no verification is performed.

To make this change practical I added the possibility to explicitely disable
verification. I used this option on all places where no verification was
performed previously (because alot of places actually don't pass the
MachineVerifier).
In the long term these problems should be fixed properly and verification
enabled after each pass. I'll enable some more verification in subsequent
commits.

This is the 2nd attempt at this after realizing that PassManager::add() may
actually delete the pass.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224059 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-11 21:26:47 +00:00
Rafael Espindola
428923cfe2 This reverts commit r224043 and r224042.
check-llvm was failing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224045 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-11 20:03:57 +00:00
Matthias Braun
71f56c4aac [CodeGen] Add print and verify pass after each MachineFunctionPass by default
Previously print+verify passes were added in a very unsystematic way, which is
annoying when debugging as you miss intermediate steps and allows bugs to stay
unnotice when no verification is performed.

To make this change practical I added the possibility to explicitely disable
verification. I used this option on all places where no verification was
performed previously (because alot of places actually don't pass the
MachineVerifier).
In the long term these problems should be fixed properly and verification
enabled after each pass. I'll enable some more verification in subsequent
commits.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224042 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-11 19:42:05 +00:00
Matthias Braun
9173c775e3 [CodeGen] Let MachineVerifierPass own its banner string
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224041 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-11 19:41:51 +00:00
Patrik Hagglund
40c2ddeffe Bugfix in InlineSpiller::traceSiblingValue().
Properly determine whether or not a phi was added by splitting.
Check against the current VNInfo of OrigLI instead of against the
OrigVNI argument.

Patch provided by Jonas Paulsson. Reviewed by Quentin Colombet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224009 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-11 10:40:17 +00:00
Ekaterina Romanova
e9afc33de5 Reverting commit 223981, because the test that I added (incorrect-variable-debugloc1.ll) failed for llvm-ppc64.
The test is failing for llvm-ppc64 because for this platform the location list is not being generated at all (most likely because of the bug in PPC code optimization or generation). I will file a bug agains PPC compiler, but meanwhile, until PPC bug is fixed, I will have to revert my change.  



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224000 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-11 06:22:35 +00:00
Philip Reames
5e62b8471d GCStrategy should not own GCFunctionInfo
This change moves the ownership and access of GCFunctionInfo (the object which describes the safepoints associated with a safepoint under GCRoot) to GCModuleInfo. Previously, this was owned by GCStrategy which was in turned owned by GCModuleInfo. This made GCStrategy module specific which is 'surprising' given it's name and other purposes.

There's a few more changes needed, but we're getting towards the point we can reuse GCStrategy for gc.statepoint as well.

p.s. The style of this code ends up being a mess. I was trying to move code around without otherwise changing much. Once I get the ownership structure rearranged, I will go through and fixup spacing, naming, comments etc.

Differential Revision: http://reviews.llvm.org/D6587



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223994 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-11 01:47:23 +00:00
Matthias Braun
1bfcc2d56f LiveInterval: Use range based for loops for subregister ranges.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223991 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-11 00:59:06 +00:00
Ekaterina Romanova
de68a563bc A fix for PR21176.
DW_OP_const <const> doesn't describe a constant value, but a value at a constant address.
The proper way to describe a constant value is DW_OP_constu <const>, DW_OP_stack_value.

Added DW_OP_stack_value to the stack.

-This line, and those below, will be ignored--

M    lib/CodeGen/AsmPrinter/DwarfDebug.cpp
A    test/DebugInfo/incorrect-variable-debugloc1.ll


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223981 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-10 23:19:56 +00:00
Matthias Braun
218d20a48b LiveInterval: Use more range based for loops for value numbers and segments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223978 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-10 23:07:54 +00:00
Aaron Ballman
7e1839ff01 Silencing a -Wsequence-point warning, and the resulting undefined behavior. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223926 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-10 14:14:54 +00:00
Matthias Braun
5e40bd7d5f MachineVerifier: Allow physreg use if just a subreg is defined.
We can't mark partially undefined registers, so we have to allow reading
a register in the machine verifier if just parts of a register are
defined.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223896 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-10 01:13:13 +00:00
Matthias Braun
8f08516992 MachineVerifier: Allow LiveInterval segments to end at a partial write.
In the subregister liveness tracking case we do not create implicit
reads on partial register writes anymore, still we need to produce a new
SSA value for partial writes so the live segment has to end.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223895 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-10 01:13:11 +00:00
Matthias Braun
8f08002f03 VirtRegMap: Improve block live-in info if subregister liveness is available.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223894 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-10 01:13:08 +00:00
Matthias Braun
668132490c VirtRegMap: No implicit defs/uses for super registers with subreg liveness tracking.
Adding the implicit defs/uses to the superregisters is semantically questionable
but was not dangerous before as the register allocator never assigned the same
register to two overlapping LiveIntervals even when the actually live
subregisters do not overlap. With subregister liveness tracking enabled this
does actually happen and leads to subsequent bugs if we don't stop adding
the superregister defs/uses.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223892 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-10 01:13:04 +00:00
Matthias Braun
0fc28990e4 LiveRegMatrix: Respect subregister liveness when allocating registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223891 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-10 01:13:01 +00:00
Matthias Braun
7b54b4de26 LiveIntervalUnion: Allow specification of liverange when unifying/extracting.
This allows it to add subregister ranges into the union.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223890 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-10 01:12:59 +00:00
Matthias Braun
08c7b086a9 RegisterCoalescer: Preserve subregister liveranges.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223888 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-10 01:12:52 +00:00
Matthias Braun
f2f0589b02 LiveInterval: Add removeEmptySubRanges().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223887 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-10 01:12:40 +00:00
Matthias Braun
6e616d2e97 LiveIntervalAnalysis: Add subregister aware variants pruneValue().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223886 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-10 01:12:36 +00:00
Matthias Braun
7fbeb8d1b9 Add a flag to enable/disable subregister liveness.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223884 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-10 01:12:30 +00:00
Matthias Braun
4402447964 LiveIntervalAnalysis: Adapt repairIntervalsInRange() to subregister liveness.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223883 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-10 01:12:26 +00:00
Matthias Braun
c7db66d496 LiveRangeEdit: Adapt eliminateDeadDef() to subregister liveness.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223882 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-10 01:12:23 +00:00
Matthias Braun
dc08729288 LiveIntervalAnalysis: Adapt handleMove() to subregister ranges.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223881 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-10 01:12:20 +00:00
Matthias Braun
e59399c28c LiveIntervalAnalysis: Update SubRanges in shrinkToUses().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223880 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-10 01:12:18 +00:00
Matthias Braun
c66fa840bf LiveIntervalAnalysis: Compute subregister ranges.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223878 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-10 01:12:12 +00:00
Matthias Braun
01ddf04b63 LiveInterval: Add support to track liveness of subregisters.
This code adds the required data structures. Algorithms to compute it follow.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223877 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-10 01:12:10 +00:00
Matthias Braun
5874714ac3 LiveInterval: Add a 'covers' operation to LiveRange.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223876 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-10 01:12:06 +00:00
Philip Reames
3490d23337 Remove the Module pointer from GCStrategy and GCMetadataPrinter
In the current implementation, GCStrategy is a part of the ownership structure for the gc metadata which describes a Module. It also contains a reference to the module in question. As a result, GCStrategy instances are essentially Module specific.

I plan to transition away from this design. Instead, a GCStrategy will be owned by the LLVMContext. It will be a lightweight policy object which contains no information about the Modules or Functions involved, but can be easily reached given a Function.

The first step in this transition is to remove the direct Module reference from GCStrategy. This also requires removing the single user of this reference, the GCMetadataPrinter hierarchy. In theory, this will allow the lifetime of the printers to be scoped to the LLVMContext as well, but in practice, I'm not actually changing that. (Yet?)

An alternate design would have been to move the direct Module reference into the GCMetadataPrinter and change the keying of the owning maps to explicitly key off both GCStrategy and Module. I'm open to doing it that way instead, but didn't see much value in preserving the per Module association for GCMetadataPrinters.

The next change in this sequence will be to start unwinding the intertwined ownership between GCStrategy, GCModuleInfo, and GCFunctionInfo.

Differential Revision: http://reviews.llvm.org/D6566



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223859 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-09 23:57:54 +00:00
Duncan P. N. Exon Smith
dad20b2ae2 IR: Split Metadata from Value
Split `Metadata` away from the `Value` class hierarchy, as part of
PR21532.  Assembly and bitcode changes are in the wings, but this is the
bulk of the change for the IR C++ API.

I have a follow-up patch prepared for `clang`.  If this breaks other
sub-projects, I apologize in advance :(.  Help me compile it on Darwin
I'll try to fix it.  FWIW, the errors should be easy to fix, so it may
be simpler to just fix it yourself.

This breaks the build for all metadata-related code that's out-of-tree.
Rest assured the transition is mechanical and the compiler should catch
almost all of the problems.

Here's a quick guide for updating your code:

  - `Metadata` is the root of a class hierarchy with three main classes:
    `MDNode`, `MDString`, and `ValueAsMetadata`.  It is distinct from
    the `Value` class hierarchy.  It is typeless -- i.e., instances do
    *not* have a `Type`.

  - `MDNode`'s operands are all `Metadata *` (instead of `Value *`).

  - `TrackingVH<MDNode>` and `WeakVH` referring to metadata can be
    replaced with `TrackingMDNodeRef` and `TrackingMDRef`, respectively.

    If you're referring solely to resolved `MDNode`s -- post graph
    construction -- just use `MDNode*`.

  - `MDNode` (and the rest of `Metadata`) have only limited support for
    `replaceAllUsesWith()`.

    As long as an `MDNode` is pointing at a forward declaration -- the
    result of `MDNode::getTemporary()` -- it maintains a side map of its
    uses and can RAUW itself.  Once the forward declarations are fully
    resolved RAUW support is dropped on the ground.  This means that
    uniquing collisions on changing operands cause nodes to become
    "distinct".  (This already happened fairly commonly, whenever an
    operand went to null.)

    If you're constructing complex (non self-reference) `MDNode` cycles,
    you need to call `MDNode::resolveCycles()` on each node (or on a
    top-level node that somehow references all of the nodes).  Also,
    don't do that.  Metadata cycles (and the RAUW machinery needed to
    construct them) are expensive.

  - An `MDNode` can only refer to a `Constant` through a bridge called
    `ConstantAsMetadata` (one of the subclasses of `ValueAsMetadata`).

    As a side effect, accessing an operand of an `MDNode` that is known
    to be, e.g., `ConstantInt`, takes three steps: first, cast from
    `Metadata` to `ConstantAsMetadata`; second, extract the `Constant`;
    third, cast down to `ConstantInt`.

    The eventual goal is to introduce `MDInt`/`MDFloat`/etc. and have
    metadata schema owners transition away from using `Constant`s when
    the type isn't important (and they don't care about referring to
    `GlobalValue`s).

    In the meantime, I've added transitional API to the `mdconst`
    namespace that matches semantics with the old code, in order to
    avoid adding the error-prone three-step equivalent to every call
    site.  If your old code was:

        MDNode *N = foo();
        bar(isa             <ConstantInt>(N->getOperand(0)));
        baz(cast            <ConstantInt>(N->getOperand(1)));
        bak(cast_or_null    <ConstantInt>(N->getOperand(2)));
        bat(dyn_cast        <ConstantInt>(N->getOperand(3)));
        bay(dyn_cast_or_null<ConstantInt>(N->getOperand(4)));

    you can trivially match its semantics with:

        MDNode *N = foo();
        bar(mdconst::hasa               <ConstantInt>(N->getOperand(0)));
        baz(mdconst::extract            <ConstantInt>(N->getOperand(1)));
        bak(mdconst::extract_or_null    <ConstantInt>(N->getOperand(2)));
        bat(mdconst::dyn_extract        <ConstantInt>(N->getOperand(3)));
        bay(mdconst::dyn_extract_or_null<ConstantInt>(N->getOperand(4)));

    and when you transition your metadata schema to `MDInt`:

        MDNode *N = foo();
        bar(isa             <MDInt>(N->getOperand(0)));
        baz(cast            <MDInt>(N->getOperand(1)));
        bak(cast_or_null    <MDInt>(N->getOperand(2)));
        bat(dyn_cast        <MDInt>(N->getOperand(3)));
        bay(dyn_cast_or_null<MDInt>(N->getOperand(4)));

  - A `CallInst` -- specifically, intrinsic instructions -- can refer to
    metadata through a bridge called `MetadataAsValue`.  This is a
    subclass of `Value` where `getType()->isMetadataTy()`.

    `MetadataAsValue` is the *only* class that can legally refer to a
    `LocalAsMetadata`, which is a bridged form of non-`Constant` values
    like `Argument` and `Instruction`.  It can also refer to any other
    `Metadata` subclass.

(I'll break all your testcases in a follow-up commit, when I propagate
this change to assembly.)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223802 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-09 18:38:53 +00:00
Juergen Ributzka
0ff2059da6 [CGP] Rewrite pattern match for splitBranchCondition to work with Values instead.
Rewrite the pattern match code to work also with Values instead with
Instructions only. Also remove the no longer need matcher (m_Instruction).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223797 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-09 17:50:10 +00:00
Juergen Ributzka
f3ac16dcb4 Revert "Move function to obtain branch weights into the BranchInst class. NFC."
This reverts commit r223784 and copies the 'ExtractBranchMetadata' to CodeGenPrepare.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223795 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-09 17:32:12 +00:00
Juergen Ributzka
b49ee78320 [CodeGenPrepare] Split branch conditions into multiple conditional branches.
This optimization transforms code like:
bb1:
  %0 = icmp ne i32 %a, 0
  %1 = icmp ne i32 %b, 0
  %or.cond = or i1 %0, %1
  br i1 %or.cond, label %TrueBB, label %FalseBB

into a multiple branch instructions like:

bb1:
  %0 = icmp ne i32 %a, 0
  br i1 %0, label %TrueBB, label %bb2
bb2:
  %1 = icmp ne i32 %b, 0
  br i1 %1, label %TrueBB, label %FalseBB

This optimization is already performed by SelectionDAG, but not by FastISel.
FastISel cannot perform this optimization, because it cannot generate new
MachineBasicBlocks.

Performing this optimization at CodeGenPrepare time makes it available to both -
SelectionDAG and FastISel - and the implementation in SelectiuonDAG could be
removed. There are currenty a few differences in codegen for X86 and PPC, so
this commmit only enables it for FastISel.

Reviewed by Jim Grosbach

This fixes rdar://problem/19034919.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223786 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-09 16:36:13 +00:00
Owen Anderson
59bf8e81f3 Fix a few instances found in SelectionDAG where we were not handling F16 at parity with F32 and F64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223760 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-09 06:50:39 +00:00
Hal Finkel
014b06e7b2 Handle early-clobber registers in the aggressive anti-dep breaker
The aggressive anti-dep breaker, used by the PowerPC backend during post-RA
scheduling (but is available to all targets), did not handle early-clobber MI
operands (at all). When constructing the list of available registers for the
replacement of some def operand, check the using instructions, and remove
registers assigned to early-clobbered defs from the set.

Fixes PR21452.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223727 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-09 01:00:59 +00:00
Tom Stellard
653ef32216 MISched: Fix moving stores across barriers
This fixes an issue with ScheduleDAGInstrs::buildSchedGraph
where stores without an underlying object would not be added
as a predecessor to the current BarrierChain.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223717 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-08 23:36:48 +00:00
Justin Bogner
70b0751080 InstrProf: An intrinsic and lowering for instrumentation based profiling
Introduce the ``llvm.instrprof_increment`` intrinsic and the
``-instrprof`` pass. These provide the infrastructure for writing
counters for profiling, as in clang's ``-fprofile-instr-generate``.

The implementation of the instrprof pass is ported directly out of the
CodeGenPGO classes in clang, and with the followup in clang that rips
that code out to use these new intrinsics this ends up being NFC.

Doing the instrumentation this way opens some doors in terms of
improving the counter performance. For example, this will make it
simple to experiment with alternate lowering strategies, and allows us
to try handling profiling specially in some optimizations if we want
to.

Finally, this drastically simplifies the frontend and puts all of the
lowering logic in one place.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223672 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-08 18:02:35 +00:00
Hans Wennborg
a421ac689f SelectionDAG switch lowering: Replace unreachable default with most popular case.
This can significantly reduce the size of the switch, allowing for more
efficient lowering.

I also worked with the idea of exploiting unreachable defaults by
omitting the range check for jump tables, but always ended up with a
non-neglible binary size increase. It might be worth looking into some more.

SimplifyCFG currently does this transformation, but I'm working towards changing
that so we can optimize harder based on unreachable defaults.

Differential Revision: http://reviews.llvm.org/D6510

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223566 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-06 01:28:50 +00:00
Eric Christopher
7a76d76e0c These two calls were grabbing the same register info. Unify them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223502 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-05 19:23:55 +00:00
Ahmed Bougacha
bf6e012f8e [CodeGenPrepare] Use variables for reused values. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223491 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-05 18:04:40 +00:00
Hal Finkel
138d5bf371 Revert "r223440 - Consider subregs when calling MI::registerDefIsDead for phys deps"
Reverting this because, while it fixes the problem in the reduced test case, it
does not fix the problem in the full test case from the bug report.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223442 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-05 02:07:35 +00:00
Hal Finkel
d281f1443a Consider subregs when calling MI::registerDefIsDead for phys deps
The scheduling dependency graph is built bottom-up within each scheduling
region, and ScheduleDAGInstrs::addPhysRegDeps is called to add output/anti
dependencies, based on physical registers, to the SUs for instructions
based on those that come before them.

In the test case, we start before post-RA scheduling with a block that looks
like this:

...
	INLINEASM <...
andc $0,$0,$2
stdcx. $0,0,$3
bne- 1b
> [sideeffect] [mayload] [maystore] [attdialect], $0:[regdef-ec:G8RC], %X6<earlyclobber,def,dead>, $1:[mem], %X3<kill>, $2:[reguse:G8RC], %X5<kill>, $3:[reguse:G8RC], %X3, $4:[mem], %X3, $5:[clobber], %CC<earlyclobber,imp-def,dead>, <<badref>>
	...
	%X4<def,dead> = ANDIo8 %X4<kill>, 1, %CR0<imp-def,dead>, %CR0GT<imp-def>
	...
	%R29<def> = ISEL %R3<undef>, %R4<kill>, %CR0GT<kill>

where it is relevant that %CC is an alias to %CR0, and that %CR0GT is a
subregister of %CR0. However, for post-RA scheduling, no dependency was added
to prevent the INLINEASM from being scheduled in between the ANDIo8 and the
ISEL (which communicate via the %CR0GT register).

In ScheduleDAGInstrs::addPhysRegDeps, when called for the %CC operand, we'd
iterate over all of its aliases (which include %CC itself and also %CR0), and
look for previously-encountered defs of those registers. We'd find the ANDIo8,
but decide not to add a dependency between the INLINEASM and the ANDIo8 because
both the INLINEASM's def of %CC is dead, and also the ANDIo8 def of %CR0 is
dead. This ignores, however, that ANDIo8 has a non-dead def of %CR0GT, a
subregister of %CR0, and thus a dependency still must exist.

To fix this problem, when calling registerDefIsDead on the SU with the def, we
also check all subregisters for possible non-dead defs, and add the dependency
if any are found.

Fixes PR21742.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223440 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-05 01:57:22 +00:00
Adrian Prantl
74fb019dd0 Cleanup: Calls to getDwarfRegNum() may actually fail, if there is
no DWARF register number mapping, or if the register was a virtual
register that was never materialized. Previously, we would just emit a
bogus location, after this patch we don't emit a location at all by
doing an early exit.

After my bugfix in r223401 today, this doesn't actually happen on any
target that I tested this with, but it's still preferable to make the
possibility of a failure explicit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223428 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-05 01:02:46 +00:00
Adrian Prantl
9e8083744d Simplify implementation and testcase of r223401 based on feedback from dblaikie.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223405 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 22:58:41 +00:00
Adrian Prantl
fb8dcb45c6 Debug info: If the RegisterCoalescer::reMaterializeTrivialDef() is
eliminating all uses of a vreg, update any DBG_VALUE describing that vreg
to point to the rematerialized register instead.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223401 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 22:29:04 +00:00
Patrik Hagglund
cfb121f286 Use DomTree in MachineSink to sink over diamonds.
According to a previous FIXME comment we now not only look at MBB
successors, but also handle code sinking past them:

  x = computation
  if () {} else {}
  use x

The instruction could be sunk over the whole diamond for the
if/then/else (or loop, etc), allowing it to be sunk into other blocks
after that.

Modified test added in r204522, due to one spill less present.

Minor fixes in comments.

Patch provided by Jonas Paulsson. Reviewed by Hal Finkel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223350 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 10:36:42 +00:00
Simon Pilgrim
94590ca4cf [InstCombine] Minor optimization for bswap with binary ops
Added instcombine optimizations for BSWAP with AND/OR/XOR ops:

OP( BSWAP(x), BSWAP(y) ) -> BSWAP( OP(x, y) )
OP( BSWAP(x), CONSTANT ) -> BSWAP( OP(x, BSWAP(CONSTANT) ) )

Since its just a one liner, I've also added BSWAP to the DAGCombiner equivalent as well:

fold (OP (bswap x), (bswap y)) -> (bswap (OP x, y))

Refactored bswap-fold tests to use FileCheck instead of just checking that the bswaps had gone.

Differential Revision: http://reviews.llvm.org/D6407



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223349 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 09:44:01 +00:00
Elena Demikhovsky
73ae1df82c Masked Load / Store Intrinsics - the CodeGen part.
I'm recommiting the codegen part of the patch.
The vectorizer part will be send to review again.

Masked Vector Load and Store Intrinsics.
Introduced new target-independent intrinsics in order to support masked vector loads and stores. The loop vectorizer optimizes loops containing conditional memory accesses by generating these intrinsics for existing targets AVX2 and AVX-512. The vectorizer asks the target about availability of masked vector loads and stores.
Added SDNodes for masked operations and lowering patterns for X86 code generator.
Examples:
<16 x i32> @llvm.masked.load.v16i32(i8* %addr, <16 x i32> %passthru, i32 4 /* align */, <16 x i1> %mask)
declare void @llvm.masked.store.v8f64(i8* %addr, <8 x double> %value, i32 4, <8 x i1> %mask)

Scalarizer for other targets (not AVX2/AVX-512) will be done in a separate patch.

http://reviews.llvm.org/D6191



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223348 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 09:40:44 +00:00
Matt Arsenault
459e595697 Allow target to specify prefix for labels
Use the MCAsmInfo instead of the DataLayout, and allow
specifying a custom prefix for labels specifically. HSAIL
requires that labels begin with @, but global symbols with &.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223323 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 00:06:57 +00:00
Quentin Colombet
331ec379a0 [RegAllocFast] Handle implicit definitions conservatively.
Prior to this commit, physical registers defined implicitly were considered free
right after their definition, i.e.. like dead definitions. Therefore, their uses
had to immediately follow their definitions, otherwise the related register may
be reused to allocate a virtual register.

This commit fixes this assumption by keeping implicit definitions alive until
they are actually used. The downside is that if the implicit definition was dead
(and not marked at such), we block an otherwise available register. This is
however conservatively correct and makes the fast register allocator much more
robust in particular regarding the scheduling of the instructions.

Fixes PR21700.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223317 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-03 23:38:08 +00:00
Peter Collingbourne
bb660fc192 Prologue support
Patch by Ben Gamari!

This redefines the `prefix` attribute introduced previously and
introduces a `prologue` attribute.  There are a two primary usecases
that these attributes aim to serve,

  1. Function prologue sigils

  2. Function hot-patching: Enable the user to insert `nop` operations
     at the beginning of the function which can later be safely replaced
     with a call to some instrumentation facility

  3. Runtime metadata: Allow a compiler to insert data for use by the
     runtime during execution. GHC is one example of a compiler that
     needs this functionality for its tables-next-to-code functionality.

Previously `prefix` served cases (1) and (2) quite well by allowing the user
to introduce arbitrary data at the entrypoint but before the function
body. Case (3), however, was poorly handled by this approach as it
required that prefix data was valid executable code.

Here we redefine the notion of prefix data to instead be data which
occurs immediately before the function entrypoint (i.e. the symbol
address). Since prefix data now occurs before the function entrypoint,
there is no need for the data to be valid code.

The previous notion of prefix data now goes under the name "prologue
data" to emphasize its duality with the function epilogue.

The intention here is to handle cases (1) and (2) with prologue data and
case (3) with prefix data.

References
----------

This idea arose out of discussions[1] with Reid Kleckner in response to a
proposal to introduce the notion of symbol offsets to enable handling of
case (3).

[1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-May/073235.html

Test Plan: testsuite

Differential Revision: http://reviews.llvm.org/D6454

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223189 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-03 02:08:38 +00:00