Commit Graph

5786 Commits

Author SHA1 Message Date
Quentin Colombet
921f0b1d66 [CodeGenPrepare] Undo changes that happened for the profitability check.
The addressing mode matcher checks at some point the profitability of folding an
instruction into the addressing mode. When the instruction to be folded has
several uses, it checks that the instruction can be folded in each use.
To do so, it creates a new matcher for each use and check if the instruction is
in the list of the matched instructions of this new matcher.

The new matchers may promote some instructions and this has to be undone to keep
the state of the original matcher consistent.

A test case will follow.

<rdar://problem/16020230>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@201121 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-11 01:59:02 +00:00
Benjamin Kramer
299918ad48 Make succ_iterator a real random access iterator and clean up a couple of users.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@201088 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-10 14:17:42 +00:00
Juergen Ributzka
6f1819f2e6 [Constant Hoisting] Fix insertion point for constant materialization.
The bitcast instruction during constant materialization was not placed correcly
in the presence of phi nodes. This commit fixes the insertion point to be in the
idom instead.

This fixes PR18768

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@201009 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-08 00:20:49 +00:00
Juergen Ributzka
1368e659d7 [Constant Hoisting] Don't update the use list while traversing it - DOH!
This fix first traverses the whole use list of the constant expression and
keeps track of the instructions that need to be updated. Then perform the
fixup afterwards.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@201008 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-08 00:20:45 +00:00
Quentin Colombet
30c0f72237 [CodeGenPrepare] Move away sign extensions that get in the way of addressing
mode.

Basically the idea is to transform code like this:
%idx = add nsw i32 %a, 1
%sextidx = sext i32 %idx to i64
%gep = gep i8* %myArray, i64 %sextidx
load i8* %gep

Into:
%sexta = sext i32 %a to i64
%idx = add nsw i64 %sexta, 1
%gep = gep i8* %myArray, i64 %idx
load i8* %gep

That way the computation can be folded into the addressing mode.

This transformation is done as part of the addressing mode matcher.
If the matching fails (not profitable, addressing mode not legal, etc.), the
matcher will revert the related promotions.

<rdar://problem/15519855>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200947 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-06 21:44:56 +00:00
Nick Lewycky
44e40408ee A memcpy out of an fresh alloca is a no-op, delete it. Patch by Patrick Walton!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200907 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-06 06:29:19 +00:00
Paul Robinson
2684ddd72e Disable most IR-level transform passes on functions marked 'optnone'.
Ideally only those transform passes that run at -O0 remain enabled,
in reality we get as close as we reasonably can.
Passes are responsible for disabling themselves, it's not the job of
the pass manager to do it for them.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200892 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-06 00:07:05 +00:00
Duncan P. N. Exon Smith
483727da48 cleanup: scc_iterator consumers should use isAtEnd
No functional change.  Updated loops from:

    for (I = scc_begin(), E = scc_end(); I != E; ++I)

to:

    for (I = scc_begin(); !I.isAtEnd(); ++I)

for teh win.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200789 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-04 19:19:07 +00:00
Nick Lewycky
e4d1a3e352 Self-memcpy-elision and memcpy of constant byte to memset transforms don't care how many bytes you were trying to transfer. Sink that safety test after those transforms. Noticed by inspection.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200726 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-04 00:18:54 +00:00
Chandler Carruth
115fd30b24 [LPM] Apply a really big hammer to fix PR18688 by recursively reforming
LCSSA when we promote to SSA registers inside of LICM.

Currently, this is actually necessary. The promotion logic in LICM uses
SSAUpdater which doesn't understand how to place LCSSA PHI nodes.
Teaching it to do so would be a very significant undertaking. It may be
worthwhile and I've left a FIXME about this in the code as well as
starting a thread on llvmdev to try to figure out the right long-term
solution.

For now, the PR needs to be fixed. Short of using the promition
SSAUpdater to place both the LCSSA PHI nodes and the promoted PHI nodes,
I don't see a cleaner or cheaper way of achieving this. Fortunately,
LCSSA is relatively lazy and sparse -- it should only update
instructions which need it. We can also skip the recursive variant when
we don't promote to SSA values.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200612 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-01 13:35:14 +00:00
Chandler Carruth
a403ceb205 [LPM] Fix PR18643, another scary place where loop transforms failed to
preserve loop simplify of enclosing loops.

The problem here starts with LoopRotation which ends up cloning code out
of the latch into the new preheader it is buidling. This can create
a new edge from the preheader into the exit block of the loop which
breaks LoopSimplify form. The code tries to fix this by splitting the
critical edge between the latch and the exit block to get a new exit
block that only the latch dominates. This sadly isn't sufficient.

The exit block may be an exit block for multiple nested loops. When we
clone an edge from the latch of the inner loop to the new preheader
being built in the outer loop, we create an exiting edge from the outer
loop to this exit block. Despite breaking the LoopSimplify form for the
inner loop, this is fine for the outer loop. However, when we split the
edge from the inner loop to the exit block, we create a new block which
is in neither the inner nor outer loop as the new exit block. This is
a predecessor to the old exit block, and so the split itself takes the
outer loop out of LoopSimplify form. We need to split every edge
entering the exit block from inside a loop nested more deeply than the
exit block in order to preserve all of the loop simplify constraints.

Once we try to do that, a problem with splitting critical edges
surfaces. Previously, we tried a very brute force to update LoopSimplify
form by re-computing it for all exit blocks. We don't need to do this,
and doing this much will sometimes but not always overlap with the
LoopRotate bug fix. Instead, the code needs to specifically handle the
cases which can start to violate LoopSimplify -- they aren't that
common. We need to see if the destination of the split edge was a loop
exit block in simplified form for the loop of the source of the edge.
For this to be true, all the predecessors need to be in the exact same
loop as the source of the edge being split. If the dest block was
originally in this form, we have to split all of the deges back into
this loop to recover it. The old mechanism of doing this was
conservatively correct because at least *one* of the exiting blocks it
rewrote was the DestBB and so the DestBB's predecessors were fixed. But
this is a much more targeted way of doing it. Making it targeted is
important, because ballooning the set of edges touched prevents
LoopRotate from being able to split edges *it* needs to split to
preserve loop simplify in a coherent way -- the critical edge splitting
would sometimes find the other edges in need of splitting but not
others.

Many, *many* thanks for help from Nick reducing these test cases
mightily. And helping lots with the analysis here as this one was quite
tricky to track down.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200393 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-29 13:16:53 +00:00
Chandler Carruth
6a67a3f3ec [LPM] Fix PR18642, a pretty nasty bug in IndVars that "never mattered"
because of the inside-out run of LoopSimplify in the LoopPassManager and
the fact that LoopSimplify couldn't be "preserved" across two
independent LoopPassManagers.

Anyways, in that case, IndVars wasn't correctly preserving an LCSSA PHI
node because it thought it was rewriting (via SCEV) the incoming value
to a loop invariant value. While it may well be invariant for the
current loop, it may be rewritten in terms of an enclosing loop's
values. This in and of itself is fine, as the LCSSA PHI node in the
enclosing loop for the inner loop value we're rewriting will have its
own LCSSA PHI node if used outside of the enclosing loop. With me so
far?

Well, the current loop and the enclosing loop may share an exiting
block and exit block, and when they do they also share LCSSA PHI nodes.
In this case, its not valid to RAUW through the LCSSA PHI node.

Expected crazy test included.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200372 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-29 04:40:19 +00:00
Reid Kleckner
59bec0e3c0 Update optimization passes to handle inalloca arguments
Summary:
I searched Transforms/ and Analysis/ for 'ByVal' and updated those call
sites to check for inalloca if appropriate.

I added tests for any change that would allow an optimization to fire on
inalloca.

Reviewers: nlewycky

Differential Revision: http://llvm-reviews.chandlerc.com/D2449

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200281 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-28 02:38:36 +00:00
Benjamin Kramer
08aa38d39b ConstantHoisting: We can't insert instructions directly in front of a PHI node.
Insert before the terminating instruction of the dominating block instead.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200218 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-27 13:11:43 +00:00
Chandler Carruth
3d69cf57e1 [LPM] Make LCSSA a utility with a FunctionPass that applies it to all
the loops in a function, and teach LICM to work in the presance of
LCSSA.

Previously, LCSSA was a loop pass. That made passes requiring it also be
loop passes and unable to depend on function analysis passes easily. It
also caused outer loops to have a different "canonical" form from inner
loops during analysis. Instead, we go into LCSSA form and preserve it
through the loop pass manager run.

Note that this has the same problem as LoopSimplify that prevents
enabling its verification -- loop passes which run at the end of the loop
pass manager and don't preserve these are valid, but the subsequent loop
pass runs of outer loops that do preserve this pass trigger too much
verification and fail because the inner loop no longer verifies.

The other problem this exposed is that LICM was completely unable to
handle LCSSA form. It didn't preserve it and it actually would give up
on moving instructions in many cases when they were used by an LCSSA phi
node. I've taught LICM to support detecting LCSSA-form PHI nodes and to
hoist and sink around them. This may actually let LICM fire
significantly more because we put everything into LCSSA form to rotate
the loop before running LICM. =/ Now LICM should handle that fine and
preserve it correctly. The down side is that LICM has to require LCSSA
in order to preserve it. This is just a fact of life for LCSSA. It's
entirely possible we should completely remove LCSSA from the optimizer.

The test updates are essentially accomodating LCSSA phi nodes in the
output of LICM, and the fact that we now completely sink every
instruction in ashr-crash below the loop bodies prior to unrolling.

With this change, LCSSA is computed only three times in the pass
pipeline. One of them could be removed (and potentially a SCEV run and
a separate LoopPassManager entirely!) if we had a LoopPass variant of
InstCombine that ran InstCombine on the loop body but refused to combine
away LCSSA PHI nodes. Currently, this also prevents loop unrolling from
being in the same loop pass manager is rotate, LICM, and unswitch.

There is one thing that I *really* don't like -- preserving LCSSA in
LICM is quite expensive. We end up having to re-run LCSSA twice for some
loops after LICM runs because LICM can undo LCSSA both in the current
loop and the parent loop. I don't really see good solutions to this
other than to completely move away from LCSSA and using tools like
SSAUpdater instead.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200067 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-25 04:07:24 +00:00
Juergen Ributzka
943ce55f39 Revert "Revert "Add Constant Hoisting Pass" (r200034)"
This reverts commit r200058 and adds the using directive for
ARMTargetTransformInfo to silence two g++ overload warnings.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200062 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-25 02:02:55 +00:00
Hans Wennborg
503793e834 Revert "Add Constant Hoisting Pass" (r200034)
This commit caused -Woverloaded-virtual warnings. The two new
TargetTransformInfo::getIntImmCost functions were only added to the superclass,
and to the X86 subclass. The other targets were not updated, and the
warning highlighted this by pointing out that e.g. ARMTTI::getIntImmCost was
hiding the two new getIntImmCost variants.

We could pacify the warning by adding "using TargetTransformInfo::getIntImmCost"
to the various subclasses, or turning it off, but I suspect that it's wrong to
leave the functions unimplemnted in those targets. The default implementations
return TCC_Free, which I don't think is right e.g. for ARM.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200058 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-25 01:18:18 +00:00
Juergen Ributzka
96172cb4a4 Add Constant Hoisting Pass
Retry commit r200022 with a fix for the build bot errors. Constant expressions
have (unlike instructions) module scope use lists and therefore may have users
in different functions. The fix is to simply ignore these out-of-function uses.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200034 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-24 20:18:00 +00:00
Juergen Ributzka
dc6f9b9a4f Revert "Add Constant Hoisting Pass"
This reverts commit r200022 to unbreak the build bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200024 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-24 18:40:30 +00:00
Juergen Ributzka
fb282c68b7 Add Constant Hoisting Pass
This pass identifies expensive constants to hoist and coalesces them to
better prepare it for SelectionDAG-based code generation. This works around the
limitations of the basic-block-at-a-time approach.

First it scans all instructions for integer constants and calculates its
cost. If the constant can be folded into the instruction (the cost is
TCC_Free) or the cost is just a simple operation (TCC_BASIC), then we don't
consider it expensive and leave it alone. This is the default behavior and
the default implementation of getIntImmCost will always return TCC_Free.

If the cost is more than TCC_BASIC, then the integer constant can't be folded
into the instruction and it might be beneficial to hoist the constant.
Similar constants are coalesced to reduce register pressure and
materialization code.

When a constant is hoisted, it is also hidden behind a bitcast to force it to
be live-out of the basic block. Otherwise the constant would be just
duplicated and each basic block would have its own copy in the SelectionDAG.
The SelectionDAG recognizes such constants as opaque and doesn't perform
certain transformations on them, which would create a new expensive constant.

This optimization is only applied to integer constants in instructions and
simple (this means not nested) constant cast experessions. For example:
%0 = load i64* inttoptr (i64 big_constant to i64*)

Reviewed by Eric

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200022 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-24 18:23:08 +00:00
Alp Toker
ae43cab6ba Fix known typos
Sweep the codebase for common typos. Includes some changes to visible function
names that were misspelt.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200018 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-24 17:20:08 +00:00
Chandler Carruth
4296ce5662 [LPM] Fix a logic error in LICM spotted by inspection.
We completely skipped promotion in LICM if the loop has a preheader or
dedicated exits, but not *both*. We hoist if there is a preheader, and
sink if there are dedicated exits, but either hoisting or sinking can
move loop invariant code out of the loop!

I have no idea if this has a practical consequence. If anyone has ideas
for a test case, let me know.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199966 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-24 02:24:47 +00:00
Chandler Carruth
42e23de4db [cleanup] Use the type-based preservation method rather than a string
literal that bakes a pass name and forces parsing it in the pass
manager.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199963 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-24 01:59:49 +00:00
Chandler Carruth
aaf44af769 [LPM] Make LoopSimplify no longer a LoopPass and instead both a utility
function and a FunctionPass.

This has many benefits. The motivating use case was to be able to
compute function analysis passes *after* running LoopSimplify (to avoid
invalidating them) and then to run other passes which require
LoopSimplify. Specifically passes like unrolling and vectorization are
critical to wire up to BranchProbabilityInfo and BlockFrequencyInfo so
that they can be profile aware. For the LoopVectorize pass the only
things in the way are LoopSimplify and LCSSA. This fixes LoopSimplify
and LCSSA is next on my list.

There are also a bunch of other benefits of doing this:
- It is now very feasible to make more passes *preserve* LoopSimplify
  because they can simply run it after changing a loop. Because
  subsequence passes can assume LoopSimplify is preserved we can reduce
  the runs of this pass to the times when we actually mutate a loop
  structure.
- The new pass manager should be able to more easily support loop passes
  factored in this way.
- We can at long, long last observe that LoopSimplify is preserved
  across SCEV. This *halves* the number of times we run LoopSimplify!!!

Now, getting here wasn't trivial. First off, the interfaces used by
LoopSimplify are all over the map regarding how analysis are updated. We
end up with weird "pass" parameters as a consequence. I'll try to clean
at least some of this up later -- I'll have to have it all clean for the
new pass manager.

Next up I discovered a really frustrating bug. LoopUnroll *claims* to
preserve LoopSimplify. That's actually a lie. But the way the
LoopPassManager ends up running the passes, it always ran LoopSimplify
on the unrolled-into loop, rectifying this oversight before any
verification could kick in and point out that in fact nothing was
preserved. So I've added code to the unroller to *actually* simplify the
surrounding loop when it succeeds at unrolling.

The only functional change in the test suite is that we now catch a case
that was previously missed because SCEV and other loop transforms see
their containing loops as simplified and thus don't miss some
opportunities. One test case has been converted to check that we catch
this case rather than checking that we miss it but at least don't get
the wrong answer.

Note that I have #if-ed out all of the verification logic in
LoopSimplify! This is a temporary workaround while extracting these bits
from the LoopPassManager. Currently, there is no way to have a pass in
the LoopPassManager which preserves LoopSimplify along with one which
does not. The LPM will try to verify on each loop in the nest that
LoopSimplify holds but the now-Function-pass cannot distinguish what
loop is being verified and so must try to verify all of them. The inner
most loop is clearly no longer simplified as there is a pass which
didn't even *attempt* to preserve it. =/ Once I get LCSSA out (and maybe
LoopVectorize and some other fixes) I'll be able to re-enable this check
and catch any places where we are still failing to preserve
LoopSimplify. If this causes problems I can back this out and try to
commit *all* of this at once, but so far this seems to work and allow
much more incremental progress.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199884 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-23 11:23:19 +00:00
Matt Arsenault
88a9f0476c Handle an addrspacecast case in memcpyopt
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199836 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 21:53:19 +00:00
Tim Northover
b9b629cbaa Loop strength reduce: fix function name.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199801 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-22 13:27:00 +00:00
Chandler Carruth
c04f2c99ab [SROA] Fix a bug which could cause the common type finding to return
inconsistent results for different orderings of alloca slices. The
fundamental issue is that it is just always a mistake to return early
from this function. There is no effective early exit to leverage. This
patch stops trynig to do so and simplifies the code a bit as
a consequence.

Original diagnosis and patch by James Molloy with some name tweaks by me
in part reflecting feedback from Duncan Smith on the mailing list.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199771 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-21 23:16:05 +00:00
Chandler Carruth
e1a5243053 Fix a really nasty SROA bug with how we handled out-of-bounds memcpy
intrinsics.

Reported on the list by Evan with a couple of attempts to fix, but it
took a while to dig down to the root cause. There are two overlapping
bugs here, both centering around the circumstance of discovering
a memcpy operand which is known to be completely outside the bounds of
the alloca.

First, we need to kill the *other* side of the memcpy if it was added to
this alloca. Otherwise we'll factor it into our slicing and try to
rewrite it even though we know for a fact that it is dead. This is made
more tricky because we can visit the sides in either order. So we have
to both kill the other side and skip instructions marked as dead. The
latter really should be goodness in every case, but here is a matter of
correctness.

Second, we need to actually remove the *uses* of the alloca by the
memcpy when queuing it for later deletion. Otherwise it may still be
using the alloca when we go to promote it (if the rewrite re-uses the
existing alloca instruction). Do this by factoring out the
use-clobbering used when for nixing a Phi argument and re-using it
across the operands of a to-be-deleted instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199590 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-19 12:16:54 +00:00
Quentin Colombet
9b24eeee01 [opt][PassInfo] Allow opt to run passes that need target machine.
When registering a pass, a pass can now specify a second construct that takes as
argument a pointer to TargetMachine.
The PassInfo class has been updated to reflect that possibility.
If such a constructor exists opt will use it instead of the default constructor
when instantiating the pass.

Since such IR passes are supposed to be rare, no specific support has been
added to this commit to allow an easy registration of such a pass.
In other words, for such pass, the initialization function has to be
hand-written (see CodeGenPrepare for instance).

Now, codegenprepare can be tested using opt:
opt -codegenprepare -mtriple=mytriple input.ll


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199430 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-16 21:44:34 +00:00
Chandler Carruth
7f2eff792a [PM] Split DominatorTree into a concrete analysis result object which
can be used by both the new pass manager and the old.

This removes it from any of the virtual mess of the pass interfaces and
lets it derive cleanly from the DominatorTreeBase<> template. In turn,
tons of boilerplate interface can be nuked and it turns into a very
straightforward extension of the base DominatorTree interface.

The old analysis pass is now a simple wrapper. The names and style of
this split should match the split between CallGraph and
CallGraphWrapperPass. All of the users of DominatorTree have been
updated to match using many of the same tricks as with CallGraph. The
goal is that the common type remains the resulting DominatorTree rather
than the pass. This will make subsequent work toward the new pass
manager significantly easier.

Also in numerous places things became cleaner because I switched from
re-running the pass (!!! mid way through some other passes run!!!) to
directly recomputing the domtree.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199104 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-13 13:07:17 +00:00
Chandler Carruth
2073b0a63c [PM] Pull the generic graph algorithms and data structures for dominator
trees into the Support library.

These are all expressed in terms of the generic GraphTraits and CFG,
with no reliance on any concrete IR types. Putting them in support
clarifies that and makes the fact that the static analyzer in Clang uses
them much more sane. When moving the Dominators.h file into the IR
library I claimed that this was the right home for it but not something
I planned to work on. Oops.

So why am I doing this? It happens to be one step toward breaking the
requirement that IR verification can only be performed from inside of
a pass context, which completely blocks the implementation of
verification for the new pass manager infrastructure. Fixing it will
also allow removing the concept of the "preverify" step (WTF???) and
allow the verifier to cleanly flag functions which fail verification in
a way that precludes even computing dominance information. Currently,
that results in a fatal error even when you ask the verifier to not
fatally error. It's awesome like that.

The yak shaving will continue...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199095 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-13 10:52:56 +00:00
Chandler Carruth
56e1394c88 [cleanup] Move the Dominators.h and Verifier.h headers into the IR
directory. These passes are already defined in the IR library, and it
doesn't make any sense to have the headers in Analysis.

Long term, I think there is going to be a much better way to divide
these matters. The dominators code should be fully separated into the
abstract graph algorithm and have that put in Support where it becomes
obvious that evn Clang's CFGBlock's can use it. Then the verifier can
manually construct dominance information from the Support-driven
interface while the Analysis library can provide a pass which both
caches, reconstructs, and supports a nice update API.

But those are very long term, and so I don't want to leave the really
confusing structure until that day arrives.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199082 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-13 09:26:24 +00:00
Chandler Carruth
9f20a4c6ce Re-sort #include lines again, prior to moving headers around.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199080 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-13 08:04:33 +00:00
Diego Novillo
4b2b2da9c7 Extend and simplify the sample profile input file.
1- Use the line_iterator class to read profile files.

2- Allow comments in profile file. Lines starting with '#'
   are completely ignored while reading the profile.

3- Add parsing support for discriminators and indirect call samples.

   Our external profiler can emit more profile information that we are
   currently not handling. This patch does not add new functionality to
   support this information, but it allows profile files to provide it.

   I will add actual support later on (for at least one of these
   features, I need support for DWARF discriminators in Clang).

   A sample line may contain the following additional information:

   Discriminator. This is used if the sampled program was compiled with
   DWARF discriminator support
   (http://wiki.dwarfstd.org/index.php?title=Path_Discriminators). This
   is currently only emitted by GCC and we just ignore it.

   Potential call targets and samples. If present, this line contains a
   call instruction. This models both direct and indirect calls. Each
   called target is listed together with the number of samples. For
   example,

                    130: 7  foo:3  bar:2  baz:7

   The above means that at relative line offset 130 there is a call
   instruction that calls one of foo(), bar() and baz(). With baz()
   being the relatively more frequent call target.

   Differential Revision: http://llvm-reviews.chandlerc.com/D2355

4- Simplify format of profile input file.

   This implements earlier suggestions to simplify the format of the
   sample profile file. The symbol table is not necessary and function
   profiles do not need to know the number of samples in advance.

   Differential Revision: http://llvm-reviews.chandlerc.com/D2419

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198973 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-10 23:23:51 +00:00
Diego Novillo
0de8cecb84 Propagation of profile samples through the CFG.
This adds a propagation heuristic to convert instruction samples
into branch weights. It implements a similar heuristic to the one
implemented by Dehao Chen on GCC.

The propagation proceeds in 3 phases:

1- Assignment of block weights. All the basic blocks in the function
   are initial assigned the same weight as their most frequently
   executed instruction.

2- Creation of equivalence classes. Since samples may be missing from
   blocks, we can fill in the gaps by setting the weights of all the
   blocks in the same equivalence class to the same weight. To compute
   the concept of equivalence, we use dominance and loop information.
   Two blocks B1 and B2 are in the same equivalence class if B1
   dominates B2, B2 post-dominates B1 and both are in the same loop.

3- Propagation of block weights into edges. This uses a simple
   propagation heuristic. The following rules are applied to every
   block B in the CFG:

   - If B has a single predecessor/successor, then the weight
     of that edge is the weight of the block.

   - If all the edges are known except one, and the weight of the
     block is already known, the weight of the unknown edge will
     be the weight of the block minus the sum of all the known
     edges. If the sum of all the known edges is larger than B's weight,
     we set the unknown edge weight to zero.

   - If there is a self-referential edge, and the weight of the block is
     known, the weight for that edge is set to the weight of the block
     minus the weight of the other incoming edges to that block (if
     known).

Since this propagation is not guaranteed to finalize for every CFG, we
only allow it to proceed for a limited number of iterations (controlled
by -sample-profile-max-propagate-iterations). It currently uses the same
GCC default of 100.

Before propagation starts, the pass builds (for each block) a list of
unique predecessors and successors. This is necessary to handle
identical edges in multiway branches. Since we visit all blocks and all
edges of the CFG, it is cleaner to build these lists once at the start
of the pass.

Finally, the patch fixes the computation of relative line locations.
The profiler emits lines relative to the function header. To discover
it, we traverse the compilation unit looking for the subprogram
corresponding to the function. The line number of that subprogram is the
line where the function begins. That becomes line zero for all the
relative locations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198972 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-10 23:23:46 +00:00
Chandler Carruth
560e3955c3 Put the functionality for printing a value to a raw_ostream as an
operand into the Value interface just like the core print method is.
That gives a more conistent organization to the IR printing interfaces
-- they are all attached to the IR objects themselves. Also, update all
the users.

This removes the 'Writer.h' header which contained only a single function
declaration.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198836 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-09 02:29:41 +00:00
Chandler Carruth
bc65a8d518 Move the LLVM IR asm writer header files into the IR directory, as they
are part of the core IR library in order to support dumping and other
basic functionality.

Rename the 'Assembly' include directory to 'AsmParser' to match the
library name and the only functionality left their -- printing has been
in the core IR library for quite some time.

Update all of the #includes to match.

All of this started because I wanted to have the layering in good shape
before I started adding support for printing LLVM IR using the new pass
infrastructure, and commandline support for the new pass infrastructure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198688 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-07 12:34:26 +00:00
Chandler Carruth
974a445bd9 Re-sort all of the includes with ./utils/sort_includes.py so that
subsequent changes are easier to review. About to fix some layering
issues, and wanted to separate out the necessary churn.

Also comment and sink the include of "Windows.h" in three .inc files to
match the usage in Memory.inc.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198685 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-07 11:48:04 +00:00
Andrew Trick
b4e0c9b85d Reapply r198654 "indvars: sink truncates outside the loop."
This doesn't seem to have actually broken anything. It was paranoia
on my part. Trying again now that bots are more stable.

This is a follow up of the r198338 commit that added truncates for
lcssa phi nodes. Sinking the truncates below the phis cleans up the
loop and simplifies subsequent analysis within the indvars pass.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198678 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-07 06:59:12 +00:00
Andrew Trick
2352abb7c6 Revert "indvars: sink truncates outside the loop."
This reverts commit r198654.

One of the bots reported a SciMark failure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198659 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-07 01:50:58 +00:00
Andrew Trick
ced88c5918 indvars: sink truncates outside the loop.
This is a follow up of the r198338 commit that added truncates for
lcssa phi nodes. Sinking the truncates below the phis cleans up the
loop and simplifies subsequent analysis within the indvars pass.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198654 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-07 01:02:55 +00:00
Andrew Trick
f86063e7ec 80 col. comment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198653 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-07 01:02:52 +00:00
Alp Toker
395f7c2505 Add missed cleanup from r198456
All other uses of this macro in LLVM/clang have been moved to the function
definition so follow suite (and the usage advice) here too for consistency.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198516 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-04 22:47:48 +00:00
Nico Weber
c3d3f0c696 Add a LLVM_DUMP_METHOD macro.
The motivation is to mark dump methods as used in debug builds so that they can
be called from lldb, but to not do so in release builds so that they can be
dead-stripped.

There's lots of potential follow-up work suggested in the thread
"Should dump methods be LLVM_ATTRIBUTE_USED only in debug builds?" on cfe-dev,
but everyone seems to agreen on this subset.

Macro name chosen by fair coin toss.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198456 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-03 22:53:37 +00:00
David Peixotto
dace98d805 Fix loop rerolling pass failure with non-consant loop lower bound
The loop rerolling pass was failing with an assertion failure from a
failed cast on loops like this:

  void foo(int *A, int *B, int m, int n) {
    for (int i = m; i < n; i+=4) {
      A[i+0] = B[i+0] * 4;
      A[i+1] = B[i+1] * 4;
      A[i+2] = B[i+2] * 4;
      A[i+3] = B[i+3] * 4;
    }
  }

The code was casting the SCEV-expanded code for the new
induction variable to a phi-node. When the loop had a non-constant
lower bound, the SCEV expander would end the code expansion with an
add insted of a phi node and the cast would fail.

It looks like the cast to a phi node was only needed to get the
induction variable value coming from the backedge to compute the end
of loop condition. This patch changes the loop reroller to compare
the induction variable to the number of times the backedge is taken
instead of the iteration count of the loop. In other words, we stop
the loop when the current value of the induction variable ==
IterationCount-1. Previously, the comparison was comparing the
induction variable value from the next iteration == IterationCount.

This problem only seems to occur on 32-bit targets. For some reason,
the loop is not rerolled on 64-bit targets.

PR18290


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198425 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-03 17:20:01 +00:00
Hal Finkel
ac8ba0c0fd Disable compare sinking in CodeGenPrepare when multiple condition registers are available
As noted in the comment above CodeGenPrepare::OptimizeInst, which aggressively
sinks compares to reduce pressure on the condition register(s), for targets
such as PowerPC with multiple condition registers, this may not be the right
thing to do. This adds an HasMultipleConditionRegisters boolean to TLI, and
CodeGenPrepare::OptimizeInst is skipped when HasMultipleConditionRegisters is
true.

This functionality will be used by the PowerPC backend in an upcoming commit.
Especially when the PowerPC backend starts tracking individual condition
register bits as separate allocatable entities (which will happen in this
upcoming commit), this sinking from CodeGenPrepare::OptimizeInst is
significantly suboptimial.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198354 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-02 21:13:43 +00:00
Andrew Trick
5f8e79e6d2 indvars: cleanup the IV visitor. It does more than gather sext/zext info.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198353 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-02 21:12:11 +00:00
Andrew Trick
fcbe3d9501 indvars: insert truncate at loop boundary to avoid redundant IVs.
When widening an IV to remove s/zext, we generally try to eliminate
the original narrow IV. However, LCSSA phi nodes outside the loop were
still using the original IV. Clean this up more aggressively to avoid
redundancy in generated code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198338 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-02 19:29:38 +00:00
Andrew Trick
c7b0b7dc8f Add support to indvars for optimizing sadd.with.overflow.
Split sadd.with.overflow into add + sadd.with.overflow to allow
analysis and optimization. This should ideally be done after
InstCombine, which can perform code motion (eventually indvars should
run after all canonical instcombines). We want ISEL to recombine the
add and the check, at least on x86.

This is currently under an option for reducing live induction
variables: -liv-reduce. The next step is reducing liveness of IVs that
are live out of the overflow check paths. Once the related
optimizations are fully developed, reviewed and tested, I do expect
this to become default.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197926 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-23 23:31:49 +00:00
Richard Sandiford
166acc9489 Fix Scalarizer insertion point when replacing PHIs with insertelements
If the Scalarizer scalarized a vector PHI but could not scalarize
all uses of it, it would insert a series of insertelements to reconstruct
the vector PHI value from the scalar ones.  The problem was that it would
emit these insertelements immediately after the PHI, even if there were
other PHIs after it.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197909 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-23 14:51:56 +00:00