The commit r225977 uncovered this bug. The problem was that the vectorizer tried to
read the second operand of an already deleted instruction.
The bug didn't show up before r225977 because the freed memory still contained a non-null pointer.
With r225977 deletion of instructions is delayed and the read operand pointer is always null.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227800 91177308-0d34-0410-b5e6-96231b3b80d8
Other than moving code and adding the boilerplate for the new files, the code
being moved is unchanged.
There are a few global functions that are shared with the rest of the
LoopVectorizer. I moved these to the new module as well (emitLoopAnalysis,
stripIntegerCast, replaceSymbolicStrideSCEV) along with the Report class used
by emitLoopAnalysis. There is probably room for further improvement in this
area.
I kept DEBUG_TYPE "loop-vectorize" because it's used as the PassName with
emitOptimizationRemarkAnalysis. This will obviously have to change.
NFC. This is part of the patchset that splits out the memory dependence logic
from LoopVectorizationLegality into a new class LoopAccessAnalysis.
LoopAccessAnalysis will be used by the new Loop Distribution pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227756 91177308-0d34-0410-b5e6-96231b3b80d8
This class needs to remain public because it's used by
LoopVectorizationLegality::addRuntimeCheck.
NFC. This is part of the patchset that splits out the memory dependence logic
from LoopVectorizationLegality into a new class LoopAccessAnalysis.
LoopAccessAnalysis will be used by the new Loop Distribution pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227755 91177308-0d34-0410-b5e6-96231b3b80d8
Rather than using globals use a structure to pass parameters from the
vectorizer. This prepares the class to be moved outside the LoopVectorizer.
It's not great how all this is passed through in LoopAccessAnalysis but this
is all expected to change once the class start servicing the Loop Distribution
pass as well where some of these parameters make no sense.
NFC. This is part of the patchset that splits out the memory dependence logic
from LoopVectorizationLegality into a new class LoopAccessAnalysis.
LoopAccessAnalysis will be used by the new Loop Distribution pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227754 91177308-0d34-0410-b5e6-96231b3b80d8
Move the canVectorizeMemory functionality from LoopVectorizationLegality to a
new class LoopAccessAnalysis and forward users.
Currently the collection of the symbolic stride information is kept with
LoopVectorizationLegality and it becomes an input to LoopAccessAnalysis.
NFC. This is part of the patchset that splits out the memory dependence logic
from LoopVectorizationLegality into a new class LoopAccessAnalysis.
LoopAccessAnalysis will be used by the new Loop Distribution pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227751 91177308-0d34-0410-b5e6-96231b3b80d8
These members are moving to LoopAccessAnalysis. The accessors help to hide
this.
NFC. This is part of the patchset that splits out the memory dependence logic
from LoopVectorizationLegality into a new class LoopAccessAnalysis.
LoopAccessAnalysis will be used by the new Loop Distribution pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227750 91177308-0d34-0410-b5e6-96231b3b80d8
This class will become public in the new LoopAccessAnalysis header so the name
needs to be more global.
NFC. This is part of the patchset that splits out the memory dependence logic
from LoopVectorizationLegality into a new class LoopAccessAnalysis.
LoopAccessAnalysis will be used by the new Loop Distribution pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227749 91177308-0d34-0410-b5e6-96231b3b80d8
The logic in emitAnalysis is duplicated across multiple functions. This
splits it into a function. Another use will be added by the patchset.
NFC. This is part of the patchset that splits out the memory dependence logic
from LoopVectorizationLegality into a new class LoopAccessAnalysis.
LoopAccessAnalysis will be used by the new Loop Distribution pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227748 91177308-0d34-0410-b5e6-96231b3b80d8
RuntimePointerCheck will be used through LoopAccessAnalysis in
LoopVectorizationLegality. Later in the patchset it will become a local class
of LoopAccessAnalysis.
NFC. This is part of the patchset that splits out the memory dependence logic
from LoopVectorizationLegality into a new class LoopAccessAnalysis.
LoopAccessAnalysis will be used by the new Loop Distribution pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227747 91177308-0d34-0410-b5e6-96231b3b80d8
getTTI method used to get an actual TTI object.
No functionality changed. This just threads the argument and ensures
code like the inliner can correctly look up the callee's TTI rather than
using a fixed one.
The next change will use this to implement per-function subtarget usage
by TTI. The changes after that should eliminate the need for FTTI as that
will have become the default.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227730 91177308-0d34-0410-b5e6-96231b3b80d8
This should be sufficient to replace the initial (minor) function pass
pipeline in Clang with the new pass manager. I'll probably add an (off
by default) flag to do that just to ensure we can get extra testing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227726 91177308-0d34-0410-b5e6-96231b3b80d8
I've added RUN lines both to the basic test for EarlyCSE and the
target-specific test, as this serves as a nice test that the TTI layer
in the new pass manager is in fact working well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227725 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
CUDA driver can unroll loops when jit-compiling PTX. To prevent CUDA
driver from unrolling a loop marked with llvm.loop.unroll.disable is not
unrolled by CUDA driver, we need to emit .pragma "nounroll" at the
header of that loop.
This patch also extracts getting unroll metadata from loop ID metadata
into a shared helper function.
Test Plan: test/CodeGen/NVPTX/nounroll.ll
Reviewers: eliben, meheff, jholewinski
Reviewed By: jholewinski
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D7041
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227703 91177308-0d34-0410-b5e6-96231b3b80d8
aggregate or scalar, the debug info needs to refer to the absolute offset
(relative to the entire variable) instead of storing the offset inside
the smaller aggregate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227702 91177308-0d34-0410-b5e6-96231b3b80d8
type erased interface and a single analysis pass rather than an
extremely complex analysis group.
The end result is that the TTI analysis can contain a type erased
implementation that supports the polymorphic TTI interface. We can build
one from a target-specific implementation or from a dummy one in the IR.
I've also factored all of the code into "mix-in"-able base classes,
including CRTP base classes to facilitate calling back up to the most
specialized form when delegating horizontally across the surface. These
aren't as clean as I would like and I'm planning to work on cleaning
some of this up, but I wanted to start by putting into the right form.
There are a number of reasons for this change, and this particular
design. The first and foremost reason is that an analysis group is
complete overkill, and the chaining delegation strategy was so opaque,
confusing, and high overhead that TTI was suffering greatly for it.
Several of the TTI functions had failed to be implemented in all places
because of the chaining-based delegation making there be no checking of
this. A few other functions were implemented with incorrect delegation.
The message to me was very clear working on this -- the delegation and
analysis group structure was too confusing to be useful here.
The other reason of course is that this is *much* more natural fit for
the new pass manager. This will lay the ground work for a type-erased
per-function info object that can look up the correct subtarget and even
cache it.
Yet another benefit is that this will significantly simplify the
interaction of the pass managers and the TargetMachine. See the future
work below.
The downside of this change is that it is very, very verbose. I'm going
to work to improve that, but it is somewhat an implementation necessity
in C++ to do type erasure. =/ I discussed this design really extensively
with Eric and Hal prior to going down this path, and afterward showed
them the result. No one was really thrilled with it, but there doesn't
seem to be a substantially better alternative. Using a base class and
virtual method dispatch would make the code much shorter, but as
discussed in the update to the programmer's manual and elsewhere,
a polymorphic interface feels like the more principled approach even if
this is perhaps the least compelling example of it. ;]
Ultimately, there is still a lot more to be done here, but this was the
huge chunk that I couldn't really split things out of because this was
the interface change to TTI. I've tried to minimize all the other parts
of this. The follow up work should include at least:
1) Improving the TargetMachine interface by having it directly return
a TTI object. Because we have a non-pass object with value semantics
and an internal type erasure mechanism, we can narrow the interface
of the TargetMachine to *just* do what we need: build and return
a TTI object that we can then insert into the pass pipeline.
2) Make the TTI object be fully specialized for a particular function.
This will include splitting off a minimal form of it which is
sufficient for the inliner and the old pass manager.
3) Add a new pass manager analysis which produces TTI objects from the
target machine for each function. This may actually be done as part
of #2 in order to use the new analysis to implement #2.
4) Work on narrowing the API between TTI and the targets so that it is
easier to understand and less verbose to type erase.
5) Work on narrowing the API between TTI and its clients so that it is
easier to understand and less verbose to forward.
6) Try to improve the CRTP-based delegation. I feel like this code is
just a bit messy and exacerbating the complexity of implementing
the TTI in each target.
Many thanks to Eric and Hal for their help here. I ended up blocked on
this somewhat more abruptly than I expected, and so I appreciate getting
it sorted out very quickly.
Differential Revision: http://reviews.llvm.org/D7293
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227669 91177308-0d34-0410-b5e6-96231b3b80d8
analyses back into the LTO code generator.
The pass manager builder (and the transforms library in general)
shouldn't be referencing the target machine at all.
This makes the LTO population work like the others -- the data layout
and target transform info need to be pre-populated.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227576 91177308-0d34-0410-b5e6-96231b3b80d8
The validation algorithm used an incremental approach, building each
iteration's data structures temporarily, validating them, then
adding them to a global set.
This does not scale well to having multiple sets of Root nodes, as the
set of instructions used in each iteration is the union over all
the root nodes. Therefore, refactor the logic to create a single, simple
container to which later logic then refers. This makes it simpler
control-flow wise to make the creation of the container more complex with
the addition of multiple root sets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227499 91177308-0d34-0410-b5e6-96231b3b80d8
reroll() was slightly monolithic and a pain to modify. Refactor
a bunch of its state from local variables to member variables
of a helper class, and do some trivial simplification while we're
there.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227439 91177308-0d34-0410-b5e6-96231b3b80d8
Patch by: Igor Laevsky <igor@azulsystems.com>
"Currently SplitBlockPredecessors generates incorrect code in case if basic block we are going to split has a landingpad. Also seems like it is fairly common case among it's users to conditionally call either SplitBlockPredecessors or SplitLandingPadPredecessors. Because of this I think it is reasonable to add this condition directly into SplitBlockPredecessors."
Differential Revision: http://reviews.llvm.org/D7157
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227390 91177308-0d34-0410-b5e6-96231b3b80d8
abomination.
For starters, this API is incredibly slow. In order to lookup the name
of a pass it must take a memory fence to acquire a pointer to the
managed static pass registry, and then potentially acquire locks while
it consults this registry for information about what passes exist by
that name. This stops the world of LLVMs in your process no matter
how little they cared about the result.
To make this more joyful, you'll note that we are preserving many passes
which *do not exist* any more, or are not even analyses which one might
wish to have be preserved. This means we do all the work only to say
"nope" with no error to the user.
String-based APIs are a *bad idea*. String-based APIs that cannot
produce any meaningful error are an even worse idea. =/
I have a patch that simply removes this API completely, but I'm hesitant
to commit it as I don't really want to perniciously break out-of-tree
users of the old pass manager. I'd rather they just have to migrate to
the new one at some point. If others disagree and would like me to kill
it with fire, just say the word. =]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227294 91177308-0d34-0410-b5e6-96231b3b80d8
COMDATs must be identically named to the symbol. When support for COMDATs was
introduced, the symbol rewriter was not updated, resulting in rewriting failing
for symbols which were placed into COMDATs. This corrects the behaviour and
adds test cases for this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227261 91177308-0d34-0410-b5e6-96231b3b80d8
This was introduced in a faulty refactoring (r225640, mea culpa):
the tests weren't testing the return values, so, for both
__strcpy_chk and __stpcpy_chk, we would return the end of the
buffer (matching stpcpy) instead of the beginning (for strcpy).
The root cause was the prefix "__" being ignored when comparing,
which made us always pick LibFunc::stpcpy_chk.
Pass the LibFunc::Func directly to avoid this kind of error.
Also, make the testcases as explicit as possible to prevent this.
The now-useful testcases expose another, entangled, stpcpy problem,
with the further simplification. This was introduced in a
refactoring (r225640) to match the original behavior.
However, this leads to problems when successive simplifications
generate several similar instructions, none of which are removed
by the custom replaceAllUsesWith.
For instance, InstCombine (the main user) doesn't erase the
instruction in its custom RAUW. When trying to simplify say
__stpcpy_chk:
- first, an stpcpy is created (fortified simplifier),
- second, a memcpy is created (normal simplifier), but the
stpcpy call isn't removed.
- third, InstCombine later revisits the instructions,
and simplifies the first stpcpy to a memcpy. We now have
two memcpys.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227250 91177308-0d34-0410-b5e6-96231b3b80d8
Splitting a loop to make range checks redundant is profitable only if
the range check "never" fails. Make this fact a part of recognizing a
range check -- a branch is a range check only if it is expected to
pass (via branch_weights metadata).
Differential Revision: http://reviews.llvm.org/D7192
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227249 91177308-0d34-0410-b5e6-96231b3b80d8
If a memory access is unaligned, emit __tsan_unaligned_read/write
callbacks instead of __tsan_read/write.
Required to change semantics of __tsan_unaligned_read/write to not do the user memory.
But since they were unused (other than through __sanitizer_unaligned_load/store) this is fine.
Fixes long standing issue 17:
https://code.google.com/p/thread-sanitizer/issues/detail?id=17
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227231 91177308-0d34-0410-b5e6-96231b3b80d8
This patch teaches the Instruction Combiner how to fold a cttz/ctlz followed by
a icmp plus select into a single cttz/ctlz with flag 'is_zero_undef' cleared.
Added test InstCombine/select-cmp-cttz-ctlz.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227197 91177308-0d34-0410-b5e6-96231b3b80d8
Sanitizer coverage constructor must run after asan constructor (for each DSO).
Bump constructor priority to guarantee that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227195 91177308-0d34-0410-b5e6-96231b3b80d8
LoopRotate wanted to avoid live range interference by looking at the
uses of a Value in the loop latch and seeing if any lied outside of the
loop. We would wrongly perform this operation on Constants.
This fixes PR22337.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227171 91177308-0d34-0410-b5e6-96231b3b80d8
object that manages a single run of this pass.
This was already essentially how it worked. Within the run function, it
would point members at *stack local* allocations that were only live for
a single run. Instead, it seems much cleaner to have a utility object
whose lifetime is clearly bounded by the run of the pass over the
function and can use member variables in a more direct way.
This also makes it easy to plumb the analyses used into it from the pass
and will make it re-usable with the new pass manager.
No functionality changed here, its just a refactoring.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227162 91177308-0d34-0410-b5e6-96231b3b80d8
An unreachable default destination can be exploited by other optimizations and
allows for more efficient lowering. Both the SDag switch lowering and
LowerSwitch can exploit unreachable defaults.
Also make TurnSwitchRangeICmp handle switches with unreachable default.
This is kind of separate change, but it cannot be tested without the change
above, and I don't want to land the change above without this since that would
regress other tests.
Differential Revision: http://reviews.llvm.org/D6471
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227125 91177308-0d34-0410-b5e6-96231b3b80d8
when refactoring for the new pass manager without introducing too many
formatting changes into meaning full diffs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227000 91177308-0d34-0410-b5e6-96231b3b80d8
This just lifts the logic into a static helper function, sinks the
legacy pass to be a trivial wrapper of that helper fuction, and adds
a trivial wrapper for the new PM as well. Not much to see here.
I switched a test case to run in both modes, but we have to strip the
dead prototypes separately as that pass isn't in the new pass manager
(yet).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226999 91177308-0d34-0410-b5e6-96231b3b80d8
changed the IR. This is particularly easy as we can just look for the
existence of any expect intrinsic at all to know whether we've changed
the IR.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226998 91177308-0d34-0410-b5e6-96231b3b80d8
for small switches, and avoid using a complex loop to set up the
weights.
We know what the baseline weights will be so we can just resize the
vector to contain all that value and clobber the one slot that is
likely. This seems much more direct than the previous code that tested
at every iteration, and started off by zeroing the vector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226995 91177308-0d34-0410-b5e6-96231b3b80d8
no members for them to use.
Also, make them accept references as there is no possibility of a null
pointer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226994 91177308-0d34-0410-b5e6-96231b3b80d8
It was already in the Scalar header and referenced extensively as being
in this library, the source file was just in the utils directory for
some reason. No actual functionality changed. I noticed as it didn't
make sense to add a pass header to the utils headers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226991 91177308-0d34-0410-b5e6-96231b3b80d8
This is exciting as this is a much more involved port. This is
a complex, existing transformation pass. All of the core logic is shared
between both old and new pass managers. Only the access to the analyses
is separate because the actual techniques are separate. This also uses
a bunch of different and interesting analyses and is the first time
where we need to use an analysis across an IR layer.
This also paves the way to expose instcombine utility functions. I've
got a static function that implements the core pass logic over
a function which might be mildly interesting, but more interesting is
likely exposing a routine which just uses instructions *already in* the
worklist and combines until empty.
I've switched one of my favorite instcombine tests to run with both as
well to make sure this keeps working.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226987 91177308-0d34-0410-b5e6-96231b3b80d8
SimplifyCFG currently does this transformation, but I'm planning to remove that
to allow other passes, such as this one, to exploit the unreachable default.
This patch takes care to keep track of what case values are unreachable even
after the transformation, allowing for more efficient lowering.
Differential Revision: http://reviews.llvm.org/D6697
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226934 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r176827.
Björn Steinbrink pointed out that this didn't actually fix the bug
(PR15555) it was attempting to fix.
With this reverted, we can now remove landingpad cleanups that
immediately resume unwinding, converting the invoke to a call.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226850 91177308-0d34-0410-b5e6-96231b3b80d8
Use the struct instead of a std::pair<Value *, Value *>. This makes a
Range an obviously immutable object, and we can now assert that a
range is well-typed (Begin->getType() == End->getType()) on its
construction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226804 91177308-0d34-0410-b5e6-96231b3b80d8
There are places where the inductive range check elimination pass
depends on two llvm::Values or llvm::SCEVs to be of the same
llvm::Type when they do not need to be. This patch relaxes those
restrictions (by bailing out of the optimization if the types
mismatch), and adds test cases to trigger those paths.
These issues were found by bootstrapping clang with IRCE running in
the -O3 pass ordering.
Differential Revision: http://reviews.llvm.org/D7082
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226793 91177308-0d34-0410-b5e6-96231b3b80d8
Even with the current limit on the number of alias checks, the containing loop has quadratic complexity.
This begins to hurt for blocks containing > 1K load/store instructions.
This commit introduces a limit for the loop count. It reduces the runtime for such very large blocks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226792 91177308-0d34-0410-b5e6-96231b3b80d8
creating a non-internal header file for the InstCombine pass.
I thought about calling this InstCombiner.h or in some way more clearly
associating it with the InstCombiner clas that it is primarily defining,
but there are several other utility interfaces defined within this for
InstCombine. If, in the course of refactoring, those end up moving
elsewhere or going away, it might make more sense to make this the
combiner's header alone.
Naturally, this is a bikeshed to a certain degree, so feel free to lobby
for a different shade of paint if this name just doesn't suit you.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226783 91177308-0d34-0410-b5e6-96231b3b80d8
ever stored to always use a legal integer type if one is available.
Regardless of whether this particular type is good or bad, it ensures we
don't get weird differences in generated code (and resulting
performance) from "equivalent" patterns that happen to end up using
a slightly different type.
After some discussion on llvmdev it seems everyone generally likes this
canonicalization. However, there may be some parts of LLVM that handle
it poorly and need to be fixed. I have at least verified that this
doesn't impede GVN and instcombine's store-to-load forwarding powers in
any obvious cases. Subtle cases are exactly what we need te flush out if
they remain.
Also note that this IR pattern should already be hitting LLVM from Clang
at least because it is exactly the IR which would be produced if you
used memcpy to copy a pointer or floating point between memory instead
of a variable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226781 91177308-0d34-0410-b5e6-96231b3b80d8
function. This is a bit tidier anyways and will make a subsquent patch
simpler as I want to add another case to this combine.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226746 91177308-0d34-0410-b5e6-96231b3b80d8
When two calls from the same MDLocation are inlined they currently get
treated as one inlined function call (creating difficulty debugging,
duplicate variables, etc).
Clang worked around this by including column information on inline calls
which doesn't address LTO inlining or calls to the same function from
the same line and column (such as through a macro). It also didn't
address ctor and member function calls.
By making the inlinedAt locations distinct, every call site has an
explicitly distinct location that cannot be coalesced with any other
call.
This can produce linearly (2x in the worst case where every call is
inlined and the call instruction has a non-call instruction at the same
location) more debug locations. Any increase beyond that are in cases
where the Clang workaround was insufficient and the new scheme is
creating necessary distinct nodes that were being erroneously coalesced
previously.
After this change to LLVM the incomplete workarounds in Clang. That
should reduce the number of debug locations (in a build without column
info, the default on Darwin, not the default on Linux) by not creating
pseudo-distinct locations for every call to an inline function.
(oh, and I made the inlined-at chain rebuilding iterative instead of
recursive because I was having trouble wrapping my head around it the
way it was - open to discussion on the right design for that function
(including going back to a recursive solution))
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226736 91177308-0d34-0410-b5e6-96231b3b80d8
The return type of a thunk is meaningless, we just want the arguments
and return value to be forwarded.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226708 91177308-0d34-0410-b5e6-96231b3b80d8
Previously we always stored 4 bytes of origin at the destination address
even for 8-byte (and longer) stores.
This should fix rare missing, or incorrect, origin stacks in MSan reports.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226658 91177308-0d34-0410-b5e6-96231b3b80d8
Because in its primary function pass the combiner is run repeatedly over
the same function until doing so produces no changes, it is essentially
to not re-allocate the worklist. However, as a utility, the more common
pattern would be to put a limited set of instructions in the worklist
rather than the entire function body. That is also the more likely
pattern when used by the new pass manager.
The result is a very light weight combiner that does the visiting with
a separable worklist. This can then be wrapped up in a helper function
for users that want a combiner utility, or as I have here it can be
wrapped up in a pass which manages the iterations used when combining an
entire function's instructions.
Hopefully this removes some of the worst of the interface warts that
became apparant with the last patch here. However, there is clearly more
work. I've again left some FIXMEs for the most egregious. The ones that
stick out to me are the exposure of the worklist and IR builder as
public members, and the use of pointers rather than references. However,
fixing these is likely to be much more mechanical and less interesting
so I didn't want to touch them in this patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226655 91177308-0d34-0410-b5e6-96231b3b80d8
SimplifyLibCalls utility by sinking it into the specific call part of
the combiner.
This will avoid us needing to do any contortions to build this object in
a subsequent refactoring I'm doing and seems generally better factored.
We don't need this utility everywhere and it carries no interesting
state so we might as well build it on demand.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226654 91177308-0d34-0410-b5e6-96231b3b80d8
a more direct approach: a type-erased glorified function pointer. Now we
can pass a function pointer into this for the easy case and we can even
pass a lambda into it in the interesting case in the instruction
combiner.
I'll be using this shortly to simplify the interfaces to InstCombiner,
but this helps pave the way and seems like a better design for the
libcall simplifier utility.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226640 91177308-0d34-0410-b5e6-96231b3b80d8
This creates a small internal pass which runs the InstCombiner over
a function. This is the hard part of porting InstCombine to the new pass
manager, as at this point none of the code in InstCombine has access to
a Pass object any longer.
The resulting interface for the InstCombiner is pretty terrible. I'm not
planning on leaving it that way. The key thing missing is that we need
to separate the worklist from the combiner a touch more. Once that's
done, it should be possible for *any* part of LLVM to just create
a worklist with instructions, populate it, and then combine it until
empty. The pass will just be the (obvious and important) special case of
doing that for an entire function body.
For now, this is the first increment of factoring to make all of this
work.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226618 91177308-0d34-0410-b5e6-96231b3b80d8
don't get muddied up by formatting changes.
Some of these don't really seem like improvements to me, but they also
don't seem any worse and I care much more about not formatting them
manually than I do about the particular formatting. =]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226610 91177308-0d34-0410-b5e6-96231b3b80d8
This reapplies r225379.
ChangeLog:
- The assertion that this commit previously ran into about the inability
to handle indirect variables has since been removed and the backend
can handle this now.
- Testcases were upgrade to the new MDLocation format.
- Instead of keeping a DebugDeclares map, we now use
llvm::FindAllocaDbgDeclare().
Original commit message follows.
Debug info: Teach SROA how to update debug info for fragmented variables.
This allows us to generate debug info for extremely advanced code such as
typedef struct { long int a; int b;} S;
int foo(S s) {
return s.b;
}
which at -O1 on x86_64 is codegen'd into
define i32 @foo(i64 %s.coerce0, i32 %s.coerce1) #0 {
ret i32 %s.coerce1, !dbg !24
}
with this patch we emit the following debug info for this
TAG_formal_parameter [3]
AT_location( 0x00000000
0x0000000000000000 - 0x0000000000000006: rdi, piece 0x00000008, rsi, piece 0x00000004
0x0000000000000006 - 0x0000000000000008: rdi, piece 0x00000008, rax, piece 0x00000004 )
AT_name( "s" )
AT_decl_file( "/Volumes/Data/llvm/_build.ninja.release/test.c" )
Thanks to chandlerc, dblaikie, and echristo for their feedback on all
previous iterations of this patch!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226598 91177308-0d34-0410-b5e6-96231b3b80d8
The new code does not create new basic blocks in the case when shadow is a
compile-time constant; it generates either an unconditional __msan_warning
call or nothing instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226569 91177308-0d34-0410-b5e6-96231b3b80d8
along with the other analyses.
The most obvious reason why is because eventually I need to separate out
the pass layer from the rest of the instcombiner. However, it is also
probably a compile time win as every query through the pass manager
layer is pretty slow these days.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226550 91177308-0d34-0410-b5e6-96231b3b80d8
This patch fixes 2 issues in reorderInputsAccordingToOpcode
1) AllSameOpcodeLeft and AllSameOpcodeRight was being calculated incorrectly resulting in code not being vectorized in few cases.
2) Adds logic to reorder operands if we get longer chain of consecutive loads enabling vectorization. Handled the same for cases were we have AltOpcode.
Thanks Michael for inputs and review.
Review: http://reviews.llvm.org/D6677
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226547 91177308-0d34-0410-b5e6-96231b3b80d8
Now that the clone methods used by `MapMetadata()` don't do any
remapping (and return a temporary), they make more sense as member
functions on `MDNode` (and subclasses).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226541 91177308-0d34-0410-b5e6-96231b3b80d8
a DominatorTree argument as that is the analysis that it wants to
update.
This removes the last non-loop utility function in Utils/ which accepts
a raw Pass argument.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226537 91177308-0d34-0410-b5e6-96231b3b80d8
As part of PR22235, introduce `DwarfNode` and `GenericDwarfNode`. The
former is a metadata node with a DWARF tag. The latter matches our
current (generic) schema of a header with string (and stringified
integer) data and an arbitrary number of operands.
This doesn't move it into place yet; that change will require a large
number of testcase updates.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226529 91177308-0d34-0410-b5e6-96231b3b80d8
As pointed out in r226501, the distinction between `MDNode` and
`UniquableMDNode` is confusing. When we need subclasses of `MDNode`
that don't use all its functionality it might make sense to break it
apart again, but until then this makes the code clearer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226520 91177308-0d34-0410-b5e6-96231b3b80d8
Take advantage of the new ability of temporary nodes to mutate to
distinct and uniqued nodes to greatly simplify the `MapMetadata()`
helper functions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226511 91177308-0d34-0410-b5e6-96231b3b80d8
Change `MDTuple::getTemporary()` and `MDLocation::getTemporary()` to
return (effectively) `std::unique_ptr<T, MDNode::deleteTemporary>`, and
clean up call sites. (For now, `DIBuilder` call sites just call
`release()` immediately.)
There's an accompanying change in each of clang and polly to use the new
API.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226504 91177308-0d34-0410-b5e6-96231b3b80d8
Remove `MDNodeFwdDecl` (as promised in r226481). Aside from API
changes, there's no real functionality change here.
`MDNode::getTemporary()` now forwards to `MDTuple::getTemporary()`,
which returns a tuple with `isTemporary()` equal to true.
The main point is that we can now add temporaries of other `MDNode`
subclasses, needed for PR22235 (I introduced `MDNodeFwdDecl` in the
first place because I didn't recognize this need, and thought they were
only needed to handle forward references).
A few things left out of (or highlighted by) this commit:
- I've had to remove the (few) uses of `std::unique_ptr<>` to deal
with temporaries, since the destructor is no longer public.
`getTemporary()` should probably return the equivalent of
`std::unique_ptr<T, MDNode::deleteTemporary>`.
- `MDLocation::getTemporary()` doesn't exist yet (worse, it actually
does exist, but does the wrong thing: `MDNode::getTemporary()` is
inherited and returns an `MDTuple`).
- `MDNode` now only has one subclass, `UniquableMDNode`, and the
distinction between them is actually somewhat confusing.
I'll fix those up next.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226501 91177308-0d34-0410-b5e6-96231b3b80d8
Change `MDNode::isDistinct()` to only apply to 'distinct' nodes (not
temporaries), and introduce `MDNode::isUniqued()` and
`MDNode::isTemporary()` for the other two possibilities.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226482 91177308-0d34-0410-b5e6-96231b3b80d8
and updated.
This may appear to remove handling for things like alias analysis when
splitting critical edges here, but in fact no callers of SplitEdge
relied on this. Similarly, all of them wanted to preserve LCSSA if there
was any update of the loop info. That makes the interface much simpler.
With this, all of BasicBlockUtils.h is free of Pass arguments and
prepared for the new pass manager. This is tho majority of utilities
that relied on pass arguments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226459 91177308-0d34-0410-b5e6-96231b3b80d8
while refactoring this API for the new pass manager.
No functionality changed here, the code didn't actually support this
option.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226457 91177308-0d34-0410-b5e6-96231b3b80d8
APIs and replace it and numerous booleans with an option struct.
The critical edge splitting API has a really large surface of flags and
so it seems worth burning a small option struct / builder. This struct
can be constructed with the various preserved analyses and then flags
can be flipped in a builder style.
The various users are now responsible for directly passing along their
analysis information. This should be enough for the critical edge
splitting to work cleanly with the new pass manager as well.
This API is still pretty crufty and could be cleaned up a lot, but I've
focused on this change just threading an option struct rather than
a pass through the API.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226456 91177308-0d34-0410-b5e6-96231b3b80d8
we can while splitting critical edges.
The only code which called this and didn't require simplified loops to
be preserved is polly, and the code behaves correctly there anyways.
Without this change, it becomes really hard to share this code with the
new pass manager where things like preserving loop simplify form don't
make any sense.
If anyone discovers this code behaving incorrectly, what it *should* be
testing for is whether the loops it needs to be in simplified form are
in fact in that form. It should always be trying to preserve that form
when it exists.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226443 91177308-0d34-0410-b5e6-96231b3b80d8
In case of blocks with many memory-accessing instructions, alias checking can take lot of time
(because calculating the memory dependencies has quadratic complexity).
I chose a limit which resulted in no changes when running the benchmarks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226439 91177308-0d34-0410-b5e6-96231b3b80d8
SplitLandingPadPredecessors and remove the Pass argument from its
interface.
Another step to the utilities being usable with both old and new pass
managers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226426 91177308-0d34-0410-b5e6-96231b3b80d8
rather than relying on the pass object.
This one is a bit annoying, but will pay off. First, supporting this one
will make the next one much easier, and for utilities like LoopSimplify,
this is moving them (slowly) closer to not having to pass the pass
object around throughout their APIs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226396 91177308-0d34-0410-b5e6-96231b3b80d8
interface, removing Pass from its interface.
This also makes those analyses optional so that passes which don't even
preserve these (or use them) can skip the logic entirely.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226394 91177308-0d34-0410-b5e6-96231b3b80d8
optionally updated by MergeBlockIntoPredecessors.
No functionality changed, just refactoring to clear the way for the new
pass manager.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226392 91177308-0d34-0410-b5e6-96231b3b80d8
Instead of querying the pass every where we need to, do that once and
cache a pointer in the pass object. This is both simpler and I'm about
to add yet another place where we need to dig out that pointer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226391 91177308-0d34-0410-b5e6-96231b3b80d8
accepting a Pass and querying it for analyses.
This is necessary to allow the utilities to work both with the old and
new pass managers, and I also think this makes the interface much more
clear and helps the reader know what analyses the utility can actually
handle. I plan to repeat this process iteratively to clean up all the
pass utilities.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226386 91177308-0d34-0410-b5e6-96231b3b80d8
cleaner to derive from the generic base.
Thise removes a ton of boiler plate code and somewhat strange and
pointless indirections. It also remove a bunch of the previously needed
friend declarations. To fully remove these, I also lifted the verify
logic into the generic LoopInfoBase, which seems good anyways -- it is
generic and useful logic even for the machine side.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226385 91177308-0d34-0410-b5e6-96231b3b80d8
This was dead even before I refactored how we initialized it, but my
refactoring made it trivially dead and it is now caught by a Clang
warning. This fixes the warning and should clean up the -Werror bot
failures (sorry!).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226376 91177308-0d34-0410-b5e6-96231b3b80d8
a LoopInfoWrapperPass to wire the object up to the legacy pass manager.
This switches all the clients of LoopInfo over and paves the way to port
LoopInfo to the new pass manager. No functionality change is intended
with this iteration.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226373 91177308-0d34-0410-b5e6-96231b3b80d8
IRCE eliminates range checks of the form
0 <= A * I + B < Length
by splitting a loop's iteration space into three segments in a way
that the check is completely redundant in the middle segment. As an
example, IRCE will convert
len = < known positive >
for (i = 0; i < n; i++) {
if (0 <= i && i < len) {
do_something();
} else {
throw_out_of_bounds();
}
}
to
len = < known positive >
limit = smin(n, len)
// no first segment
for (i = 0; i < limit; i++) {
if (0 <= i && i < len) { // this check is fully redundant
do_something();
} else {
throw_out_of_bounds();
}
}
for (i = limit; i < n; i++) {
if (0 <= i && i < len) {
do_something();
} else {
throw_out_of_bounds();
}
}
IRCE can deal with multiple range checks in the same loop (it takes
the intersection of the ranges that will make each of them redundant
individually).
Currently IRCE does not do any profitability analysis. That is a
TODO.
Please note that the status of this pass is *experimental*, and it is
not part of any default pass pipeline. Having said that, I will love
to get feedback and general input from people interested in trying
this out.
This pass was originally r226201. It was reverted because it used C++
features not supported by MSVC 2012.
Differential Revision: http://reviews.llvm.org/D6693
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226238 91177308-0d34-0410-b5e6-96231b3b80d8
The change used C++11 features not supported by MSVC 2012. I will fix
the change to use things supported MSVC 2012 and recommit shortly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226216 91177308-0d34-0410-b5e6-96231b3b80d8
IRCE eliminates range checks of the form
0 <= A * I + B < Length
by splitting a loop's iteration space into three segments in a way
that the check is completely redundant in the middle segment. As an
example, IRCE will convert
len = < known positive >
for (i = 0; i < n; i++) {
if (0 <= i && i < len) {
do_something();
} else {
throw_out_of_bounds();
}
}
to
len = < known positive >
limit = smin(n, len)
// no first segment
for (i = 0; i < limit; i++) {
if (0 <= i && i < len) { // this check is fully redundant
do_something();
} else {
throw_out_of_bounds();
}
}
for (i = limit; i < n; i++) {
if (0 <= i && i < len) {
do_something();
} else {
throw_out_of_bounds();
}
}
IRCE can deal with multiple range checks in the same loop (it takes
the intersection of the ranges that will make each of them redundant
individually).
Currently IRCE does not do any profitability analysis. That is a
TODO.
Please note that the status of this pass is *experimental*, and it is
not part of any default pass pipeline. Having said that, I will love
to get feedback and general input from people interested in trying
this out.
Differential Revision: http://reviews.llvm.org/D6693
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226201 91177308-0d34-0410-b5e6-96231b3b80d8
This patch was generated by a clang tidy checker that is being open sourced.
The documentation of that checker is the following:
/// The emptiness of a container should be checked using the empty method
/// instead of the size method. It is not guaranteed that size is a
/// constant-time function, and it is generally more efficient and also shows
/// clearer intent to use empty. Furthermore some containers may implement the
/// empty method but not implement the size method. Using empty whenever
/// possible makes it easier to switch to another container in the future.
Patch by Gábor Horváth!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226161 91177308-0d34-0410-b5e6-96231b3b80d8
The pass is really just a means of accessing a cached instance of the
TargetLibraryInfo object, and this way we can re-use that object for the
new pass manager as its result.
Lots of delta, but nothing interesting happening here. This is the
common pattern that is developing to allow analyses to live in both the
old and new pass manager -- a wrapper pass in the old pass manager
emulates the separation intrinsic to the new pass manager between the
result and pass for analyses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226157 91177308-0d34-0410-b5e6-96231b3b80d8
While the term "Target" is in the name, it doesn't really have to do
with the LLVM Target library -- this isn't an abstraction which LLVM
targets generally need to implement or extend. It has much more to do
with modeling the various runtime libraries on different OSes and with
different runtime environments. The "target" in this sense is the more
general sense of a target of cross compilation.
This is in preparation for porting this analysis to the new pass
manager.
No functionality changed, and updates inbound for Clang and Polly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226078 91177308-0d34-0410-b5e6-96231b3b80d8
The bug was introduced in r225282. r225282 assumed that sub X, Y is
the same as add X, -Y. This is not correct if we are going to upgrade
the sub to sub nuw. This change fixes the issue by making the
optimization ignore sub instructions.
Differential Revision: http://reviews.llvm.org/D6979
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226075 91177308-0d34-0410-b5e6-96231b3b80d8
This speeds up the dependency calculations for blocks with many load/store/call instructions.
Beside the improved runtime, there is no functional change.
Compared to the original commit, this re-applied commit contains a bug fix which ensures that there are
no incorrect collisions in the alias cache.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225977 91177308-0d34-0410-b5e6-96231b3b80d8
I.E. more than two -> exactly two
Fix a typo function name in LoopVectorize.
I.E. collectStrideAcccess() -> collectStrideAccess()
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225935 91177308-0d34-0410-b5e6-96231b3b80d8
Although this makes the `cast<>` assert more often, the
`assert(Node->isResolved())` on the following line would assert in all
those cases. So, no functionality change here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225903 91177308-0d34-0410-b5e6-96231b3b80d8
It turns out, all callsites of the simplifier are guarded by a check for
CallInst::getCalledFunction (i.e., to make sure the callee is direct).
This check wasn't done when trying to further optimize a simplified fortified
libcall, introduced by a refactoring in r225640.
Fix that, add a testcase, and document the requirement.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225895 91177308-0d34-0410-b5e6-96231b3b80d8
The issue was introduced in r214638:
+ for (auto &BSIter : BlocksSchedules) {
+ scheduleBlock(BSIter.second.get());
+ }
Because BlocksSchedules is a DenseMap with BasicBlock* keys, blocks are
scheduled in non-deterministic order, resulting in unpredictable IR.
Patch by Daniel Reynaud!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225821 91177308-0d34-0410-b5e6-96231b3b80d8
The alias cache has a problem of incorrect collisions in case a new instruction is allocated at the same address as a previously deleted instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225790 91177308-0d34-0410-b5e6-96231b3b80d8
This speeds up the dependency calculations for blocks with many load/store/call instructions.
Beside the improved runtime, there is no functional change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225786 91177308-0d34-0410-b5e6-96231b3b80d8