DataLayout is no longer optional. It was initialized with or without
a DataLayout, and the DataLayout when supplied could have been the
one from the TargetMachine.
Summary:
This change is part of a series of commits dedicated to have a single
DataLayout during compilation by using always the one owned by the
module.
Reviewers: echristo
Subscribers: jholewinski, llvm-commits, rafael, yaron.keren
Differential Revision: http://reviews.llvm.org/D11021
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241774 91177308-0d34-0410-b5e6-96231b3b80d8
Fix some places where the word consecutive is used but the code really
means constant-stride (i.e. not just unit stride).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241763 91177308-0d34-0410-b5e6-96231b3b80d8
This commit ([LAA] Fix estimation of number of memchecks) regressed the
logic a bit. We shouldn't quit the analysis if we encounter a pointer
without known bounds *unless* we actually need to emit a memcheck for
it.
The original code was using NumComparisons which is now computed
differently. Instead I compute NeedRTCheck from NumReadPtrChecks and
NumWritePtrChecks.
As side note, I find the separation of NeedRTCheck and CanDoRT
confusing, so I will try to merge them in a follow-up patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241756 91177308-0d34-0410-b5e6-96231b3b80d8
r239285 ([LoopAccessAnalysis] Teach LAA to check the memory dependence
between strided accesses.) introduced a new case under
MemoryDepChecker::isDependent. We normally have debug output for each
case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241707 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Often filter-like loops will do memory accesses that are
separated by constant offsets. In these cases it is
common that we will exceed the threshold for the
allowable number of checks.
However, it should be possible to merge such checks,
sice a check of any interval againt two other intervals separated
by a constant offset (a,b), (a+c, b+c) will be equivalent with
a check againt (a, b+c), as long as (a,b) and (a+c, b+c) overlap.
Assuming the loop will be executed for a sufficient number of
iterations, this will be true. If not true, checking against
(a, b+c) is still safe (although not equivalent).
As long as there are no dependencies between two accesses,
we can merge their checks into a single one. We use this
technique to construct groups of accesses, and then check
the intervals associated with the groups instead of
checking the accesses directly.
Reviewers: anemet
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10386
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241673 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Initially, these intrinsics seemed like part of a family of "frame"
related intrinsics, but now I think that's more confusing than helpful.
Initially, the LangRef specified that this would create a new kind of
allocation that would be allocated at a fixed offset from the frame
pointer (EBP/RBP). We ended up dropping that design, and leaving the
stack frame layout alone.
These intrinsics are really about sharing local stack allocations, not
frame pointers. I intend to go further and add an `llvm.localaddress()`
intrinsic that returns whatever register (EBP, ESI, ESP, RBX) is being
used to address locals, which should not be confused with the frame
pointer.
Naming suggestions at this point are welcome, I'm happy to re-run sed.
Reviewers: majnemer, nicholas
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11011
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241633 91177308-0d34-0410-b5e6-96231b3b80d8
From the linker's perspective, an available_externally global is equivalent
to an external declaration (per isDeclarationForLinker()), so it is incorrect
to consider it to be a weak definition.
Also clean up some logic in the dead argument elimination pass and clarify
its comments to better explain how its behavior depends on linkage,
introduce GlobalValue::isStrongDefinitionForLinker() and start using
it throughout the optimizers and backend.
Differential Revision: http://reviews.llvm.org/D10941
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241413 91177308-0d34-0410-b5e6-96231b3b80d8
The expressions we delinearize do not necessarily have to have a SCEVAddRecExpr
at the outermost level. At this moment, the additional flexibility is not
exploited in LLVM itself, but in Polly we will soon soonish use this
functionality. For LLVM, this change should not affect existing functionality
(which is covered by test/Analysis/Delinearization/)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240952 91177308-0d34-0410-b5e6-96231b3b80d8
If we have a caller that knows a particular argument can never be null, we can exploit this fact while simplifying values in the inline cost analysis. This has the effect of reducing the cost for inlining when a null check is present in the callee, but the value is known non null in the caller. In particular, any dependent control flow can be discounted from the cost estimate.
Note that we use the parameter attributes at the call site to memoize the analysis within the caller's code. The setting of this attribute is done in InstCombine, the inline cost analysis just consumes it. This is intentional and important because we want the inline cost analysis results to be easily cachable themselves. We're not currently doing so, but initial results on LTO indicate this will quickly become important.
Differential Revision: http://reviews.llvm.org/D9129
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240828 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Scalar evolution does not propagate the non-wrapping flags to values
that are derived from a non-wrapping induction variable because
the non-wrapping property could be flow-sensitive.
This change is a first attempt to establish the non-wrapping property in
some simple cases. The main idea is to look through the operations
defining the pointer. As long as we arrive to a non-wrapping AddRec via
a small chain of non-wrapping instruction, the pointer should not wrap
either.
I believe that this essentially is what Andy described in
http://article.gmane.org/gmane.comp.compilers.llvm.cvs/220731 as the way
forward.
Reviewers: aschwaighofer, nadav, sanjoy, atrick
Reviewed By: atrick
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10472
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240798 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Because LSR happens at a late stage where mul of a power of 2 is
typically canonicalized to shl, this canonicalization emits code that
can be better CSE'ed.
Test Plan:
Transforms/LoopStrengthReduce/shl.ll shows how this change makes GVN more
powerful. Fixes some existing tests due to this change.
Reviewers: sanjoy, majnemer, atrick
Reviewed By: majnemer, atrick
Subscribers: majnemer, llvm-commits
Differential Revision: http://reviews.llvm.org/D10448
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240573 91177308-0d34-0410-b5e6-96231b3b80d8
CaptureTracking becomes very expensive in large basic blocks while
calling PointerMayBeCaptured. PointerMayBeCaptured scans the BB the
number of times equal to the number of uses of 'BeforeHere', which is
currently capped at 20 and bails out with Tracker->tooManyUses().
The bottleneck here is the number of calls to PointerMayBeCaptured * the
basic block scan. In a testcase with a 82k instruction BB,
PointerMayBeCaptured is called 130k times, leading to 'shouldExplore'
taking 527k runs, this currently takes ~12min.
To fix this we locally (within PointerMayBeCaptured) number the
instructions in the basic block using a DenseMap to cache instruction
positions/numbers. We build the cache incrementally every time we need
to scan an unexplored part of the BB, improving compile time to only
take ~2min.
This triggers in the flow: DeadStoreElimination -> MepDepAnalysis ->
CaptureTracking.
Side note: after multiple runs in the test-suite I've seen no
performance nor compile time regressions, but could note a couple of
compile time improvements:
Performance Improvements - Compile Time Delta Previous Current StdDev
SingleSource/Benchmarks/Misc-C++/bigfib -4.48% 0.8547 0.8164 0.0022
MultiSource/Benchmarks/TSVC/LoopRerolling-dbl/LoopRerolling-dbl -1.47% 1.3912 1.3707 0.0056
Differential Revision: http://reviews.llvm.org/D7010
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240560 91177308-0d34-0410-b5e6-96231b3b80d8
This will allow classes to implement the AA interface without deriving
from the class or referencing an internal enum of some other class as
their return types.
Also, to a pretty fundamental extent, concepts such as 'NoAlias',
'MayAlias', and 'MustAlias' are first class concepts in LLVM and we
aren't saving anything by scoping them heavily.
My mild preference would have been to use a scoped enum, but that
feature is essentially completely broken AFAICT. I'm extremely
disappointed. For example, we cannot through any reasonable[1] means
construct an enum class (or analog) which has scoped names but converts
to a boolean in order to test for the possibility of aliasing.
[1]: Richard Smith came up with a "solution", but it requires class
templates, and lots of boilerplate setting up the enumeration multiple
times. Something like Boost.PP could potentially bundle this up, but
even that would be quite painful and it doesn't seem realistically worth
it. The enum class solution would probably work without the need for
a bool conversion.
Differential Revision: http://reviews.llvm.org/D10495
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240255 91177308-0d34-0410-b5e6-96231b3b80d8
accurately describe what is being tracked.
While these two enums do track mod/ref information and aliasing
information, they don't represent the exact same things as either the
mod/ref enums or the alias result enum in AA. They're definitions are
dominated by the structure of their lattice and the bit's various
semantics. This patch just calls them what they are and tries to spell
out usefully distinct names for these things.
This will clear the path for using a raw unscoped enum to represent some
of these concepts across LLVM's analysis library.
No functionality changed here.
Differential Revision: http://reviews.llvm.org/D10494
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240254 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Since FunctionMap has llvm::Function pointers as keys, the order in
which the traversal happens can differ from run to run, causing spurious
FileCheck failures. Have CallGraph::print sort the CallGraphNodes by
name before printing them.
Reviewers: bogner, chandlerc
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10575
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240191 91177308-0d34-0410-b5e6-96231b3b80d8
The patch is generated using this command:
tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \
-checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \
llvm/lib/
Thanks to Eugene Kosov for the original patch!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240137 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Currently intrinsics don't affect the creation of the call graph.
This is not accurate with respect to statepoint and patchpoint
intrinsics -- these do call (or invoke) LLVM level functions.
This change fixes this inconsistency by adding a call to the external
node for call sites that call these non-leaf intrinsics. This coupled
with the fact that these intrinsics also escape the function pointer
they call gives us a conservatively correct call graph.
Reviewers: reames, chandlerc, atrick, pgavlin
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10526
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240039 91177308-0d34-0410-b5e6-96231b3b80d8
The personality routine currently lives in the LandingPadInst.
This isn't desirable because:
- All LandingPadInsts in the same function must have the same
personality routine. This means that each LandingPadInst beyond the
first has an operand which produces no additional information.
- There is ongoing work to introduce EH IR constructs other than
LandingPadInst. Moving the personality routine off of any one
particular Instruction and onto the parent function seems a lot better
than have N different places a personality function can sneak onto an
exceptional function.
Differential Revision: http://reviews.llvm.org/D10429
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239940 91177308-0d34-0410-b5e6-96231b3b80d8
names for counts with the word 'Count' to make them less ambiguous.
This will be an actual error if we use unscoped enums for any of these,
and generally this seems much clearer to read.
Also, use clang-format to normalize the formatting of this code which
seems to have been needlessly odd.
No functionality changed here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239887 91177308-0d34-0410-b5e6-96231b3b80d8
This is now living in MemoryLocation, which is what it pertains to. It
is also an enum there rather than a static data member which is left
never defined.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239886 91177308-0d34-0410-b5e6-96231b3b80d8
that it is its own entity in the form of MemoryLocation, and update all
the callers.
This is an entirely mechanical change. References to "Location" within
AA subclases become "MemoryLocation", and elsewhere
"AliasAnalysis::Location" becomes "MemoryLocation". Hope that helps
out-of-tree folks update.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239885 91177308-0d34-0410-b5e6-96231b3b80d8
virtual interface on AliasAnalysis only deals with ModRef information.
This interface was both computing memory locations by using TLI and
other tricks to estimate the size of memory referenced by an operand,
and computing ModRef information through similar investigations. This
change narrows the scope of the virtual interface on AliasAnalysis
slightly.
Note that all of this code could live in BasicAA, and be done with
a single investigation of the argument, if it weren't for the fact that
the generic code in AliasAnalysis::getModRefBehavior for a callsite
calls into the virtual aspect of (now) getArgModRefInfo. But this
patch's arrangement seems a not terrible way to go for now.
The other interesting wrinkle is how we could reasonably extend LLVM
with support for custom memory location sizes and mod/ref behavior for
library routines. After discussions with Hal on the review, the
conclusion is that this would be best done by fleshing out the much
desired support for extensions to TLI, and support these types of
queries in that interface where we would likely be doing other library
API recognition and analysis.
Differential Revision: http://reviews.llvm.org/D10259
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239884 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
When propagating mass through irregular loops, the mass flowing through
each loop header may not be equal. This was causing wrong frequencies
to be computed for irregular loop headers.
Fixed by keeping track of masses flowing through each of the headers in
an irregular loop. To do this, we now keep track of per-header backedge
weights. After the loop mass is distributed through the loop, the
backedge weights are used to re-distribute the loop mass to the loop
headers.
Since each backedge will have a mass proportional to the different
branch weights, the loop headers will end up with a more approximate
weight distribution (as opposed to the current distribution that assumes
that every loop header is the same).
Reviewers: dexonsmith
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10348
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239843 91177308-0d34-0410-b5e6-96231b3b80d8
Any combination of +-inf/+-inf is NaN so it's already ignored with
nnan and we can skip checking for ninf. Also rephrase logic in comments
a bit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239821 91177308-0d34-0410-b5e6-96231b3b80d8
This change is hopefully NFC. The only tricky part is that I changed the context instruction being used to the branch rather than the comparison. I believe both to be correct, but the branch is strictly more powerful. With the moved code, using the branch instruction is required for the basic block comparison test to return the same result. The previous code was able to directly access both the branch and the comparison where the revised code is not.
Differential Revision: http://reviews.llvm.org/D9652
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239797 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
ValueTracking used to overwrite the analysis results computed from
assumes and dominating conditions. This patch fixes this issue.
Test Plan: test/Analysis/ValueTracking/assume.ll
Reviewers: hfinkel, majnemer
Reviewed By: majnemer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10283
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239718 91177308-0d34-0410-b5e6-96231b3b80d8
The CFLAA code currently calls ConstantExpr::getAsInstruction which creates an instruction from a constant expr.
We then pass that instruction to the InstVisitor to analyze it.
Its not necessary to create these instructions as we can just cast from Constant to Operator in the visitor. This is how other InstVisitor’s such as SelectionDAGBuilder handle ConstantExpr.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239616 91177308-0d34-0410-b5e6-96231b3b80d8
Determining proper debug locations for instructions created in
PHITransAddr is tricky. We use a simple approach here and simply copy
debug locations from instructions computing load address to
"corresponding" instructions re-creating the address computation
in predecessor basic blocks.
This may not always be correct, given all the rearrangement and
simplification going on, and debug locations may jump around a lot,
as the basic blocks we copy locations between may be very far from
each other.
Still, this would work good in most simple cases (e.g. when chain
of address computing instruction is short, or our mapping turns out
to be 1-to-1), and we desire to have *some* reasonable debug locations
associated with newly inserted instructions.
See http://reviews.llvm.org/D10351 review thread for more details.
Test Plan: regression test suite
Reviewers: spatel, dblaikie
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10351
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239479 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
We need to add a runtime memcheck for pair of accesses (x,y) where at least one of x and y
are writes.
Assuming we have w writes and r reads, currently this number is estimated as being
w* (w+r-1). This estimation will count (write,write) pairs twice and will overestimate
the number of checks required.
This change adds a getNumberOfChecks method to RuntimePointerCheck, which
will count the number of runtime checks needed (similar in implementation to
needsAnyChecking) and uses it to produce the correct number of runtime checks.
Test Plan:
llvm test suite
spec2k
spec2k6
Performance results: no changes observed (not surprising since the formula for 1 writer is basically the same, which would covers most cases - at least with the current check limit).
Reviewers: anemet
Reviewed By: anemet
Subscribers: mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D10217
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239295 91177308-0d34-0410-b5e6-96231b3b80d8
Interleaved memory accesses are grouped and vectorized into vector load/store and shufflevector.
E.g. for (i = 0; i < N; i+=2) {
a = A[i]; // load of even element
b = A[i+1]; // load of odd element
... // operations on a, b, c, d
A[i] = c; // store of even element
A[i+1] = d; // store of odd element
}
The loads of even and odd elements are identified as an interleave load group, which will be transfered into vectorized IRs like:
%wide.vec = load <8 x i32>, <8 x i32>* %ptr
%vec.even = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 0, i32 2, i32 4, i32 6>
%vec.odd = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 1, i32 3, i32 5, i32 7>
The stores of even and odd elements are identified as an interleave store group, which will be transfered into vectorized IRs like:
%interleaved.vec = shufflevector <4 x i32> %vec.even, %vec.odd, <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7>
store <8 x i32> %interleaved.vec, <8 x i32>* %ptr
This optimization is currently disabled by defaut. To try it by adding '-enable-interleaved-mem-accesses=true'.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239291 91177308-0d34-0410-b5e6-96231b3b80d8
There were several SelectInst combines that always returned an existing
instruction instead of modifying an old one or creating a new one.
These are prime candidates for moving to InstSimplify.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239229 91177308-0d34-0410-b5e6-96231b3b80d8
port it to the new pass manager.
All this does is extract the inner "location" class used by AA into its
own full fledged type. This seems *much* cleaner as MemoryDependence and
soon MemorySSA also use this heavily, and it doesn't make much sense
being inside the AA infrastructure.
This will also make it much easier to break apart the AA infrastructure
into something that stands on its own rather than using the analysis
group design.
There are a few places where this makes APIs not make sense -- they were
taking an AliasAnalysis pointer just to build locations. I'll try to
clean those up in follow-up commits.
Differential Revision: http://reviews.llvm.org/D10228
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239003 91177308-0d34-0410-b5e6-96231b3b80d8
Unreachable values may use themselves in strange ways due to their
dominance property. Attempting to translate through them can lead to
infinite recursion, crashing LLVM. Instead, claim that we weren't able
to translate the value.
This fixes PR23096.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238702 91177308-0d34-0410-b5e6-96231b3b80d8