Specifically, if a pointer accesses different underlying objects in each
iteration, don't look through the phi node defining the pointer.
The motivating case is the underlyling-objects-2.ll testcase. Consider
the loop nest:
int **A;
for (i)
for (j)
A[i][j] = A[i-1][j] * B[j]
This loop is transformed by Load-PRE to stash away A[i] for the next
iteration of the outer loop:
Curr = A[0]; // Prev_0
for (i: 1..N) {
Prev = Curr; // Prev = PHI (Prev_0, Curr)
Curr = A[i];
for (j: 0..N)
Curr[j] = Prev[j] * B[j]
}
Since A[i] and A[i-1] are likely to be independent pointers,
getUnderlyingObjects should not assume that Curr and Prev share the same
underlying object in the inner loop.
If it did we would try to dependence-analyze Curr and Prev and the
analysis of the corresponding SCEVs would fail with non-constant
distance.
To fix this, the getUnderlyingObjects API is extended with an optional
LoopInfo parameter. This is effectively what controls whether we want
the above behavior or the original. Currently, I only changed to use
this approach for LoopAccessAnalysis.
The other testcase is to guard the opposite case where we do want to
look through the loop PHI. If we step through an array by incrementing
a pointer, the underlying object is the incoming value of the phi as the
loop is entered.
Fixes rdar://problem/19566729
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235634 91177308-0d34-0410-b5e6-96231b3b80d8
Move isDereferenceablePointer function to Analysis. This function recursively tracks dereferencability over a chain of values like other functions in ValueTracking.
This refactoring is motivated by further changes to support dereferenceable_or_null attribute (http://reviews.llvm.org/D8650). isDereferenceablePointer will be extended to perform context-sensitive analysis and IR is not a good place to have such functionality.
Patch by: Artur Pilipenko <apilipenko@azulsystems.com>
Differential Revision: reviews.llvm.org/D9075
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235611 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
MemorySSA uses this algorithm as well, and this enables us to reuse the code in both places.
There are no actual algorithm or datastructure changes in here, just code movement.
Reviewers: qcolombet, chandlerc
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9118
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235406 91177308-0d34-0410-b5e6-96231b3b80d8
Bring function documentation for ScalarEvolutionExpander up to code by
not repeating the function name in the comment documenting
functionality. Reflow the edited comments where needed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234847 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Runtime unrolling of loops needs to emit an expression to compute the
loop's runtime trip-count. Avoid runtime unrolling if this computation
will be expensive.
Depends on D8993.
Reviewers: atrick
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8994
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234846 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Move isHighCostExpansion from IndVarSimplify to SCEVExpander. This
exposed function will be used in a subsequent change.
Reviewers: bogner, atrick
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8995
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234844 91177308-0d34-0410-b5e6-96231b3b80d8
The patch is generated using clang-tidy misc-use-override check.
This command was used:
tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py \
-checks='-*,misc-use-override' -header-filter='llvm|clang' \
-j=32 -fix -format
http://reviews.llvm.org/D8925
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234679 91177308-0d34-0410-b5e6-96231b3b80d8
CallSite roughly behaves as a common base CallInst and InvokeInst. Bring
the behavior closer to that model by making upcasts explicit. Downcasts
remain implicit and work as before.
Following dyn_cast as a mental model checking whether a Value *V isa
CallSite now looks like this:
if (auto CS = CallSite(V)) // think dyn_cast
instead of:
if (CallSite CS = V)
This is an extra token but I think it is slightly clearer. Making the
ctor explicit has the advantage of not accidentally creating nullptr
CallSites, e.g. when you pass a Value * to a function taking a CallSite
argument.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234601 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Some optimizations such as jump threading and loop unswitching can negatively
affect performance when applied to divergent branches. The divergence analysis
added in this patch conservatively estimates which branches in a GPU program
can diverge. This information can then help LLVM to run certain optimizations
selectively.
Test Plan: test/Analysis/DivergenceAnalysis/NVPTX/diverge.ll
Reviewers: resistor, hfinkel, eliben, meheff, jholewinski
Subscribers: broune, bjarke.roune, madhur13490, tstellarAMD, dberlin, echristo, jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D8576
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234567 91177308-0d34-0410-b5e6-96231b3b80d8
(Re-apply r234361 with a fix and a testcase for PR23157)
Both run-time pointer checking and the dependence analysis are capable
of dealing with uniform addresses. I.e. it's really just an orthogonal
property of the loop that the analysis computes.
Run-time pointer checking will only try to reason about SCEVAddRec
pointers or else gives up. If the uniform pointer turns out the be a
SCEVAddRec in an outer loop, the run-time checks generated will be
correct (start and end bounds would be equal).
In case of the dependence analysis, we work again with SCEVs. When
compared against a loop-dependent address of the same underlying object,
the difference of the two SCEVs won't be constant. This will result in
returning an Unknown dependence for the pair.
When compared against another uniform access, the difference would be
constant and we should return the right type of dependence
(forward/backward/etc).
The changes also adds support to query this property of the loop and
modify the vectorizer to use this.
Patch by Ashutosh Nema!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234424 91177308-0d34-0410-b5e6-96231b3b80d8
Both run-time pointer checking and the dependence analysis are capable
of dealing with uniform addresses. I.e. it's really just an orthogonal
property of the loop that the analysis computes.
Run-time pointer checking will only try to reason about SCEVAddRec
pointers or else gives up. If the uniform pointer turns out the be a
SCEVAddRec in an outer loop, the run-time checks generated will be
correct (start and end bounds would be equal).
In case of the dependence analysis, we work again with SCEVs. When
compared against a loop-dependent address of the same underlying object,
the difference of the two SCEVs won't be constant. This will result in
returning an Unknown dependence for the pair.
When compared against another uniform access, the difference would be
constant and we should return the right type of dependence
(forward/backward/etc).
The changes also adds support to query this property of the loop and
modify the vectorizer to use this.
Patch by Ashutosh Nema!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234361 91177308-0d34-0410-b5e6-96231b3b80d8
Require the pointee type to be passed explicitly and assert that it is
correct. For now it's possible to pass nullptr here (and I've done so in
a few places in this patch) but eventually that will be disallowed once
all clients have been updated or removed. It'll be a long road to get
all the way there... but if you have the cahnce to update your callers
to pass the type explicitly without depending on a pointer's element
type, that would be a good thing to do soon and a necessary thing to do
eventually.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233938 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This change teaches ScalarEvolution::isLoopBackedgeGuardedByCond to look
at edges within the loop body that dominate the latch. We don't do an
exhaustive search for all possible edges, but only a quick walk up the
dom tree.
This re-lands r233447. r233447 was reverted because it caused massive
compile-time regressions. This change has a fix for the same issue.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233829 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This is part 1 of fixes to address the problems described in
https://llvm.org/bugs/show_bug.cgi?id=22719.
The restriction to limit loop scales to 4,096 does not really prevent
overflows anymore, as the underlying algorithm has changed and does
not seem to suffer from this problem.
Additionally, artificially restricting loop scales to such a low number
skews frequency information, making loops of equal hotness appear to
have very different hotness properties.
The only loops that are artificially restricted to a scale of 4096 are
infinite loops (those loops with an exit mass of 0). This prevents
infinite loops from skewing the frequencies of other regions in the CFG.
At the end of propagation, frequencies are scaled to values that take no
more than 64 bits to represent. When the range of frequencies to be
represented fits within 61 bits, it pushes up the scaling factor to a
minimum of 8 to better distinguish small frequency values. Otherwise,
small frequency values are all saturated down at 1.
Tested on x86_64.
Reviewers: dexonsmith
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8718
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233826 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This change teaches isImpliedCond to infer things like "X sgt 0" => "X -
1 sgt -1". The `ConstantRange` class has the logic to do the heavy
lifting, this change simply gets ScalarEvolution to exploit that when
reasonable.
Depends on D8345
Reviewers: atrick
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8346
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232576 91177308-0d34-0410-b5e6-96231b3b80d8
then using the symbols from those anonymous namespaces from outside the
anonymous namespace).
This was "detected" by causing the modules selfhost to fail in some cases.
The corresponding Clang bug was fixed in r232455.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232457 91177308-0d34-0410-b5e6-96231b3b80d8
The C++ standard reserves all identifiers starting with an underscore
followed by an uppercase letter for the implementation for any use.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232302 91177308-0d34-0410-b5e6-96231b3b80d8
Instead, run both EH preparation passes, and have them both ignore
functions with unrecognized EH personalities. Pass delegation involved
some hacky code for creating an AnalysisResolver that we don't need now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231995 91177308-0d34-0410-b5e6-96231b3b80d8
This is the final patch that actually introduces the new parameter of
partition mapping to RuntimePointerCheck::needsChecking.
Another API (LAI::getInstructionsForAccess) is also exposed that helps
to map pointers to instructions because ultimately we partition
instructions.
The WIP version of the Loop Distribution pass in D6930 has been adapted
to use all this. See for example, how
InstrPartitionContainer::computePartitionSetForPointers sets up the
partitions using the above API and then calls to LAI::addRuntimeCheck
with the pointer partitions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231818 91177308-0d34-0410-b5e6-96231b3b80d8
Now the analysis won't "fail" if the memchecks exceed the threshold. It
is the transform pass' responsibility to perform the check.
This allows the transform pass to further analyze/eliminate the
memchecks. E.g. in Loop distribution we only need to check pointers
that end up in different partitions.
Note that there is a slight change of functionality here. The logic in
analyzeLoop is that if dependence checking fails due to non-constant
distance between the pointers, another attempt is made to prove safety
of the dependences purely using run-time checks.
Before this patch we could fail the loop due to exceeding the memcheck
threshold after the first step, now we only check the threshold in the
client after the full analysis. There is no measurable compile-time
effect but I wanted to record this here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231817 91177308-0d34-0410-b5e6-96231b3b80d8
The dependences are now expose through the new getInterestingDependences
API so we can use that with -analyze too and fix the FIXME.
This lets us remove the test that relied on -debug to check the
dependences.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231807 91177308-0d34-0410-b5e6-96231b3b80d8
Gather an array of interesting dependences rather than just failing
after the first unsafe one and regarding the loop unsafe. Loop
Distribution needs to be able to collect all dependences in order to
isolate the dependence cycles into their own partition.
Since the dependence checking algorithm is quadratic in terms of
accesses sharing the same underlying pointer, I am applying a cut-off
threshold (MaxInterestingDependence). Exceeding that, the logic reverts
back to the original approach deeming the loop unsafe upon encountering
the first unsafe dependence.
The main idea of the patch is to split isDepedent from directly
answering the question whether the dep is safe for vectorization to
return a dependence type which then gets mapped to old boolean result
using Dependence::isSafeForVectorization.
Tested that this was compile-time neutral on SpecINT2006 LTO bitcode
inputs. No assembly change on the testsuite including external.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231806 91177308-0d34-0410-b5e6-96231b3b80d8
LoopDistribution needs to query various results of the dependence
analysis. This series will expose some more APIs and state of the
dependence checker.
This patch is a simple one to just expose the DepChecker instance. The
set is compile-time neutral measured with LTO bitcode files of
SpecINT2006. Also there is no assembly change on the testsuite.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231805 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Now that the DataLayout is a mandatory part of the module, let's start
cleaning the codebase. This patch is a first attempt at doing that.
This patch is not exactly NFC as for instance some places were passing
a nullptr instead of the DataLayout, possibly just because there was a
default value on the DataLayout argument to many functions in the API.
Even though it is not purely NFC, there is no change in the
validation.
I turned as many pointer to DataLayout to references, this helped
figuring out all the places where a nullptr could come up.
I had initially a local version of this patch broken into over 30
independant, commits but some later commit were cleaning the API and
touching part of the code modified in the previous commits, so it
seemed cleaner without the intermediate state.
Test Plan:
Reviewers: echristo
Subscribers: llvm-commits
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231740 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This removes some duplicated code, and also helps optimization: e.g. in
the test case added, `%idx ULT 128` in `@x` is not currently optimized
to `true` by `-indvars` but will be, after this change.
The only functional change in ths commit is that for add recurrences,
ScalarEvolution::getRange will be more aggressive -- computing the
unsigned (resp. signed) range for a SCEVAddRecExpr will now look at the
NSW (resp. NUW) bits and check for signed (resp. unsigned) overflow.
This can be a strict improvement in some cases (such as the attached
test case), and should be no worse in other cases.
Reviewers: atrick, nlewycky
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8142
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231709 91177308-0d34-0410-b5e6-96231b3b80d8