All flags setting/getting is now done in the class with helper methods instead
of users having to get the bits in the correct order.
Reviewed by Rafael Espíndola.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239314 91177308-0d34-0410-b5e6-96231b3b80d8
While we have some code to transform specification like {ax} into
{eax}/{rax} if the operand type isn't 16bit, we should reject cases
where there is no sane way to do this, like the i128 type in the
example.
Related to rdar://21042280
Differential Revision: http://reviews.llvm.org/D10260
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239309 91177308-0d34-0410-b5e6-96231b3b80d8
The global-merge pass was crashing because it assumes that all ConstantExprs
(reached via the global variables that they use) have at least one user.
I haven't worked out a way to test this, as an unused ConstantExpr cannot be
represented by serialised IR, and global-merge can only be run in llc, which
does not run any passes which can make a ConstantExpr dead.
This (reduced to the point of silliness) C code triggers this bug when compiled
for arm-none-eabi at -O1:
static a = 7;
static volatile b[10] = {&a};
c;
main() {
c = 0;
for (; c < 10;)
printf(b[c]);
}
Differential Revision: http://reviews.llvm.org/D10314
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239308 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds support for system register MMFR4_EL1 (memory model feature register) in the assembler.
This register provides information about the implemented memory model and memory management support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239302 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
We need to add a runtime memcheck for pair of accesses (x,y) where at least one of x and y
are writes.
Assuming we have w writes and r reads, currently this number is estimated as being
w* (w+r-1). This estimation will count (write,write) pairs twice and will overestimate
the number of checks required.
This change adds a getNumberOfChecks method to RuntimePointerCheck, which
will count the number of runtime checks needed (similar in implementation to
needsAnyChecking) and uses it to produce the correct number of runtime checks.
Test Plan:
llvm test suite
spec2k
spec2k6
Performance results: no changes observed (not surprising since the formula for 1 writer is basically the same, which would covers most cases - at least with the current check limit).
Reviewers: anemet
Reviewed By: anemet
Subscribers: mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D10217
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239295 91177308-0d34-0410-b5e6-96231b3b80d8
Interleaved memory accesses are grouped and vectorized into vector load/store and shufflevector.
E.g. for (i = 0; i < N; i+=2) {
a = A[i]; // load of even element
b = A[i+1]; // load of odd element
... // operations on a, b, c, d
A[i] = c; // store of even element
A[i+1] = d; // store of odd element
}
The loads of even and odd elements are identified as an interleave load group, which will be transfered into vectorized IRs like:
%wide.vec = load <8 x i32>, <8 x i32>* %ptr
%vec.even = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 0, i32 2, i32 4, i32 6>
%vec.odd = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 1, i32 3, i32 5, i32 7>
The stores of even and odd elements are identified as an interleave store group, which will be transfered into vectorized IRs like:
%interleaved.vec = shufflevector <4 x i32> %vec.even, %vec.odd, <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7>
store <8 x i32> %interleaved.vec, <8 x i32>* %ptr
This optimization is currently disabled by defaut. To try it by adding '-enable-interleaved-mem-accesses=true'.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239291 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Using some SCEV functionality helped to entirely remove SCEVCache class and FindConstantPointers SCEV visitor.
Also, this makes the code more universal - I'll take advandate of it in next patches where I start handling additional types of instructions.
Test Plan: Tests would be submitted in subsequent patches.
Reviewers: atrick, chandlerc
Reviewed By: atrick, chandlerc
Subscribers: atrick, llvm-commits
Differential Revision: http://reviews.llvm.org/D10205
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239282 91177308-0d34-0410-b5e6-96231b3b80d8
No test since the kinds of transforms this prevents seem to not really
be relevant for SI's different addressing modes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239261 91177308-0d34-0410-b5e6-96231b3b80d8
There were several SelectInst combines that always returned an existing
instruction instead of modifying an old one or creating a new one.
These are prime candidates for moving to InstSimplify.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239229 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
canUnrollCompletely takes `unsigned` values for `UnrolledCost` and
`RolledDynamicCost` but is passed in `uint64_t`s that are silently
truncated. Because of this, when `UnrolledSize` is a large integer
that has a small remainder with UINT32_MAX, LLVM tries to completely
unroll loops with high trip counts.
Reviewers: mzolotukhin, chandlerc
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10293
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239218 91177308-0d34-0410-b5e6-96231b3b80d8
CVP wants to analyze the condition operand of a select along an edge.
It succeeds in getting back a Constant but not a ConstantInt. Instead,
it gets a ConstantExpr. It then assumes that the Constant must be equal
to false because it isn't equal to true.
Instead, perform an additional comparison.
This fixes PR23752.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239217 91177308-0d34-0410-b5e6-96231b3b80d8
If we have (select a, b, c), it is sometimes valid to simplify this to a
single select operand. However, doing so is only valid if the
computation doesn't inject poison into the computation.
It might be helpful to consider the following example:
(select (icmp ne %i, INT_MAX), (add nsw %i, 1), INT_MIN)
The select is equivalent to (add %i, 1) but not (add nsw %i, 1).
Self hosting on x86_64 revealed that this occurs very, very rarely so
bailing out is hopefully pretty reasonable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239215 91177308-0d34-0410-b5e6-96231b3b80d8
the overloaded version of addPass which takes Pass*.
This change enables inserting the machine printer pass when the overloaded
version of addPass that takes Pass* is called to add a pass, instead of the
one which takes AnalysisID. I need this to prevent make-check tests from
failing when I commit another patch later.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239192 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r239141. This commit was an attempt to reintroduce
a previous patch that broke many self-hosting bots with clang timeouts,
but it still has slowdown issues, at least on ARM, increasing the
compilation time (stage 2, clang's) by 5x.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239175 91177308-0d34-0410-b5e6-96231b3b80d8
This change is NFC because both the ``break;`` and the fall through end
up returning immediately. However, this helps clarify intent and also
ensures correctness in case more ``case`` blocks are added later.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239172 91177308-0d34-0410-b5e6-96231b3b80d8
For targets with a free fneg, this fold is always a net loss if it
ends up duplicating the multiply, so definitely avoid it.
This might be true for some targets without a free fneg too, but
I'll leave that for future investigation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239167 91177308-0d34-0410-b5e6-96231b3b80d8
The new naming is (to me) much easier to understand. Here is a summary
of the new state of the world:
- '*Threshold' is the threshold for full unrolling. It is measured
against the estimated unrolled cost as computed by getUserCost in TTI
(or CodeMetrics, etc). We will exceed this threshold when unrolling
loops where unrolling exposes a significant degree of simplification
of the logic within the loop.
- '*PercentDynamicCostSavedThreshold' is the percentage of the loop's
estimated dynamic execution cost which needs to be saved by unrolling
to apply a discount to the estimated unrolled cost.
- '*DynamicCostSavingsDiscount' is the discount applied to the estimated
unrolling cost when the dynamic savings are expected to be high.
When actually analyzing the loop, we now produce both an estimated
unrolled cost, and an estimated rolled cost. The rolled cost is notably
a dynamic estimate based on our analysis of the expected execution of
each iteration.
While we're still working to build up the infrastructure for making
these estimates, to me it is much more clear *how* to make them better
when they have reasonably descriptive names. For example, we may want to
apply estimated (from heuristics or profiles) dynamic execution weights
to the *dynamic* cost estimates. If we start doing that, we would also
need to track the static unrolled cost and the dynamic unrolled cost, as
only the latter could reasonably be weighted by profile information.
This patch is sadly not without functionality change for the new unroll
analysis logic. Buried in the heuristic management were several things
that surprised me. For example, we never subtracted the optimized
instruction count off when comparing against the unroll heursistics!
I don't know if this just got lost somewhere along the way or what, but
with the new accounting of things, this is much easier to keep track of
and we use the post-simplification cost estimate to compare to the
thresholds, and use the dynamic cost reduction ratio to select whether
we can exceed the baseline threshold.
The old values of these flags also don't necessarily make sense. My
impression is that none of these thresholds or discounts have been tuned
yet, and so they're just arbitrary placehold numbers. As such, I've not
bothered to adjust for the fact that this is now a discount and not
a tow-tier threshold model. We need to tune all these values once the
logic is ready to be enabled.
Differential Revision: http://reviews.llvm.org/D9966
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239164 91177308-0d34-0410-b5e6-96231b3b80d8