Sometimes an incidentally created instruction can duplicate a Value used
elsewhere. It then often doesn't end up in the leader table. If it's later
removed, we attempt to remove it from the leader table and segfault.
Instead we should just ignore the removal request, which won't cause any
problems. The reverse situation, where the original instruction is replaced by
the new one (which you might think could leave the leader table empty) cannot
occur, because the incidental instruction will never be found in the first
place.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242199 91177308-0d34-0410-b5e6-96231b3b80d8
Volatile loads and stores are made visible in global state regardless of
what memory is involved. It is not correct to disregard the ordering
and synchronization scope because it is possible to synchronize with
memory operations performed by hardware.
This partially addresses PR23737.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242126 91177308-0d34-0410-b5e6-96231b3b80d8
Previously we would refrain from attempting to increase the linkage of
available_externally globals because they were considered weak for the
linker. Now they are treated more like a declaration instead of a weak
definition.
This was causing SSE alignment faults in Chromuim, when some code
assumed it could increase the alignment of a dllimported global that it
didn't control. http://crbug.com/509256
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242091 91177308-0d34-0410-b5e6-96231b3b80d8
This test case was breaking the hexagon elf bot. The failing lines
were actually unnecessary as checking that the store still reads the
correct value demonstrates that everything is working fine now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242073 91177308-0d34-0410-b5e6-96231b3b80d8
When spotting that a loop can use ctpop, we were incorrectly replacing all uses of a value with a value derived from ctpop.
The bug here was exposed because we were replacing a use prior to the ctpop with the ctpop value and so we have a use before def, i.e., we changed
%tobool.5 = icmp ne i32 %num, 0
store i1 %tobool.5, i1* %ptr
br i1 %tobool.5, label %for.body.lr.ph, label %for.end
to
store i1 %1, i1* %ptr
%0 = call i32 @llvm.ctpop.i32(i32 %num)
%1 = icmp ne i32 %0, 0
br i1 %1, label %for.body.lr.ph, label %for.end
Even if we inserted the ctpop so that it dominates the store here, that would still be incorrect. The store doesn’t want the result of ctpop.
The fix is very simple, and involves replacing only the branch condition with the ctpop instead of all uses.
Reviewed by Hal Finkel.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242068 91177308-0d34-0410-b5e6-96231b3b80d8
Enable runtime unrolling for loops with unroll count metadata ("#pragma unroll N")
and a runtime trip count. Also, do not unroll loops with unroll full metadata if the
loop has a runtime loop count. Previously, such loops would be unrolled with a
very large threshold (pragma-unroll-threshold) if runtime unrolled happened to be
enabled resulting in a very large (and likely unwise) unroll factor.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242047 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This at least saves compile time. I also encountered a case where
ephemeral values affect whether other variables are promoted, causing
performance issues. It may be a bug in LSR, but I didn't manage to
reduce it yet. Anyhow, I believe it's in general not worth considering
ephemeral values in LSR.
Reviewers: atrick, hfinkel
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11115
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242011 91177308-0d34-0410-b5e6-96231b3b80d8
There is no suitable basic block to sink instructions in loops without
exits. The only way an instruction in a loop without exits can be used
is as an incoming value to a PHI. In such cases, the incoming block for
the corresponding value is unreachable.
This fixes PR24013.
Differential Revision: http://reviews.llvm.org/D10903
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241987 91177308-0d34-0410-b5e6-96231b3b80d8
Not doing this can lead to misoptimizations down the line, e.g. because
of range metadata on the replacing load excluding values that are valid
for the load that is being replaced.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241886 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
In RewriteLoopExitValues, before expanding out an SCEV expression using
SCEVExpander, try to see if an existing LLVM IR expression already
computes the value we're interested in. If so use that existing
expression.
Apart from reducing IndVars' reliance on the rest of the compilation
pipeline, this also prevents IndVars from concluding some expressions as
"high cost" when they're not. For instance,
`InductiveRangeCheckElimination` often emits code of the following form:
```
len = umin(len_A, len_B)
loop:
...
if (i++ < len)
goto loop
outside_loop:
use(i)
```
`SCEVExpander` refuses to rewrite the use of `i` in `outside_loop`,
since it thinks the value of `i` on loop exit, `len`, is a high cost
expansion since it contains an `umax` in it. With this change,
`IndVars` can see that it can re-use `len` instead of creating a new
expression to compute `umin(len_A, len_B)`.
I considered putting this cleverness in `SCEVExpander`, but I was
worried that it may then have a deterimental effect on other passes
that use it. So I decided it was better to just do this in the one
place where it seems like an obviously good idea, with the intent of
generalizing later if needed.
Reviewers: atrick, reames
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10782
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241838 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Often filter-like loops will do memory accesses that are
separated by constant offsets. In these cases it is
common that we will exceed the threshold for the
allowable number of checks.
However, it should be possible to merge such checks,
sice a check of any interval againt two other intervals separated
by a constant offset (a,b), (a+c, b+c) will be equivalent with
a check againt (a, b+c), as long as (a,b) and (a+c, b+c) overlap.
Assuming the loop will be executed for a sufficient number of
iterations, this will be true. If not true, checking against
(a, b+c) is still safe (although not equivalent).
As long as there are no dependencies between two accesses,
we can merge their checks into a single one. We use this
technique to construct groups of accesses, and then check
the intervals associated with the groups instead of
checking the accesses directly.
Reviewers: anemet
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10386
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241673 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Initially, these intrinsics seemed like part of a family of "frame"
related intrinsics, but now I think that's more confusing than helpful.
Initially, the LangRef specified that this would create a new kind of
allocation that would be allocated at a fixed offset from the frame
pointer (EBP/RBP). We ended up dropping that design, and leaving the
stack frame layout alone.
These intrinsics are really about sharing local stack allocations, not
frame pointers. I intend to go further and add an `llvm.localaddress()`
intrinsic that returns whatever register (EBP, ESI, ESP, RBX) is being
used to address locals, which should not be confused with the frame
pointer.
Naming suggestions at this point are welcome, I'm happy to re-run sed.
Reviewers: majnemer, nicholas
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11011
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241633 91177308-0d34-0410-b5e6-96231b3b80d8
This checks subtarget feature compatibility for inlining by verifying
that the callee is a strict subset of the caller's features. This includes
the cpu as part of the subtarget we can get via the incoming functions as
the backend takes CPUs as feature sets.
This allows us to inline things like:
int foo() { return baz(); }
int __attribute__((target("sse4.2"))) bar() {
return foo();
}
so that generic code can be inlined into specialized functions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241221 91177308-0d34-0410-b5e6-96231b3b80d8
TwoAddressInstructionPass stops after a successful commuting but 3 Addr
conversion might be good for some cases.
Consider:
int foo(int a, int b) {
return a + b;
}
Before this commit, we emit:
addl %esi, %edi
movl %edi, %eax
ret
After this commit, we try 3 Addr conversion:
leal (%rsi,%rdi), %eax
ret
Patch by Volkan Keles <vkeles@apple.com>!
Differential Revision: http://reviews.llvm.org/D10851
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241206 91177308-0d34-0410-b5e6-96231b3b80d8
This is mostly an NFC, which increases code readability (instead of
saving old terminator, generating new one in front of old, and deleting
old, we just call a function). However, it would additionaly copy
the debug location from old instruction to replacement, which
would help PR23837.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241197 91177308-0d34-0410-b5e6-96231b3b80d8
We would create a phi node with a zero initialized operand instead of
undef in the case where no value was originally available. This was
problematic for x86_mmx which has no null value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241143 91177308-0d34-0410-b5e6-96231b3b80d8
Surprisingly, this is a correctness issue: the mmx type exists for
calling convention purposes, LLVM doesn't have a zero representation for
them.
This partially fixes PR23999.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241142 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
nsw are flaky and can often be removed by optimizations. This patch enhances
nsw by leveraging @llvm.assume in the IR. Specifically, NaryReassociate now
understands that
assume(a + b >= 0) && assume(a >= 0) ==> a +nsw b
As a result, it can split more sext(a + b) into sext(a) + sext(b) for CSE.
Test Plan: nary-gep.ll
Reviewers: broune, meheff
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D10822
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241139 91177308-0d34-0410-b5e6-96231b3b80d8
Set debug location for terminator instruction in loop backedge block
(which is an unconditional jump to loop header). We can't copy debug
location from original backedges, as there can be several of them,
with different debug info locations. So, we follow the approach of
SplitBlockPredecessors, and copy the debug info from first non-PHI
instruction in the header (i.e. destination block).
This is yet another change for PR23837.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240999 91177308-0d34-0410-b5e6-96231b3b80d8
If we are dealing with a pointer induction variable, isInductionPHI
gives back a step value of Stride / size of pointer. However, we might
be indexing with a legal type wider than the pointer width.
Handle this by inserting casts where appropriate instead of crashing.
This fixes PR23954.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240877 91177308-0d34-0410-b5e6-96231b3b80d8
The PruneEH pass tries to annotate functions as 'noreturn' if it doesn't
see a ReturnInst. However, a naked function containing inline assembly
can contain control flow leaving the function.
This fixes PR23971.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240876 91177308-0d34-0410-b5e6-96231b3b80d8
It is possible for a global to be substituted with another global of a
different type or a different kind (i.e. an alias) at IR link time. One
example of this scenario is when a Microsoft ABI vtable is substituted with
an alias referring to a larger vtable containing an RTTI reference.
This will cause the global to be RAUW'd with a possibly bitcasted reference
to the other global. This will of course also affect any references to the
global in bitset metadata.
The right way to handle such metadata is simply to ignore it. This is sound
because the linked module should contain another copy of the bitset entries as
applied to the new global.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240866 91177308-0d34-0410-b5e6-96231b3b80d8
This change extends the detection of base pointers for vector constructs to handle arbitrary phi and select nodes. The existing non-vector code already handles those, so this is basically just extending the vector special case to be less special cased. It still isn't generalized vector handling since we can't handle arbitrary vector instructions (e.g. shufflevectors), but it's a lot closer.
The general structure of the change is as follows:
* Extend the base defining value relation over a subset of vector instructions and vector typed phi & select instructions.
* Move scalarization from before base pointer rewriting to after base pointer rewriting. The extension of the BDV relation is sufficient to find vector base phis for vector inputs.
* Preserve the existing special case logic for when the base of a vector element is locally obvious. This general idea could be extended to the scalar case as well.
Differential Revision: http://reviews.llvm.org/D10461#inline-84275
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240850 91177308-0d34-0410-b5e6-96231b3b80d8
If we have a caller that knows a particular argument can never be null, we can exploit this fact while simplifying values in the inline cost analysis. This has the effect of reducing the cost for inlining when a null check is present in the callee, but the value is known non null in the caller. In particular, any dependent control flow can be discounted from the cost estimate.
Note that we use the parameter attributes at the call site to memoize the analysis within the caller's code. The setting of this attribute is done in InstCombine, the inline cost analysis just consumes it. This is intentional and important because we want the inline cost analysis results to be easily cachable themselves. We're not currently doing so, but initial results on LTO indicate this will quickly become important.
Differential Revision: http://reviews.llvm.org/D9129
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240828 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Fixes PR23809. Without passing the context to SimplifyICmpInst, we would
use the assume to prove that the condition feeding the assume is
trivially true (see isValidAssumeForContext in ValueTracking.cpp),
causing the removal of the assume which may be useful for later
optimizations.
Test Plan: pr23800.ll
Reviewers: hfinkel, majnemer
Reviewed By: hfinkel
Subscribers: henryhu, llvm-commits, wengxt, broune, meheff, eliben
Differential Revision: http://reviews.llvm.org/D10695
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240683 91177308-0d34-0410-b5e6-96231b3b80d8
We performed a simple, but incomplete, intersection when it came time to
CSE instructions. It didn't handle, for example, the 'exact' flag.
This fixes PR23922.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240595 91177308-0d34-0410-b5e6-96231b3b80d8
Reassociate mutated existing instructions in order to form negations
which would create additional reassociate opportunities.
This fixes PR23926.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240593 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Because LSR happens at a late stage where mul of a power of 2 is
typically canonicalized to shl, this canonicalization emits code that
can be better CSE'ed.
Test Plan:
Transforms/LoopStrengthReduce/shl.ll shows how this change makes GVN more
powerful. Fixes some existing tests due to this change.
Reviewers: sanjoy, majnemer, atrick
Reviewed By: majnemer, atrick
Subscribers: majnemer, llvm-commits
Differential Revision: http://reviews.llvm.org/D10448
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240573 91177308-0d34-0410-b5e6-96231b3b80d8
With option OptForSize enabled, the Loop Vectorizer is not supposed to
create tail loop. The condition checking that was invalid and was not
matching to the comment above.
Patch by Marianne Mailhot-Sarrasin.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240556 91177308-0d34-0410-b5e6-96231b3b80d8