We can prove that a 'sub' can be a 'sub nuw' if the left-hand side is
negative and the right-hand side is non-negative.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216045 91177308-0d34-0410-b5e6-96231b3b80d8
We can prove that a 'sub' can be a 'sub nsw' under certain conditions:
- The sign bits of the operands is the same.
- Both operands have more than 1 sign bit.
The subtraction cannot be a signed overflow in either case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216037 91177308-0d34-0410-b5e6-96231b3b80d8
While this might seem like an obvious canonicalization, there is one subtle problem with it. The result of the original expression
is undef when x is NaN (remember, fast math flags), but the result of the select is always defined when x is NaN. This means that the
new expression is strictly more defined than the original one. One unfortunate consequence of this is that the transform is not reversible!
It's always legal to make increase the defined-ness of an expression, but it's not legal to reduce it. Thus, targets that prefer the original
form of the expression cannot reverse the transform to recover it. Another way to think of it is that the transform has lost source-level
information (the fast math flags), which is undesirable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215825 91177308-0d34-0410-b5e6-96231b3b80d8
While *most* (X sdiv 1) operations will get caught by InstSimplify, it
is still possible for a sdiv to appear in the worklist which hasn't been
simplified yet.
This means that it is possible for 0 - (X sdiv 1) to get transformed
into (X sdiv -1); dividing by -1 can make the transform produce undef
values instead of the proper result.
Sorry for the lack of testcase, it's a bit problematic because it relies
on the exact order of operations in the worklist.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215818 91177308-0d34-0410-b5e6-96231b3b80d8
We can combne a mul with a div if one of the operands is a multiple of
the other:
%mul = mul nsw nuw %a, C1
%ret = udiv %mul, C2
=>
%ret = mul nsw %a, (C1 / C2)
This can expose further optimization opportunities if we end up
multiplying or dividing by a power of 2.
Consider this small example:
define i32 @f(i32 %a) {
%mul = mul nuw i32 %a, 14
%div = udiv exact i32 %mul, 7
ret i32 %div
}
which gets CodeGen'd to:
imull $14, %edi, %eax
imulq $613566757, %rax, %rcx
shrq $32, %rcx
subl %ecx, %eax
shrl %eax
addl %ecx, %eax
shrl $2, %eax
retq
We can now transform this into:
define i32 @f(i32 %a) {
%shl = shl nuw i32 %a, 1
ret i32 %shl
}
which gets CodeGen'd to:
leal (%rdi,%rdi), %eax
retq
This fixes PR20681.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215815 91177308-0d34-0410-b5e6-96231b3b80d8
Add header guards to files that were missing guards. Remove #endif comments
as they don't seem common in LLVM (we can easily add them back if we decide
they're useful)
Changes made by clang-tidy with minor tweaks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215558 91177308-0d34-0410-b5e6-96231b3b80d8
What follows bellow is a correctness proof of the transform using CVC3.
$ < t.cvc
A, B : BITVECTOR(32);
QUERY BVPLUS(32, A & B, A | B) = BVPLUS(32, A, B);
$ cvc3 < t.cvc
Valid.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215400 91177308-0d34-0410-b5e6-96231b3b80d8
We can only propagate the nsw bits if both subtraction instructions are
marked with the appropriate bit.
N.B. We only propagate the nsw bit in InstCombine because the nuw case
is already handled in InstSimplify.
This fixes PR20189.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214385 91177308-0d34-0410-b5e6-96231b3b80d8
While we can already transform A | (A ^ B) into A | B, things get bad
once we have (A ^ B) | (A ^ B ^ Cst) because reassociation will morph
this into (A ^ B) | ((A ^ Cst) ^ B). Our existing patterns fail once
this happens.
To fix this, we add a new pattern which looks through the tree of xor
binary operators to see that, in fact, there exists a redundant xor
operation.
What follows bellow is a correctness proof of the transform using CVC3.
$ cat t.cvc
A, B, C : BITVECTOR(64);
QUERY BVXOR(A, B) | BVXOR(BVXOR(B, C), A) = BVXOR(A, B) | C;
QUERY BVXOR(BVXOR(A, C), B) | BVXOR(A, B) = BVXOR(A, B) | C;
QUERY BVXOR(A, B) & BVXOR(BVXOR(B, C), A) = BVXOR(A, B) & ~C;
QUERY BVXOR(BVXOR(A, C), B) & BVXOR(A, B) = BVXOR(A, B) & ~C;
$ cvc3 < t.cvc
Valid.
Valid.
Valid.
Valid.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214342 91177308-0d34-0410-b5e6-96231b3b80d8
In order to enable the preservation of noalias function parameter information
after inlining, and the representation of block-level __restrict__ pointer
information (etc.), additional kinds of aliasing metadata will be introduced.
This metadata needs to be carried around in AliasAnalysis::Location objects
(and MMOs at the SDAG level), and so we need to generalize the current scheme
(which is hard-coded to just one TBAA MDNode*).
This commit introduces only the necessary refactoring to allow for the
introduction of other aliasing metadata types, but does not actually introduce
any (that will come in a follow-up commit). What it does introduce is a new
AAMDNodes structure to hold all of the aliasing metadata nodes associated with
a particular memory-accessing instruction, and uses that structure instead of
the raw MDNode* in AliasAnalysis::Location, etc.
No functionality change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213859 91177308-0d34-0410-b5e6-96231b3b80d8
It handles the errors which were seen in PR19958 where wrong code was being emitted due to earlier patch.
Added code for lshr as well as non-exact right shifts.
It implements :
(icmp eq/ne (ashr/lshr const2, A), const1)" ->
(icmp eq/ne A, Log2(const2/const1)) ->
(icmp eq/ne A, Log2(const2) - Log2(const1))
Differential Revision: http://reviews.llvm.org/D4068
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213678 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r213474 (and r213475), which causes a miscompile on
a stage2 LTO build. I'll reply on the list in a moment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213562 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: This patch introduces two new iterator ranges and updates existing code to use it. No functional change intended.
Test Plan: All tests (make check-all) still pass.
Reviewers: dblaikie
Reviewed By: dblaikie
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D4481
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213474 91177308-0d34-0410-b5e6-96231b3b80d8
In the original version of the patch the behaviour was like described in
the comment. This behaviour was changed before committing it without
updating the comment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213117 91177308-0d34-0410-b5e6-96231b3b80d8
Fix a crash in `InstCombiner::Descale()` when a multiply-by-zero gets
created as an argument to a GEP partway through an iteration, causing
-instcombine to optimize the GEP before the multiply.
rdar://problem/17615671
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212742 91177308-0d34-0410-b5e6-96231b3b80d8
isSafeToSpeculativelyExecute can optionally take a DataLayout pointer. In the
past, this was mainly used to make better decisions regarding divisions known
not to trap, and so was not all that important for users concerned with "cheap"
instructions. However, now it also helps look through bitcasts for
dereferencable loads, and will also be important if/when we add a
dereferencable pointer attribute.
This is some initial work to feed a DataLayout pointer through to callers of
isSafeToSpeculativelyExecute, generally where one was already available.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212720 91177308-0d34-0410-b5e6-96231b3b80d8
In PR20059 ( http://llvm.org/pr20059 ), instcombine eliminates shuffles that are necessary before performing an operation that can trap (srem).
This patch calls isSafeToSpeculativelyExecute() and bails out of the optimization in SimplifyVectorOp() if needed.
Differential Revision: http://reviews.llvm.org/D4424
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212629 91177308-0d34-0410-b5e6-96231b3b80d8
It is not safe to negate the smallest signed integer, doing so yields
the same number back.
This fixes PR20186.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212164 91177308-0d34-0410-b5e6-96231b3b80d8
This patch reduces the stack memory consumption of the InstCombine
function "isOnlyCopiedFromConstantGlobal() ", that in certain conditions
could overflow the stack because of excessive recursiveness.
For example, in a case like this:
%0 = alloca [50025 x i32], align 4
%1 = getelementptr inbounds [50025 x i32]* %0, i64 0, i64 0
store i32 0, i32* %1
%2 = getelementptr inbounds i32* %1, i64 1
store i32 1, i32* %2
%3 = getelementptr inbounds i32* %2, i64 1
store i32 2, i32* %3
%4 = getelementptr inbounds i32* %3, i64 1
store i32 3, i32* %4
%5 = getelementptr inbounds i32* %4, i64 1
store i32 4, i32* %5
%6 = getelementptr inbounds i32* %5, i64 1
store i32 5, i32* %6
...
This piece of code crashes llvm when trying to apply instcombine on
desktop. On embedded devices this could happen with a much lower limit
of recursiveness. Some instructions (getelementptr and bitcasts) make
the function recursively call itself on their uses, which is what makes
the example above consume so much stack (it becomes a recursive
depth-first tree visit with a very big depth).
The patch changes the algorithm to be semantically equivalent, but
iterative instead of recursive and the visiting order to be from a
depth-first visit to a breadth-first visit (visit all the instructions
of the current level before the ones of the next one).
Now if a lot of memory is required a heap allocation is done instead of
the the stack allocation, avoiding the possible crash.
Reviewed By: rnk
Differential Revision: http://reviews.llvm.org/D4355
Patch by Marcello Maggioni! We don't generally commit large stress test
that look for out of memory conditions, so I didn't request that one be
added to the patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212133 91177308-0d34-0410-b5e6-96231b3b80d8