constructing ImmediateDominator is now folded into DomTree construction.
This is part of the ongoing work for PR217.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@36063 91177308-0d34-0410-b5e6-96231b3b80d8
This sinks the two stores in this example into a single store in cond_next. In this
case, it allows elimination of the load as well:
store double 0.000000e+00, double* @s.3060
%tmp3 = fcmp ogt double %tmp1, 5.000000e-01 ; <i1> [#uses=1]
br i1 %tmp3, label %cond_true, label %cond_next
cond_true: ; preds = %entry
store double 1.000000e+00, double* @s.3060
br label %cond_next
cond_next: ; preds = %entry, %cond_true
%tmp6 = load double* @s.3060 ; <double> [#uses=1]
This implements Transforms/InstCombine/store-merge.ll:test2
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@36040 91177308-0d34-0410-b5e6-96231b3b80d8
out to do! :)
This fixes a problem where LSR would insert a bunch of code into each MBB
that uses a particular subexpression (e.g. IV+base+C). The problem is that
this code cannot be CSE'd back together if inserted into different blocks.
This patch changes LSR to attempt to insert a single copy of this code and
share it, allowing codegenprepare to duplicate the code if it can be sunk
into various addressing modes. On CodeGen/ARM/lsr-code-insertion.ll,
for example, this gives us code like:
add r8, r0, r5
str r6, [r8, #+4]
..
ble LBB1_4 @cond_next
LBB1_3: @cond_true
str r10, [r8, #+4]
LBB1_4: @cond_next
...
LBB1_5: @cond_true55
ldr r6, LCPI1_1
str r6, [r8, #+4]
instead of:
add r10, r0, r6
str r8, [r10, #+4]
...
ble LBB1_4 @cond_next
LBB1_3: @cond_true
add r8, r0, r6
str r10, [r8, #+4]
LBB1_4: @cond_next
...
LBB1_5: @cond_true55
add r8, r0, r6
ldr r10, LCPI1_1
str r10, [r8, #+4]
Besides being smaller and more efficient, this makes it immediately
obvious that it is profitable to predicate LBB1_3 now :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@35972 91177308-0d34-0410-b5e6-96231b3b80d8
this fixes problems where codegenprepare would sink expressions into load/stores
that are not valid, and fixes cases where it would miss important valid ones.
This fixes several serious codesize and perf issues, particularly on targets
with complex addressing modes like arm and x86. For example, now we compile
CodeGen/X86/isel-sink.ll to:
_test:
movl 8(%esp), %eax
movl 4(%esp), %ecx
cmpl $1233, %eax
ja LBB1_2 #F
LBB1_1: #T
movl $4, (%ecx,%eax,4)
movl $141, %eax
ret
LBB1_2: #F
movl (%ecx,%eax,4), %eax
ret
instead of:
_test:
movl 8(%esp), %eax
leal (,%eax,4), %ecx
addl 4(%esp), %ecx
cmpl $1233, %eax
ja LBB1_2 #F
LBB1_1: #T
movl $4, (%ecx)
movl $141, %eax
ret
LBB1_2: #F
movl (%ecx), %eax
ret
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@35970 91177308-0d34-0410-b5e6-96231b3b80d8
define i32 @test(i32 %X) {
entry:
%Y = and i32 %X, 4 ; <i32> [#uses=1]
icmp eq i32 %Y, 0 ; <i1>:0 [#uses=1]
sext i1 %0 to i32 ; <i32>:1 [#uses=1]
ret i32 %1
}
by moving code out of commonIntCastTransforms into visitZExt. Simplify the
APInt gymnastics in it etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@35885 91177308-0d34-0410-b5e6-96231b3b80d8
We now tolerate small amounts of undefined behavior, better emulating what
would happen if the transaction actually occurred in memory. This fixes
SingleSource/UnitTests/2007-04-10-BitfieldTest.c on PPC, at least until
Devang gets a chance to fix the CFE from doing undefined things with bitfields :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@35875 91177308-0d34-0410-b5e6-96231b3b80d8
happen to be an entry, in such case, it is not a good idea to
insert new block before entry.
Also fix typo in assertion check.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@35833 91177308-0d34-0410-b5e6-96231b3b80d8
some constant exprs to apints).
Thanks to Anton for tracking down a small testcase that triggered this!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@35633 91177308-0d34-0410-b5e6-96231b3b80d8
isel has its own particular features that it wants in the CFG, in order to
reduce the number of times a constant is computed, etc. Make sure that we
clean up the CFG before doing any other things for isel. Doing so can
dramatically reduce the number of split edges and reduce the number of
places that constants get computed. For example, this shrinks
CodeGen/Generic/phi-immediate-factoring.ll from 44 to 37 instructions on X86,
and from 21 to 17 MBB's in the output. This is primarily a code size win,
not a performance win.
This implements CodeGen/Generic/phi-immediate-factoring.ll and PR1296.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@35575 91177308-0d34-0410-b5e6-96231b3b80d8
amount is safe.
2. Use new method on ConstantInt instead of (? :) operator.
3. Use new method uge() on ConstantInt to simplify codes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@35505 91177308-0d34-0410-b5e6-96231b3b80d8
1. Line out nested call of APInt::zext/trunc.
2. Make more use of APInt::getHighBitsSet/getLowBitsSet.
3. Use APInt[] operator instead of expression like "APIntVal & SignBit".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@35444 91177308-0d34-0410-b5e6-96231b3b80d8
instead of using ConstantExpr::getXX.
2. Use constant reference to APInt if possible instead of expensive
APInt copy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@35443 91177308-0d34-0410-b5e6-96231b3b80d8
2. Use APInt[] instead of "X & SignBit".
3. Clean up some codes.
4. Make the expression like "ShiftAmt = ShiftAmtC->getZExtValue()" safe.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@35424 91177308-0d34-0410-b5e6-96231b3b80d8
1. Line out nested use of zext/trunc.
2. Make more use of getHighBitsSet/getLowBitsSet.
3. Use APInt[] != 0 instead of "(APInt & SignBit) != 0".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@35408 91177308-0d34-0410-b5e6-96231b3b80d8
When converting an add/xor/and triplet into a trunc/sext, only do so if the
intermediate integer type is a bitwidth that the targets can handle.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@35400 91177308-0d34-0410-b5e6-96231b3b80d8
original and new instruction. A slight performance hit with ostringstream
but it is only for debug.
Also, clean up an uninitialized variable warning noticed in a release build.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@35358 91177308-0d34-0410-b5e6-96231b3b80d8
Fix SingleSource/Regression/C/2003-05-21-UnionBitFields.c by changing a
getHighBitsSet call to getLowBitsSet call that was incorrectly converted
from the original lshr constant expression.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@35348 91177308-0d34-0410-b5e6-96231b3b80d8
Remove a use of getLowBitsSet that caused the mask used for replacement of
shl/lshr pairs with an AND instruction to be computed incorrectly. Its not
clear exactly why this is the case. This solves the disappearing shifts
problem, but it doesn't fix Regression/C/2003-05-21-UnionBitFields. It
seems there is more going on.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@35342 91177308-0d34-0410-b5e6-96231b3b80d8
* Don't assume shift amounts are <= 64 bits
* Avoid creating an extra APInt in SubOne and AddOne by using -- and ++
* Add another use of getLowBitsSet
* Convert a series of if statements to a switch
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@35339 91177308-0d34-0410-b5e6-96231b3b80d8
using the facilities of APInt. While this duplicates a tiny fraction of
the constant folding code, it also makes the code easier to read and
avoids large ConstantExpr overhead for simple, known computations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@35335 91177308-0d34-0410-b5e6-96231b3b80d8
2. Use isStrictlyPositive() instead of isPositive() in two places where
they need APInt value > 0 not only >=0.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@35333 91177308-0d34-0410-b5e6-96231b3b80d8
* Convert the last use of a uint64_t that should have been an APInt.
* Change ComputeMaskedBits to have a const reference argument for the Mask
so that recursions don't cause unneeded temporaries. This causes temps
to be needed in other places (where the mask has to change) but this
change optimizes for the recursion which is more frequent.
* Remove two instances of &ing a Mask with getAllOnesValue. Its not
needed any more because APInt is accurate in its bit computations.
* Start using the getLowBitsSet and getHighBits set methods on APInt
instead of shifting. This makes it more clear in the code what is
going on.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@35321 91177308-0d34-0410-b5e6-96231b3b80d8
Convert some calls to ConstantInt::getZExtValue() into getValue() and
use APInt facilities in the subsequent computations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@35294 91177308-0d34-0410-b5e6-96231b3b80d8
* APIntify visitAdd and visitSelectInst
* Remove unused uint64_t versions of utility functions that have been
replaced with APInt versions.
This completes most of the changes for APIntification of InstCombine. This
passes llvm-test and llvm/test/Transforms/InstCombine/APInt.
Patch by Zhou Sheng.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@35287 91177308-0d34-0410-b5e6-96231b3b80d8
* Re-enable the APInt version of MaskedValueIsZero.
* APIntify the Comput{Un}SignedMinMaxValuesFromKnownBits functions
* APIntify visitICmpInst.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@35270 91177308-0d34-0410-b5e6-96231b3b80d8
Analyze GEPs. If the indices are all zero, transfer whether the pointer is
known to be not null through the GEP.
Add a few more cases for xor and shift instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@35257 91177308-0d34-0410-b5e6-96231b3b80d8
* Fix some indentation and comments in InsertRangeTest
* Add an "IsSigned" parameter to AddWithOverflow and make it handle signed
additions. Also, APIntify this function so it works with any bitwidth.
* For the icmp pred ([us]div %X, C1), C2 transforms, exit early if the
div instruction's RHS is zero.
* Finally, for icmp pred (sdiv %X, C1), -C2, fix an off-by-one error. The
HiBound needs to be incremented in order to get the range test correct.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@35247 91177308-0d34-0410-b5e6-96231b3b80d8
Add new micro-optimizations.
Add icmp predicate snuggling. Given %x ULT 4, "icmp ugt %x, 2" becomes
"icmp eq %x, 3". This doesn't apply in any non-trivial cases yet due to missing
support for NE values in ValueRanges.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@35119 91177308-0d34-0410-b5e6-96231b3b80d8
"APInt::getAllOnesValue(ShiftAmt).zextOrCopy(BitWidth)",
to handle ShiftAmt == BitWidth situation, use zextOrCopy() instead of
zext().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@35080 91177308-0d34-0410-b5e6-96231b3b80d8
1. Replace getSignedMinValue() with getSignBit() for better code readability.
2. Replace APIntOps::shl() with operator<<= for convenience.
3. Make APInt construction more effective.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@35060 91177308-0d34-0410-b5e6-96231b3b80d8
Provide an APIntified version of MaskedValueIsZero. This will (temporarily)
cause a "defined but not used" message from the compiler. It will be used
in the next patch in this series.
Patch by Sheng Zhou.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@35019 91177308-0d34-0410-b5e6-96231b3b80d8
Add a new ComputeMaskedBits function that is APIntified. We'll slowly
convert things over to use this version. When its all done, we'll remove
the existing version.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@35018 91177308-0d34-0410-b5e6-96231b3b80d8
scalarrepl things down to elements, but mem2reg can't promote elements that
are memset/memcpy'd. Until then, the code is disabled "0 &&".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@34924 91177308-0d34-0410-b5e6-96231b3b80d8
would scan the entire loop body, then scan all users of instructions in the
loop, looking for users outside the loop. Now, since we know that the
loop is in LCSSA form, we know that any users outside the loop will be LCSSA
phi nodes. Just scan them.
This speeds up indvars significantly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@34898 91177308-0d34-0410-b5e6-96231b3b80d8
the order that instcombine processed instructions in the testcase. The end
result is that instcombine finished with:
define i16 @test1(i16 %a) {
%tmp = zext i16 %a to i32 ; <i32> [#uses=2]
%tmp21 = lshr i32 %tmp, 8 ; <i32> [#uses=1]
%tmp5 = shl i32 %tmp, 8 ; <i32> [#uses=1]
%tmp.upgrd.32 = or i32 %tmp21, %tmp5 ; <i32> [#uses=1]
%tmp.upgrd.3 = trunc i32 %tmp.upgrd.32 to i16 ; <i16> [#uses=1]
ret i16 %tmp.upgrd.3
}
which can't get matched as a bswap.
This patch makes instcombine more sophisticated about removing truncating
casts, allowing it to turn this into:
define i16 @test2(i16 %a) {
%tmp211 = lshr i16 %a, 8
%tmp52 = shl i16 %a, 8
%tmp.upgrd.323 = or i16 %tmp211, %tmp52
ret i16 %tmp.upgrd.323
}
which then matches as bswap. This fixes bswap.ll and implements
InstCombine/cast2.ll:test[12]. This also implements cast elimination of
add/sub.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@34870 91177308-0d34-0410-b5e6-96231b3b80d8
a value from the worklist required scanning the entire worklist to remove all
entries. We now use a combination map+vector to prevent duplicates from
happening and prevent the scan. This speeds up instcombine on a large file
from the llvm-gcc bootstrap from 189.7s to 4.84s in a debug build and from
5.04s to 1.37s in a release build.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@34848 91177308-0d34-0410-b5e6-96231b3b80d8
First round of ConstantRange changes. This makes all CR constructors use
only APInt and not use ConstantInt. Clients are adjusted accordingly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@34756 91177308-0d34-0410-b5e6-96231b3b80d8
the Transforms library. This reduces debug library size by 132 KB, debug
binary size by 376 KB, and reduces link time for llvm tools slightly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@33939 91177308-0d34-0410-b5e6-96231b3b80d8
Learn from sext and zext. The destination value falls within the range of the
source type.
Generalize properties regarding constant ints.
Get smarter about marking blocks as unreachable. If 1 >= 2 in order for this
block to execute, then it isn't reachable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@33889 91177308-0d34-0410-b5e6-96231b3b80d8
Make the Module's dependent library use a std::vector instead of SetVector
adjust #includes in .cpp files because SetVector.h is no longer included.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@33855 91177308-0d34-0410-b5e6-96231b3b80d8
This feature is needed in order to support shifts of more than 255 bits
on large integer types. This changes the syntax for llvm assembly to
make shl, ashr and lshr instructions look like a binary operator:
shl i32 %X, 1
instead of
shl i32 %X, i8 1
Additionally, this should help a few passes perform additional optimizations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@33776 91177308-0d34-0410-b5e6-96231b3b80d8
pessimization where instcombine can sink a load (good for code size) that
prevents an alloca from being promoted by mem2reg (bad for everything).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@33771 91177308-0d34-0410-b5e6-96231b3b80d8
This occurs in C++ code like:
#include <iostream>
#include <iterator>
int a[] = { 1, 2, 3, 4, 5 };
int main() {
using namespace std;
copy(a, a + sizeof(a)/sizeof(a[0]), ostream_iterator<int>(cout, "\n"));
return 0;
}
Before we would decide the loop trip count is:
sdiv (i32 sub (i32 ptrtoint (i32* getelementptr ([5 x i32]* @a, i32 0, i32 5) to i32), i32 ptrtoint ([5 x i32]* @a to i32)), i32 4)
Now we decide it is "5". Amazing.
This code will need to be refactored, but I'm doing that as a separate
commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@33665 91177308-0d34-0410-b5e6-96231b3b80d8
Fix initializeConstant, now initializeInt. Fixes major performance
bottleneck.
X == Y || X->DominatedBy(Y) is redundant. Remove the X == Y part.
Fix crasher in makeEqual where getOrInsertNode would add a new constant,
producing an NE relationship between the two members we're trying to make
equal. This now allows us to mark more BBs as unreachable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@33612 91177308-0d34-0410-b5e6-96231b3b80d8
changes: (1) don't special case for i1 any more, (2) use the new
TargetData::getTypeSizeInBits method to ensure source and dest are the
same bit width.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@33427 91177308-0d34-0410-b5e6-96231b3b80d8
We only want to do this if the src and destination types have the same
bit width. This patch uses TargetData::getTypeSizeInBits() instead of
making a special case for integer types and avoiding the transform if
they don't match.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@33414 91177308-0d34-0410-b5e6-96231b3b80d8
This is the final patch for this PR. It implements some minor cleanup
in the use of IntegerType, to wit:
1. Type::getIntegerTypeMask -> IntegerType::getBitMask
2. Type::Int*Ty changed to IntegerType* from Type*
3. ConstantInt::getType() returns IntegerType* now, not Type*
This also fixes PR1120.
Patch by Sheng Zhou.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@33370 91177308-0d34-0410-b5e6-96231b3b80d8
transform. Change some variable names so it is clear what is source and
what is dest of the cast. Also, add an assert to ensure that the integer
to integer case is asserting if the bitwidths are different. This prevents
illegal casts from being formed and catches bitwidth bugs sooner.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@33337 91177308-0d34-0410-b5e6-96231b3b80d8
because TargetData::getTypeSize() returns the same for i1 and i8. This fix
is not right for the full generality of bitwise types, but it fixes the
regression.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@33237 91177308-0d34-0410-b5e6-96231b3b80d8
the basic block and is stable across runs in gdb or valgrind.
Make Node::update handle edges which dominate and are tighter than
existing edges.
Replace makeEqual's "squeeze theorem" code. Fixes miscompilation.
Gate the calls to defToOps and opsToDef. Before this, we were getting IG
edges about values which weren't even defined in the dominated area. This
reduces the size of the IG by about half.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@33236 91177308-0d34-0410-b5e6-96231b3b80d8
rename Type::getIntegralTypeMask to Type::getIntegerTypeMask.
This makes naming much more consistent. For example, there are now no longer any
instances of IntegerType that are not considered isInteger! :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@33225 91177308-0d34-0410-b5e6-96231b3b80d8