loop-variant components, adds must be inserted after the increment.
Keep track of the increment position for this case, and insert
these adds in the correct location.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94110 91177308-0d34-0410-b5e6-96231b3b80d8
operands exceeds the number of registers used in the initial
solution, as that wouldn't lead to a profitable solution anyway.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94107 91177308-0d34-0410-b5e6-96231b3b80d8
ValueMapper.cpp ends up calling an out of line
__ZNK4llvm12PATypeHolder3getEv, which is a template and llvm-config
determines arbitrarily to use the one in libipo. This sucks, but
keeping the #include is a reasonable workaround.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94103 91177308-0d34-0410-b5e6-96231b3b80d8
This new version is much more aggressive about doing "full" reduction in
cases where it reduces register pressure, and also more aggressive about
rewriting induction variables to count down (or up) to zero when doing so
reduces register pressure.
It currently uses fairly simplistic algorithms for finding reuse
opportunities, but it introduces a new framework allows it to combine
multiple strategies at once to form hybrid solutions, instead of doing
all full-reduction or all base+index.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94061 91177308-0d34-0410-b5e6-96231b3b80d8
No functional change except the forgotten test for
InlineLimit.getNumOccurrences() == 0 in the CurrentThreshold2 calculation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94007 91177308-0d34-0410-b5e6-96231b3b80d8
than the scaled register. This makes it more likely that subsequent
AddrModeMatcher queries will match the new address the same way as the
old, instead of accidentally matching what had been the base register
as the new scaled register, and then failing to match the scaled register.
This fixes some problems with address-mode sinking multiple muls into a
block, which will be a lot more common with some upcoming
LoopStrengthReduction changes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93935 91177308-0d34-0410-b5e6-96231b3b80d8
are the same. I had already fixed a similar problem where the source and
destination were different bitcasts derived from the same alloca, but the
previous fix still did not handle the case where both operands are exactly
the same value. Radar 7552893.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93848 91177308-0d34-0410-b5e6-96231b3b80d8
aggressive changed the canonical form from sext(trunc(x)) to ashr(lshr(x)),
make sure to transform a couple more things into that canonical form,
and catch a case where we missed turning zext/shl/ashr into a single sext.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93787 91177308-0d34-0410-b5e6-96231b3b80d8
added to the FSub version. However, the original version of this xform guarded
against doing this for floating point (!Op0->getType()->isFPOrFPVector()).
This is causing LLVM to perform incorrect xforms for code like:
void func(double *rhi, double *rlo, double xh, double xl, double yh, double yl){
double mh, ml;
double c = 134217729.0;
double up, u1, u2, vp, v1, v2;
up = xh*c;
u1 = (xh - up) + up;
u2 = xh - u1;
vp = yh*c;
v1 = (yh - vp) + vp;
v2 = yh - v1;
mh = xh*yh;
ml = (((u1*v1 - mh) + (u1*v2)) + (u2*v1)) + (u2*v2);
ml += xh*yl + xl*yh;
*rhi = mh + ml;
*rlo = (mh - (*rhi)) + ml;
}
The last line was optimized away, but rl is intended to be the difference
between the infinitely precise result of mh + ml and after it has been rounded
to double precision.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93369 91177308-0d34-0410-b5e6-96231b3b80d8
in JT.
2) When cloning blocks for PHI or xor conditions, use
instsimplify to simplify the code as we go. This allows us to
squish common cases early in JT which opens up opportunities for
subsequent iterations, and allows it to completely simplify the
testcase.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93253 91177308-0d34-0410-b5e6-96231b3b80d8
condition is a xor with a phi node. This eliminates nonsense
like this from 176.gcc in several places:
LBB166_84:
testl %eax, %eax
- setne %al
- xorb %cl, %al
- notb %al
- testb $1, %al
- je LBB166_85
+ je LBB166_69
+ jmp LBB166_85
This is rdar://7391699
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93221 91177308-0d34-0410-b5e6-96231b3b80d8
codegen has no apparent problem with the trunc version of this, because it turns
into a simple subreg idiom
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93202 91177308-0d34-0410-b5e6-96231b3b80d8
trunc has multiple uses. Codegen is not able to coalesce the subreg case
correctly and so this leads to higher register pressure and spilling (see PR5997).
This speeds up 256.bzip2 from 8.60 -> 8.04s on my machine, ~7%.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93200 91177308-0d34-0410-b5e6-96231b3b80d8
BitsToClear case. This allows it to promote expressions which have an
and/or/xor after the lshr, promoting cases like test2 (from PR4216)
and test3 (random extample extracted from a spec benchmark).
clang now compiles the code in PR4216 into:
_test_bitfield: ## @test_bitfield
movl %edi, %eax
orl $194, %eax
movl $4294902010, %ecx
andq %rax, %rcx
orl $32768, %edi
andq $39936, %rdi
movq %rdi, %rax
orq %rcx, %rax
ret
instead of:
_test_bitfield: ## @test_bitfield
movl %edi, %eax
orl $194, %eax
movl $4294902010, %ecx
andq %rax, %rcx
shrl $8, %edi
orl $128, %edi
shlq $8, %rdi
andq $39936, %rdi
movq %rdi, %rax
orq %rcx, %rax
ret
which is still not great, but is progress.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93145 91177308-0d34-0410-b5e6-96231b3b80d8
new BitsToClear result which allows us to start promoting
expressions that end with a lshr-by-constant. This is
conservatively correct and better than what we had before
(see testcases) but still needs to be extended further.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93144 91177308-0d34-0410-b5e6-96231b3b80d8
the zext dest type. This allows us to handle test52/53 in cast.ll,
and allows llvm-gcc to generate much better code for PR4216 in -m64
mode:
_test_bitfield: ## @test_bitfield
orl $32962, %edi
movl %edi, %eax
andl $-25350, %eax
ret
This also fixes a bug handling vector extends, ensuring that the
mask produced is a vector constant, not an integer constant.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93127 91177308-0d34-0410-b5e6-96231b3b80d8
elimination of a sign extend to be a win, which simplifies
the client of CanEvaluateSExtd, and allows us to eliminate
more casts (examples taken from real code).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93109 91177308-0d34-0410-b5e6-96231b3b80d8
lshr+ashr instead of trunc+sext. We want to avoid type
conversions whenever possible, it is easier to codegen expressions
without truncates and extensions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93107 91177308-0d34-0410-b5e6-96231b3b80d8
bits known clear in the result and don't care about the # casts
eliminated. TD is also dead but keeping it for now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93098 91177308-0d34-0410-b5e6-96231b3b80d8
1) don't try to optimize a sext or zext that is only used by a trunc, let
the trunc get optimized first. This avoids some pointless effort in
some common cases since instcombine scans down a block in the first pass.
2) Change the cost model for zext elimination to consider an 'and' cheaper
than a zext. This allows us to do it more aggressively, and for the next
patch to simplify the code quite a bit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93097 91177308-0d34-0410-b5e6-96231b3b80d8
commonIntCastTransforms into the callers, eliminating a switch,
and allowing the static predicate methods to be moved down to
live next to the corresponding function. No functionality
change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93089 91177308-0d34-0410-b5e6-96231b3b80d8
that feeds into a zext, similar to the patch I did yesterday for sext.
There is a lot of room for extension beyond this patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92962 91177308-0d34-0410-b5e6-96231b3b80d8
to an element of a vector in a static ctor) which occurs with an
unrelated patch I'm testing. Annoyingly, EvaluateStoreInto basically
does exactly the same stuff as InsertElement constant folding, but it
now handles vectors, and you can't insertelement into a vector. It
would be 'really nice' if GEP into a vector were not legal.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92889 91177308-0d34-0410-b5e6-96231b3b80d8
phi nodes when deciding which pointers point to local memory.
I actually checked long ago how useful this is, and it isn't
very: it hardly ever fires in the testsuite, but since Chris
wants it here it is!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92836 91177308-0d34-0410-b5e6-96231b3b80d8
memcpy, memset and other intrinsics that only access their arguments
to be readnone if the intrinsic's arguments all point to local memory.
This improves the testcase in the README to readonly, but it could in
theory be made readnone, however this would involve more sophisticated
analysis that looks through the memcpy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92829 91177308-0d34-0410-b5e6-96231b3b80d8
Previously, instcombine would only promote an expression tree to
the larger type if doing so eliminated two casts. This is because
a need to manually do the sign extend after the promoted expression
tree with two shifts. Now, we keep track of whether the result of
the computation is going to be properly sign extended already. If
so, we can unconditionally promote the expression, which allows us
to zap more sext's.
This implements rdar://6598839 (aka gcc pr38751)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92815 91177308-0d34-0410-b5e6-96231b3b80d8
The only difference is that EvaluateInDifferentType checks to ensure
they are profitable before doing them :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92788 91177308-0d34-0410-b5e6-96231b3b80d8
RecursivelyDeleteDeadPHINode, and DeleteDeadPHIs return a flag
indicating whether they made any changes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92732 91177308-0d34-0410-b5e6-96231b3b80d8
dyn_castNotVal in the X+~X transform. dyn_castNotVal is
dramatic overkill for what the xform needed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92704 91177308-0d34-0410-b5e6-96231b3b80d8
Eliminate the 'AddMaskingAnd' transformation, it is redundant with this
more general code right below it:
// A+B --> A|B iff A and B have no bits set in common.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92693 91177308-0d34-0410-b5e6-96231b3b80d8
that got instantiated. There is no reason for instcombine
to try this hard for simple associative optimizations. Next
up, eliminate the template completely.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92692 91177308-0d34-0410-b5e6-96231b3b80d8
when doing this transform if the GEP is not inbounds. No testcase because
it is very difficult to trigger this: instcombine already canonicalizes
GEP indices to pointer size, so it relies specific permutations of the
instcombine worklist.
Thanks to Duncan for pointing this possible problem out.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92495 91177308-0d34-0410-b5e6-96231b3b80d8