I don't know what Dan wants to do with phi-merge-gep.ll, I'll let
him deal with it because instcombine may end up sinking these.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@85739 91177308-0d34-0410-b5e6-96231b3b80d8
not the max. This didn't matter until the previous patch because
instcombine would refuse to sink loads with differenting alignments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@85738 91177308-0d34-0410-b5e6-96231b3b80d8
phis, it didn't preserve the alignment of the load. This is a missed
optimization of the alignment is high and a miscompilation when the
alignment is low.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@85736 91177308-0d34-0410-b5e6-96231b3b80d8
MergeBlockIntoPredecessor. This makes SimplifyCFG slightly more aggressive,
and makes it unnecessary for LoopUnroll to have its own copy of this code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@85667 91177308-0d34-0410-b5e6-96231b3b80d8
ArraySize * ElementSize
ElementSize * ArraySize
ArraySize << log2(ElementSize)
ElementSize << log2(ArraySize)
Refactor isArrayMallocHelper and delete isSafeToGetMallocArraySize, so that there is only 1 copy of the malloc array determining logic.
Update users of getMallocArraySize() to not bother calling isArrayMalloc() as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@85421 91177308-0d34-0410-b5e6-96231b3b80d8
with multiple return values it inserts a PHI to merge them all together.
However, if the return values are all the same, it ends up with a pointless
PHI and this pointless PHI happens to really block SRoA from happening in
at least a silly C++ example written by Doug, but probably others. This
fixes rdar://7339069.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@85206 91177308-0d34-0410-b5e6-96231b3b80d8
Update all analysis passes and transforms to treat free calls just like FreeInst.
Remove RaiseAllocations and all its tests since FreeInst no longer needs to be raised.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@84987 91177308-0d34-0410-b5e6-96231b3b80d8
exact backedge taken count, when checking for infinite loops. This allows
it to delete loops with multiple exit conditions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@84952 91177308-0d34-0410-b5e6-96231b3b80d8
non-type-safe constant initializers. This sort of thing happens
quite a bit for 4-byte loads out of string constants, unions,
bitfields, and an interesting endianness check from sqlite, which
is something like this:
const int sqlite3one = 1;
# define SQLITE_BIGENDIAN (*(char *)(&sqlite3one)==0)
# define SQLITE_LITTLEENDIAN (*(char *)(&sqlite3one)==1)
# define SQLITE_UTF16NATIVE (SQLITE_BIGENDIAN?SQLITE_UTF16BE:SQLITE_UTF16LE)
all of these macros now constant fold away.
This implements PR3152 and is based on a patch started by Eli, but heavily
modified and extended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@84936 91177308-0d34-0410-b5e6-96231b3b80d8
Most changes are cleanup, but there is 1 correctness fix:
I fixed InstCombine so that the icmp is removed only if the malloc call is removed (which requires explicit removal because the Worklist won't DCE any calls since they can have side-effects).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@84772 91177308-0d34-0410-b5e6-96231b3b80d8
in the PHI's Basic Block. This uses a conservative approach, because we don't
have dominator info in instcombine.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@84754 91177308-0d34-0410-b5e6-96231b3b80d8
When an incoming value for a PHI is updated, we must also updated all other
incoming values for the same BB to match, otherwise we create invalid PHIs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@84638 91177308-0d34-0410-b5e6-96231b3b80d8
when the invoke had multiple return values: it set the lattice value only on the
extractvalue.
This caused the invoke's lattice value to remain the default (undefined), and
later propagated to extractvalue's operand, which incorrectly introduces
undefined behavior.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@84637 91177308-0d34-0410-b5e6-96231b3b80d8
where a loop's header is being split and it has predecessors which are not
contained by the most-nested loop which contains the loop.
This fixes PR5235.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@84505 91177308-0d34-0410-b5e6-96231b3b80d8
allowing it to simplify the crazy constantexprs in the testcases
down to something sensible. This allows -std-compile-opts to
completely "devirtualize" the pointers to member functions in
the testcase from PR5176.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@84368 91177308-0d34-0410-b5e6-96231b3b80d8
Update testcases that rely on malloc insts being present.
Also prematurely remove MallocInst handling from IndMemRemoval and RaiseAllocations to help pass tests in this incremental step.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@84292 91177308-0d34-0410-b5e6-96231b3b80d8
input the the mul is a zext from bool, just that it is all zeros
other than the low bit. This fixes some phase ordering issues
that would cause us to miss some xforms in mul.ll when the worklist
is visited differently.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@83794 91177308-0d34-0410-b5e6-96231b3b80d8
For now the metadata of sinked/hoisted instructions is still wrong, but that'll
be fixed when instructions will have debug metadata directly attached.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@83786 91177308-0d34-0410-b5e6-96231b3b80d8
done by condprop, but do it in a much more general form. The
basic idea is that we can do a limited form of tail duplication
in the case when we have a branch on a phi. Moving the branch
up in to the predecessor block makes instruction selection
much easier and encourages chained jump threadings.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@83759 91177308-0d34-0410-b5e6-96231b3b80d8
from GVN, this also speeds it up, inserts fewer PHI nodes (see the
testcase) and allows it to remove more loads (due to fewer PHI nodes
standing in the way).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@83746 91177308-0d34-0410-b5e6-96231b3b80d8
and that will make Caller too big to inline, see if it
might be better to inline Caller into its callers instead.
This situation is described in PR 2973, although I haven't
tried the specific case in SPASS.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@83602 91177308-0d34-0410-b5e6-96231b3b80d8
out of it, and jump threading, condprop and gvn are now getting
most of the benefit. This was approved by Nicholas and Nicolas.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@83390 91177308-0d34-0410-b5e6-96231b3b80d8
phi nodes. Make sure to phi translate from the right block.
This fixes a llvm-building-llvm failure on GVN-PRE.cpp
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@82970 91177308-0d34-0410-b5e6-96231b3b80d8
the PassManager code into a regular verifyAnalysis method.
Also, reorganize loop verification. Make the LoopPass infrastructure
call verifyLoop as needed instead of having LoopInfo::verifyAnalysis
check every loop in the function after each looop pass. Add a new
command-line argument, -verify-loop-info, to enable the expensive
full checking.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@82952 91177308-0d34-0410-b5e6-96231b3b80d8
simple constants for the true/false value of the select. We now
do phi translation etc. This really fixes PR4895 :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@82917 91177308-0d34-0410-b5e6-96231b3b80d8
that are phi nodes. Also tighten up FoldOpIntoPhi to treat constantexpr
operands to phis just like other variables, avoiding moving constantexpr
computations around.
Patch by Daniel Dunbar.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@82913 91177308-0d34-0410-b5e6-96231b3b80d8
take into consideration that the result of an invoke is only valid in
the normal dest, not the unwind dest. This caused 'PHINode::hasConstantValue'
to return true in an invalid situation, causing mem2reg to delete a phi that
was actually needed. This caused a crash building 483.xalancbmk.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@82491 91177308-0d34-0410-b5e6-96231b3b80d8
This doesn't kick in too much because of phi translation issues,
but this can be resolved in the future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@82447 91177308-0d34-0410-b5e6-96231b3b80d8
from a piece of a large store when both are in the same block.
This allows clang to compile the testcase in PR4216 to this code:
_test_bitfield:
movl 4(%esp), %eax
movl %eax, %ecx
andl $-65536, %ecx
orl $32962, %eax
andl $40186, %eax
orl %ecx, %eax
ret
This is not ideal, but is a whole lot better than the code produced
by llvm-gcc:
_test_bitfield:
movw $-32574, %ax
orw 4(%esp), %ax
andw $-25350, %ax
movw %ax, 4(%esp)
movw 7(%esp), %cx
shlw $8, %cx
movzbl 6(%esp), %edx
orw %cx, %dx
movzwl %dx, %ecx
shll $16, %ecx
movzwl %ax, %eax
orl %ecx, %eax
ret
and dramatically better than that produced by gcc 4.2:
_test_bitfield:
pushl %ebx
call L3
"L00000000001$pb":
L3:
popl %ebx
movl 8(%esp), %eax
leal 0(,%eax,4), %edx
sarb $7, %dl
movl %eax, %ecx
andl $7168, %ecx
andl $-7201, %ebx
movzbl %dl, %edx
andl $1, %edx
sall $5, %edx
orl %ecx, %ebx
orl %edx, %ebx
andl $24, %eax
andl $-58336, %ebx
orl %eax, %ebx
orl $32962, %ebx
movl %ebx, %eax
popl %ebx
ret
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@82439 91177308-0d34-0410-b5e6-96231b3b80d8
early for the stated reasons: this allows it to find more
equivalences and depend less on code layout.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@82404 91177308-0d34-0410-b5e6-96231b3b80d8
so that nonlocal and partially redundant loads can use it as well.
The testcase shows examples of craziness this can handle. This triggers
*many* times in 176.gcc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@82403 91177308-0d34-0410-b5e6-96231b3b80d8
(and load -> load) when the base pointers must alias but when
they are different types. This occurs very very frequently in
176.gcc and other code that uses bitfields a lot.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@82399 91177308-0d34-0410-b5e6-96231b3b80d8
more than one phi, since that leads to higher register pressure on
entry to the phi. This is especially problematic when the phi is in
a loop header, as it increases register pressure throughout the loop.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@81993 91177308-0d34-0410-b5e6-96231b3b80d8
input filename so that opt doesn't print the input filename in the
output so that grep lines in the tests don't unintentionally match
strings in the input filename.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@81537 91177308-0d34-0410-b5e6-96231b3b80d8
how to fold notionally-out-of-bounds array getelementptr indices instead
of just doing these in lib/Analysis/ConstantFolding.cpp, because it can
be done in a fairly general way without TargetData, and because not all
constants are visited by lib/Analysis/ConstantFolding.cpp. This enables
more constant folding.
Also, set the "inbounds" flag when the getelementptr indices are
one-past-the-end.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@81483 91177308-0d34-0410-b5e6-96231b3b80d8
within the notional bounds of the static type of the getelementptr (which
is not the same as "inbounds") from GlobalOpt into a utility routine,
and use it in ConstantFold.cpp to check whether there are any mis-behaved
indices.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@81478 91177308-0d34-0410-b5e6-96231b3b80d8
loop exit edge -- new PHIs may be needed not only for the additional
splits that are made to preserve LoopSimplify form, but also for the
original split. Factor out the code that inserts new PHIs so that it
can be used for both. Remove LoopRotation.cpp's code for manually
updating LCSSA form, as it is now redundant. This fixes PR4934.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@81363 91177308-0d34-0410-b5e6-96231b3b80d8
that get created during loop unswitching, and fix SplitBlockPredecessors'
LCSSA updating code to create new PHIs instead of trying to just move
existing ones.
Also, optimize Loop::verifyLoop, since it gets called a lot. Use
searches on a sorted list of blocks instead of calling the "contains"
function, as is done in other places in the Loop class, since "contains"
does a linear search. Also, don't call verifyLoop from LoopSimplify or
LCSSA, as the PassManager is already calling verifyLoop as part of
LoopInfo's verifyAnalysis.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@81221 91177308-0d34-0410-b5e6-96231b3b80d8
extractelement operations into a bitcast of the pointer,
then a gep, then a scalar load. Disable this when the vector
only has one element, because it leads to infinite loops in
instcombine (PR4908).
This transformation seems like a really bad idea to me, as it
will likely disable CSE of vector load/stores etc and can be
better done in the code generator when profitable. This
goes all the way back to the first days of packed types,
r25299 specifically.
I'll let those people who care about the performance of vector
code decide what to do with this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@81185 91177308-0d34-0410-b5e6-96231b3b80d8
- I'd appreciate it if someone else eyeballs my changes to make sure I captured
the intent of the test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@81083 91177308-0d34-0410-b5e6-96231b3b80d8
instead of a bool argument, and to do the dominator check itself.
This makes it eaiser to use when DominatorTree information is
available.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@80920 91177308-0d34-0410-b5e6-96231b3b80d8
simplifylibcalls optimization is thus valid for C++ but not C.
It's not important enough to worry about for C++ apps, so just
remove it.
rdar://7191924
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@80887 91177308-0d34-0410-b5e6-96231b3b80d8
and we get the original pointer type. This doesn't mean that we're
at the first pointer being indexed. Correct the predicate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@80762 91177308-0d34-0410-b5e6-96231b3b80d8
don't alias. Remove an old and poorly reduced testcase that fails
with this transform for reasons unrelated to the original test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@80693 91177308-0d34-0410-b5e6-96231b3b80d8
for sanity. This didn't turn up any bugs.
Change CallGraphNode to maintain its "callsite" information in the
call edges list as a WeakVH instead of as an instruction*. This fixes
a broad class of dangling pointer bugs, and makes CallGraph have a number
of useful invariants again. This fixes the class of problem indicated
by PR4029 and PR3601.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@80663 91177308-0d34-0410-b5e6-96231b3b80d8
changes: SimplifyDemandedBits can't use the builder yet because it
has the wrong insertion point. This fixes a crash building
MultiSource/Benchmarks/PAQ8p
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@80537 91177308-0d34-0410-b5e6-96231b3b80d8
indirect function pointer, inline it, then go to delete the body.
The problem is that the callgraph had other references to the function,
though the inliner had no way to know it, so we got a dangling pointer
and an invalid iterator out of the deal.
The fix to this is pretty simple: stop the inliner from deleting the
function by knowing that there are references to it. Do this by making
CallGraphNodes contain a refcount. This requires moving deletion of
available_externally functions to the module-level cleanup sweep where
it belongs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@80533 91177308-0d34-0410-b5e6-96231b3b80d8
is itself a bitcast. Since we have gep(bitcast(bitcast(y))) in this
case, just wait for the two bitcasts to get zapped. This prevents
instcombine from confusing some aliasing stuff, and allows it to
directly eliminate the load in the testcase.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@80508 91177308-0d34-0410-b5e6-96231b3b80d8
calls into a function and if the calls bring in arrays, try to merge
them together to reduce stack size. For example, in the testcase
we'd previously end up with 4 allocas, now we end up with 2 allocas.
As described in the comments, this is not really the ideal solution
to this problem, but it is surprisingly effective. For example, on
176.gcc, we end up eliminating 67 arrays at "gccas" time and another
24 at "llvm-ld" time.
One piece of concern that I didn't look into: at -O0 -g with
forced inlining this will almost certainly result in worse debug
info. I think this is acceptable though given that this is a case
of "debugging optimized code", and we don't want debug info to
prevent the optimizer from doing things anyway.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@80215 91177308-0d34-0410-b5e6-96231b3b80d8
sinking code, since they are special. If the loop preheader happens
to be the entry block of a function, don't sink static allocas
out of it. This fixes PR4775.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@80010 91177308-0d34-0410-b5e6-96231b3b80d8
This change speeds up llvm-gcc by more then 6% at "-O0 -g" (measured by compiling InstructionCombining.cpp!)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@79977 91177308-0d34-0410-b5e6-96231b3b80d8
array member of a struct, it's possible to land in an arbitrary position
inside that struct, such that attempting to find further getelementptr
indices will fail. In such cases, folding cannot be done.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@79485 91177308-0d34-0410-b5e6-96231b3b80d8
static extents of the static array type, it causes GlobalOpt and
other passes to be more conservative. This canonicalization also
allows the constant folder to add "inbounds" to GEPs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@79440 91177308-0d34-0410-b5e6-96231b3b80d8
TargetData is not present. It still uses TargetData when available.
This generalization also fixed some limitations in the TargetData
case; the attached testcase covers this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@79344 91177308-0d34-0410-b5e6-96231b3b80d8
unfoldable references to a PHI node in the block being folded, and disable
the transformation in that case. The correct transformation of such PHI
nodes depends on whether BB dominates Succ, and dominance is expensive
to compute here. (Alternatively, it's possible to check whether any
uses are live, but that's also essentially a dominance calculation.
Another alternative is to use reg2mem, but it probably isn't a good idea to
use that in simplifycfg.)
Also, remove some incorrect code from CanPropagatePredecessorsForPHIs
which is made unnecessary with this patch: it didn't consider the case
where a PHI node in BB has multiple uses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@79174 91177308-0d34-0410-b5e6-96231b3b80d8
the new load by the old load instead of by the extract element because
a store could have occurred between the load and extract element.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@78891 91177308-0d34-0410-b5e6-96231b3b80d8
few places in InstCombine to use it, to fix problems handling pointer
types. This fixes the recent llvm-gcc bootstrap error.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@78005 91177308-0d34-0410-b5e6-96231b3b80d8
also apply to vectors. This allows us to compile this:
#include <emmintrin.h>
__m128i a(__m128 a, __m128 b) { return a==a & b==b; }
__m128i b(__m128 a, __m128 b) { return a!=a | b!=b; }
to:
_a:
cmpordps %xmm1, %xmm0
ret
_b:
cmpunordps %xmm1, %xmm0
ret
with clang instead of to a ton of horrible code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@76863 91177308-0d34-0410-b5e6-96231b3b80d8
with negative tests: this test wasn't checking what it thought it was
because it was grepping .bc, not .ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@76861 91177308-0d34-0410-b5e6-96231b3b80d8
Getelementptrs that are defined to wrap are virtually useless to
optimization, and getelementptrs that are undefined on any kind
of overflow are too restrictive -- it's difficult to ensure that
all intermediate addresses are within bounds. I'm going to take
a different approach.
Remove a few optimizations that depended on this flag.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@76437 91177308-0d34-0410-b5e6-96231b3b80d8
insertelement/extractelement.
I'm not entirely sure this is precisely what we want to do: should we
prefer bitcast(insertelement) or insertelement(bitcast)? Similarly. should we
prefer extractelement(bitcast) or bitcast(extractelement)?
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@76345 91177308-0d34-0410-b5e6-96231b3b80d8
operands; it's possible to end up with a constant-foldable operand to
most instructions, even those which can't trap.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@75845 91177308-0d34-0410-b5e6-96231b3b80d8
the operands have pointer type, so that the resulting type matches
the original SCEV type, and so that unnecessary ptrtoints are
avoided in common cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@75680 91177308-0d34-0410-b5e6-96231b3b80d8
(I think it's reasonably clear that we want to have a canonical form for
constructs like this; if anyone thinks that a select is not the best
canonical form, please tell me.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@75531 91177308-0d34-0410-b5e6-96231b3b80d8
check for avoiding re-analyzing a widening cast needed to happen
earlier, as getSCEV itself may result in a isLoopGuardedByCond query.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@75511 91177308-0d34-0410-b5e6-96231b3b80d8
so that all code paths get it. PR4256 was about a case where the
phi translation loop would find all preds in the Visited cache, so
it could get by without re-sorting the NonLocalPointerDeps cache.
Fix this by resorting it earlier, there is no reason not to do this.
This patch inspired by Jakub Staszak's patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@75476 91177308-0d34-0410-b5e6-96231b3b80d8