where a loop's header is being split and it has predecessors which are not
contained by the most-nested loop which contains the loop.
This fixes PR5235.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@84505 91177308-0d34-0410-b5e6-96231b3b80d8
Update testcases that rely on malloc insts being present.
Also prematurely remove MallocInst handling from IndMemRemoval and RaiseAllocations to help pass tests in this incremental step.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@84292 91177308-0d34-0410-b5e6-96231b3b80d8
identifying the malloc as a non-array malloc. This broke GlobalOpt's optimization of stores of mallocs
to global variables.
The fix is to classify malloc's into 3 categories:
1. non-array mallocs
2. array mallocs whose array size can be determined
3. mallocs that cannot be determined to be of type 1 or 2 and cannot be optimized
getMallocArraySize() returns NULL for category 3, and all users of this function must avoid their
malloc optimization if this function returns NULL.
Eventually, currently unexpected codegen for computing the malloc's size argument will be supported in
isArrayMalloc() and getMallocArraySize(), extending malloc optimizations to those examples.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@84199 91177308-0d34-0410-b5e6-96231b3b80d8
don't bother every time going around the main worklist. This speeds up a
release-asserts opt -std-compile-opts on 403.gcc by about 4% (1.5s). It
seems to speed up the most expensive instances of instcombine by ~10%.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@84171 91177308-0d34-0410-b5e6-96231b3b80d8
instruction (which disqualifies stores, unreachable, etc) and at least the
first operand is a constant. This filters out a lot of obvious cases that
can't be folded. Also, switch the IRBuilder to a TargetFolder, which tries
harder.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@84170 91177308-0d34-0410-b5e6-96231b3b80d8
BasicBlocks, so that it doesn't blindly procede in the presence of
large individual BasicBlocks. This addresses a class of code-size
expansion problems.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@83992 91177308-0d34-0410-b5e6-96231b3b80d8
it to visit instructions from the start of the function to the
end of the function in the first path. This greatly speeds up
some pathological cases (e.g. PR5150).
Try #3, this time with some unneeded debug info stuff removed
which was causing dead pointers to be added to the worklist.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@83818 91177308-0d34-0410-b5e6-96231b3b80d8
it to visit instructions from the start of the function to the
end of the function in the first path. This greatly speeds up
some pathological cases (e.g. PR5150).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@83814 91177308-0d34-0410-b5e6-96231b3b80d8
into a shuffle even if it was used by another insertelement. If the
visitation order of instcombine was wrong, this would turn a chain of
insertelements into a chain of shufflevectors, which was quite painful.
Since CollectShuffleElements handles these cases, the code can just
be nuked.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@83810 91177308-0d34-0410-b5e6-96231b3b80d8
input the the mul is a zext from bool, just that it is all zeros
other than the low bit. This fixes some phase ordering issues
that would cause us to miss some xforms in mul.ll when the worklist
is visited differently.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@83794 91177308-0d34-0410-b5e6-96231b3b80d8
it to visit instructions from the start of the function to the
end of the function in the first path. This greatly speeds up
some pathological cases (e.g. PR5150).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@83790 91177308-0d34-0410-b5e6-96231b3b80d8
For now the metadata of sinked/hoisted instructions is still wrong, but that'll
be fixed when instructions will have debug metadata directly attached.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@83786 91177308-0d34-0410-b5e6-96231b3b80d8
done by condprop, but do it in a much more general form. The
basic idea is that we can do a limited form of tail duplication
in the case when we have a branch on a phi. Moving the branch
up in to the predecessor block makes instruction selection
much easier and encourages chained jump threadings.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@83759 91177308-0d34-0410-b5e6-96231b3b80d8
inserted only once, just use vector. Don't compute ExitBlocks unless we
need it, change std::sort to array_pod_sort.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@83747 91177308-0d34-0410-b5e6-96231b3b80d8
from GVN, this also speeds it up, inserts fewer PHI nodes (see the
testcase) and allows it to remove more loads (due to fewer PHI nodes
standing in the way).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@83746 91177308-0d34-0410-b5e6-96231b3b80d8
not just at the end. Add a big comment explaining when this could
be useful (which never happens for jump threading).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@83741 91177308-0d34-0410-b5e6-96231b3b80d8
DemoteRegToStack. This makes it more efficient (because it isn't
creating a ton of load/stores that are eventually removed by a later
mem2reg), and more slightly more effective (because those load/stores
don't get in the way of threading).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@83706 91177308-0d34-0410-b5e6-96231b3b80d8
constants used in inlining heuristics (especially
those used in more than one file). No functional change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@83675 91177308-0d34-0410-b5e6-96231b3b80d8
and that will make Caller too big to inline, see if it
might be better to inline Caller into its callers instead.
This situation is described in PR 2973, although I haven't
tried the specific case in SPASS.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@83602 91177308-0d34-0410-b5e6-96231b3b80d8
to declare that they preserve other passes without needing to pull in
additional header file or library dependencies. Convert MachineFunctionPass
and CodeGenLICM to make use of this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@83555 91177308-0d34-0410-b5e6-96231b3b80d8
already on the worklist, and print Visited when an instruction is about to be
visited. Net, on one input, this reduced the output size by at least 9x.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@83510 91177308-0d34-0410-b5e6-96231b3b80d8
for inlining.
When MallocInst goes away this code will be subsumed as part of
calls and work just fine...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@83434 91177308-0d34-0410-b5e6-96231b3b80d8
out of it, and jump threading, condprop and gvn are now getting
most of the benefit. This was approved by Nicholas and Nicolas.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@83390 91177308-0d34-0410-b5e6-96231b3b80d8
the new predicates I added) instead of going through a context and doing a
pointer comparison. Besides being cheaper, this allows a smart compiler
to turn the if sequence into a switch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@83297 91177308-0d34-0410-b5e6-96231b3b80d8
phi nodes. Make sure to phi translate from the right block.
This fixes a llvm-building-llvm failure on GVN-PRE.cpp
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@82970 91177308-0d34-0410-b5e6-96231b3b80d8
simple constants for the true/false value of the select. We now
do phi translation etc. This really fixes PR4895 :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@82917 91177308-0d34-0410-b5e6-96231b3b80d8
that are phi nodes. Also tighten up FoldOpIntoPhi to treat constantexpr
operands to phis just like other variables, avoiding moving constantexpr
computations around.
Patch by Daniel Dunbar.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@82913 91177308-0d34-0410-b5e6-96231b3b80d8
This doesn't kick in too much because of phi translation issues,
but this can be resolved in the future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@82447 91177308-0d34-0410-b5e6-96231b3b80d8
from a piece of a large store when both are in the same block.
This allows clang to compile the testcase in PR4216 to this code:
_test_bitfield:
movl 4(%esp), %eax
movl %eax, %ecx
andl $-65536, %ecx
orl $32962, %eax
andl $40186, %eax
orl %ecx, %eax
ret
This is not ideal, but is a whole lot better than the code produced
by llvm-gcc:
_test_bitfield:
movw $-32574, %ax
orw 4(%esp), %ax
andw $-25350, %ax
movw %ax, 4(%esp)
movw 7(%esp), %cx
shlw $8, %cx
movzbl 6(%esp), %edx
orw %cx, %dx
movzwl %dx, %ecx
shll $16, %ecx
movzwl %ax, %eax
orl %ecx, %eax
ret
and dramatically better than that produced by gcc 4.2:
_test_bitfield:
pushl %ebx
call L3
"L00000000001$pb":
L3:
popl %ebx
movl 8(%esp), %eax
leal 0(,%eax,4), %edx
sarb $7, %dl
movl %eax, %ecx
andl $7168, %ecx
andl $-7201, %ebx
movzbl %dl, %edx
andl $1, %edx
sall $5, %edx
orl %ecx, %ebx
orl %edx, %ebx
andl $24, %eax
andl $-58336, %ebx
orl %eax, %ebx
orl $32962, %ebx
movl %ebx, %eax
popl %ebx
ret
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@82439 91177308-0d34-0410-b5e6-96231b3b80d8
so that nonlocal and partially redundant loads can use it as well.
The testcase shows examples of craziness this can handle. This triggers
*many* times in 176.gcc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@82403 91177308-0d34-0410-b5e6-96231b3b80d8
(and load -> load) when the base pointers must alias but when
they are different types. This occurs very very frequently in
176.gcc and other code that uses bitfields a lot.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@82399 91177308-0d34-0410-b5e6-96231b3b80d8
In getMallocArraySize(), fix bug in the case that array size is the product of 2 constants.
Extend isArrayMalloc() and getMallocArraySize() to handle case where malloc is used as char array.
Ensure that ArraySize in LowerAllocations::runOnBasicBlock() is correct type.
Extend Instruction::isSafeToSpeculativelyExecute() to handle malloc calls.
Add verification for malloc calls.
Reviewed by Dan Gohman.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@82257 91177308-0d34-0410-b5e6-96231b3b80d8
constants out of loops. These aren't covered by the regular LICM
pass, because in LLVM IR constants don't require separate
instructions. They're not always covered by the MachineLICM pass
either, because it doesn't know how to unfold folded constant-pool
loads. This is somewhat experimental at this point, and off by
default.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@82076 91177308-0d34-0410-b5e6-96231b3b80d8
phis, similar to the FoldPHIArgGEPIntoPHI change.
Also, delete some comments that don't reflect the code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@82053 91177308-0d34-0410-b5e6-96231b3b80d8
more than one phi, since that leads to higher register pressure on
entry to the phi. This is especially problematic when the phi is in
a loop header, as it increases register pressure throughout the loop.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@81993 91177308-0d34-0410-b5e6-96231b3b80d8
argpromote to avoid invalidating an iterator. This fixes PR4977.
All clang tests now pass with expensive checking (on my system
at least).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@81843 91177308-0d34-0410-b5e6-96231b3b80d8
within the notional bounds of the static type of the getelementptr (which
is not the same as "inbounds") from GlobalOpt into a utility routine,
and use it in ConstantFold.cpp to check whether there are any mis-behaved
indices.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@81478 91177308-0d34-0410-b5e6-96231b3b80d8
loop exit edge -- new PHIs may be needed not only for the additional
splits that are made to preserve LoopSimplify form, but also for the
original split. Factor out the code that inserts new PHIs so that it
can be used for both. Remove LoopRotation.cpp's code for manually
updating LCSSA form, as it is now redundant. This fixes PR4934.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@81363 91177308-0d34-0410-b5e6-96231b3b80d8
that get created during loop unswitching, and fix SplitBlockPredecessors'
LCSSA updating code to create new PHIs instead of trying to just move
existing ones.
Also, optimize Loop::verifyLoop, since it gets called a lot. Use
searches on a sorted list of blocks instead of calling the "contains"
function, as is done in other places in the Loop class, since "contains"
does a linear search. Also, don't call verifyLoop from LoopSimplify or
LCSSA, as the PassManager is already calling verifyLoop as part of
LoopInfo's verifyAnalysis.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@81221 91177308-0d34-0410-b5e6-96231b3b80d8
extractelement operations into a bitcast of the pointer,
then a gep, then a scalar load. Disable this when the vector
only has one element, because it leads to infinite loops in
instcombine (PR4908).
This transformation seems like a really bad idea to me, as it
will likely disable CSE of vector load/stores etc and can be
better done in the code generator when profitable. This
goes all the way back to the first days of packed types,
r25299 specifically.
I'll let those people who care about the performance of vector
code decide what to do with this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@81185 91177308-0d34-0410-b5e6-96231b3b80d8
compile-time constant integers or that are out of bounds for their
corresponding static array types. These can cause aliasing that
GlobalOpt assumes won't happen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@81165 91177308-0d34-0410-b5e6-96231b3b80d8
an aggregate store overlapping a different aggregate store, despite
the stores having distinct addresses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@81164 91177308-0d34-0410-b5e6-96231b3b80d8
is missing the inbounds flag. This is slightly conservative, but it
avoids problems with two constants pointing to the same address but
getting distinct entries in the Memory DenseMap.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@81163 91177308-0d34-0410-b5e6-96231b3b80d8