* Instead of handling dead functions specially, just nuke them.
* Be more aggressive about cleaning up after constification, in
particular, handle getelementptr instructions and constantexprs.
* Be a little bit more structured about how we process globals.
*** Delete globals that are only stored to, and never read. These are
clearly not useful, so they should go. This implements deadglobal.llx
This last one triggers quite a few times. In particular, 2208 in the
external tests, 1865 of which are in 252.eon. This shrinks eon from
1995094 to 1732341 bytes of bytecode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16802 91177308-0d34-0410-b5e6-96231b3b80d8
simplifications of the resultant program to avoid making later passes
do it all.
This allows us to constify globals that just have the same constant that
they are initialized stored into them.
Suprisingly this comes up ALL of the freaking time, dozens of times in
SPEC, 30 times in vortex alone.
For example, on 256.bzip2, it allows us to constify these two globals:
%smallMode = internal global ubyte 0 ; <ubyte*> [#uses=8]
%verbosity = internal global int 0 ; <int*> [#uses=49]
Which (with later optimizations) results in the bytecode file shrinking
from 82286 to 69686 bytes! Lets hear it for IPO :)
For the record, it's nuking lots of "if (verbosity > 2) { do lots of stuff }"
code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16793 91177308-0d34-0410-b5e6-96231b3b80d8
an instruction if it can be hoisted to a common dominator of the block.
This implements: test/Regression/Transforms/TailDup/MergeTest.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16758 91177308-0d34-0410-b5e6-96231b3b80d8
* SubOne/AddOne functions always return ConstantInt, declare them as such
* Pull code for handling setcc X, cst, where cst is at the end of the range,
or cc is LE or GE up earlier in visitSetCondInst. This reduces #iterations
in some cases.
* Fold: (div X, C1) op C2 -> range check, implementing div.ll:test6 - test9.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16588 91177308-0d34-0410-b5e6-96231b3b80d8
This takes something like this:
%A = phi int [ 3, %cond_false.0 ], [ 2, %endif.0.i ], [ 2, %endif.1.i ]
%B = div int %tmp.243, 4
and turns it into:
%A = phi int [ 3/4, %cond_false.0 ], [ 2/4, %endif.0.i ], [ 2/4, %endif.1.i ]
which is later simplified (in this case) into %A = 0.
This triggers thousands of times in spec, for example, 269 times in 176.gcc.
This is tested by InstCombine/add.ll:test23 and set.ll:test18.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16582 91177308-0d34-0410-b5e6-96231b3b80d8
Instcombine (setcc (truncate X), C1).
This occurs THOUSANDS of times in many benchmarks. Particularlly common
seem to be things like (seteq (cast bool X to int), int 0)
This turns it into (seteq bool %X, false), which then becomes (not %X).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16567 91177308-0d34-0410-b5e6-96231b3b80d8
This is important for several reasons:
1. Benchmarks have lots of code that looks like this (perlbmk in particular):
%tmp.2.i = setne int %tmp.0.i, 128 ; <bool> [#uses=1]
%tmp.6343 = seteq int %tmp.0.i, 1 ; <bool> [#uses=1]
%tmp.63 = and bool %tmp.2.i, %tmp.6343 ; <bool> [#uses=1]
we now fold away the setne, a clear improvement.
2. In the more important cases, such as (X >= 10) & (X < 20), we now produce
smaller code: (X-10) < 10.
3. Perhaps the nicest effect of this patch is that it really helps out the
code generators. In particular, for a 'range test' like the above,
instead of generating this on X86 (the difference on PPC is even more
pronounced):
cmp %EAX, 50
setge %CL
cmp %EAX, 100
setl %AL
and %CL, %AL
cmp %CL, 0
we now generate this:
add %EAX, -50
cmp %EAX, 50
Furthermore, this causes setcc's to be folded into branches more often.
These combinations trigger dozens of times in the spec benchmarks, particularly
in 176.gcc, 186.crafty, 253.perlbmk, 254.gap, & 099.go.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16559 91177308-0d34-0410-b5e6-96231b3b80d8
Implement (setcc (shl X, C1), C2) folding.
The second one occurs several dozen times in spec. The first was added
just in case. :)
These are tested by shift.ll:test2[12], and div.ll:test5
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16549 91177308-0d34-0410-b5e6-96231b3b80d8
This latent bug was exposed by recent changes, and is tested as:
llvm/test/Regression/Transforms/InstCombine/2004-09-28-BadShiftAndSetCC.llx
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16546 91177308-0d34-0410-b5e6-96231b3b80d8
where we folded (X & 254) -> X < 1 instead of X < 2. These problems were
latent problems exposed by the latest patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16528 91177308-0d34-0410-b5e6-96231b3b80d8
triggers often, for example:
6x in povray, 1x in gzip, 279x in gcc, 1x in crafty, 8x in eon, 11x in perlbmk,
362x in gap, 4x in vortex, 14 in m88ksim, 211x in 126.gcc, 1x in compress,
11x in ijpeg, and 4x in 147.vortex.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16521 91177308-0d34-0410-b5e6-96231b3b80d8
whose addresses where used by trivial phi nodes and select instructions. This
is now performed by the instcombine pass, which is more powerful, is much
simpler, and is faster. This allows the deletion of a bunch of code, two
FIXME's and two gotos.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16406 91177308-0d34-0410-b5e6-96231b3b80d8
a function being deleted. Due to optimizations done while inlining, there
can be edges from the external call node to a function node that were not
apparent any longer.
This fixes the compiler crash while compiling 175.vpr
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16399 91177308-0d34-0410-b5e6-96231b3b80d8
Move include/Config and include/Support into include/llvm/Config,
include/llvm/ADT and include/llvm/Support. From here on out, all LLVM
public header files must be under include/llvm/.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@16137 91177308-0d34-0410-b5e6-96231b3b80d8
This also registers the pass with opt with a -lower-packed command line
option.
Patch contributed by Brad Jones.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@15987 91177308-0d34-0410-b5e6-96231b3b80d8
block (common in a switch), make sure to remove extra edges in successor
blocks. This fixes CodeExtractor/2004-08-12-BlockExtractPHI.ll and should
be pulled into LLVM 1.3 (though the regression test need not be, as that
would require pulling in the LoopExtract.cpp changes).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@15717 91177308-0d34-0410-b5e6-96231b3b80d8
instructions in the body of the function (not the entry block). This fixes
test/Programs/SingleSource/Regression/C/2004-08-12-InlinerAndAllocas.c
and test/Programs/External/SPEC/CINT2000/176.gcc on zion.
This should obviously be pulled into 1.3.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@15684 91177308-0d34-0410-b5e6-96231b3b80d8
dangling constant users were removed from a function, causing it to be dead,
we never removed the call graph edge from the external node to the function.
In most cases, this didn't cause a problem (by luck). This should definitely
go into 1.3
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@15570 91177308-0d34-0410-b5e6-96231b3b80d8
1. Fix a REALLY nasty cyclic replacement issue that Anshu discovered, causing
nondeterminstic crashes and memory corruption.
2. For performance, don't go inserting constantexpr casts of GV pointers.
This should definitely go into 1.3
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@15568 91177308-0d34-0410-b5e6-96231b3b80d8
assumed that a constant on the RHS of a multiplication was either an
IntConstant or an FPConstant. It checked for an IntConstant and then,
if it did not find one, did a hard cast to an FPConstant. That code
would crash if the RHS were a ConstantExpr that was neither an
IntConstant nor an FPConstant. This version replaces the hard cast
with a dyn_cast. It performs the same way for IntConstants and
FPConstants but does nothing, instead of crashing, for constant
expressions.
The regression test for this change is 2004-07-27-ConstantExprMul.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@15291 91177308-0d34-0410-b5e6-96231b3b80d8
a bug in DSE).
* Delete dead operand uses iteratively instead of recursively, using a
SetVector.
* Defer deletion of dead operand uses until the end of processing, which means
we don't have to bother with updating the AliasSetTracker. This speeds up
DSE substantially.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@15204 91177308-0d34-0410-b5e6-96231b3b80d8
can be improved in many ways. But: stop laughing, even with -basicaa it
deletes 15% of the stores in 252.eon :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@15101 91177308-0d34-0410-b5e6-96231b3b80d8
* Test for whether bits are shifted out during the optzn.
If so, the fold is illegal, though it can be handled explicitly for setne/seteq
This fixes the miscompilation of 254.gap last night, which was a latent bug
exposed by other optimizer improvements.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@15085 91177308-0d34-0410-b5e6-96231b3b80d8
actually care about. Someday when the cast instruction is gone, we can do
better here, but this will do for now. This implements
instcombine/cast.ll:test17/18 as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@15018 91177308-0d34-0410-b5e6-96231b3b80d8
night compiling cfrac. It did not realize that code like this:
int G; int *H = &G;
takes the address of G.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@14973 91177308-0d34-0410-b5e6-96231b3b80d8
- Replace ConstantPointerRef usage with GlobalValue usage
- Rename methods to get ride of ConstantPointerRef usage
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@14945 91177308-0d34-0410-b5e6-96231b3b80d8
Speed up SCCP substantially by processing overdefined values quickly. This
patch speeds up SCCP by about 30-40% on large testcases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@14861 91177308-0d34-0410-b5e6-96231b3b80d8
This version takes about 1s longer than the previous one (down to 2.35s),
but on the positive side, it actually works :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@14856 91177308-0d34-0410-b5e6-96231b3b80d8
This eliminates an N*N*logN algorithm from the loop simplify pass, replacing
it with a much simpler and faster alternative. In a debug build, this reduces
gccas time on eon from 85s to 42s.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@14851 91177308-0d34-0410-b5e6-96231b3b80d8
"load (cast foo)". This allows us to compile C++ code like this:
class Bclass {
public: virtual int operator()() { return 666; }
};
class Dclass: public Bclass {
public: virtual int operator()() { return 667; }
} ;
int main(int argc, char** argv) {
Dclass x;
return x();
}
Into this:
int %main(int %argc, sbyte** %argv) {
entry:
call void %__main( )
ret int 667
}
Instead of this:
int %main(int %argc, sbyte** %argv) {
entry:
%x = alloca "struct.std::bad_typeid" ; <"struct.std::bad_typeid"*> [#uses=3]
call void %__main( )
%tmp.1.i.i = getelementptr "struct.std::bad_typeid"* %x, uint 0, uint 0, uint 0 ; <int (...)***> [#uses=1]
store int (...)** getelementptr ([3 x int (...)*]* %vtable for Bclass, int 0, long 2), int (...)*** %tmp.1.i.i
%tmp.3.i = getelementptr "struct.std::bad_typeid"* %x, int 0, uint 0, uint 0 ; <int (...)***> [#uses=1]
store int (...)** getelementptr ([3 x int (...)*]* %vtable for Dclass, int 0, long 2), int (...)*** %tmp.3.i
%tmp.5 = load int ("struct.std::bad_typeid"*)** cast (int (...)** getelementptr ([3 x int (...)*]* %vtable for Dclass, int 0, long 2) to int
("struct.std::bad_typeid"*)**) ; <int ("struct.std::bad_typeid"*)*> [#uses=1]
%tmp.6 = call int %tmp.5( "struct.std::bad_typeid"* %x ) ; <int> [#uses=1]
ret int %tmp.6
ret int 0
}
In order words, we now resolve the virtual function call.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@14783 91177308-0d34-0410-b5e6-96231b3b80d8
Don't touch GEPs for which DecomposeArrayRef is not going to do anything
special (e.g., < 2 indices, or 2 indices and the last one is a constant.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@14647 91177308-0d34-0410-b5e6-96231b3b80d8
Also, remove X % -1 = 0, because it's not true for unsigneds, and the
signed case is superceeded by this new handling.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@14637 91177308-0d34-0410-b5e6-96231b3b80d8
since May 1st. In this code, the pred iterator was being invalidated sometimes
causing the wrong entries to be added to PHI nodes.
The fix for this is to defererence and safe the *PI value before we hack on
branch instructions, which changes use/def chains, which SOMETIMES invalidates
the iterator.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@14278 91177308-0d34-0410-b5e6-96231b3b80d8
of ConstantInt objects in memory used to determine which order arguments
were added in in some cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@14276 91177308-0d34-0410-b5e6-96231b3b80d8
Fix another non-deterministic behavior, this one should actually speed up the
code though as it was doing silly things.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@14258 91177308-0d34-0410-b5e6-96231b3b80d8
was processing blocks in whatever order they happened to end up in the
dominator tree data structure. Force an ordering.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@14248 91177308-0d34-0410-b5e6-96231b3b80d8
non-deterministic things like the ordering of blocks in the dominance
frontier of a BB. Unfortunately, I don't know of a better way to solve
this problem than to explicitly sort the BB's in function-order before
processing them. This is guaranteed to slow the pass down a bit, but
is absolutely necessary to get usable diffs between two different tools
executing the mem2reg or scalarrepl pass.
Before this, bazillions of spurious diff failures occurred all over the
place due to the different order of processing PHIs:
- %tmp.111 = getelementptr %struct.Connector_struct* %upcon.0.0, uint 0, uint 0
+ %tmp.111 = getelementptr %struct.Connector_struct* %upcon.0.1, uint 0, uint 0
Now, the diffs match.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@14244 91177308-0d34-0410-b5e6-96231b3b80d8
nondeterministic results that depend on where these objects land in memory.
Instead, sort by the value of the constant, which is stable.
Before this patch, the -simplifycfg pass run from two different compilers
could cause different code to be generated, though it was semantically the
same:
@@ -12258,8 +12258,8 @@
%s_addr.1 = phi sbyte* [ %s, %entry ], [ %inc.0, %no_exit ] ; <sbyte*> [#uses=5]
%tmp.1 = load sbyte* %s_addr.1 ; <sbyte> [#uses=1]
switch sbyte %tmp.1, label %no_exit [
- sbyte 0, label %loopexit
sbyte 46, label %loopexit
+ sbyte 0, label %loopexit
]
We need to stomp all of this stuff out.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@14243 91177308-0d34-0410-b5e6-96231b3b80d8
things from happening due to
declare bool %llvm.isunordered(double, double)
declare bool %llvm.isunordered(float, float)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@14219 91177308-0d34-0410-b5e6-96231b3b80d8
is write an autoconf macro that checks whether __isnan or isnan actually works
**using the C++ compiler after #include <cmath>**, instead of doing it the easy
way with AC_CHECK_FUNCS().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@14171 91177308-0d34-0410-b5e6-96231b3b80d8
186.crafty, fhourstones and 132.ijpeg.
Bugpoint makes really nasty miscompilations embarassingly easy to find. It
narrowed it down to the instcombiner and this testcase (from fhourstones):
bool %l7153_l4706_htstat_loopentry_2E_4_no_exit_2E_4(int* %i, [32 x int]* %works, int* %tmp.98.out) {
newFuncRoot:
%tmp.96 = load int* %i ; <int> [#uses=1]
%tmp.97 = getelementptr [32 x int]* %works, long 0, int %tmp.96 ; <int*> [#uses=1]
%tmp.98 = load int* %tmp.97 ; <int> [#uses=2]
%tmp.99 = load int* %i ; <int> [#uses=1]
%tmp.100 = and int %tmp.99, 7 ; <int> [#uses=1]
%tmp.101 = seteq int %tmp.100, 7 ; <bool> [#uses=2]
%tmp.102 = cast bool %tmp.101 to int ; <int> [#uses=0]
br bool %tmp.101, label %codeRepl4.exitStub, label %codeRepl3.exitStub
codeRepl4.exitStub: ; preds = %newFuncRoot
store int %tmp.98, int* %tmp.98.out
ret bool true
codeRepl3.exitStub: ; preds = %newFuncRoot
store int %tmp.98, int* %tmp.98.out
ret bool false
}
... which only has one combination performed on it:
$ llvm-as < t.ll | opt -instcombine -debug | llvm-dis
IC: Old = %tmp.101 = seteq int %tmp.100, 7 ; <bool> [#uses=1]
New = setne int %tmp.100, 0 ; <bool>:<badref> [#uses=0]
IC: MOD = br bool %tmp.101, label %codeRepl3.exitStub, label %codeRepl4.exitStub
IC: MOD = %tmp.97 = getelementptr [32 x int]* %works, uint 0, int %tmp.96 ; <int*> [#uses=1]
It doesn't get much better than this. :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@14109 91177308-0d34-0410-b5e6-96231b3b80d8
collapse this:
bool %le(int %A, int %B) {
%c1 = setgt int %A, %B
%tmp = select bool %c1, int 1, int 0
%c2 = setlt int %A, %B
%result = select bool %c2, int -1, int %tmp
%c3 = setle int %result, 0
ret bool %c3
}
into:
bool %le(int %A, int %B) {
%c3 = setle int %A, %B ; <bool> [#uses=1]
ret bool %c3
}
which is handy, because the Java FE makes these sequences all over the place.
This is tested as: test/Regression/Transforms/InstCombine/JavaCompare.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@14086 91177308-0d34-0410-b5e6-96231b3b80d8
This code hadn't been updated after the "structs with more than 256 elements"
related changes to the GEP instruction. Also it was not handling the
ConstantAggregateZero class.
Now it does!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13834 91177308-0d34-0410-b5e6-96231b3b80d8
Add support for acos/asin/atan. 188.ammp contains three calls to acos with
constant arguments. Constant folding it allows elimination of those 3 calls
and three FP divisions of the results.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13821 91177308-0d34-0410-b5e6-96231b3b80d8
into (X & (C2 << C1)) != (C3 << C1), where the shift may be either left or
right and the compare may be any one.
This triggers 1546 times in 176.gcc alone, as it is a common pattern that
occurs for bitfield accesses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13740 91177308-0d34-0410-b5e6-96231b3b80d8
CloneTrace, and because it is primarily an operation on ValueMaps. It
is now a global (non-static) function which can be pulled in using
ValueMapper.h.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13600 91177308-0d34-0410-b5e6-96231b3b80d8
Add better comments, including a better head-of-file comment.
Prune #includes.
Fix a FIXME that Chris put here by using doInitialization().
Use DEBUG() to print out debug msgs.
Give names to basic blocks inserted by this pass.
Expand tabs.
Use InsertProfilingInitCall() from ProfilingUtils to insert the initialize call.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13581 91177308-0d34-0410-b5e6-96231b3b80d8
in the size calculation.
This is not something you want to see:
Loop Unroll: F[main] Loop %no_exit Loop Size = 2 Trip Count = 2147483648 - UNROLLING!
The problem was that 2*2147483648 == 0.
Now we get:
Loop Unroll: F[main] Loop %no_exit Loop Size = 2 Trip Count = 2147483648 - TOO LARGE: 4294967296>100
Thanks to some anonymous person playing with the demo page that repeatedly
caused zion to go into swapping land. That's one way to ensure you'll get
a quick bugfix. :)
Testcase here: Transforms/LoopUnroll/2004-05-13-DontUnrollTooMuch.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13564 91177308-0d34-0410-b5e6-96231b3b80d8
PHI node entries from multiple outside-the-region blocks. This also fixes
extraction of the entry block in a function. Yaay.
This has successfully block extracted all (but one) block from the score_move
function in obsequi (out of 33). Hrm, I wonder which block the bug is in. :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13489 91177308-0d34-0410-b5e6-96231b3b80d8
* Add a stub for the severSplitPHINodes which will allow us to bbextract
bb's with PHI nodes in them soon.
* Remove unused arguments from findInputsOutputs
* Dramatically simplify the code in findInputsOutputs. In particular,
nothing really cares whether or not a PHI node is using something.
* Move moveCodeToFunction to after emitCallAndSwitchStatement as that's the
order they get called.
* Fix a bug where we would code extract a region that included a call to
vastart. Like 'alloca', calls to vastart must stay in the function that
they are defined in.
* Add some comments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13482 91177308-0d34-0410-b5e6-96231b3b80d8
from the extracted region. If the return has 0 or 1 exit blocks, the new
function returns void. If it has 2 exits, it returns bool, otherwise it
returns a ushort as before.
This allows us to use a conditional branch instruction when there are two
exit blocks, as often happens during block extraction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13481 91177308-0d34-0410-b5e6-96231b3b80d8
1. Get rid of the silly abort block. When doing bb extraction, we get one
abort block for every block extracted, which is kinda annoying.
2. If the switch ends up having a single destination, turn it into an
unconditional branch.
I would like to add support for conditional branches, but to do this we will
want to have the function return a bool instead of a ushort.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13478 91177308-0d34-0410-b5e6-96231b3b80d8
%tmp.0 = getelementptr [50 x sbyte]* %ar, uint 0, int 5 ; <sbyte*> [#uses=2]
%tmp.7 = getelementptr sbyte* %tmp.0, int 8 ; <sbyte*> [#uses=1]
together. This patch actually allows us to simplify and generalize the code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13415 91177308-0d34-0410-b5e6-96231b3b80d8
is only used by a cast, and the casted type is the same size as the original
allocation, it would eliminate the cast by folding it into the allocation.
Unfortunately, it was placing the new allocation instruction right before
the cast, which could pull (for example) alloca instructions into the body
of a function. This turns statically allocatable allocas into expensive
dynamically allocated allocas, which is bad bad bad.
This fixes the problem by placing the new allocation instruction at the same
place the old one was, duh. :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13289 91177308-0d34-0410-b5e6-96231b3b80d8
the Module. The default behavior keeps functionality as before: the chosen
function is the one that remains.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13111 91177308-0d34-0410-b5e6-96231b3b80d8
loop. This eliminates the extra add from the previous case, but it's
not clear that this will be a performance win overall. Tommorows test
results will tell. :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13103 91177308-0d34-0410-b5e6-96231b3b80d8
over its USES. If it's dead it doesn't have any uses! :)
Thanks to the fabulous and mysterious Bill Wendling for pointing this out. :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13102 91177308-0d34-0410-b5e6-96231b3b80d8
Eventually it would be nice if CallGraph maintained an ilist of CallGraphNode's instead
of a vector of pointers to them, but today is not that day.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13100 91177308-0d34-0410-b5e6-96231b3b80d8
structure to being dynamically computed on demand. This makes updating
loop information MUCH easier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13045 91177308-0d34-0410-b5e6-96231b3b80d8
that the exit block of the loop becomes the new entry block of the function.
This was causing a verifier assertion on 252.eon.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13039 91177308-0d34-0410-b5e6-96231b3b80d8
block. The primary motivation for doing this is that we can now unroll nested loops.
This makes a pretty big difference in some cases. For example, in 183.equake,
we are now beating the native compiler with the CBE, and we are a lot closer
with LLC.
I'm now going to play around a bit with the unroll factor and see what effect
it really has.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13034 91177308-0d34-0410-b5e6-96231b3b80d8
limited. Even in it's extremely simple state (it can only *fully* unroll single
basic block loops that execute a constant number of times), it already helps improve
performance a LOT on some benchmarks, particularly with the native code generators.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@13028 91177308-0d34-0410-b5e6-96231b3b80d8
Basically we were using SimplifyCFG as a huge sledgehammer for a simple
optimization. Because simplifycfg does so many things, we can't use it
for this purpose.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12977 91177308-0d34-0410-b5e6-96231b3b80d8
Instead of producing code like this:
Loop:
X = phi 0, X2
...
X2 = X + 1
if (X != N-1) goto Loop
We now generate code that looks like this:
Loop:
X = phi 0, X2
...
X2 = X + 1
if (X2 != N) goto Loop
This has two big advantages:
1. The trip count of the loop is now explicit in the code, allowing
the direct implementation of Loop::getTripCount()
2. This reduces register pressure in the loop, and allows X and X2 to be
put into the same register.
As a consequence of the second point, the code we generate for loops went
from:
.LBB2: # no_exit.1
...
mov %EDI, %ESI
inc %EDI
cmp %ESI, 2
mov %ESI, %EDI
jne .LBB2 # PC rel: no_exit.1
To:
.LBB2: # no_exit.1
...
inc %ESI
cmp %ESI, 3
jne .LBB2 # PC rel: no_exit.1
... which has two fewer moves, and uses one less register.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12961 91177308-0d34-0410-b5e6-96231b3b80d8
This transforms code like this:
%C = or %A, %B
%D = select %cond, %C, %A
into:
%C = select %cond, %B, 0
%D = or %A, %C
Since B is often a constant, the select can often be eliminated. In any case,
this reduces the usage count of A, allowing subsequent optimizations to happen.
This xform applies when the operator is any of:
add, sub, mul, or, xor, and, shl, shr
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12800 91177308-0d34-0410-b5e6-96231b3b80d8
that have a constant operand. This implements
add.ll:test19, shift.ll:test15*, and others that are not tested
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12794 91177308-0d34-0410-b5e6-96231b3b80d8
Now we collect all of the call sites we are interested in inlining, then inline
them. This entirely avoids issues with trying to inline a call site we got by
inlining another call site. This also eliminates iterator invalidation issues.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12770 91177308-0d34-0410-b5e6-96231b3b80d8
This also implements some new features for the indvars pass, including
linear function test replacement, exit value substitution, and it works with
a much more general class of induction variables and loops.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@12620 91177308-0d34-0410-b5e6-96231b3b80d8