early in the cleanup code and one late interlaced with the inliner. The second one is
important because inlining and other scalar optzns can unpin allocas, allowing them to
be split up and promoted. While important for performance, this is also relatively
rare, and we would previously force a (non-lazy) computation of DomFrontiers, which
happened even if nothing became unpinned.
With this patch, the first pass of scalarrepl still promotes the vast bulk of allocas
in programs, but hte second pass has changed to use SSAUpdater, which is more "sparse"
and lazy. This speeds up opt -O3 time on kimwitu++ (a c++ app) by about 1%. The
numbers are interesting: the first pass promotes ~17500 allocas. The second pass
promotes about 1600. For non-C++ codes, the compile time win should be greater,
because the second pass of scalarrepl does less.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123437 91177308-0d34-0410-b5e6-96231b3b80d8
most important simplifications, as well as resolving phase ordering issues where instcombine
would inhibit important CSE'ing opportunities, for instance on BitBench/drop3.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123418 91177308-0d34-0410-b5e6-96231b3b80d8
My i386 llvm-gcc nightly tester found a regression for
SingleSource/Benchmarks/McGill/chomp that a bisect blamed on 122743.
That seems strange but apparently the combination of earlycse and instcombine
did something bad. Chris says he intended to remove the instcombine pass, so
let's go ahead and try that. We'll see if there are any performance losses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122907 91177308-0d34-0410-b5e6-96231b3b80d8
It forms memset and memcpy's, and will someday form popcount and
other stuff. All of this is bad when compiling the implementation
of memset, memcpy, popcount, etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122854 91177308-0d34-0410-b5e6-96231b3b80d8
improvement in the generated code, and speeds up 'opt -std-compile-opts'
compile time on 176.gcc from 24.84s to 23.2s (about 7%).
This also resolves a specific code quality issue in rdar://7352081 which
was generating poor code for:
int t(int a, int b) {
if (a & b & 1)
return a & b;
return 3;
}
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122740 91177308-0d34-0410-b5e6-96231b3b80d8
limitations, this kicks in dozens of times in the 4 specfp2000 benchmarks,
and hundreds of times in the int part. It also kicks in hundreds of times
in multisource.
This kicks in right before loop deletion, which has the pleasant effect of
deleting loops that *just* do a memset.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122664 91177308-0d34-0410-b5e6-96231b3b80d8
does normal initialization and normal chaining. Change the default
AliasAnalysis implementation to NoAlias.
Update StandardCompileOpts.h and friends to explicitly request
BasicAliasAnalysis.
Update tests to explicitly request -basicaa.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116720 91177308-0d34-0410-b5e6-96231b3b80d8
hide jump threading opportunities by turning control flow into data flow. Run an early JumpThreading pass
(adds approximately an additional 1% to optimization time on SPEC), allowing it to get a shot at these cases
first. Fixes <rdar://problem/8447345>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115099 91177308-0d34-0410-b5e6-96231b3b80d8
since none of them use it. With this, we now only run
domfrontier (an N^2 analysis) 3 times at clang -O3: once for
"early" per-function cleanup, once at the start of the
per-function pipeline to support SRoA, and once late because
the EHPrepare class uses it.
EHPrepare needs to stop using it, this is silly and wasteful.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112420 91177308-0d34-0410-b5e6-96231b3b80d8
dependence on DominanceFrontier. Instead, add an explicit DominanceFrontier
pass in StandardPasses.h to ensure that it gets scheduled at the right
time.
Declare that loop unrolling preserves ScalarEvolution, and shuffle some
getAnalysisUsages.
This eliminates one LoopSimplify and one LCCSA run in the standard
compile opts sequence.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109413 91177308-0d34-0410-b5e6-96231b3b80d8
Initial skeleton and SCEVUnknown lowering implemented,
the rest should come relatively quickly. Move testcase
to new directory.
Move pass to right before SimplifyLibCalls - which is
moved down a bit so we can take advantage of a few opts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95628 91177308-0d34-0410-b5e6-96231b3b80d8
This makes both logical sense (see below) and increases the
number of functions marked readnone/readonly by about 1-2%
in practice. The number of functions marked nocapture goes
up by about 5-10%. The reason it makes sense is shown by
the following example: if you run -functionattrs -inline on
it, then no attributes are assigned. But if you instead run
-inline -functionattrs then @f is marked readnone because the
simplifications produced by the inliner eliminate the store.
@x = external global i32
define void @w(i1 %b) {
br i1 %b, label %write, label %return
write:
store i32 1, i32 *@x
br label %return
return:
ret void
}
define void @f() {
call void @w(i1 0)
ret void
}
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@85893 91177308-0d34-0410-b5e6-96231b3b80d8
1. we'd run simplifycfg at the very start, even though
the per function passes have already cleaned this up.
2. In the main per-function pipeline that is interlaced with inlining
etc, we would do instcombine, jump threading, simplifycfg *before*
doing SROA. SROA is much more likely to expose opportunities for
these passes than they are for SROA, so move SRoA up earlier.
also add some comments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@85742 91177308-0d34-0410-b5e6-96231b3b80d8
GlobalOpt already deletes trivially dead functions/globals,
so GlobalDCE only adds values for cycles of dead things.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@85731 91177308-0d34-0410-b5e6-96231b3b80d8
ipconstprop and doesn't take much time. Just run it in its place.
This adds a testcase for it, which I plan to expand to cover other
"integration" cases, where we expect the optimizer to be able to
eliminate various things. Due to phase order issues we've regressed
in a number of areas and integration tests are the only way I see to
prevent this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@85729 91177308-0d34-0410-b5e6-96231b3b80d8
Update all analysis passes and transforms to treat free calls just like FreeInst.
Remove RaiseAllocations and all its tests since FreeInst no longer needs to be raised.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@84987 91177308-0d34-0410-b5e6-96231b3b80d8
this is run after the 'standard function passes', SRoA was
recently run. This saves a domfrontier construction. Thanks
to Eli for noticing this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@82291 91177308-0d34-0410-b5e6-96231b3b80d8