The long awaited CAST patch. This introduces 12 new instructions into LLVM
to replace the cast instruction. Corresponding changes throughout LLVM are
provided. This passes llvm-test, llvm/test, and SPEC CPUINT2000 with the
exception of 175.vpr which fails only on a slight floating point output
difference.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@31931 91177308-0d34-0410-b5e6-96231b3b80d8
only do these transformations if there are a small number of phi's.
This speeds up Ptrdist/ks from 2.35s to 2.19s on my mac pro.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@31853 91177308-0d34-0410-b5e6-96231b3b80d8
This patch converts the old SHR instruction into two instructions,
AShr (Arithmetic) and LShr (Logical). The Shr instructions now are not
dependent on the sign of their operands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@31542 91177308-0d34-0410-b5e6-96231b3b80d8
Turn on -Wunused and -Wno-unused-parameter. Clean up most of the resulting
fall out by removing unused variables. Remaining warnings have to do with
unused functions (I didn't want to delete code without review) and unused
variables in generated code. Maintainers should clean up the remaining
issues when they see them. All changes pass DejaGnu tests and Olden.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@31380 91177308-0d34-0410-b5e6-96231b3b80d8
This patch implements the first increment for the Signless Types feature.
All changes pertain to removing the ConstantSInt and ConstantUInt classes
in favor of just using ConstantInt.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@31063 91177308-0d34-0410-b5e6-96231b3b80d8
The critical edge block dominates the dest block if the destblock dominates
all edges other than the one incoming from the critical edge.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@30696 91177308-0d34-0410-b5e6-96231b3b80d8
or when splitting loops with a common header into multiple loops. In particular
the old code would always insert the preheader before the old loop header. This
is disasterous in cases where the loop hasn't been rotated. For example, it can
produce code like:
.. outside the loop...
jmp LBB1_2 #bb13.outer
LBB1_1: #bb1
movsd 8(%esp,%esi,8), %xmm1
mulsd (%edi), %xmm1
addsd %xmm0, %xmm1
addl $24, %edi
incl %esi
jmp LBB1_3 #bb13
LBB1_2: #bb13.outer
leal (%edx,%eax,8), %edi
pxor %xmm1, %xmm1
xorl %esi, %esi
LBB1_3: #bb13
movapd %xmm1, %xmm0
cmpl $4, %esi
jl LBB1_1 #bb1
Note that the loop body is actually LBB1_1 + LBB1_3, which means that the
loop now contains an uncond branch WITHIN it to jump around the inserted
loop header (LBB1_2). Doh.
This patch changes the preheader insertion code to insert it in the right
spot, producing this code:
... outside the loop, fall into the header ...
LBB1_1: #bb13.outer
leal (%edx,%eax,8), %esi
pxor %xmm0, %xmm0
xorl %edi, %edi
jmp LBB1_3 #bb13
LBB1_2: #bb1
movsd 8(%esp,%edi,8), %xmm0
mulsd (%esi), %xmm0
addsd %xmm1, %xmm0
addl $24, %esi
incl %edi
LBB1_3: #bb13
movapd %xmm0, %xmm1
cmpl $4, %edi
jl LBB1_2 #bb1
Totally crazy, no branch in the loop! :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@30587 91177308-0d34-0410-b5e6-96231b3b80d8
reachable, making it general purpose enough for use by InsertPreheaderForLoop.
Eliminate custom dominfo updating code in InsertPreheaderForLoop, using
UpdateDomInfoForRevectoredPreds instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@30586 91177308-0d34-0410-b5e6-96231b3b80d8
Call these from your backend to enjoy setjmp/longjmp goodness, see
lib/Target/IA64/IA64ISelLowering.cpp for an example
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@30095 91177308-0d34-0410-b5e6-96231b3b80d8
Not only will this take huge amounts of compile time, the resultant loop nests
won't be useful for optimization. This reduces loopsimplify time on
Transforms/LoopSimplify/2006-08-11-LoopSimplifyLongTime.ll from ~32s to ~0.4s
with a debug build of llvm on a 2.7Ghz G5.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@29647 91177308-0d34-0410-b5e6-96231b3b80d8
blocks that target loop blocks.
Before, the code was run once per loop, and depended on the number of
predecessors each block in the loop had. Unfortunately, scanning preds can
be really slow when huge numbers of phis exist or when phis with huge numbers
of inputs exist.
Now, the code is run once per function and scans successors instead of preds,
which is far faster. In addition, the new code is simpler and is goto free,
woo.
This change speeds up a nasty testcase Duraid provided me from taking hours to
taking ~72s with a debug build. The functionality this implements is already
tested in the testsuite as Transforms/CodeExtractor/2004-03-13-LoopExtractorCrash.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@29644 91177308-0d34-0410-b5e6-96231b3b80d8
down approach, inspired by discussions with Tanya.
This approach is significantly faster, because it does not need dominator
frontiers and it does not insert extraneous unused PHI nodes. For example, on
252.eon, in a release-asserts build, this speeds up LCSSA (which is the slowest
pass in gccas) from 9.14s to 0.74s on my G5. This code is also slightly smaller
and significantly simpler than the old code.
Amusingly, in a normal Release build (which includes the
"assert(L->isLCSSAForm());" assertion), asserting that the result of LCSSA
is in LCSSA form is actually slower than the LCSSA transformation pass
itself on 252.eon. I will see if Loop::isLCSSAForm can be sped up next.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@29463 91177308-0d34-0410-b5e6-96231b3b80d8
Handle this case, which doesn't require a new callgraph edge. This fixes
a crash compiling MallocBench/gs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@29121 91177308-0d34-0410-b5e6-96231b3b80d8
target CG node. This allows the inliner to properly update the callgraph
when using the pruning inliner. The pruning inliner may not copy over all
call sites from a callee to a caller, so the edges corresponding to those
call sites should not be copied over either.
This fixes PR827 and Transforms/Inline/2006-07-12-InlinePruneCGUpdate.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@29120 91177308-0d34-0410-b5e6-96231b3b80d8
cases. Ideally, this issue will go away in the future as LCSSA gets smarter
about which Phi nodes it inserts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@29076 91177308-0d34-0410-b5e6-96231b3b80d8
LCSSA is still the slowest pass when gccas'ing 252.eon, but now it only takes
39s instead of 289s. :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28776 91177308-0d34-0410-b5e6-96231b3b80d8
not handling PHI nodes correctly when determining if a value was live-out.
This patch reduces the number of detected live-out variables in the testcase
from 6565 to 485.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28771 91177308-0d34-0410-b5e6-96231b3b80d8
If a single exit block has multiple predecessors within the loop, it will
appear in the exit blocks list more than once. LCSSA needs to take that into
account so that it doesn't double process that exit block.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@28750 91177308-0d34-0410-b5e6-96231b3b80d8