cleans up the generated code a bit. This should have the added benefit of
not randomly renaming functions/globals like my previous patch did. :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61023 91177308-0d34-0410-b5e6-96231b3b80d8
llvm[2]: Linking Release executable opt (without symbols)
...
Undefined symbols:
"llvm::APFloat::IEEEsingle", referenced from:
__ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(Constants.o)
__ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(AsmWriter.o)
__ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(ConstantFold.o)
"llvm::APFloat::IEEEdouble", referenced from:
__ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(Constants.o)
__ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(AsmWriter.o)
__ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(ConstantFold.o)
ld: symbol(s) not found
This is in release mode. To replicate, compile llvm and llvm-gcc in optimized
mode. Then build llvm, in optimized mode, with the newly created compiler.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60977 91177308-0d34-0410-b5e6-96231b3b80d8
of a pointer. This allows is to catch more equivalencies. For example,
the type_lists_compatible_p function used to require two iterations of
the gvn pass (!) to delete its 18 redundant loads because the first pass
would CSE all the addressing computation cruft, which would unblock the
second memdep/gvn passes from recognizing them. This change allows
memdep/gvn to catch all 18 when run just once on the function (as is
typical :) instead of just 3.
On all of 403.gcc, this bumps up the # reundandancies found from:
63 gvn - Number of instructions PRE'd
153991 gvn - Number of instructions deleted
50069 gvn - Number of loads deleted
to:
63 gvn - Number of instructions PRE'd
154137 gvn - Number of instructions deleted
50185 gvn - Number of loads deleted
+120 loads deleted isn't bad.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60799 91177308-0d34-0410-b5e6-96231b3b80d8
MemDep::getNonLocalPointerDependency method. There are
some open issues with this (missed optimizations) and
plenty of future work, but this does allow GVN to eliminate
*slightly* more loads (49246 vs 49033).
Switching over now allows simplification of the other code
path in memdep.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60780 91177308-0d34-0410-b5e6-96231b3b80d8
jump threading has been shown to only expose problems not
have bugs itself. I'm sure it's completely bug free! ;-)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60725 91177308-0d34-0410-b5e6-96231b3b80d8
doesn't do its own local caching, and is slightly more aggressive about
free/store dse (see testcase). This eliminates the last external client
of MemDep::getDependenceFrom().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60619 91177308-0d34-0410-b5e6-96231b3b80d8
loops when they can be subsumed into addressing modes.
Change X86 addressing mode check to realize that
some PIC references need an extra register.
(I believe this is correct for Linux, if not, I'm sure
someone will tell me.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60608 91177308-0d34-0410-b5e6-96231b3b80d8
1. Merge the 'None' result into 'Normal', making loads
and stores return their dependencies on allocations as Normal.
2. Split the 'Normal' result into 'Clobber' and 'Def' to
distinguish between the cases when memdep knows the value is
produced from when we just know if may be changed.
3. Move some of the logic for determining whether readonly calls
are CSEs into memdep instead of it being in GVN. This still
leaves verification that the arguments are hte same to GVN to
let it know about value equivalences in different contexts.
4. Change memdep's call/call dependency analysis to use
getModRefInfo(CallSite,CallSite) instead of doing something
very weak. This only really matters for things like DSA, but
someday maybe we'll have some other decent context sensitive
analyses :)
5. This reimplements the guts of memdep to handle the new results.
6. This simplifies GVN significantly:
a) readonly call CSE is slightly simpler
b) I eliminated the "getDependencyFrom" chaining for load
elimination and load CSE doesn't have to worry about
volatile (they are always clobbers) anymore.
c) GVN no longer does any 'lastLoad' caching, leaving it to
memdep.
7. The logic in DSE is simplified a bit and sped up. A potentially
unsafe case was eliminated.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60607 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes many bugs. I will add more test cases in a separate check-in.
Some day, the code that manipulates CFG and updates dom. info could use refactoring help.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60554 91177308-0d34-0410-b5e6-96231b3b80d8
1) have it fold "br undef", which does occur with
surprising frequency as jump threading iterates.
2) teach j-t to delete dead blocks. This removes the successor
edges, reducing the in-edges of other blocks, allowing
recursive simplification.
3) Fold things like:
br COND, BBX, BBY
BBX:
br COND, BBZ, BBW
which also happens because jump threading iterates.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60470 91177308-0d34-0410-b5e6-96231b3b80d8
straight-forward implementation. This does not require any extra
alias analysis queries beyond what we already do for non-local loads.
Some programs really really like load PRE. For example, SPASS triggers
this ~1000 times, ~300 times in 255.vortex, and ~1500 times on 403.gcc.
The biggest limitation to the implementation is that it does not split
critical edges. This is a huge killer on many programs and should be
addressed after the initial patch is enabled by default.
The implementation of this should incidentally speed up rejection of
non-local loads because it avoids creating the repl densemap in cases
when it won't be used for fully redundant loads.
This is currently disabled by default.
Before I turn this on, I need to fix a couple of miscompilations in
the testsuite, look at compile time performance numbers, and look at
perf impact. This is pretty close to ready though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60408 91177308-0d34-0410-b5e6-96231b3b80d8
constant. If X is a constant, then this is folded elsewhere.
- Added a note to Target/README.txt to indicate that we'd like to implement
this when we're able.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60399 91177308-0d34-0410-b5e6-96231b3b80d8
a new value numbering set after splitting a critical edge. This increases
the number of instances of PRE on 403.gcc from ~60 to ~570.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60393 91177308-0d34-0410-b5e6-96231b3b80d8
figuring out the base of the IV. This produces better
code in the example. (Addresses use (IV) instead of
(BASE,IV) - a significant improvement on low-register
machines like x86).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60374 91177308-0d34-0410-b5e6-96231b3b80d8
instead of std::sort. This shrinks the release-asserts LSR.o file
by 1100 bytes of code on my system.
We should start using array_pod_sort where possible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60335 91177308-0d34-0410-b5e6-96231b3b80d8
buggy rewrite, this notifies ScalarEvolution of a pending instruction
about to be removed and then erases it, instead of erasing it then
notifying.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60329 91177308-0d34-0410-b5e6-96231b3b80d8
new instructions it simplifies. Because we're threading jumps on edges
with constants coming in from PHI's, we inherently are exposing a lot more
constants to the new block. Folding them and deleting dead conditions
allows the cost model in jump threading to be more accurate as it iterates.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60327 91177308-0d34-0410-b5e6-96231b3b80d8
elimination: when finding dependent load/stores, realize that
they are the same if aliasing claims must alias instead of relying
on the pointers to be exactly equal. This makes load elimination
more aggressive. For example, on 403.gcc, we had:
< 68 gvn - Number of instructions PRE'd
< 152718 gvn - Number of instructions deleted
< 49699 gvn - Number of loads deleted
< 6153 memdep - Number of dirty cached non-local responses
< 169336 memdep - Number of fully cached non-local responses
< 162428 memdep - Number of uncached non-local responses
now we have:
> 64 gvn - Number of instructions PRE'd
> 153623 gvn - Number of instructions deleted
> 49856 gvn - Number of loads deleted
> 5022 memdep - Number of dirty cached non-local responses
> 159030 memdep - Number of fully cached non-local responses
> 162443 memdep - Number of uncached non-local responses
That's an extra 157 loads deleted and extra 905 other instructions nuked.
This slows down GVN very slightly, from 3.91 to 3.96s.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60314 91177308-0d34-0410-b5e6-96231b3b80d8
vector instead of a densemap. This shrinks the memory usage of this thing
substantially (the high water mark) as well as making operations like
scanning it faster. This speeds up memdep slightly, gvn goes from
3.9376 to 3.9118s on 403.gcc
This also splits out the statistics for the cached non-local case to
differentiate between the dirty and clean cached case. Here's the stats
for 403.gcc:
6153 memdep - Number of dirty cached non-local responses
169336 memdep - Number of fully cached non-local responses
162428 memdep - Number of uncached non-local responses
yay for caching :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60313 91177308-0d34-0410-b5e6-96231b3b80d8
Note that the FoldOpIntoPhi call is dead because it's impossible for the
first operand of a subtraction to be both a ConstantInt and a PHINode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60306 91177308-0d34-0410-b5e6-96231b3b80d8
"For signed integers, the determination of overflow of x*y is not so simple. If
x and y have the same sign, then overflow occurs iff xy > 2**31 - 1. If they
have opposite signs, then overflow occurs iff xy < -2**31."
In this case, x == -1.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60278 91177308-0d34-0410-b5e6-96231b3b80d8
overflowed on negation. This commit checks to make sure that neithe C nor X
overflows. This requires that the RHS of X (a subtract instruction) be a
constant integer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60275 91177308-0d34-0410-b5e6-96231b3b80d8
If we see that a load depends on the allocation of its memory with no
intervening stores, we now return a 'None' depedency instead of "Normal".
This tweaks GVN to do its optimization with the new result.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60267 91177308-0d34-0410-b5e6-96231b3b80d8
former does caching, the later doesn't. This dramatically simplifies
the logic in getDependency and getDependencyFrom.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60234 91177308-0d34-0410-b5e6-96231b3b80d8
query. This makes it crystal clear what cases can escape from MemDep that
the clients have to handle. This also gives the clients a nice simplified
interface to it that is easy to poke at.
This patch also makes DepResultTy and MemoryDependenceAnalysis::DepType
private, yay.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60231 91177308-0d34-0410-b5e6-96231b3b80d8
of a pointer/int pair instead of a manually bitmangled pointer.
This forces clients to think a little more about checking the
appropriate pieces and will be useful for internal
implementation improvements later.
I'm not particularly happy with this. After going through this
I don't think that the clients of memdep should be exposed to
the internal type at all. I'll fix this in a subsequent commit.
This has no functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60230 91177308-0d34-0410-b5e6-96231b3b80d8
nothing to do with dead instruction elimination. No tests in
dejagnu depend on this, so I don't know what it was needed for.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60202 91177308-0d34-0410-b5e6-96231b3b80d8
wrappers around the interesting code and use an obscure iterator
abstraction that dates back many many years.
Move EraseDeadInstructions to Transforms/Utils and name it
RecursivelyDeleteTriviallyDeadInstructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60191 91177308-0d34-0410-b5e6-96231b3b80d8
1. Make it fold blocks separated by an unconditional branch. This enables
jump threading to see a broader scope.
2. Make jump threading able to eliminate locally redundant loads when they
feed the branch condition of a block. This frequently occurs due to
reg2mem running.
3. Make jump threading able to eliminate *partially redundant* loads when
they feed the branch condition of a block. This is common in code with
lots of loads and stores like C++ code and 255.vortex.
This implements thread-loads.ll and rdar://6402033.
Per the fixme's, several pieces of this should be moved into Transforms/Utils.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60148 91177308-0d34-0410-b5e6-96231b3b80d8
performance in most cases on the Grawp tester, but does speed some
things up (like shootout/hash by 15%). This also doesn't impact
compile time in a noticable way on the Grawp tester.
It also, of course, gets the testcase it was designed for right :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60120 91177308-0d34-0410-b5e6-96231b3b80d8
heuristic: the value is already live at the new memory operation if
it is used by some other instruction in the memop's block. This is
cheap and simple to compute (moreso than full liveness).
This improves the new heuristic even more. For example, it cuts two
out of three new instructions out of 255.vortex:DbmFileInGrpHdr,
which is one of the functions that the heuristic regressed. This
overall eliminates another 40 instructions from 403.gcc and visibly
reduces register pressure in 255.vortex (though this only actually
ends up saving the 2 instructions from the whole program).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60084 91177308-0d34-0410-b5e6-96231b3b80d8
phrased in terms of liveness instead of as a horrible hack. :)
In pratice, this doesn't change the generated code for either
255.vortex or 403.gcc, but it could cause minor code changes in
theory. This is framework for coming changes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60082 91177308-0d34-0410-b5e6-96231b3b80d8
-enable-smarter-addr-folding to llc) that gives CGP a better
cost model for when to sink computations into addressing modes.
The basic observation is that sinking increases register
pressure when part of the addr computation has to be available
for other reasons, such as having a use that is a non-memory
operation. In cases where it works, it can substantially reduce
register pressure.
This code is currently an overall win on 403.gcc and 255.vortex
(the two things I've been looking at), but there are several
things I want to do before enabling it by default:
1. This isn't doing any caching of results, so it is much slower
than it could be. It currently slows down release-asserts llc
by 1.7% on 176.gcc: 27.12s -> 27.60s.
2. This doesn't think about inline asm memory operands yet.
3. The cost model botches the case when the needed value is live
across the computation for other reasons.
I'll continue poking at this, and eventually turn it on as llcbeta.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60074 91177308-0d34-0410-b5e6-96231b3b80d8
optimize addressing modes. This allows us to optimize things like isel-sink2.ll
into:
movl 4(%esp), %eax
cmpb $0, 4(%eax)
jne LBB1_2 ## F
LBB1_1: ## TB
movl $4, %eax
ret
LBB1_2: ## F
movzbl 7(%eax), %eax
ret
instead of:
_test:
movl 4(%esp), %eax
cmpb $0, 4(%eax)
leal 4(%eax), %eax
jne LBB1_2 ## F
LBB1_1: ## TB
movl $4, %eax
ret
LBB1_2: ## F
movzbl 3(%eax), %eax
ret
This shrinks (e.g.) 403.gcc from 1133510 to 1128345 lines of .s.
Note that the 2008-10-16-SpillerBug.ll testcase is dubious at best, I doubt
it is really testing what it thinks it is.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60068 91177308-0d34-0410-b5e6-96231b3b80d8
can recursively match things) and scales by 0 by ignoring them.
This triggers once in 403.gcc, saving 1 (!!!!) instruction in the
whole huge app.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60013 91177308-0d34-0410-b5e6-96231b3b80d8
into a new AddressingModeMatcher class. This makes it easier
to reason about and reduces passing around of stuff, but has
no functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60012 91177308-0d34-0410-b5e6-96231b3b80d8
g++ -m32 -c -g -DIN_GCC -W -Wall -Wwrite-strings -Wmissing-format-attribute -fno-common -mdynamic-no-pic -DHAVE_CONFIG_H -Wno-unused -DTARGET_NAME=\"i386-apple-darwin9.5.0\" -I. -I. -I../../llvm-gcc.src/gcc -I../../llvm-gcc.src/gcc/. -I../../llvm-gcc.src/gcc/../include -I./../intl -I../../llvm-gcc.src/gcc/../libcpp/include -I../../llvm-gcc.src/gcc/../libdecnumber -I../libdecnumber -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/include -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/include -DENABLE_LLVM -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/../llvm.src/include -D_DEBUG -D_GNU_SOURCE -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -I. -I. -I../../llvm-gcc.src/gcc -I../../llvm-gcc.src/gcc/. -I../../llvm-gcc.src/gcc/../include -I./../intl -I../../llvm-gcc.src/gcc/../libcpp/include -I../../llvm-gcc.src/gcc/../libdecnumber -I../libdecnumber -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/include -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/include ../../llvm-gcc.src/gcc/llvm-types.cpp -o llvm-types.o
../../llvm-gcc.src/gcc/llvm-convert.cpp: In member function 'void TreeToLLVM::EmitMemCpy(llvm::Value*, llvm::Value*, llvm::Value*, unsigned int)':
../../llvm-gcc.src/gcc/llvm-convert.cpp:1496: error: 'memcpy_i32' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp:1496: error: 'memcpy_i64' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp: In member function 'void TreeToLLVM::EmitMemMove(llvm::Value*, llvm::Value*, llvm::Value*, unsigned int)':
../../llvm-gcc.src/gcc/llvm-convert.cpp:1512: error: 'memmove_i32' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp:1512: error: 'memmove_i64' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp: In member function 'void TreeToLLVM::EmitMemSet(llvm::Value*, llvm::Value*, llvm::Value*, unsigned int)':
../../llvm-gcc.src/gcc/llvm-convert.cpp:1528: error: 'memset_i32' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp:1528: error: 'memset_i64' is not a member of 'llvm::Intrinsic'
make[3]: *** [llvm-convert.o] Error 1
make[3]: *** Waiting for unfinished jobs....
rm fsf-funding.pod gcov.pod gfdl.pod cpp.pod gpl.pod gcc.pod
make[2]: *** [all-stage1-gcc] Error 2
make[1]: *** [stage1-bubble] Error 2
make: *** [all] Error 2
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@59809 91177308-0d34-0410-b5e6-96231b3b80d8
it is likely that the optimizer deleted code in between these
two intrinsics. Keep only the last llvm.dbg.stoppoint in this case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@59657 91177308-0d34-0410-b5e6-96231b3b80d8