foldMemoryOperand how to "fold" them, by converting them into constant-pool
loads. When they aren't folded, they use xorps/cmpeqd, but for example when
register pressure is high, they may now be folded as memory operands, which
reduces register pressure.
Also, mark V_SET0 isAsCheapAsAMove so that two-address-elimination will
remat it instead of copying zeros around (V_SETALLONES was already marked).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60461 91177308-0d34-0410-b5e6-96231b3b80d8
is set but mayLoad is not set. Fix all the problems this turned up.
Change code to not use isSimpleLoad instead of mayLoad unless it
really wants isSimpleLoad.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60459 91177308-0d34-0410-b5e6-96231b3b80d8
delegates to the regular x86-32 convention which handles byval, but only
after it handles a few cases, and it's necessary to handle byval before
handling those cases. This fixes PR3122 (and rdar://6400815), llvm-gcc
miscompiling LLVM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60453 91177308-0d34-0410-b5e6-96231b3b80d8
1. ppcf128 select is expanded to f64 select's.
2. f64 select operand 0 is an i1 truncate, it's promoted to i32 zero_extend.
3. f64 select is updated. It's changed back to a "NewNode" and being re-analyzed.
4. f64 select operands are being processed. Operand 0 is a "NewNode". It's being expunged out of ReplacedValues map.
5. ExpungeNode tries to remap f64 select and notice it's a "NewNode" and assert.
Duncan, please take a look. Thanks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60443 91177308-0d34-0410-b5e6-96231b3b80d8
- Incorporate Tilmann Scheller's ISD::TRUNCATE custom lowering patch
- Update SPU calling convention info, even if it's not used yet (but can be
at some point or another)
- Ensure that any-extended f32 loads are custom lowered, especially when
they're promoted for use in printf.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60438 91177308-0d34-0410-b5e6-96231b3b80d8
straight-forward implementation. This does not require any extra
alias analysis queries beyond what we already do for non-local loads.
Some programs really really like load PRE. For example, SPASS triggers
this ~1000 times, ~300 times in 255.vortex, and ~1500 times on 403.gcc.
The biggest limitation to the implementation is that it does not split
critical edges. This is a huge killer on many programs and should be
addressed after the initial patch is enabled by default.
The implementation of this should incidentally speed up rejection of
non-local loads because it avoids creating the repl densemap in cases
when it won't be used for fully redundant loads.
This is currently disabled by default.
Before I turn this on, I need to fix a couple of miscompilations in
the testsuite, look at compile time performance numbers, and look at
perf impact. This is pretty close to ready though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60408 91177308-0d34-0410-b5e6-96231b3b80d8
constant. If X is a constant, then this is folded elsewhere.
- Added a note to Target/README.txt to indicate that we'd like to implement
this when we're able.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60399 91177308-0d34-0410-b5e6-96231b3b80d8
a new value numbering set after splitting a critical edge. This increases
the number of instances of PRE on 403.gcc from ~60 to ~570.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60393 91177308-0d34-0410-b5e6-96231b3b80d8
- LowerXADDO lowers [SU]ADDO into an ADD with an implicit EFLAGS define. The
EFLAGS are fed into a SETCC node which has the conditional COND_O or COND_C,
depending on the type of ADDO requested.
- LowerBRCOND now recognizes if it's coming from a SETCC node with COND_O or
COND_C set.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60388 91177308-0d34-0410-b5e6-96231b3b80d8
figuring out the base of the IV. This produces better
code in the example. (Addresses use (IV) instead of
(BASE,IV) - a significant improvement on low-register
machines like x86).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60374 91177308-0d34-0410-b5e6-96231b3b80d8
- Fix v2[if]64 vector insertion code before IBM files a bug report.
- Ensure that zero (0) offsets relative to $sp don't trip an assert
(add $sp, 0 gets legalized to $sp alone, tripping an assert)
- Shuffle masks passed to SPUISD::SHUFB are now v16i8 or v4i32
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60358 91177308-0d34-0410-b5e6-96231b3b80d8
MERGE_VALUES node with only one operand, so get
rid of special code that only existed to handle
that possibility.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60349 91177308-0d34-0410-b5e6-96231b3b80d8
ReplaceNodeResults: rather than returning a node which
must have the same number of results as the original
node (which means mucking around with MERGE_VALUES,
and which is also easy to get wrong since SelectionDAG
folding may mean you don't get the node you expect),
return the results in a vector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60348 91177308-0d34-0410-b5e6-96231b3b80d8
instead of std::sort. This shrinks the release-asserts LSR.o file
by 1100 bytes of code on my system.
We should start using array_pod_sort where possible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60335 91177308-0d34-0410-b5e6-96231b3b80d8
buggy rewrite, this notifies ScalarEvolution of a pending instruction
about to be removed and then erases it, instead of erasing it then
notifying.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60329 91177308-0d34-0410-b5e6-96231b3b80d8
new instructions it simplifies. Because we're threading jumps on edges
with constants coming in from PHI's, we inherently are exposing a lot more
constants to the new block. Folding them and deleting dead conditions
allows the cost model in jump threading to be more accurate as it iterates.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60327 91177308-0d34-0410-b5e6-96231b3b80d8
prevents the passmgr from adding yet-another domtree invocation
for Verifier if there is already one live.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60326 91177308-0d34-0410-b5e6-96231b3b80d8
elimination: when finding dependent load/stores, realize that
they are the same if aliasing claims must alias instead of relying
on the pointers to be exactly equal. This makes load elimination
more aggressive. For example, on 403.gcc, we had:
< 68 gvn - Number of instructions PRE'd
< 152718 gvn - Number of instructions deleted
< 49699 gvn - Number of loads deleted
< 6153 memdep - Number of dirty cached non-local responses
< 169336 memdep - Number of fully cached non-local responses
< 162428 memdep - Number of uncached non-local responses
now we have:
> 64 gvn - Number of instructions PRE'd
> 153623 gvn - Number of instructions deleted
> 49856 gvn - Number of loads deleted
> 5022 memdep - Number of dirty cached non-local responses
> 159030 memdep - Number of fully cached non-local responses
> 162443 memdep - Number of uncached non-local responses
That's an extra 157 loads deleted and extra 905 other instructions nuked.
This slows down GVN very slightly, from 3.91 to 3.96s.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60314 91177308-0d34-0410-b5e6-96231b3b80d8
vector instead of a densemap. This shrinks the memory usage of this thing
substantially (the high water mark) as well as making operations like
scanning it faster. This speeds up memdep slightly, gvn goes from
3.9376 to 3.9118s on 403.gcc
This also splits out the statistics for the cached non-local case to
differentiate between the dirty and clean cached case. Here's the stats
for 403.gcc:
6153 memdep - Number of dirty cached non-local responses
169336 memdep - Number of fully cached non-local responses
162428 memdep - Number of uncached non-local responses
yay for caching :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60313 91177308-0d34-0410-b5e6-96231b3b80d8
Note that the FoldOpIntoPhi call is dead because it's impossible for the
first operand of a subtraction to be both a ConstantInt and a PHINode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60306 91177308-0d34-0410-b5e6-96231b3b80d8
getAnalysis<>. getAnalysis<> is apparently extremely expensive.
Doing this speeds up GVN on 403.gcc by 16%!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60304 91177308-0d34-0410-b5e6-96231b3b80d8
multiplies.
Some more cleverness would be nice, though. It would be nice if we
could do this transformation on illegal types. Also, we would
prefer a narrower constant when possible so that we can use a narrower
multiply, which can be cheaper.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60283 91177308-0d34-0410-b5e6-96231b3b80d8
"For signed integers, the determination of overflow of x*y is not so simple. If
x and y have the same sign, then overflow occurs iff xy > 2**31 - 1. If they
have opposite signs, then overflow occurs iff xy < -2**31."
In this case, x == -1.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60278 91177308-0d34-0410-b5e6-96231b3b80d8
nearby FIXME.
I'm not sure what the right way to fix the Cell test was; if the
approach I used isn't okay, please let me know.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60277 91177308-0d34-0410-b5e6-96231b3b80d8
overflowed on negation. This commit checks to make sure that neithe C nor X
overflows. This requires that the RHS of X (a subtract instruction) be a
constant integer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60275 91177308-0d34-0410-b5e6-96231b3b80d8
ReverseLocalDeps when we update it. This fixes a regression test
failure from my last commit.
Second, for each non-local cached information structure, keep a bit that
indicates whether it is dirty or not. This saves us a scan over the whole
thing in the common case when it isn't dirty.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60274 91177308-0d34-0410-b5e6-96231b3b80d8
instead of containing them by value. This increases the density
(!) of NonLocalDeps as well as making the reallocation case
faster. This speeds up gvn on 403.gcc by 2% and makes room for
future improvements.
I'm not super thrilled with having to explicitly manage the new/delete
of the map, but it is necesary for the next change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60271 91177308-0d34-0410-b5e6-96231b3b80d8
If we see that a load depends on the allocation of its memory with no
intervening stores, we now return a 'None' depedency instead of "Normal".
This tweaks GVN to do its optimization with the new result.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60267 91177308-0d34-0410-b5e6-96231b3b80d8
method that returns its result as a DepResultTy instead of as a
MemDepResult. This reduces conversion back and forth.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60266 91177308-0d34-0410-b5e6-96231b3b80d8
dependencies. The basic situation was this: consider if we had:
store1
...
store2
...
store3
Where memdep thinks that store3 depends on store2 and store2 depends
on store1. The problem happens when we delete store2: The code in
question was updating dep info for store3 to be store1. This is a
spiffy optimization, but is not safe at all, because aliasing isn't
transitive. This bug isn't exposed today with DSE because DSE will only
zap store2 if it is identifical to store 3, and in this case, it is
safe to update it to depend on store1. However, memcpyopt is not so
fortunate, which is presumably why the "dropInstruction" code used to
exist.
Since this doesn't actually provide a speedup in practice, just rip the
code out.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60263 91177308-0d34-0410-b5e6-96231b3b80d8
an entry in the nonlocal deps map, don't reset entries
referencing that instruction to [dirty, null], instead, set
them to [dirty,next] where next is the instruction after the
deleted one. Use this information in the non-local deps
code to avoid rescanning entire blocks.
This speeds up GVN slightly by avoiding pointless work. On
403.gcc this makes GVN 1.5% faster.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60256 91177308-0d34-0410-b5e6-96231b3b80d8
Put a some code back to handle buggy behavior that GVN expects: it wants
loads to depend on each other, and accesses to depend on their allocations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60240 91177308-0d34-0410-b5e6-96231b3b80d8
former does caching, the later doesn't. This dramatically simplifies
the logic in getDependency and getDependencyFrom.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60234 91177308-0d34-0410-b5e6-96231b3b80d8
Document the Dirty value more precisely, use it for the uninitialized
DepResultTy value. Change reverse mappings to be from an instruction*
instead of DepResultTy, and stop tracking other forms. This makes it more
clear that we only care about the instruction cases.
Eliminate a DepResultTy,bool pair by using Dirty in the local case as well,
shrinking the map and simplifying the code.
This speeds up GVN by ~3% on 403.gcc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60232 91177308-0d34-0410-b5e6-96231b3b80d8
query. This makes it crystal clear what cases can escape from MemDep that
the clients have to handle. This also gives the clients a nice simplified
interface to it that is easy to poke at.
This patch also makes DepResultTy and MemoryDependenceAnalysis::DepType
private, yay.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60231 91177308-0d34-0410-b5e6-96231b3b80d8
of a pointer/int pair instead of a manually bitmangled pointer.
This forces clients to think a little more about checking the
appropriate pieces and will be useful for internal
implementation improvements later.
I'm not particularly happy with this. After going through this
I don't think that the clients of memdep should be exposed to
the internal type at all. I'll fix this in a subsequent commit.
This has no functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60230 91177308-0d34-0410-b5e6-96231b3b80d8
properly updates the reverse dependency map when it installs updated
dependencies for instructions that depend on the removed instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60222 91177308-0d34-0410-b5e6-96231b3b80d8
but it doesn't make any sense at all.
Also make the method const, private, and fit in 80 cols while we're at it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60215 91177308-0d34-0410-b5e6-96231b3b80d8
nothing to do with dead instruction elimination. No tests in
dejagnu depend on this, so I don't know what it was needed for.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60202 91177308-0d34-0410-b5e6-96231b3b80d8
wrappers around the interesting code and use an obscure iterator
abstraction that dates back many many years.
Move EraseDeadInstructions to Transforms/Utils and name it
RecursivelyDeleteTriviallyDeadInstructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60191 91177308-0d34-0410-b5e6-96231b3b80d8
Despite changing the order of evaluation, this doesn't actually change the
meaning of the statement.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60177 91177308-0d34-0410-b5e6-96231b3b80d8
1. Make it fold blocks separated by an unconditional branch. This enables
jump threading to see a broader scope.
2. Make jump threading able to eliminate locally redundant loads when they
feed the branch condition of a block. This frequently occurs due to
reg2mem running.
3. Make jump threading able to eliminate *partially redundant* loads when
they feed the branch condition of a block. This is common in code with
lots of loads and stores like C++ code and 255.vortex.
This implements thread-loads.ll and rdar://6402033.
Per the fixme's, several pieces of this should be moved into Transforms/Utils.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60148 91177308-0d34-0410-b5e6-96231b3b80d8