the frame reference. This will help post-RA scheduling determine
that spills to distinct stack slots are independent.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60486 91177308-0d34-0410-b5e6-96231b3b80d8
1) have it fold "br undef", which does occur with
surprising frequency as jump threading iterates.
2) teach j-t to delete dead blocks. This removes the successor
edges, reducing the in-edges of other blocks, allowing
recursive simplification.
3) Fold things like:
br COND, BBX, BBY
BBX:
br COND, BBZ, BBW
which also happens because jump threading iterates.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60470 91177308-0d34-0410-b5e6-96231b3b80d8
foldMemoryOperand how to "fold" them, by converting them into constant-pool
loads. When they aren't folded, they use xorps/cmpeqd, but for example when
register pressure is high, they may now be folded as memory operands, which
reduces register pressure.
Also, mark V_SET0 isAsCheapAsAMove so that two-address-elimination will
remat it instead of copying zeros around (V_SETALLONES was already marked).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60461 91177308-0d34-0410-b5e6-96231b3b80d8
is set but mayLoad is not set. Fix all the problems this turned up.
Change code to not use isSimpleLoad instead of mayLoad unless it
really wants isSimpleLoad.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60459 91177308-0d34-0410-b5e6-96231b3b80d8
delegates to the regular x86-32 convention which handles byval, but only
after it handles a few cases, and it's necessary to handle byval before
handling those cases. This fixes PR3122 (and rdar://6400815), llvm-gcc
miscompiling LLVM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60453 91177308-0d34-0410-b5e6-96231b3b80d8
1. ppcf128 select is expanded to f64 select's.
2. f64 select operand 0 is an i1 truncate, it's promoted to i32 zero_extend.
3. f64 select is updated. It's changed back to a "NewNode" and being re-analyzed.
4. f64 select operands are being processed. Operand 0 is a "NewNode". It's being expunged out of ReplacedValues map.
5. ExpungeNode tries to remap f64 select and notice it's a "NewNode" and assert.
Duncan, please take a look. Thanks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60443 91177308-0d34-0410-b5e6-96231b3b80d8
- Incorporate Tilmann Scheller's ISD::TRUNCATE custom lowering patch
- Update SPU calling convention info, even if it's not used yet (but can be
at some point or another)
- Ensure that any-extended f32 loads are custom lowered, especially when
they're promoted for use in printf.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60438 91177308-0d34-0410-b5e6-96231b3b80d8
straight-forward implementation. This does not require any extra
alias analysis queries beyond what we already do for non-local loads.
Some programs really really like load PRE. For example, SPASS triggers
this ~1000 times, ~300 times in 255.vortex, and ~1500 times on 403.gcc.
The biggest limitation to the implementation is that it does not split
critical edges. This is a huge killer on many programs and should be
addressed after the initial patch is enabled by default.
The implementation of this should incidentally speed up rejection of
non-local loads because it avoids creating the repl densemap in cases
when it won't be used for fully redundant loads.
This is currently disabled by default.
Before I turn this on, I need to fix a couple of miscompilations in
the testsuite, look at compile time performance numbers, and look at
perf impact. This is pretty close to ready though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60408 91177308-0d34-0410-b5e6-96231b3b80d8
constant. If X is a constant, then this is folded elsewhere.
- Added a note to Target/README.txt to indicate that we'd like to implement
this when we're able.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60399 91177308-0d34-0410-b5e6-96231b3b80d8
a new value numbering set after splitting a critical edge. This increases
the number of instances of PRE on 403.gcc from ~60 to ~570.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60393 91177308-0d34-0410-b5e6-96231b3b80d8
- LowerXADDO lowers [SU]ADDO into an ADD with an implicit EFLAGS define. The
EFLAGS are fed into a SETCC node which has the conditional COND_O or COND_C,
depending on the type of ADDO requested.
- LowerBRCOND now recognizes if it's coming from a SETCC node with COND_O or
COND_C set.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60388 91177308-0d34-0410-b5e6-96231b3b80d8
figuring out the base of the IV. This produces better
code in the example. (Addresses use (IV) instead of
(BASE,IV) - a significant improvement on low-register
machines like x86).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60374 91177308-0d34-0410-b5e6-96231b3b80d8
- Fix v2[if]64 vector insertion code before IBM files a bug report.
- Ensure that zero (0) offsets relative to $sp don't trip an assert
(add $sp, 0 gets legalized to $sp alone, tripping an assert)
- Shuffle masks passed to SPUISD::SHUFB are now v16i8 or v4i32
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60358 91177308-0d34-0410-b5e6-96231b3b80d8
MERGE_VALUES node with only one operand, so get
rid of special code that only existed to handle
that possibility.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60349 91177308-0d34-0410-b5e6-96231b3b80d8
ReplaceNodeResults: rather than returning a node which
must have the same number of results as the original
node (which means mucking around with MERGE_VALUES,
and which is also easy to get wrong since SelectionDAG
folding may mean you don't get the node you expect),
return the results in a vector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60348 91177308-0d34-0410-b5e6-96231b3b80d8
instead of std::sort. This shrinks the release-asserts LSR.o file
by 1100 bytes of code on my system.
We should start using array_pod_sort where possible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60335 91177308-0d34-0410-b5e6-96231b3b80d8
buggy rewrite, this notifies ScalarEvolution of a pending instruction
about to be removed and then erases it, instead of erasing it then
notifying.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60329 91177308-0d34-0410-b5e6-96231b3b80d8
new instructions it simplifies. Because we're threading jumps on edges
with constants coming in from PHI's, we inherently are exposing a lot more
constants to the new block. Folding them and deleting dead conditions
allows the cost model in jump threading to be more accurate as it iterates.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60327 91177308-0d34-0410-b5e6-96231b3b80d8
prevents the passmgr from adding yet-another domtree invocation
for Verifier if there is already one live.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60326 91177308-0d34-0410-b5e6-96231b3b80d8
elimination: when finding dependent load/stores, realize that
they are the same if aliasing claims must alias instead of relying
on the pointers to be exactly equal. This makes load elimination
more aggressive. For example, on 403.gcc, we had:
< 68 gvn - Number of instructions PRE'd
< 152718 gvn - Number of instructions deleted
< 49699 gvn - Number of loads deleted
< 6153 memdep - Number of dirty cached non-local responses
< 169336 memdep - Number of fully cached non-local responses
< 162428 memdep - Number of uncached non-local responses
now we have:
> 64 gvn - Number of instructions PRE'd
> 153623 gvn - Number of instructions deleted
> 49856 gvn - Number of loads deleted
> 5022 memdep - Number of dirty cached non-local responses
> 159030 memdep - Number of fully cached non-local responses
> 162443 memdep - Number of uncached non-local responses
That's an extra 157 loads deleted and extra 905 other instructions nuked.
This slows down GVN very slightly, from 3.91 to 3.96s.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60314 91177308-0d34-0410-b5e6-96231b3b80d8
vector instead of a densemap. This shrinks the memory usage of this thing
substantially (the high water mark) as well as making operations like
scanning it faster. This speeds up memdep slightly, gvn goes from
3.9376 to 3.9118s on 403.gcc
This also splits out the statistics for the cached non-local case to
differentiate between the dirty and clean cached case. Here's the stats
for 403.gcc:
6153 memdep - Number of dirty cached non-local responses
169336 memdep - Number of fully cached non-local responses
162428 memdep - Number of uncached non-local responses
yay for caching :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60313 91177308-0d34-0410-b5e6-96231b3b80d8
Note that the FoldOpIntoPhi call is dead because it's impossible for the
first operand of a subtraction to be both a ConstantInt and a PHINode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60306 91177308-0d34-0410-b5e6-96231b3b80d8
getAnalysis<>. getAnalysis<> is apparently extremely expensive.
Doing this speeds up GVN on 403.gcc by 16%!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60304 91177308-0d34-0410-b5e6-96231b3b80d8