nodes. This allows it to do fairly general phi insertion if a
load from a pointer global wants to be SRAd but the load is used
by (recursive) phi nodes. This fixes a pessimization on ppc
introduced by Load PRE.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61123 91177308-0d34-0410-b5e6-96231b3b80d8
temporary workaround for an obscure bug. When node cloning is
used, it is possible that more SUnits will be created, and
if the SUnits std::vector has to reallocate, it will
invalidate all the graph edges.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61122 91177308-0d34-0410-b5e6-96231b3b80d8
DAGTypeLegalizer::ExpandShiftWithKnownAmountBit.
In terms of restoring the optimization, the best fix here isn't
obvious... any ideas?
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61119 91177308-0d34-0410-b5e6-96231b3b80d8
can be negative. Keep track of whether all uses of
an IV are outside the loop. Some cosmetics; no
functional change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61109 91177308-0d34-0410-b5e6-96231b3b80d8
consistently for deleting branches. In addition to being slightly
more readable, this makes SimplifyCFG a bit better
about cleaning up after itself when it makes conditions unused.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61100 91177308-0d34-0410-b5e6-96231b3b80d8
position in the critical path during the main instruction walk. This
eliminates the need for the CritialAntiDep DenseMap.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61096 91177308-0d34-0410-b5e6-96231b3b80d8
which source/line a certain BB/instruction comes from, original variable names,
and original (unmangled) C++ name of functions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61085 91177308-0d34-0410-b5e6-96231b3b80d8
visited set before they are used. If used, their blocks need to be
added to the visited set so that subsequent queries don't use conflicting
pointer values in the cache result blocks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61080 91177308-0d34-0410-b5e6-96231b3b80d8
one of its aliases defined. This is conservative, but tricky subreg
corner cases are outside the primary aim of this pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61077 91177308-0d34-0410-b5e6-96231b3b80d8
computation code. Also, avoid adding output-depenency edges when both
defs are dead, which frequently happens with EFLAGS defs.
Compute Depth and Height lazily, and always in terms of edge latency
values. For the schedulers that don't care about latency, edge latencies
are set to 1.
Eliminate Cycle and CycleBound, and LatencyPriorityQueue's Latencies array.
These are all subsumed by the Depth and Height fields.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61073 91177308-0d34-0410-b5e6-96231b3b80d8
alignment attribute such that 0 means unaligned.
This will probably require a rebuild of llvm-gcc because of the change to
Attributes.h. If you see many test failures on "make check", please rebuild
your llvm-gcc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61030 91177308-0d34-0410-b5e6-96231b3b80d8
and insert vector element. Modified extract vector element to extend the
result to match the expected promoted type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61029 91177308-0d34-0410-b5e6-96231b3b80d8
CFG when there is exactly one predecessor where the load is not available.
This is designed to not increase code size but still eliminate partially
redundant loads. This fires 1765 times on 403.gcc even though it doesn't
do critical edge splitting yet (the most common reason for it to fail).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61027 91177308-0d34-0410-b5e6-96231b3b80d8
cleans up the generated code a bit. This should have the added benefit of
not randomly renaming functions/globals like my previous patch did. :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61023 91177308-0d34-0410-b5e6-96231b3b80d8
memdep keeps track of how PHIs affect the pointer in dep queries, which
allows it to eliminate the load in cases like rle-phi-translate.ll, which
basically end up being:
BB1:
X = load P
br BB3
BB2:
Y = load Q
br BB3
BB3:
R = phi [P] [Q]
load R
turning "load R" into a phi of X/Y. In addition to additional exposed
opportunities, this makes memdep safe in many cases that it wasn't before
(which is required for load PRE) and also makes it substantially more
efficient. For example, consider:
bb1: // has many predecessors.
P = some_operator()
load P
In this example, previously memdep would scan all the predecessors of BB1
to see if they had something that would mustalias P. In some cases (e.g.
test/Transforms/GVN/rle-must-alias.ll) it would actually find them and end
up eliminating something. In many other cases though, it would scan and not
find anything useful. MemDep now stops at a block if the pointer is defined
in that block and cannot be phi translated to predecessors. This causes it
to miss the (rare) cases like rle-must-alias.ll, but makes it faster by not
scanning tons of stuff that is unlikely to be useful. For example, this
speeds up GVN as a whole from 3.928s to 2.448s (60%)!. IMO, scalar GVN
should be enhanced to simplify the rle-must-alias pointer base anyway, which
would allow the loads to be eliminated.
In the future, this should be enhanced to phi translate through geps and
bitcasts as well (as indicated by FIXMEs) making memdep even more powerful.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61022 91177308-0d34-0410-b5e6-96231b3b80d8
callee will not introduce any new aliases of that pointer.
The attributes had all bits allocated already, so I decided to collapse
alignment. Alignment was previously stored as a 16-bit integer from bits 16 to
32 of the attribute, but it was required to be a power of 2. Now it's stored in
log2 encoded form in five bits from 16 to 21. That gives us 11 more bits of
space.
You may have already noticed that you only need four bits to encode a 16-bit
power of two, so why five bits? Because the AsmParser accepted 32-bit
alignments, even though we couldn't store them (they were silently discarded).
Now we can store them in memory, but not in the bitcode.
The bitcode format was already storing these as 64-bit VBR integers. So, the
bitcode format stays the same, keeping the alignment values stored as 16 bit
raw values. There's some hideous code in the reader and writer that deals with
this, waiting to be ripped out the moment we run out of bits again and have to
replace the parameter attributes table encoding.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61019 91177308-0d34-0410-b5e6-96231b3b80d8
llvm[2]: Linking Release executable opt (without symbols)
...
Undefined symbols:
"llvm::APFloat::IEEEsingle", referenced from:
__ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(Constants.o)
__ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(AsmWriter.o)
__ZN4llvm7APFloat10IEEEsingleE$non_lazy_ptr in libLLVMCore.a(ConstantFold.o)
"llvm::APFloat::IEEEdouble", referenced from:
__ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(Constants.o)
__ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(AsmWriter.o)
__ZN4llvm7APFloat10IEEEdoubleE$non_lazy_ptr in libLLVMCore.a(ConstantFold.o)
ld: symbol(s) not found
This is in release mode. To replicate, compile llvm and llvm-gcc in optimized
mode. Then build llvm, in optimized mode, with the newly created compiler.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60977 91177308-0d34-0410-b5e6-96231b3b80d8
width register load followed by a truncating
store for the copy, since the load will not place
the value in the lower bits. Probably partial
loads/stores can never happen here, but fix it
anyway.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60972 91177308-0d34-0410-b5e6-96231b3b80d8
use of illegal integer types: instead, use a stack slot
and copying via integer registers. The existing code
is still used if the bitconvert is to a legal integer
type.
This fires on the PPC testcases 2007-09-08-unaligned.ll
and vec_misaligned.ll. It looks like equivalent code
is generated with these changes, just permuted, but
it's hard to tell.
With these changes, nothing in LegalizeDAG produces
illegal integer types anymore. This is a prerequisite
for removing the LegalizeDAG type legalization code.
While there I noticed that the existing code doesn't
handle trunc store of f64 to f32: it turns this into
an i64 store, which represents a 4 byte stack smash.
I added a FIXME about this. Hopefully someone more
motivated than I am will take care of it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60964 91177308-0d34-0410-b5e6-96231b3b80d8
which are identical to the original patterns.
- Change the multiply with overflow so that we distinguish between signed and
unsigned multiplication. Currently, unsigned multiplication with overflow
isn't working!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60963 91177308-0d34-0410-b5e6-96231b3b80d8
do an extending load of the 4 bytes rather than a
potentially illegal (type) i32 load followed by a
sign extend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60945 91177308-0d34-0410-b5e6-96231b3b80d8
ISD::ADD to emit an implicit EFLAGS. This was horribly broken. Instead, replace
the intrinsic with an ISD::SADDO node. Then custom lower that into an
X86ISD::ADD node with a associated SETCC that checks the correct condition code
(overflow or carry). Then that gets lowered into the correct X86::ADDOvf
instruction.
Similar for SUB and MUL instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60915 91177308-0d34-0410-b5e6-96231b3b80d8
for promoted integer types, eg: i16 on ppc-32, or
i24 on any platform. Complete support for arbitrary
precision integers would require handling expanded
integer types, eg: i128, but I couldn't be bothered.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60834 91177308-0d34-0410-b5e6-96231b3b80d8
parallel, allowing it to decide that P/Q must alias if A/B
must alias in things like:
P = gep A, 0, i, 1
Q = gep B, 0, i, 1
This allows GVN to delete 62 more instructions out of 403.gcc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60820 91177308-0d34-0410-b5e6-96231b3b80d8
node latencies. Use CalcLatency instead of manual code in
CalculatePriorities to keep it consistent. Previously it
computed slightly different results.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60817 91177308-0d34-0410-b5e6-96231b3b80d8
- Emit DW_AT_byte_size for struct and union of size zero.
- Emit DW_AT_declaration for forward type declaration.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60812 91177308-0d34-0410-b5e6-96231b3b80d8
- Fix bug 3185, with misc other cleanups.
- Needed to implement SPUInstrInfo::InsertBranch(). CAUTION: Not sure what
gets or needs to get passed to InsertBranch() to insert a conditional
branch. This will abort for now until a good test case shows up.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60811 91177308-0d34-0410-b5e6-96231b3b80d8
overflow/carry from the "arithmetic with overflow" intrinsics. It searches the
machine basic block from bottom to top to find the SETO/SETC instruction that is
its conditional. If an instruction modifies EFLAGS before it reaches the
SETO/SETC instruction, then it defaults to the normal instruction emission.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60807 91177308-0d34-0410-b5e6-96231b3b80d8
The Cost field is removed. It was only being used in a very limited way,
to indicate when the scheduler should attempt to protect a live register,
and it isn't really needed to do that. If we ever want the scheduler to
start inserting copies in non-prohibitive situations, we'll have to
rethink some things anyway.
A Latency field is added. Instead of giving each node a single
fixed latency, each edge can have its own latency. This will eventually
be used to model various micro-architecture properties more accurately.
The PointerIntPair class and an internal union are now used, which
reduce the overall size.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60806 91177308-0d34-0410-b5e6-96231b3b80d8
target-independent way of determining overflow on multiplication. It's very
tricky. Patch by Zoltan Varga!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60800 91177308-0d34-0410-b5e6-96231b3b80d8
of a pointer. This allows is to catch more equivalencies. For example,
the type_lists_compatible_p function used to require two iterations of
the gvn pass (!) to delete its 18 redundant loads because the first pass
would CSE all the addressing computation cruft, which would unblock the
second memdep/gvn passes from recognizing them. This change allows
memdep/gvn to catch all 18 when run just once on the function (as is
typical :) instead of just 3.
On all of 403.gcc, this bumps up the # reundandancies found from:
63 gvn - Number of instructions PRE'd
153991 gvn - Number of instructions deleted
50069 gvn - Number of loads deleted
to:
63 gvn - Number of instructions PRE'd
154137 gvn - Number of instructions deleted
50185 gvn - Number of loads deleted
+120 loads deleted isn't bad.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60799 91177308-0d34-0410-b5e6-96231b3b80d8
essential problem was that the DAG can contain
random unused nodes which were never analyzed.
When remapping a value of a node being processed,
such a node may become used and need to be analyzed;
however due to operands being transformed during
analysis the node may morph into a different one.
Users of the morphing node need to be updated, and
this wasn't happening. While there I added a bunch
of documentation and sanity checks, so I (or some
other poor soul) won't have to scratch their head
over this stuff so long trying to remember how it
was all supposed to work next time some obscure
problem pops up! The extra sanity checking exposed
a few places where invariants weren't being preserved,
so those are fixed too. Since some of the sanity
checking is expensive, I added a flag to turn it
on. It is also turned on when building with
ENABLE_EXPENSIVE_CHECKS=1.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60797 91177308-0d34-0410-b5e6-96231b3b80d8
tricks based on readnone/readonly functions.
Teach memdep to look past readonly calls when analyzing
deps for a readonly call. This allows elimination of a
few more calls from 403.gcc:
before:
63 gvn - Number of instructions PRE'd
153986 gvn - Number of instructions deleted
50069 gvn - Number of loads deleted
after:
63 gvn - Number of instructions PRE'd
153991 gvn - Number of instructions deleted
50069 gvn - Number of loads deleted
5 calls isn't much, but this adds plumbing for the next change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60794 91177308-0d34-0410-b5e6-96231b3b80d8
load dependence queries. This allows GVN to eliminate a few more
instructions on 403.gcc:
152598 gvn - Number of instructions deleted
49240 gvn - Number of loads deleted
after:
153986 gvn - Number of instructions deleted
50069 gvn - Number of loads deleted
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60786 91177308-0d34-0410-b5e6-96231b3b80d8
MemDep::getNonLocalPointerDependency method. There are
some open issues with this (missed optimizations) and
plenty of future work, but this does allow GVN to eliminate
*slightly* more loads (49246 vs 49033).
Switching over now allows simplification of the other code
path in memdep.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60780 91177308-0d34-0410-b5e6-96231b3b80d8
the first block of a query specially. This makes the "complete query
caching" subsystem more effective, avoiding predecessor queries. This
speeds up GVN another 4%.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60752 91177308-0d34-0410-b5e6-96231b3b80d8
Fix the shift amount when unrolling a vector shift into scalar shifts.
Fix problem in getShuffleScalarElt where it assumes that the input of
a bit convert must be a vector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60740 91177308-0d34-0410-b5e6-96231b3b80d8
- Change default scheduling preference to list-burr, which produces somewhat
better code than the default. Could also use list-tdrr, but need to ask
dev list about the appropriate handy mnemonic before commiting.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60738 91177308-0d34-0410-b5e6-96231b3b80d8
complete. For instance, it lowers the common case into this less-than-optimal
code:
addl %ecx, %eax
seto %cl
testb %cl, %cl
jne LBB1_2 ## overflow
instead of:
addl %ecx, %eax
jo LBB1_2 ## overflow
That will come in a future commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60737 91177308-0d34-0410-b5e6-96231b3b80d8
jump threading has been shown to only expose problems not
have bugs itself. I'm sure it's completely bug free! ;-)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60725 91177308-0d34-0410-b5e6-96231b3b80d8
and use it in x86 address mode folding. Also, make
getRegForValue return 0 for illegal types even if it has a
ValueMap for them, because Argument values are put in the
ValueMap. This fixes PR3181.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60696 91177308-0d34-0410-b5e6-96231b3b80d8
track of whether the CachedNonLocalPointerInfo for a block is specific
to a block. If so, just return it without any pred scanning. This is
good for a 6% speedup on GVN (when it uses this lookup method, which
it doesn't right now).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60695 91177308-0d34-0410-b5e6-96231b3b80d8
method. This will eventually take over load/store dep
queries from getNonLocalDependency. For now it works
fine, but is incredibly slow because it does no caching.
Lets not switch GVN to use it until that is fixed :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60649 91177308-0d34-0410-b5e6-96231b3b80d8
duplication of logic (in 2 places) to determine what pointer a
load/store touches. This will be addressed in a future commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60648 91177308-0d34-0410-b5e6-96231b3b80d8
clobber with the current implementation. Instead of returning
a "precise clobber" just return a fuzzy one. This doesn't
matter to any clients anyway and should speed up analysis time
very very slightly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60641 91177308-0d34-0410-b5e6-96231b3b80d8
doesn't do its own local caching, and is slightly more aggressive about
free/store dse (see testcase). This eliminates the last external client
of MemDep::getDependenceFrom().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60619 91177308-0d34-0410-b5e6-96231b3b80d8
since %p isn't formatted consistently, but obviously plain %x is wrong.
PRIxPTR with a cast to uintptr_t would work here, but that requires
inconvenient build-system changes. %lu works on all current and
foreseable future hosts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60616 91177308-0d34-0410-b5e6-96231b3b80d8
loops when they can be subsumed into addressing modes.
Change X86 addressing mode check to realize that
some PIC references need an extra register.
(I believe this is correct for Linux, if not, I'm sure
someone will tell me.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60608 91177308-0d34-0410-b5e6-96231b3b80d8
1. Merge the 'None' result into 'Normal', making loads
and stores return their dependencies on allocations as Normal.
2. Split the 'Normal' result into 'Clobber' and 'Def' to
distinguish between the cases when memdep knows the value is
produced from when we just know if may be changed.
3. Move some of the logic for determining whether readonly calls
are CSEs into memdep instead of it being in GVN. This still
leaves verification that the arguments are hte same to GVN to
let it know about value equivalences in different contexts.
4. Change memdep's call/call dependency analysis to use
getModRefInfo(CallSite,CallSite) instead of doing something
very weak. This only really matters for things like DSA, but
someday maybe we'll have some other decent context sensitive
analyses :)
5. This reimplements the guts of memdep to handle the new results.
6. This simplifies GVN significantly:
a) readonly call CSE is slightly simpler
b) I eliminated the "getDependencyFrom" chaining for load
elimination and load CSE doesn't have to worry about
volatile (they are always clobbers) anymore.
c) GVN no longer does any 'lastLoad' caching, leaving it to
memdep.
7. The logic in DSE is simplified a bit and sped up. A potentially
unsafe case was eliminated.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60607 91177308-0d34-0410-b5e6-96231b3b80d8
1. GlobalBaseReg may have been spilled.
2. It may not be live at the use.
3. Spiller doesn't know this is happening so it won't prevent GlobalBaseReg from being spilled later (That by itself is a nasty hack. It's needed because we don't insert the reload until later).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60595 91177308-0d34-0410-b5e6-96231b3b80d8
changing the stack slots on an instruction, to keep them
consistent with the actual memory addresses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60584 91177308-0d34-0410-b5e6-96231b3b80d8
While they appear to provide a normal clobbering def, they don't
in the case of the awkward IMPLICIT_DEF+INSERT_SUBREG idiom. It
would be good to change INSERT_SUBREG; until then, this change
allows post-regalloc scheduling to cope in a mildly conservative
way.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60583 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes many bugs. I will add more test cases in a separate check-in.
Some day, the code that manipulates CFG and updates dom. info could use refactoring help.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60554 91177308-0d34-0410-b5e6-96231b3b80d8
- Add v4f32, v2f64 to LowerVECTOR_SHUFFLE
- Look for vector rotate in shuffle elements, generate a vector rotate
instead of a full-blown shuffle when opportunity presents itself.
- Generate larger test harness and fix a few interesting but obscure bugs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60552 91177308-0d34-0410-b5e6-96231b3b80d8
- First patch from Nehal Desai, a new contributor at Aerospace. Nehal's patch
fixes sign/zero/any-extending loads for integers and floating point. Example
code, compiled w/o debugging or optimization where he first noticed the bug:
int main(void) {
float a = 99.0;
printf("%d\n", a);
return 0;
}
Verified that this code actually works on a Cell SPU.
Changes by Scott Michel:
- Fix bug in the value type list constructed by SPUISD::LDRESULT to include
both the load result's result and chain, not just the chain alone.
- Simplify LowerLOAD and remove extraneous and unnecessary chains.
- Remove unused SPUISD pseudo instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60526 91177308-0d34-0410-b5e6-96231b3b80d8
the frame reference. This will help post-RA scheduling determine
that spills to distinct stack slots are independent.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60486 91177308-0d34-0410-b5e6-96231b3b80d8
1) have it fold "br undef", which does occur with
surprising frequency as jump threading iterates.
2) teach j-t to delete dead blocks. This removes the successor
edges, reducing the in-edges of other blocks, allowing
recursive simplification.
3) Fold things like:
br COND, BBX, BBY
BBX:
br COND, BBZ, BBW
which also happens because jump threading iterates.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60470 91177308-0d34-0410-b5e6-96231b3b80d8
foldMemoryOperand how to "fold" them, by converting them into constant-pool
loads. When they aren't folded, they use xorps/cmpeqd, but for example when
register pressure is high, they may now be folded as memory operands, which
reduces register pressure.
Also, mark V_SET0 isAsCheapAsAMove so that two-address-elimination will
remat it instead of copying zeros around (V_SETALLONES was already marked).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60461 91177308-0d34-0410-b5e6-96231b3b80d8
is set but mayLoad is not set. Fix all the problems this turned up.
Change code to not use isSimpleLoad instead of mayLoad unless it
really wants isSimpleLoad.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60459 91177308-0d34-0410-b5e6-96231b3b80d8
delegates to the regular x86-32 convention which handles byval, but only
after it handles a few cases, and it's necessary to handle byval before
handling those cases. This fixes PR3122 (and rdar://6400815), llvm-gcc
miscompiling LLVM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60453 91177308-0d34-0410-b5e6-96231b3b80d8
1. ppcf128 select is expanded to f64 select's.
2. f64 select operand 0 is an i1 truncate, it's promoted to i32 zero_extend.
3. f64 select is updated. It's changed back to a "NewNode" and being re-analyzed.
4. f64 select operands are being processed. Operand 0 is a "NewNode". It's being expunged out of ReplacedValues map.
5. ExpungeNode tries to remap f64 select and notice it's a "NewNode" and assert.
Duncan, please take a look. Thanks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60443 91177308-0d34-0410-b5e6-96231b3b80d8
- Incorporate Tilmann Scheller's ISD::TRUNCATE custom lowering patch
- Update SPU calling convention info, even if it's not used yet (but can be
at some point or another)
- Ensure that any-extended f32 loads are custom lowered, especially when
they're promoted for use in printf.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60438 91177308-0d34-0410-b5e6-96231b3b80d8
straight-forward implementation. This does not require any extra
alias analysis queries beyond what we already do for non-local loads.
Some programs really really like load PRE. For example, SPASS triggers
this ~1000 times, ~300 times in 255.vortex, and ~1500 times on 403.gcc.
The biggest limitation to the implementation is that it does not split
critical edges. This is a huge killer on many programs and should be
addressed after the initial patch is enabled by default.
The implementation of this should incidentally speed up rejection of
non-local loads because it avoids creating the repl densemap in cases
when it won't be used for fully redundant loads.
This is currently disabled by default.
Before I turn this on, I need to fix a couple of miscompilations in
the testsuite, look at compile time performance numbers, and look at
perf impact. This is pretty close to ready though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60408 91177308-0d34-0410-b5e6-96231b3b80d8
constant. If X is a constant, then this is folded elsewhere.
- Added a note to Target/README.txt to indicate that we'd like to implement
this when we're able.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60399 91177308-0d34-0410-b5e6-96231b3b80d8
a new value numbering set after splitting a critical edge. This increases
the number of instances of PRE on 403.gcc from ~60 to ~570.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60393 91177308-0d34-0410-b5e6-96231b3b80d8
- LowerXADDO lowers [SU]ADDO into an ADD with an implicit EFLAGS define. The
EFLAGS are fed into a SETCC node which has the conditional COND_O or COND_C,
depending on the type of ADDO requested.
- LowerBRCOND now recognizes if it's coming from a SETCC node with COND_O or
COND_C set.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60388 91177308-0d34-0410-b5e6-96231b3b80d8
figuring out the base of the IV. This produces better
code in the example. (Addresses use (IV) instead of
(BASE,IV) - a significant improvement on low-register
machines like x86).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60374 91177308-0d34-0410-b5e6-96231b3b80d8
- Fix v2[if]64 vector insertion code before IBM files a bug report.
- Ensure that zero (0) offsets relative to $sp don't trip an assert
(add $sp, 0 gets legalized to $sp alone, tripping an assert)
- Shuffle masks passed to SPUISD::SHUFB are now v16i8 or v4i32
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60358 91177308-0d34-0410-b5e6-96231b3b80d8
MERGE_VALUES node with only one operand, so get
rid of special code that only existed to handle
that possibility.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60349 91177308-0d34-0410-b5e6-96231b3b80d8
ReplaceNodeResults: rather than returning a node which
must have the same number of results as the original
node (which means mucking around with MERGE_VALUES,
and which is also easy to get wrong since SelectionDAG
folding may mean you don't get the node you expect),
return the results in a vector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60348 91177308-0d34-0410-b5e6-96231b3b80d8
instead of std::sort. This shrinks the release-asserts LSR.o file
by 1100 bytes of code on my system.
We should start using array_pod_sort where possible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60335 91177308-0d34-0410-b5e6-96231b3b80d8
buggy rewrite, this notifies ScalarEvolution of a pending instruction
about to be removed and then erases it, instead of erasing it then
notifying.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60329 91177308-0d34-0410-b5e6-96231b3b80d8
new instructions it simplifies. Because we're threading jumps on edges
with constants coming in from PHI's, we inherently are exposing a lot more
constants to the new block. Folding them and deleting dead conditions
allows the cost model in jump threading to be more accurate as it iterates.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60327 91177308-0d34-0410-b5e6-96231b3b80d8
prevents the passmgr from adding yet-another domtree invocation
for Verifier if there is already one live.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60326 91177308-0d34-0410-b5e6-96231b3b80d8
elimination: when finding dependent load/stores, realize that
they are the same if aliasing claims must alias instead of relying
on the pointers to be exactly equal. This makes load elimination
more aggressive. For example, on 403.gcc, we had:
< 68 gvn - Number of instructions PRE'd
< 152718 gvn - Number of instructions deleted
< 49699 gvn - Number of loads deleted
< 6153 memdep - Number of dirty cached non-local responses
< 169336 memdep - Number of fully cached non-local responses
< 162428 memdep - Number of uncached non-local responses
now we have:
> 64 gvn - Number of instructions PRE'd
> 153623 gvn - Number of instructions deleted
> 49856 gvn - Number of loads deleted
> 5022 memdep - Number of dirty cached non-local responses
> 159030 memdep - Number of fully cached non-local responses
> 162443 memdep - Number of uncached non-local responses
That's an extra 157 loads deleted and extra 905 other instructions nuked.
This slows down GVN very slightly, from 3.91 to 3.96s.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60314 91177308-0d34-0410-b5e6-96231b3b80d8
vector instead of a densemap. This shrinks the memory usage of this thing
substantially (the high water mark) as well as making operations like
scanning it faster. This speeds up memdep slightly, gvn goes from
3.9376 to 3.9118s on 403.gcc
This also splits out the statistics for the cached non-local case to
differentiate between the dirty and clean cached case. Here's the stats
for 403.gcc:
6153 memdep - Number of dirty cached non-local responses
169336 memdep - Number of fully cached non-local responses
162428 memdep - Number of uncached non-local responses
yay for caching :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60313 91177308-0d34-0410-b5e6-96231b3b80d8