Previously, LiveIntervalAnalysis would infer phi joins by looking for multiply
defined registers. That doesn't work if the phi join is implicitly defined in
all but one of the predecessors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96994 91177308-0d34-0410-b5e6-96231b3b80d8
operators.
The test difference is just due to the multiplication operands
being commuted (and thus requiring a more elaborate match). In
optimized code, that expression would be folded.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96816 91177308-0d34-0410-b5e6-96231b3b80d8
during a tail call. A parameter might overwrite this stack slot during the tail
call.
The sequence during a tail call is:
1.) load return address to temp reg
2.) move parameters (might involve storing to return address stack slot)
3.) store return address to new location from temp reg
If the stack location is marked immutable CodeGen can colocate load (1) with the
store (3).
This fixes bug 6225.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96783 91177308-0d34-0410-b5e6-96231b3b80d8
SSE min and max instructions. The real thing this code needs to be
concerned about is negative zero.
Update the sse-minmax.ll test accordingly, and add tests for
-enable-unsafe-fp-math mode as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96775 91177308-0d34-0410-b5e6-96231b3b80d8
induction variable value and a loop-variant value, don't force the
insert position to be at the post-increment position, because it may
not be dominated by the loop-variant value. This fixes a
use-before-def problem noticed on PPC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96774 91177308-0d34-0410-b5e6-96231b3b80d8
dragonegg self-host build. I reverted 96640 in order to revert
96556 (96640 goes on top of 96556), but it also looks like with
both of them applied the breakage happens even earlier. The
symptom of the 96556 miscompile is the following crash:
llvm[3]: Compiling AlphaISelLowering.cpp for Release build
cc1plus: /home/duncan/tmp/tmp/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:4982: void llvm::SelectionDAG::ReplaceAllUsesWith(llvm::SDNode*, llvm::SDNode*, llvm::SelectionDAG::DAGUpdateListener*): Assertion `(!From->hasAnyUseOfValue(i) || From->getValueType(i) == To->getValueType(i)) && "Cannot use this version of ReplaceAllUsesWith!"' failed.
Stack dump:
0. Running pass 'X86 DAG->DAG Instruction Selection' on function '@_ZN4llvm19AlphaTargetLowering14LowerOperationENS_7SDValueERNS_12SelectionDAGE'
g++: Internal error: Aborted (program cc1plus)
This occurs when building LLVM using LLVM built by LLVM (via
dragonegg). Probably LLVM has miscompiled itself, though it
may have miscompiled GCC and/or dragonegg itself: at this point
of the self-host build, all of GCC, LLVM and dragonegg were built
using LLVM. Unfortunately this kind of thing is extremely hard
to debug, and while I did rummage around a bit I didn't find any
smoking guns, aka obviously miscompiled code.
Found by bisection.
r96556 | evancheng | 2010-02-18 03:13:50 +0100 (Thu, 18 Feb 2010) | 5 lines
Some dag combiner goodness:
Transform br (xor (x, y)) -> br (x != y)
Transform br (xor (xor (x,y), 1)) -> br (x == y)
Also normalize (and (X, 1) == / != 1 -> (and (X, 1)) != / == 0 to match to "test on x86" and "tst on arm"
r96640 | evancheng | 2010-02-19 01:34:39 +0100 (Fri, 19 Feb 2010) | 16 lines
Transform (xor (setcc), (setcc)) == / != 1 to
(xor (setcc), (setcc)) != / == 1.
e.g. On x86_64
%0 = icmp eq i32 %x, 0
%1 = icmp eq i32 %y, 0
%2 = xor i1 %1, %0
br i1 %2, label %bb, label %return
=>
testl %edi, %edi
sete %al
testl %esi, %esi
sete %cl
cmpb %al, %cl
je LBB1_2
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96672 91177308-0d34-0410-b5e6-96231b3b80d8
strides in foreign loops. This helps locate reuse opportunities
with existing induction variables in foreign loops and reduces
the need for inserting new ones. This fixes rdar://7657764.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96629 91177308-0d34-0410-b5e6-96231b3b80d8
which is not always true if the mask contains undefs. Modified it to return
the first non undef value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96621 91177308-0d34-0410-b5e6-96231b3b80d8
Moderate the weight given to very small intervals.
The spill weight given to new intervals created when spilling was not
normalized in the same way as the original spill weights calculated by
CalcSpillWeights. That meant that restored registers would tend to hang around
because they had a much higher spill weight that unspilled registers.
This improves the runtime of a few tests by up to 10%, and there are no
significant regressions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96613 91177308-0d34-0410-b5e6-96231b3b80d8
checking whether AnalyzeBranch disagrees with the CFG
directly, rather than looking for EH_LABEL instructions.
EH_LABEL instructions aren't always at the end of the
block, due to FP_REG_KILL and other things. This fixes
an infinite loop compiling MultiSource/Benchmarks/Bullet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96611 91177308-0d34-0410-b5e6-96231b3b80d8
into a roundss intrinsic, producing a cyclic dag. The root cause
of this is badness handling ComplexPattern nodes in the old dagisel
that I noticed through inspection. Eliminate a copy of the of the
code that handled ComplexPatterns by making EmitChildMatchCode call
into EmitMatchCode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96408 91177308-0d34-0410-b5e6-96231b3b80d8
If there exists a use of a build_vector that's the bitwise complement of the mask,
then transform the node to
(and (xor x, (build_vector -1,-1,-1,-1)), (build_vector ~c1,~c2,~c3,~c4)).
Since this transformation is only useful when 1) the given build_vector will
become a load from constpool, and 2) (and (xor x -1), y) matches to a single
instruction, I decided this is appropriate as a x86 specific transformation.
rdar://7323335
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96389 91177308-0d34-0410-b5e6-96231b3b80d8
non-temporal. Fix from r96241 for botched encoding of MOVNTDQ.
Add documentation for !nontemporal metadata.
Add a simpler movnt testcase.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96386 91177308-0d34-0410-b5e6-96231b3b80d8
branch in ARM v4 code, since it gets clobbered by the return address before
it is used. Instead of adding a new register class containing all the GPRs
except LR, just use the existing tGPR class.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96360 91177308-0d34-0410-b5e6-96231b3b80d8
as it also peeks at which registers are being used by other uses. This
makes LSR less sensitive to use-list order.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96308 91177308-0d34-0410-b5e6-96231b3b80d8
A virtual register can be used before it is defined in the same MBB if the MBB
is part of a loop. Teach the implicit-def pass about this case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96279 91177308-0d34-0410-b5e6-96231b3b80d8
When coalescing with a physreg, remember to add imp-def and imp-kill when
dealing with sub-registers.
Also fix a related bug in VirtRegRewriter where substitutePhysReg may
reallocate the operand list on an instruction and invalidate the reg_iterator.
This can happen when a register is mentioned twice on the same instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96072 91177308-0d34-0410-b5e6-96231b3b80d8
phi cycles. Adjust a few tests to keep dead instructions from being optimized
away. This (together with my previous change for phi cycles) fixes Apple
radar 7627077.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96057 91177308-0d34-0410-b5e6-96231b3b80d8
stack frame, the prolog/epilog code was using the same
register for the copy of CR and the address of the save slot. Oops.
This is fixed here for Darwin, sort of, by reserving R2 for this case.
A better way would be to do the store before the decrement of SP,
which is safe on Darwin due to the red zone.
SVR4 probably has the same problem, but I don't know how to fix it;
there is no red zone and R2 is already used for something else.
I'm going to leave it to someone interested in that target.
Better still would be to rewrite the CR-saving code completely;
spilling each CR subregister individually is horrible code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96015 91177308-0d34-0410-b5e6-96231b3b80d8
bug fixes, and with improved heuristics for analyzing foreign-loop
addrecs.
This change also flattens IVUsers, eliminating the stride-oriented
groupings, which makes it easier to work with.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95975 91177308-0d34-0410-b5e6-96231b3b80d8
reduce down to a single value. InstCombine already does this transformation
but DAG legalization may introduce new opportunities. This has turned out to
be important for ARM where 64-bit values are split up during type legalization:
InstCombine is not able to remove the PHI cycles on the 64-bit values but
the separate 32-bit values can be optimized. I measured the compile time
impact of this (running llc on 176.gcc) and it was not significant.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95951 91177308-0d34-0410-b5e6-96231b3b80d8
lowering and requires that certain types exist in ValueTypes.h. Modified widening to
check if an op can trap and if so, the widening algorithm will apply only the op on
the defined elements. It is safer to do this in widening because the optimizer can't
guarantee removing unused ops in some cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95823 91177308-0d34-0410-b5e6-96231b3b80d8
legalization even when the IR-level optimizer has removed dead phis, such
as when the high half of an i64 value is unused on a 32-bit target.
I had to adjust a few test cases that had dead phis.
This is a partial fix for Radar 7627077.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95816 91177308-0d34-0410-b5e6-96231b3b80d8
in global initializers. Instead of aborting, attempt to fold them on the
spot. If folding succeeds, emit the folded expression instead.
This fixes PR6255.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95583 91177308-0d34-0410-b5e6-96231b3b80d8
only run for x86 with fastisel. I've found it being very effective in
eliminating some obvious dead code as result of formal parameter lowering
especially when tail call optimization eliminated the need for some of the loads
from fixed frame objects. It also shrinks a number of the tests. A couple of
tests no longer make sense and are now eliminated.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95493 91177308-0d34-0410-b5e6-96231b3b80d8
following it. However, the EmitGlobalConstant method wasn't emitting a body for
the constant. The assembler doesn't like that. Before, we were generating this:
.zerofill __DATA, __common, __cmd, 1, 3
This fix puts us back to that semantic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95336 91177308-0d34-0410-b5e6-96231b3b80d8
where callee's arguments are already in the caller's own caller's stack and
they line up perfectly. e.g.
extern int foo(int a, int b, int c);
int bar(int a, int b, int c) {
return foo(a, b, c);
}
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95053 91177308-0d34-0410-b5e6-96231b3b80d8
as output. Needed for (functional) correctness in inline asm,
and should be generally beneficial. 7361612.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95050 91177308-0d34-0410-b5e6-96231b3b80d8
Even if they are suported by the core, they can be disabled
(this is just a configuration bit inside some register).
Allow unaligned memops on darwin and conservatively disallow them otherwise.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94889 91177308-0d34-0410-b5e6-96231b3b80d8
runOnMachineFunction, and switch PPC to use EmitFunctionBody.
The two ppc asmprinters now don't heave to define
runOnMachineFunction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94722 91177308-0d34-0410-b5e6-96231b3b80d8
even when -tailcallopt is not specified and it does not require changing ABI.
First case is the most trivial one. Perform tail call optimization when both
the caller and callee do not return values and when the callee does not take
any input arguments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94664 91177308-0d34-0410-b5e6-96231b3b80d8
to MCExpr then emit them through MCStreamer with EmitValue. I think all
global variable initializers are now going through mcstreamer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94293 91177308-0d34-0410-b5e6-96231b3b80d8
This new version is much more aggressive about doing "full" reduction in
cases where it reduces register pressure, and also more aggressive about
rewriting induction variables to count down (or up) to zero when doing so
reduces register pressure.
It currently uses fairly simplistic algorithms for finding reuse
opportunities, but it introduces a new framework allows it to combine
multiple strategies at once to form hybrid solutions, instead of doing
all full-reduction or all base+index.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94061 91177308-0d34-0410-b5e6-96231b3b80d8
of int initializers), change some methods to be static functions,
use raw_ostream::write_hex instead of a smallstring dance with
APValue::toStringUnsigned(S, 16).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93991 91177308-0d34-0410-b5e6-96231b3b80d8
doing global variable classification anymore) and hookized, sink almost
all target targets global variable emission code into AsmPrinter and out
of each target.
Some notes:
1. PIC16 does completely custom and crazy stuff, so it is not changed.
2. XCore has some custom handling for extra directives. I'll look at it next.
3. This switches linux/ppc to use .globl instead of .global. If .globl is
actually wrong, let me know and I'll fix it.
4. This makes linux/ppc get a lot of random cases right which were obviously
wrong before, it is probably now a bit healthier.
5. Blackfin will probably start getting .comm and other things that it didn't
before. If this is undesirable, it should explicitly opt out of these
things by clearing the relevant fields of MCAsmInfo.
This leads to a nice diffstat:
14 files changed, 127 insertions(+), 830 deletions(-)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93858 91177308-0d34-0410-b5e6-96231b3b80d8
GCC would put weak zero initialized mutable data in the .bss section,
we would put it into a crasy '.gnu.linkonce.b.test,"aw",@nobits'
section. Fixing this will allow simplifications next up.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93844 91177308-0d34-0410-b5e6-96231b3b80d8
Instcombine does this but apparently there are situations where this pattern will escape the optimizer and / or created by isel. Here is a case that's seen in JavaScriptCore:
%t1 = sub i32 0, %a
%t2 = add i32 %t1, -1
The dag combiner pattern: ((c1-A)+c2) -> (c1+c2)-A
will fold it to -1 - %a.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93773 91177308-0d34-0410-b5e6-96231b3b80d8
adding an "i" to the suffix, indicating that the elements are integers, is
accepted but not part of the standard syntax. This helps us pass a few more
of the Neon tests from gcc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93677 91177308-0d34-0410-b5e6-96231b3b80d8