payloads. APFloat's internal folding routines always make QNaNs now,
instead of sometimes making QNaNs and sometimes SNaNs depending on the
type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97364 91177308-0d34-0410-b5e6-96231b3b80d8
The PowerPC floating point registers can represent both f32 and f64 via the
two register classes F4RC and F8RC. F8RC is considered a subclass of F4RC to
allow cross-class coalescing. This coalescing only affects whether registers
are spilled as f32 or f64.
Spill slots must be accessed with load/store instructions corresponding to the
class of the spilled register. PPCInstrInfo::foldMemoryOperandImpl was looking
at the instruction opcode which is wrong.
X86 has similar floating point register classes, but doesn't try to fold
memory operands, so there is no problem there.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97262 91177308-0d34-0410-b5e6-96231b3b80d8
Previously LoopStrengthReduce would sometimes be unable to find
a legal formula, causing an assertion failure.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97226 91177308-0d34-0410-b5e6-96231b3b80d8
about this, but it can be useful for users who use ccache, since the LLVMC tests
are fond of calling gcc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97171 91177308-0d34-0410-b5e6-96231b3b80d8
instead of to have a chained series of scope nodes. This makes
the generated table smaller, improves the efficiency of the
interpreter, and make the factoring optimization much more
reasonable to implement.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97160 91177308-0d34-0410-b5e6-96231b3b80d8
section with TextAlignFillValue and calls EmitCodeAlignment() instead of
calling EmitValueToAlignment(). This allows x86 assembly code to be aligned
with optimal nops.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97158 91177308-0d34-0410-b5e6-96231b3b80d8
terms of store and load, which means bitcasting between scalar
integer and vector has endian-specific results, which undermines
this whole approach.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97137 91177308-0d34-0410-b5e6-96231b3b80d8
which branch on undef to branch on a boolean constant for the edge
exiting the loop. This helps ScalarEvolution compute trip counts for
loops.
Teach ScalarEvolution to recognize single-value PHIs, when safe, and
ForgetSymbolicName to forget such single-value PHI nodes as apprpriate
in ForgetSymbolicName.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97126 91177308-0d34-0410-b5e6-96231b3b80d8
- Function uses all scratch registers AND
- Function does not use any callee saved registers AND
- Stack size is too big to address with immediate offsets.
In this case a register must be scavenged to calculate the address of a stack
object, and the scavenger needs a spare register or emergency spill slot.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97071 91177308-0d34-0410-b5e6-96231b3b80d8
greater-than-or-equal SELECT_CCs to NEON vmin/vmax instructions. This is
only allowed when UnsafeFPMath is set or when at least one of the operands
is known to be nonzero.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97065 91177308-0d34-0410-b5e6-96231b3b80d8
the number of value bits, not the number of bits of allocation for in-memory
storage.
Make getTypeStoreSize and getTypeAllocSize work consistently for arrays and
vectors.
Fix several places in CodeGen which compute offsets into in-memory vectors
to use TargetData information.
This fixes PR1784.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97064 91177308-0d34-0410-b5e6-96231b3b80d8
necessary to swap the operands to handle NaN and negative zero properly.
Also, reintroduce logic for checking for NaN conditions when forming
SSE min and max instructions, fixed to take into consideration NaNs and
negative zeros. This allows forming min and max instructions in more
cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97025 91177308-0d34-0410-b5e6-96231b3b80d8
to adding them in a determinstic order (bottom up from
the root) based on the structure of the graph itself.
This updates tests for some random changes, interesting
bits: CodeGen/Blackfin/promote-logic.ll no longer crashes.
I have no idea why, but that's good right?
CodeGen/X86/2009-07-16-LoadFoldingBug.ll also fails, but
now compiles to have one fewer constant pool entry, making
the expected load that was being folded disappear. Since it
is an unreduced mass of gnast, I just removed it.
This fixes PR6370
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97023 91177308-0d34-0410-b5e6-96231b3b80d8
Previously, LiveIntervalAnalysis would infer phi joins by looking for multiply
defined registers. That doesn't work if the phi join is implicitly defined in
all but one of the predecessors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96994 91177308-0d34-0410-b5e6-96231b3b80d8
getelementptr. Despite only doing so in the case where x is a known
array object and c can be converted to an index within range, this
could still be invalid if c is actually the address of an object
allocated outside of LLVM. Also, SCEVExpander, the original motivation
for this code, has since been improved to avoid inttoptr+ptroint in
more cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96950 91177308-0d34-0410-b5e6-96231b3b80d8
operators.
The test difference is just due to the multiplication operands
being commuted (and thus requiring a more elaborate match). In
optimized code, that expression would be folded.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96816 91177308-0d34-0410-b5e6-96231b3b80d8
during a tail call. A parameter might overwrite this stack slot during the tail
call.
The sequence during a tail call is:
1.) load return address to temp reg
2.) move parameters (might involve storing to return address stack slot)
3.) store return address to new location from temp reg
If the stack location is marked immutable CodeGen can colocate load (1) with the
store (3).
This fixes bug 6225.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96783 91177308-0d34-0410-b5e6-96231b3b80d8
SSE min and max instructions. The real thing this code needs to be
concerned about is negative zero.
Update the sse-minmax.ll test accordingly, and add tests for
-enable-unsafe-fp-math mode as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96775 91177308-0d34-0410-b5e6-96231b3b80d8
induction variable value and a loop-variant value, don't force the
insert position to be at the post-increment position, because it may
not be dominated by the loop-variant value. This fixes a
use-before-def problem noticed on PPC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96774 91177308-0d34-0410-b5e6-96231b3b80d8
true or false as its exit condition. These are usually eliminated by
SimplifyCFG, but the may be left around during a pass which wishes
to preserve the CFG.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96683 91177308-0d34-0410-b5e6-96231b3b80d8
dragonegg self-host build. I reverted 96640 in order to revert
96556 (96640 goes on top of 96556), but it also looks like with
both of them applied the breakage happens even earlier. The
symptom of the 96556 miscompile is the following crash:
llvm[3]: Compiling AlphaISelLowering.cpp for Release build
cc1plus: /home/duncan/tmp/tmp/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:4982: void llvm::SelectionDAG::ReplaceAllUsesWith(llvm::SDNode*, llvm::SDNode*, llvm::SelectionDAG::DAGUpdateListener*): Assertion `(!From->hasAnyUseOfValue(i) || From->getValueType(i) == To->getValueType(i)) && "Cannot use this version of ReplaceAllUsesWith!"' failed.
Stack dump:
0. Running pass 'X86 DAG->DAG Instruction Selection' on function '@_ZN4llvm19AlphaTargetLowering14LowerOperationENS_7SDValueERNS_12SelectionDAGE'
g++: Internal error: Aborted (program cc1plus)
This occurs when building LLVM using LLVM built by LLVM (via
dragonegg). Probably LLVM has miscompiled itself, though it
may have miscompiled GCC and/or dragonegg itself: at this point
of the self-host build, all of GCC, LLVM and dragonegg were built
using LLVM. Unfortunately this kind of thing is extremely hard
to debug, and while I did rummage around a bit I didn't find any
smoking guns, aka obviously miscompiled code.
Found by bisection.
r96556 | evancheng | 2010-02-18 03:13:50 +0100 (Thu, 18 Feb 2010) | 5 lines
Some dag combiner goodness:
Transform br (xor (x, y)) -> br (x != y)
Transform br (xor (xor (x,y), 1)) -> br (x == y)
Also normalize (and (X, 1) == / != 1 -> (and (X, 1)) != / == 0 to match to "test on x86" and "tst on arm"
r96640 | evancheng | 2010-02-19 01:34:39 +0100 (Fri, 19 Feb 2010) | 16 lines
Transform (xor (setcc), (setcc)) == / != 1 to
(xor (setcc), (setcc)) != / == 1.
e.g. On x86_64
%0 = icmp eq i32 %x, 0
%1 = icmp eq i32 %y, 0
%2 = xor i1 %1, %0
br i1 %2, label %bb, label %return
=>
testl %edi, %edi
sete %al
testl %esi, %esi
sete %cl
cmpb %al, %cl
je LBB1_2
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96672 91177308-0d34-0410-b5e6-96231b3b80d8
strides in foreign loops. This helps locate reuse opportunities
with existing induction variables in foreign loops and reduces
the need for inserting new ones. This fixes rdar://7657764.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96629 91177308-0d34-0410-b5e6-96231b3b80d8
which is not always true if the mask contains undefs. Modified it to return
the first non undef value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96621 91177308-0d34-0410-b5e6-96231b3b80d8
Moderate the weight given to very small intervals.
The spill weight given to new intervals created when spilling was not
normalized in the same way as the original spill weights calculated by
CalcSpillWeights. That meant that restored registers would tend to hang around
because they had a much higher spill weight that unspilled registers.
This improves the runtime of a few tests by up to 10%, and there are no
significant regressions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96613 91177308-0d34-0410-b5e6-96231b3b80d8
checking whether AnalyzeBranch disagrees with the CFG
directly, rather than looking for EH_LABEL instructions.
EH_LABEL instructions aren't always at the end of the
block, due to FP_REG_KILL and other things. This fixes
an infinite loop compiling MultiSource/Benchmarks/Bullet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96611 91177308-0d34-0410-b5e6-96231b3b80d8
--enable-shared configure flag to have the tools linked shared. (2.7svn is just
$(LLVMVersion) so it'll change to "2.7" in the release.) Always link the
example programs shared to test that the shared library keeps working.
On my mac laptop, Debug libLLVM2.7svn.dylib is 39MB, and opt (for example) is
16M static vs 440K shared.
Two things are less than ideal here:
1) The library doesn't include any version information. Since we expect to break
the ABI with every release, this shouldn't be much of a problem. If we do
release a compatible 2.7.1, we may be able to hack its library to work with
binaries compiled against 2.7.0, or we can just ask them to recompile. I'm
hoping to get a real packaging expert to look at this for the 2.8 release.
2) llvm-config doesn't yet have an option to print link options for the shared
library. I'll add this as a subsequent patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96559 91177308-0d34-0410-b5e6-96231b3b80d8
r95605 | dpatel | 2010-02-08 15:27:46 -0800 (Mon, 08 Feb 2010) | 2 lines
test case for r95604.
Which was the testcase for the patch reverted from llvm-gcc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96474 91177308-0d34-0410-b5e6-96231b3b80d8
into a roundss intrinsic, producing a cyclic dag. The root cause
of this is badness handling ComplexPattern nodes in the old dagisel
that I noticed through inspection. Eliminate a copy of the of the
code that handled ComplexPatterns by making EmitChildMatchCode call
into EmitMatchCode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96408 91177308-0d34-0410-b5e6-96231b3b80d8
If there exists a use of a build_vector that's the bitwise complement of the mask,
then transform the node to
(and (xor x, (build_vector -1,-1,-1,-1)), (build_vector ~c1,~c2,~c3,~c4)).
Since this transformation is only useful when 1) the given build_vector will
become a load from constpool, and 2) (and (xor x -1), y) matches to a single
instruction, I decided this is appropriate as a x86 specific transformation.
rdar://7323335
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96389 91177308-0d34-0410-b5e6-96231b3b80d8
non-temporal. Fix from r96241 for botched encoding of MOVNTDQ.
Add documentation for !nontemporal metadata.
Add a simpler movnt testcase.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96386 91177308-0d34-0410-b5e6-96231b3b80d8
branch in ARM v4 code, since it gets clobbered by the return address before
it is used. Instead of adding a new register class containing all the GPRs
except LR, just use the existing tGPR class.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96360 91177308-0d34-0410-b5e6-96231b3b80d8
as it also peeks at which registers are being used by other uses. This
makes LSR less sensitive to use-list order.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96308 91177308-0d34-0410-b5e6-96231b3b80d8
A virtual register can be used before it is defined in the same MBB if the MBB
is part of a loop. Teach the implicit-def pass about this case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96279 91177308-0d34-0410-b5e6-96231b3b80d8
odd offsets since the bitcasted pointer size and the offset pointer
size are going to be different types for the GEP vs base object.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96134 91177308-0d34-0410-b5e6-96231b3b80d8
When coalescing with a physreg, remember to add imp-def and imp-kill when
dealing with sub-registers.
Also fix a related bug in VirtRegRewriter where substitutePhysReg may
reallocate the operand list on an instruction and invalidate the reg_iterator.
This can happen when a register is mentioned twice on the same instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96072 91177308-0d34-0410-b5e6-96231b3b80d8
We still have the templated X86 JIT emitter, *and* the
almost-copy in X86InstrInfo for getting instruction sizes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96059 91177308-0d34-0410-b5e6-96231b3b80d8
phi cycles. Adjust a few tests to keep dead instructions from being optimized
away. This (together with my previous change for phi cycles) fixes Apple
radar 7627077.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96057 91177308-0d34-0410-b5e6-96231b3b80d8
stack frame, the prolog/epilog code was using the same
register for the copy of CR and the address of the save slot. Oops.
This is fixed here for Darwin, sort of, by reserving R2 for this case.
A better way would be to do the store before the decrement of SP,
which is safe on Darwin due to the red zone.
SVR4 probably has the same problem, but I don't know how to fix it;
there is no red zone and R2 is already used for something else.
I'm going to leave it to someone interested in that target.
Better still would be to rewrite the CR-saving code completely;
spilling each CR subregister individually is horrible code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96015 91177308-0d34-0410-b5e6-96231b3b80d8
2. don't bother trying to merge globals in non-default sections,
doing so is quite dubious at best anyway.
3. fix a bug reported by Arnaud de Grandmaison where we'd try to
merge two globals in different address spaces.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95995 91177308-0d34-0410-b5e6-96231b3b80d8
bug fixes, and with improved heuristics for analyzing foreign-loop
addrecs.
This change also flattens IVUsers, eliminating the stride-oriented
groupings, which makes it easier to work with.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95975 91177308-0d34-0410-b5e6-96231b3b80d8
reduce down to a single value. InstCombine already does this transformation
but DAG legalization may introduce new opportunities. This has turned out to
be important for ARM where 64-bit values are split up during type legalization:
InstCombine is not able to remove the PHI cycles on the 64-bit values but
the separate 32-bit values can be optimized. I measured the compile time
impact of this (running llc on 176.gcc) and it was not significant.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95951 91177308-0d34-0410-b5e6-96231b3b80d8
implement support for it) that the stack should be forcibly realigned in the
prologue (and the process reversed in the epilogue).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95945 91177308-0d34-0410-b5e6-96231b3b80d8
what it does. Enhance it to return false to optimizing vector
sign extensions from vector comparisions, which is the idiom used
to get a splatted vector for a vector comparison.
Doing this breaks vector-casts.ll, add some compensating
transformations to handle the important case they cover without
depending on this canonicalization.
This fixes rdar://7434900 a serious pessimization of vector compares.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95855 91177308-0d34-0410-b5e6-96231b3b80d8
block. Other blocks may have pointer cycles that will crash
basicaa and other alias analyses. In any case, there is no
point wasting cycles optimizing dead blocks. This fixes
rdar://7635088
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95852 91177308-0d34-0410-b5e6-96231b3b80d8