This assumption is not satisfied due to global mergeing.
Workaround the issue by temporary disablinge mergeing of const globals.
Also, ignore LLVM "special" globals. This fixes PR7716
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109423 91177308-0d34-0410-b5e6-96231b3b80d8
it's too late to start backing off aggressive latency scheduling when most
of the registers are in use so the threshold should be a bit tighter.
- Correctly handle live out's and extract_subreg etc.
- Enable register pressure aware scheduling by default for hybrid scheduler.
For ARM, this is almost always a win on # of instructions. It's runtime
neutral for most of the tests. But for some kernels with high register
pressure it can be a huge win. e.g. 464.h264ref reduced number of spills by
54 and sped up by 20%.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109279 91177308-0d34-0410-b5e6-96231b3b80d8
for lowering without sse2. Add a couple of new testcases.
Fixes a few libgomp tests and latent bugs. Remove a few todos.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109078 91177308-0d34-0410-b5e6-96231b3b80d8
- Fix a typo for PIC check during jmp table lowering
- Also fix the "first jump table basic block is not
considered only reachable by fall through" problem, use this
ad-hoc solution until I come up with something better.
Patch by stetorvs@gmail.com
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108820 91177308-0d34-0410-b5e6-96231b3b80d8
update the current basic block in addition to the current insert
position, so that they remain consistent. This fixes rdar://8204072.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108765 91177308-0d34-0410-b5e6-96231b3b80d8
instruction for non-constant operands. This includes the case referenced
in the README.txt regarding a bitfield copy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108608 91177308-0d34-0410-b5e6-96231b3b80d8
and a combine pattern to use it for setting a bit-field to a constant
value. More to come for non-constant stores.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108570 91177308-0d34-0410-b5e6-96231b3b80d8
void foo() { __builtin_unreachable(); }
It will output the following on Darwin X86:
_func1:
Leh_func_begin0:
pushq %rbp
Ltmp0:
movq %rsp, %rbp
Ltmp1:
Leh_func_end0:
This prolog adds a new Call Frame Information (CFI) row to the FDE with an
address that is not within the address range of the code it describes -- part is
equal to the end of the function -- and therefore results in an invalid EH
frame. If we emit a nop in this situation, then the CFI row is now within the
address range.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108568 91177308-0d34-0410-b5e6-96231b3b80d8
pass that inserted it.
It is no longer necessary to limit the live ranges of FP registers to a single
basic block.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108536 91177308-0d34-0410-b5e6-96231b3b80d8
occasions, caused code to be generated in a different order.
All cases I've seen involved float softening in the type
legalizer, and this could be perhaps be fixed there, but
it's better not to generate things differently in the first
place. 7797940 (6/29/2010..7/15/2010).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108484 91177308-0d34-0410-b5e6-96231b3b80d8
the function. We'll just turn it into a "trap" instruction instead.
The problem with not handling this is that it might generate a prologue without
the equivalent epilogue to go with it:
$ cat t.ll
define void @foo() {
entry:
unreachable
}
$ llc -o - t.ll -relocation-model=pic -disable-fp-elim -unwind-tables
.section __TEXT,__text,regular,pure_instructions
.globl _foo
.align 4, 0x90
_foo: ## @foo
Leh_func_begin0:
## BB#0: ## %entry
pushq %rbp
Ltmp0:
movq %rsp, %rbp
Ltmp1:
Leh_func_end0:
...
The unwind tables then have bad data in them causing all sorts of problems.
Fixes <rdar://problem/8096481>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108473 91177308-0d34-0410-b5e6-96231b3b80d8
-enable-no-nans-fp-math and -enable-no-infs-fp-math. All of the current codegen fp math optimizations only care whether the fp arithmetics arguments and results can never be NaN.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108465 91177308-0d34-0410-b5e6-96231b3b80d8
to keep "Text" in sync with the "pure instructions" section attribute.
Lack of this attribute was preventing the assembler from emitting
multibyte noops instructions for templates (and inlines, and other
coalesced stuff) and was causing the assembler to mismatch .o files.
This fixes rdar://8018335
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108461 91177308-0d34-0410-b5e6-96231b3b80d8
a zero. This situation arrises in Fortran code with induction variables
that start at 1 instead of 0. This fixes PR7651.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108424 91177308-0d34-0410-b5e6-96231b3b80d8
in the literal field of an instruction. E.g.,
long long foo(long long a) {
return a - 734439407618LL;
}
rdar://7038284
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108339 91177308-0d34-0410-b5e6-96231b3b80d8
address cannot be allocated a register is in 32-bit mode where the first
three arguments are marked inreg. In that case EAX, EDX, and ECX will be
used for argument passing.
This fixes PR7610.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108327 91177308-0d34-0410-b5e6-96231b3b80d8
Targets must now implement TargetInstrInfo::copyPhysReg instead. There is no
longer a default implementation forwarding to copyRegToReg.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108095 91177308-0d34-0410-b5e6-96231b3b80d8
correct alignment information, which simplifies ExpandRes_VAARG a bit.
The patch introduces a new alignment information to TargetLoweringInfo. This is
needed since the two natural candidates cannot be used:
* The 's' in target data: If this is set to the minimal alignment of any
argument, getCallFrameTypeAlignment would return 4 for doubles on ARM for
example.
* The getTransientStackAlignment method. It is possible for an architecture to
have argument less aligned than what we maintain the stack pointer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108072 91177308-0d34-0410-b5e6-96231b3b80d8
- Check getBytesToPopOnReturn().
- Eschew ST0 and ST1 for return values.
- Fix the PIC base register initialization so that it doesn't ever
fail to end up the top of the entry block.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108039 91177308-0d34-0410-b5e6-96231b3b80d8
U utils/TableGen/FastISelEmitter.cpp
--- Reverse-merging r107943 into '.':
U test/CodeGen/X86/fast-isel.ll
U test/CodeGen/X86/fast-isel-loads.ll
U include/llvm/Target/TargetLowering.h
U include/llvm/Support/PassNameParser.h
U include/llvm/CodeGen/FunctionLoweringInfo.h
U include/llvm/CodeGen/CallingConvLower.h
U include/llvm/CodeGen/FastISel.h
U include/llvm/CodeGen/SelectionDAGISel.h
U lib/CodeGen/LLVMTargetMachine.cpp
U lib/CodeGen/CallingConvLower.cpp
U lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
U lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp
U lib/CodeGen/SelectionDAG/FastISel.cpp
U lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
U lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
U lib/CodeGen/SelectionDAG/InstrEmitter.cpp
U lib/CodeGen/SelectionDAG/TargetLowering.cpp
U lib/Target/XCore/XCoreISelLowering.cpp
U lib/Target/XCore/XCoreISelLowering.h
U lib/Target/X86/X86ISelLowering.cpp
U lib/Target/X86/X86FastISel.cpp
U lib/Target/X86/X86ISelLowering.h
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107987 91177308-0d34-0410-b5e6-96231b3b80d8
disabled and then never turned back on again. Adjust some tests, one because
this change avoids an unnecessary instruction, and the other to make it
continue testing what it was intended to test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107941 91177308-0d34-0410-b5e6-96231b3b80d8
1. The arguments are f32.
2. The arguments are loads and they have no uses other than the comparison.
3. The comparison code is EQ or NE.
e.g.
vldr.32 s0, [r1]
vldr.32 s1, [r0]
vcmpe.f32 s1, s0
vmrs apsr_nzcv, fpscr
beq LBB0_2
=>
ldr r1, [r1]
ldr r0, [r0]
cmp r0, r1
beq LBB0_2
More complicated cases will be implemented in subsequent patches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107852 91177308-0d34-0410-b5e6-96231b3b80d8
Add explicit testcases for tail calls within the same module.
Duplicate some code to humor those who think .w doesn't apply on ARM.
Leave this disabled on Thumb1, and add some comments explaining why it's hard
and won't gain much.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107851 91177308-0d34-0410-b5e6-96231b3b80d8
It is OK for an alias live range to overlap if there is a copy to or from the
physical register. CoalescerPair can work out if the copy is coalescable
independently of the alias.
This means that we can join with the actual destination interval instead of
using the getOrigDstReg() hack. It is no longer necessary to merge clobber
ranges into subregisters.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107695 91177308-0d34-0410-b5e6-96231b3b80d8
the example in the testcase, we now generate:
_test1: ## @test1
movss 4(%esp), %xmm0
addss 8(%esp), %xmm0
movl 12(%esp), %eax
movss %xmm0, (%eax)
ret
instead of:
_test1: ## @test1
subl $20, %esp
movl 24(%esp), %eax
movq %mm0, (%esp)
movq %mm0, 8(%esp)
movss (%esp), %xmm0
addss 12(%esp), %xmm0
movss %xmm0, (%eax)
addl $20, %esp
ret
v2f32 support did not work reliably because most of the X86
backend didn't know it was legal. It was apparently only added
to support returning source-level v2f32 values in MMX registers
in x86-32 mode. If ABI compatibility is important on this
GCC-extended-vector type for some reason, then the frontend
should generate IR that returns v2i32 instead of v2f32. However,
we generally don't try very hard to be abi compatible on gcc
extended vectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107601 91177308-0d34-0410-b5e6-96231b3b80d8
v2f32 as legal in 32-bit mode. It is just as terrible there,
but I just care about x86-64 and noone claims it is valuable
in 64-bit mode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107600 91177308-0d34-0410-b5e6-96231b3b80d8
- X86 unfolding should check if the instructions being unfolded has memoperands.
If there is no memoperands, then it must assume conservative alignment. If this
would introduce an expensive sse unaligned load / store, then unfoldMemoryOperand
etc. should not unfold the instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107509 91177308-0d34-0410-b5e6-96231b3b80d8
PrologEpilog code, and use it to determine whether
the asm forces stack alignment or not. gcc consistently
does not do this for GCC-style asms; Apple gcc inconsistently
sometimes does it for asm blocks. There is no
convenient place to put a bit in either the SDNode or
the MachineInstr form, so I've added an extra operand
to each; unlovely, but it does allow for expansion for
more bits, should we need it. PR 5125. Some
existing testcases are affected.
The operand lists of the SDNode and MachineInstr forms
are indexed with awesome mnemonics, like "2"; I may
fix this someday, but not now. I'm not making it any
worse. If anyone is inspired I think you can find all
the right places from this patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107506 91177308-0d34-0410-b5e6-96231b3b80d8
getFunctionAlignment and the corresponding use of that value in the ARM
asm printer, but now we're using the standard asm printer. The result of
this was that function alignments were dropped completely for Thumb functions.
Radar 8143571.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107435 91177308-0d34-0410-b5e6-96231b3b80d8
Objective-C metadata types which should be marked as "weak", but which the
linker will remove upon final linkage. However, this linkage isn't specific to
Objective-C.
For example, the "objc_msgSend_fixup_alloc" symbol is defined like this:
.globl l_objc_msgSend_fixup_alloc
.weak_definition l_objc_msgSend_fixup_alloc
.section __DATA, __objc_msgrefs, coalesced
.align 3
l_objc_msgSend_fixup_alloc:
.quad _objc_msgSend_fixup
.quad L_OBJC_METH_VAR_NAME_1
This is different from the "linker_private" linkage type, because it can't have
the metadata defined with ".weak_definition".
Currently only supported on Darwin platforms.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107433 91177308-0d34-0410-b5e6-96231b3b80d8
available in a register. This is pretty primitive, but it reduces the
number of instructions in common testcases by 4%.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107380 91177308-0d34-0410-b5e6-96231b3b80d8
A partial redefine needs to be treated like a tied operand, and the register
must be reloaded while processing use operands.
This fixes a bug where partially redefined registers were processed as normal
defs with a reload added. The reload could clobber another use operand if it was
a kill that allowed register reuse.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107193 91177308-0d34-0410-b5e6-96231b3b80d8
The LowerSubregs pass needs to preserve implicit def operands attached to
EXTRACT_SUBREG instructions when it replaces those instructions with copies.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107189 91177308-0d34-0410-b5e6-96231b3b80d8
of getPhysicalRegisterRegClass with it.
If we want to make a copy (or estimate its cost), it is better to use the
smallest class as more efficient operations might be possible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107140 91177308-0d34-0410-b5e6-96231b3b80d8
There are 2 changes relative to the previous version of the patch:
1) For the "simple" if-conversion case, there's no need to worry about
RemoveExtraEdges not handling an unanalyzable branch. Predicated terminators
are ignored in this context, so RemoveExtraEdges does the right thing.
This might break someday if we ever treat indirect branches (BRIND) as
predicable, but for now, I just removed this part of the patch, because
in the case where we do not add an unconditional branch, we rely on keeping
the fall-through edge to CvtBBI (which is empty after this transformation).
The change relative to the previous patch is:
@@ -1036,10 +1036,6 @@
IterIfcvt = false;
}
- // RemoveExtraEdges won't work if the block has an unanalyzable branch,
- // which is typically the case for IfConvertSimple, so explicitly remove
- // CvtBBI as a successor.
- BBI.BB->removeSuccessor(CvtBBI->BB);
RemoveExtraEdges(BBI);
// Update block info. BB can be iteratively if-converted.
2) My patch exposed a bug in the code for merging the tail of a "diamond",
which had previously never been exercised. The code was simply checking that
the tail had a single predecessor, but there was a case in
MultiSource/Benchmarks/VersaBench/dbms where that single predecessor was
neither edge of the diamond. I added the following change to check for
that:
@@ -1276,7 +1276,18 @@
// tail, add a unconditional branch to it.
if (TailBB) {
BBInfo TailBBI = BBAnalysis[TailBB->getNumber()];
- if (TailBB->pred_size() == 1 && !TailBBI.HasFallThrough) {
+ bool CanMergeTail = !TailBBI.HasFallThrough;
+ // There may still be a fall-through edge from BBI1 or BBI2 to TailBB;
+ // check if there are any other predecessors besides those.
+ unsigned NumPreds = TailBB->pred_size();
+ if (NumPreds > 1)
+ CanMergeTail = false;
+ else if (NumPreds == 1 && CanMergeTail) {
+ MachineBasicBlock::pred_iterator PI = TailBB->pred_begin();
+ if (*PI != BBI1->BB && *PI != BBI2->BB)
+ CanMergeTail = false;
+ }
+ if (CanMergeTail) {
MergeBlocks(BBI, TailBBI);
TailBBI.IsDone = true;
} else {
With these fixes, I was able to run all the SingleSource and MultiSource
tests successfully.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107110 91177308-0d34-0410-b5e6-96231b3b80d8
have to be registers, per gcc documentation. This affects
the logic for determining what "g" should lower to. PR 7393.
A couple of existing testcases are affected.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107079 91177308-0d34-0410-b5e6-96231b3b80d8
When an instruction has tied operands and physreg defines, we must take extra
care that the tied operands conflict with neither physreg defs nor uses.
The special treatment is given to inline asm and instructions with tied operands
/ early clobbers and physreg defines.
This fixes PR7509.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107043 91177308-0d34-0410-b5e6-96231b3b80d8
regressions.
--- Reverse-merging r106939 into '.':
U test/CodeGen/Thumb2/thumb2-ifcvt3.ll
U lib/CodeGen/IfConversion.cpp
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106951 91177308-0d34-0410-b5e6-96231b3b80d8
if-conversion. The RemoveExtraEdges function doesn't work for blocks that
end with unanalyzable branches, so in those cases, the "extra" edges must
be explicitly removed. The CopyAndPredicateBlock and MergeBlocks methods
can also avoid copying successor edges due to branches that have already
been removed. The latter case is especially helpful when MergeBlocks is
called for handling "diamond" if-conversions, where otherwise you can end
up with some weird intermediate states in the CFG. Unfortunately I've
been unable to find cases where this cleanup actually makes a significant
difference in the code. There is one test where we manage to remove an
empty block at the end of a function. Radar 6911268.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106939 91177308-0d34-0410-b5e6-96231b3b80d8
CopyFromReg nodes for aliasing registers (AX and AL). This confuses the fast
register allocator.
Instead of CopyFromReg(AL), use ExtractSubReg(CopyFromReg(AX), sub_8bit).
This fixes PR7312.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106934 91177308-0d34-0410-b5e6-96231b3b80d8
introduced in r106343, but only showed up recently (with a particular compiler &
linker combination) because of the particular check, and because we have no
builtin checking for dereferencing the end of an array, which is truly
unfortunate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106908 91177308-0d34-0410-b5e6-96231b3b80d8
for an "i" constraint should get lowered; PR 6309. While
this argument was passed around a lot, this is the only
place it was used, so it goes away from a lot of other
places.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106893 91177308-0d34-0410-b5e6-96231b3b80d8
original SDNode. This is badness. Also, this function allows one SDNode to point
multiple flags to another SDNode. Badness as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106793 91177308-0d34-0410-b5e6-96231b3b80d8
address requires a register or secondary load to compute
(most PIC modes). This improves "g" constraint handling. 8015842.
The test from 2007 is attempting to test the fix for PR1761,
but since -relocation-model=static doesn't work on Darwin
x86-64, it was not testing what it was supposed to be testing
and was passing erroneously. Fixed to use Linux x86-64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106779 91177308-0d34-0410-b5e6-96231b3b80d8
CoalescerPair can determine if a copy can be coalesced, and which register gets
merged away. The old logic in SimpleRegisterCoalescing had evolved into
something a bit too convoluted.
This second attempt fixes some crashes that only occurred Linux.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106769 91177308-0d34-0410-b5e6-96231b3b80d8
when the condition is constant. This optimization shouldn't be
necessary, because codegen shouldn't be able to find dead control
paths that the IR-level optimizer can't find. And it's undesirable,
because it encourages bugpoint to leave "br i1 false" branches
in its output. And it wasn't updating the CFG.
I updated all the tests I could, but some tests are too reduced
and I wasn't able to meaningfully preserve them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106748 91177308-0d34-0410-b5e6-96231b3b80d8
CoalescerPair can determine if a copy can be coalesced, and which register gets
merged away. The old logic in SimpleRegisterCoalescing had evolved into
something a bit too convoluted.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106701 91177308-0d34-0410-b5e6-96231b3b80d8
void t(int *cp0, int *cp1, int *dp, int fmd) {
int c0, c1, d0, d1, d2, d3;
c0 = (*cp0++ & 0xffff) | ((*cp1++ << 16) & 0xffff0000);
c1 = (*cp0++ & 0xffff) | ((*cp1++ << 16) & 0xffff0000);
/* ... */
}
It code gens into something pretty bad. But with this change (analogous to the
X86 back-end), it will use ldm and generate few instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106693 91177308-0d34-0410-b5e6-96231b3b80d8
into the same node, but with different non-memory operands, we need to replace
the memory operands after it's finished morphing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106643 91177308-0d34-0410-b5e6-96231b3b80d8
Measurements show that it does not speed up coalescing, so there is no reason
the keep the added complexity around.
Also clean out some unused methods and static functions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106548 91177308-0d34-0410-b5e6-96231b3b80d8
opportunities. For example, this lets it emit this:
movq (%rax), %rcx
addq %rdx, %rcx
instead of this:
movq %rdx, %rcx
addq (%rax), %rcx
in the case where %rdx has subsequent uses. It's the same number
of instructions, and usually the same encoding size on x86, but
it appears faster, and in general, it may allow better scheduling
for the load.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106493 91177308-0d34-0410-b5e6-96231b3b80d8
This allows the fast regiser allocator to remove redundant
register moves.
Update a set of tests that depend on the register allocator
to be linear scan.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106420 91177308-0d34-0410-b5e6-96231b3b80d8
use sharing map. The reconcileNewOffset logic already forces a
separate use if the kinds differ, so incorporating the kind in the
key means we can track more sharing opportunities.
More sharing means fewer total uses to track, which means smaller
problem sizes, which means the conservative throttles don't kick
in as often.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106396 91177308-0d34-0410-b5e6-96231b3b80d8
- This fixed a number of bugs in if-converter, tail merging, and post-allocation
scheduler. If-converter now runs branch folding / tail merging first to
maximize if-conversion opportunities.
- Also changed the t2IT instruction slightly. It now defines the ITSTATE
register which is read by instructions in the IT block.
- Added Thumb2 specific hazard recognizer to ensure the scheduler doesn't
change the instruction ordering in the IT block (since IT mask has been
finalized). It also ensures no other instructions can be scheduled between
instructions in the IT block.
This is not yet enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106344 91177308-0d34-0410-b5e6-96231b3b80d8
instructions, but it doesn't really understand live ranges, so the first
INSERT_SUBREG uses an implicitly defined register.
Fix it in LiveVariableAnalysis by adding the <undef> flag.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106333 91177308-0d34-0410-b5e6-96231b3b80d8
basic tests.
This has been well tested on Darwin but not elsewhere.
It should work provided the linker correctly resolves
B.W <label in other function>
which it has not seen before, at least from llvm-based
compilers. I'm leaving the arm-tail-calls switch in
until I see if there's any problems because of that;
it might need to be disabled for some environments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106299 91177308-0d34-0410-b5e6-96231b3b80d8
does for {flags}. If we create virtual registers of the CCR class, RegAllocFast
may try to spill them, and we can't do that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106289 91177308-0d34-0410-b5e6-96231b3b80d8
LiveVariableAnalysis was a bit picky about a register only being redefined once,
but that really isn't necessary.
Here is an example of chained INSERT_SUBREGs that we can handle now:
68 %reg1040<def> = INSERT_SUBREG %reg1040, %reg1028<kill>, 14
register: %reg1040 +[70,134:0)
76 %reg1040<def> = INSERT_SUBREG %reg1040, %reg1029<kill>, 13
register: %reg1040 replace range with [70,78:1) RESULT: %reg1040,0.000000e+00 = [70,78:1)[78,134:0) 0@78-(134) 1@70-(78)
84 %reg1040<def> = INSERT_SUBREG %reg1040, %reg1030<kill>, 12
register: %reg1040 replace range with [78,86:2) RESULT: %reg1040,0.000000e+00 = [70,78:1)[78,86:2)[86,134:0) 0@86-(134) 1@70-(78) 2@78-(86)
92 %reg1040<def> = INSERT_SUBREG %reg1040, %reg1031<kill>, 11
register: %reg1040 replace range with [86,94:3) RESULT: %reg1040,0.000000e+00 = [70,78:1)[78,86:2)[86,94:3)[94,134:0) 0@94-(134) 1@70-(78) 2@78-(86) 3@86-(94)
rdar://problem/8096390
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106152 91177308-0d34-0410-b5e6-96231b3b80d8
will conflict with another live range. The place which creates this scenerio is
the code in X86 that lowers a select instruction by splitting the MBBs. This
eliminates the need to check from the bottom up in an MBB for live pregs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106066 91177308-0d34-0410-b5e6-96231b3b80d8
Early clobbers defining a virtual register were first alocated to a physreg and
then processed as a physreg EC, spilling the virtreg.
This fixes PR7382.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105998 91177308-0d34-0410-b5e6-96231b3b80d8
Given a copy instruction, CoalescerPair can determine which registers to
coalesce in order to eliminate the copy. It deals with all the subreg fun to
determine a tuple (DstReg, SrcReg, SubIdx) such that:
- SrcReg is a virtual register that will disappear after coalescing.
- DstReg is a virtual or physical register whose live range will be extended.
- SubIdx is 0 when DstReg is a physical register.
- SrcReg can be joined with DstReg:SubIdx.
CoalescerPair::isCoalescable() determines if another copy instruction is
compatible with the same tuple. This fixes some NEON miscompilations where
shuffles are getting coalesced as if they were copies.
The CoalescerPair class will replace a lot of the spaghetti logic in JoinCopy
later.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105997 91177308-0d34-0410-b5e6-96231b3b80d8
replacing the overly conservative checks that I had introduced recently to
deal with correctness issues. This makes a pretty noticable difference
in our testcases where reg_sequences are used. I've updated one test to
check that we no longer emit the unnecessary subreg moves.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105991 91177308-0d34-0410-b5e6-96231b3b80d8
symbols as declarations in the X86 backend. This would manifest
on darwin x86-32 as errors like this with -fvisibility=hidden:
symbol '__ZNSbIcED1Ev' can not be undefined in a subtraction expression
This fixes PR7353.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105954 91177308-0d34-0410-b5e6-96231b3b80d8
i64 and f64 types, but now it also handle Neon vector types, so the f64 result
of VMOVDRR may need to be converted to a Neon type. Radar 8084742.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105845 91177308-0d34-0410-b5e6-96231b3b80d8
This is a bit of a hack to make inline asm look more like call instructions.
It would be better to produce correct dead flags during isel.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105749 91177308-0d34-0410-b5e6-96231b3b80d8
there could be multiple subexpressions within a single expansion which
require insert point adjustment. This fixes PR7306.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105510 91177308-0d34-0410-b5e6-96231b3b80d8
replace an OpA with a widened OpB, it is possible to get new uses of OpA due to CSE
when recursively updating nodes. Since OpA has been processed, the new uses are
not examined again. The patch checks if this occurred and it it did, updates the
new uses of OpA to use OpB.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105453 91177308-0d34-0410-b5e6-96231b3b80d8
registers it defines then interfere with an existing preg live range.
For instance, if we had something like these machine instructions:
BB#0
... = imul ... EFLAGS<imp-def,dead>
test ..., EFLAGS<imp-def>
jcc BB#2 EFLAGS<imp-use>
BB#1
... ; fallthrough to BB#2
BB#2
... ; No code that defines EFLAGS
jcc ... EFLAGS<imp-use>
Machine sink will come along, see that imul implicitly defines EFLAGS, but
because it's "dead", it assumes that it can move imul into BB#2. But when it
does, imul's "dead" imp-def of EFLAGS is raised from the dead (a zombie) and
messes up the condition code for the jump (and pretty much anything else which
relies upon it being correct).
The solution is to know which pregs are live going into a basic block. However,
that information isn't calculated at this point. Nor does the LiveVariables pass
take into account non-allocatable physical registers. In lieu of this, we do a
*very* conservative pass through the basic block to determine if a preg is live
coming out of it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105387 91177308-0d34-0410-b5e6-96231b3b80d8
expansion is the same as that used by LegalizeDAG.
The resulting code sucks in terms of performance/codesize on x86-32 for a
64-bit operation; I haven't looked into whether different expansions might be
better in general.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105378 91177308-0d34-0410-b5e6-96231b3b80d8
that are too large. This causes the freebsd bootloader to be too
large apparently.
It's unclear if this should be an -Os or -Oz thing. Thoughts welcome.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105228 91177308-0d34-0410-b5e6-96231b3b80d8
optimization level.
This only really affects llc for now because both the llvm-gcc and clang front
ends override the default register allocator. I intend to remove that code later.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104904 91177308-0d34-0410-b5e6-96231b3b80d8
Mon Ping provided; unfortunately bugpoint failed to
reduce it, but I think it's important to have a test for
this in the suite. 8023512.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104624 91177308-0d34-0410-b5e6-96231b3b80d8
Fix it by changing the T2I_rbin_s_is multiclass to handle the CPSR
output and 'S' suffix in the same way as T2I_bin_s_irs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104531 91177308-0d34-0410-b5e6-96231b3b80d8
copying VFP subregs. This exposed a bunch of dead code in the *spill-q.ll
tests, so I tweaked those tests to keep that code from being optimized away.
Radar 7872877.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104415 91177308-0d34-0410-b5e6-96231b3b80d8
so that it will continue to test what it was meant to test when I commit a
separate change for better support of BUILD_VECTOR and VECTOR_SHUFFLE for Neon.
Fix a DAG combiner crash exposed by this test change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104380 91177308-0d34-0410-b5e6-96231b3b80d8
pass after isel instead of being interlaced with it, we can
trust that all the code for a function has been isel'd before
it is run.
The practical impact of this is that we can scan for machine
instr phis instead of doing a fuzzy match on the LLVM BB for
phi nodes. Doing the fuzzy match required knowing when isel
would produce an fp reg stack phi which was gross. It was
also wrong in cases where select got lowered to a branch
tree because cmovs aren't available (PR6828).
Just do the scan on machine phis which is simpler, faster
and more correct. This fixes PR6828.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104333 91177308-0d34-0410-b5e6-96231b3b80d8
definitions of the virtual register.
This happens when spilling the registers produced by REG_SEQUENCE:
%reg1047:5<def>, %reg1047:6<def>, %reg1047:7<def> = VLD3d8 %reg1033, 0, pred:14, pred:%reg0
The rewriter would spill the register multiple times, dead store elimination
tried to keep up, but ended up cutting the branch it was sitting on.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104321 91177308-0d34-0410-b5e6-96231b3b80d8
operand on the left, the interesting operand is on the right. This
fixes a bug where LSR was failing to recognize ICmpZero uses,
which led it to be unable to reverse the induction variable in the
attached testcase.
Delete test/CodeGen/X86/stack-color-with-reg-2.ll, because its test
is extremely fragile and hard to meaningfully update.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104262 91177308-0d34-0410-b5e6-96231b3b80d8
the addressing modes don't make this trivially easy. This allows
it to avoid falling into the less precise heuristics in more
cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104186 91177308-0d34-0410-b5e6-96231b3b80d8
lowering REG_SEQUENCE instructions.
Insert copies for REG_SEQUENCE sources not killed to avoid breaking later passes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104146 91177308-0d34-0410-b5e6-96231b3b80d8
The trouble arises when the result of a vector cmp + sext is then and'ed with all ones. Instcombine will turn it into a vector cmp + zext, dag combiner will miss turning it into a vsetcc and hell breaks loose after that.
Teach dag combine to turn a vector cpm + zest into a vsetcc + and 1. This fixes rdar://7923010.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@104094 91177308-0d34-0410-b5e6-96231b3b80d8
While that approach works wonders for register pressure, it tends to break
everything.
This should unbreak the arm-linux builder and fix a number of miscompilations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@103946 91177308-0d34-0410-b5e6-96231b3b80d8
<1xi64> -> i64 to work in MMX registers on hosts where -no-sse
is the default (not mine). The right thing is
to accept this and make i64->f64 conversions go through memory,
but I don't have time right now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@103914 91177308-0d34-0410-b5e6-96231b3b80d8
(This worked as of about 6 months ago and I didn't track down
exactly what broke it; I think this fix is appropriate.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@103911 91177308-0d34-0410-b5e6-96231b3b80d8
allow target to override it in order to map register classes to illegal
but synthesizable types. e.g. v4i64, v8i64 for ARM / NEON.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@103854 91177308-0d34-0410-b5e6-96231b3b80d8
replace the check with the appropriate predicate. Modify the testcase to reflect
the correct code. (It should be saving callee-saved registers on the stack
allocated by the calling fuction.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@103829 91177308-0d34-0410-b5e6-96231b3b80d8
This allows us to add accurate kill markers, something the scavenger likes.
Add some more tests from ARM that needed this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@103521 91177308-0d34-0410-b5e6-96231b3b80d8
Sorry for the big change. The path leading up to this patch had some TableGen
changes that I didn't want to commit before I knew they were useful. They
weren't, and this version does not need them.
The fast register allocator now does no liveness calculations. Instead it relies
on kill flags provided by isel. (Currently those kill flags are also ignored due
to isel bugs). The allocation algorithm is supposed to work with any subset of
valid kill flags. More kill flags simply means fewer spills inserted.
Registers are allocated from a working set that contains no aliases. That means
most allocations can be done directly without expensive alias checks. When the
working set runs out of registers we do the full alias check to find new free
registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@103488 91177308-0d34-0410-b5e6-96231b3b80d8
LSRUse's Regs set after all pruning is done, rather than trying
to do it on the fly, which can produce an incomplete result.
This fixes a case where heuristic pruning was stripping all
formulae from a use, which led the solver to enter an infinite
loop.
Also, add a few asserts to diagnose this kind of situation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@103328 91177308-0d34-0410-b5e6-96231b3b80d8
getConstantFP to accept the two supported long double
target types. This was not the original intent, but
there are other places that assume this works and it's
easy enough to do.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@103299 91177308-0d34-0410-b5e6-96231b3b80d8
Users can write broken code that emits the same label twice with asm renaming,
detect this and emit a fatal backend error instead of aborting.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@103140 91177308-0d34-0410-b5e6-96231b3b80d8
beneficial cases. See the changes in test/CodeGen/X86/tail-opts.ll and
test/CodeGen/ARM/ifcvt2.ll for details.
The fix is to change HashEndOfMBB to hash at most one instruction,
instead of trying to apply heuristics about when it will be profitable to
consider more than one instruction. The regular tail-merging heuristics
are already prepared to handle the same cases, and they're more precise.
Also, make test/CodeGen/ARM/ifcvt5.ll and
test/CodeGen/Thumb2/thumb2-branch.ll slightly more complex so that they
continue to test what they're intended to test.
And, this eliminates the problem in
test/CodeGen/Thumb2/2009-10-15-ITBlockBranch.ll, the testcase from
PR5204. Update it accordingly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@102907 91177308-0d34-0410-b5e6-96231b3b80d8
indexes could be of a different value type. Or not even using the same SDNode
for the constant (weird, I know). Compare the actual values instead of the
pointers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@102791 91177308-0d34-0410-b5e6-96231b3b80d8
call that might throw. The landing pad assumes that all registers are in stack
slots.
We used to spill those dirty CSRs after the call, and the stack slots would be
wrong when arriving at the landing pad.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@102770 91177308-0d34-0410-b5e6-96231b3b80d8
of different register classes. e.g.
%reg1048:3<def> = EXTRACT_SUBREG %RAX<kill>, 3
Where %reg1048 is a GR32 register. This is not impossible to handle, but it is
pretty hard and very rare.
This should unbreak the dragonegg builder.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@102672 91177308-0d34-0410-b5e6-96231b3b80d8
alignment of globals to the preferred alignment, but only when
there is no section specified on the global (by far the common
case).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@102515 91177308-0d34-0410-b5e6-96231b3b80d8
otherwise labels get incorrectly merged. We handled this by emitting a
".byte 0", but this isn't correct on thumb/arm targets where the text segment
needs to be a multiple of 2/4 bytes. Handle this by emitting a noop. This
is more gross than it should be because arm/ppc are not fully mc'ized yet.
This fixes rdar://7908505
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@102400 91177308-0d34-0410-b5e6-96231b3b80d8
doesn't dominate the header is needed, don't check whether the increment
expression has computable loop evolution. While the operands of an
addrec are required to be loop-invariant, they're not required to
dominate any part of the loop. This fixes PR6914.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@102389 91177308-0d34-0410-b5e6-96231b3b80d8
alignment of globals with a specified alignment, we fix
common variables to obey their alignment. Add a comment
explaining why this behavior is important.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@102365 91177308-0d34-0410-b5e6-96231b3b80d8
alignment to match what's used in clang and GCC for __alignof, rather
than trying to guess what Legalize is going to be doing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@102206 91177308-0d34-0410-b5e6-96231b3b80d8
misses an opportunity to fold add operands, but folds them
after LSR has separated them out. This fixes rdar://7886751.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@102157 91177308-0d34-0410-b5e6-96231b3b80d8
optimization for non-leaf functions. This will be hooked up to gcc's
-momit-leaf-frame-pointer option. rdar://7886181
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101984 91177308-0d34-0410-b5e6-96231b3b80d8
in other types. fix this by only bumping zero-byte globals
up to a single byte if the *entire global* is zero size,
fixing PR6340.
This also fixes empty arrays etc to be handled correctly,
and only does this on subsection-via-symbols targets (aka
darwin) which is the only place where this matters.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101879 91177308-0d34-0410-b5e6-96231b3b80d8
the intrinsics. The reason for those i8* types is that the intrinsics are
overloaded on the vector type and we don't have a way to declare an intrinsic
where one argument is an overloaded vector type and another argument is a
pointer to the vector element type. The bitcasts added here will match what
the frontend will typically generate when these intrinsics are used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101840 91177308-0d34-0410-b5e6-96231b3b80d8
This doesn't occur much at all, it only seems to formed in the case
when the trunc optimization kicks in due to phase ordering. In that
case it is saves a few bytes on x86-32.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101350 91177308-0d34-0410-b5e6-96231b3b80d8
a load/or/and/store sequence into a narrower store when it is
safe. Daniel tells me that clang will start producing this sort
of thing with bitfields, and this does trigger a few dozen times
on 176.gcc produced by llvm-gcc even now.
This compiles code like CodeGen/X86/2009-05-28-DAGCombineCrash.ll
into:
movl %eax, 36(%rdi)
instead of:
movl $4294967295, %eax ## imm = 0xFFFFFFFF
andq 32(%rdi), %rax
shlq $32, %rcx
addq %rax, %rcx
movq %rcx, 32(%rdi)
and each of the testcases into a single store. Each of them used
to compile into craziness like this:
_test4:
movl $65535, %eax ## imm = 0xFFFF
andl (%rdi), %eax
shll $16, %esi
addl %eax, %esi
movl %esi, (%rdi)
ret
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101343 91177308-0d34-0410-b5e6-96231b3b80d8