Apparently we didn't keep an association of Compile Unit metadata nodes
to DIEs so looking up that parental context failed & thus caused no
DW_TAG_imported_modules to be emitted at the CU scope. Fix this by
adding the mapping & sure up the test case to verify this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181339 91177308-0d34-0410-b5e6-96231b3b80d8
Now even the small structures could be passed within byval (small enough
to be stored in GPRs).
In regression tests next function prototypes are checked:
PR15293:
%artz = type { i32 }
define void @foo(%artz* byval %s)
define void @foo2(%artz* byval %s, i32 %p, %artz* byval %s2)
foo: "s" stored in R0
foo2: "s" stored in R0, "s2" stored in R2.
Next AAPCS rules are checked:
5.5 Parameters Passing, C.4 and C.5,
"ParamSize" is parameter size in 32bit words:
-- NSAA != 0, NCRN < R4 and NCRN+ParamSize > R4.
Parameter should be sent to the stack; NCRN := R4.
-- NSAA != 0, and NCRN < R4, NCRN+ParamSize < R4.
Parameter stored in GPRs; NCRN += ParamSize.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181148 91177308-0d34-0410-b5e6-96231b3b80d8
at all of the operands. Previously it was skipping over implicit operands which
cause infinite looping when the two-address pass try to reschedule a
two-address instruction below the kill of tied operand.
I'm unable to come up with a reasonably sized test case.
rdar://13747577
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180906 91177308-0d34-0410-b5e6-96231b3b80d8
the things, and renames it to CBindingWrapping.h. I also moved
CBindingWrapping.h into Support/.
This new file just contains the macros for defining different wrap/unwrap
methods.
The calls to those macros, as well as any custom wrap/unwrap definitions
(like for array of Values for example), are put into corresponding C++
headers.
Doing this required some #include surgery, since some .cpp files relied
on the fact that including Wrap.h implicitly caused the inclusion of a
bunch of other things.
This also now means that the C++ headers will include their corresponding
C API headers; for example Value.h must include llvm-c/Core.h. I think
this is harmless, since the C API headers contain just external function
declarations and some C types, so I don't believe there should be any
nasty dependency issues here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180881 91177308-0d34-0410-b5e6-96231b3b80d8
report a fatal error. This allows us to continue processing the translation
unit. Test case to come on the clang side because we need an inline asm
diagnostics handler in place.
rdar://13446483
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180873 91177308-0d34-0410-b5e6-96231b3b80d8
Optimize CONCAT_VECTOR nodes that merge EXTRACT_SUBVECTOR values that extract from the same vector.
rdar://13402653
PR15866
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180871 91177308-0d34-0410-b5e6-96231b3b80d8
register-indirect address with an offset of 0.
It used to be that a DBG_VALUE is a register-indirect value if the offset
(operand 1) is nonzero. The new convention is that a DBG_VALUE is
register-indirect if the first operand is a register and the second
operand is an immediate. For plain registers use the combination reg, reg.
rdar://problem/13658587
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180816 91177308-0d34-0410-b5e6-96231b3b80d8
First, taking advantage of the fact that the virtual base registers are allocated in order of the local frame offsets, remove the quadratic register-searching behavior. Because of the ordering, we only need to check the last virtual base register created.
Second, store the frame index in the FrameRef structure, and get the frame index and the local offset from this structure at the top of the loop iteration. This allows us to de-nest the loops in insertFrameReferenceRegisters (and I think makes the code cleaner). I also moved the needsFrameBaseReg check into the first loop over instructions so that we don't bother pushing FrameRefs for instructions that don't want a virtual base register anyway.
Lastly, and this is the only functionality change, avoid the creation of single-use virtual base registers. These are currently not useful because, in general, they end up replacing what would be one r+r instruction with an add and a r+i instruction. Committing this removes the XFAIL in CodeGen/PowerPC/2007-09-07-LoadStoreIdxForms.ll
Jim has okayed this off-list.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180799 91177308-0d34-0410-b5e6-96231b3b80d8
The `llvm.tls_init_funcs' (created by the front-end) holds pointers to the TLS
initialization functions. These need to be placed into the correct section so
that they are run before `main()'.
<rdar://problem/13733006>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180737 91177308-0d34-0410-b5e6-96231b3b80d8
to determine whether or not we're on a darwin platform for debug code
emitting.
Solves the problem of a module with no triple on the command line
and no triple in the module using non-gdb ok features on darwin. Fix
up the member-pointers test to check the correct things for cross
platform (DW_FORM_flag is a good prefix).
Unfortunately no testcase because I have no ideas how to test something
without a triple and without a triple in the module yet check
precisely on two platforms. Ideas welcome.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180660 91177308-0d34-0410-b5e6-96231b3b80d8
Clarify documentation and API to make the difference between register and
register-indirect addressed locations more explicit. Put in a comment
to point out that with the current implementation we cannot specify
a register-indirect location with offset 0 (a breg 0 in DWARF).
No functionality change intended.
rdar://problem/13658587
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180641 91177308-0d34-0410-b5e6-96231b3b80d8
TLVs probably won't be as common as the other types of variables. Check for them
last before defaulting to "DATA".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180631 91177308-0d34-0410-b5e6-96231b3b80d8
This already helps SSE2 x86 a lot because it lacks an efficient way to
represent a vector select. The long term goal is to enable the backend to match
a canonicalized pattern into a single instruction (e.g. vabs or pabs).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180597 91177308-0d34-0410-b5e6-96231b3b80d8
Fixes PR15838. Need to check for blocks with nothing but dbg.value.
I'm not sure how to force this situation with a unit test. I tried to
reduce the test case in PR15838 (1k lines of metadata) but gave up.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180227 91177308-0d34-0410-b5e6-96231b3b80d8
For now, we just reschedule instructions that use the copied vregs and
let regalloc elliminate it. I would really like to eliminate the
copies on-the-fly during scheduling, but we need a complete
implementation of repairIntervalsInRange() first.
The general strategy is for the register coalescer to eliminate as
many global copies as possible and shrink live ranges to be
extended-basic-block local. The coalescer should not have to worry
about resolving local copies (e.g. it shouldn't attemp to reorder
instructions). The scheduler is a much better place to deal with local
interference. The coalescer side of this equation needs work.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180193 91177308-0d34-0410-b5e6-96231b3b80d8
When MachineScheduler is enabled, this functionality can be
removed. Until then, provide a way to disable it for test cases and
designing MachineScheduler heuristics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180192 91177308-0d34-0410-b5e6-96231b3b80d8
This exposed an issue with PowerPC AltiVec where it appears it was setting the wrong vector boolean contents. The included change
fixes the PowerPC tests, and was OK'd by Hal.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180129 91177308-0d34-0410-b5e6-96231b3b80d8
1) Disallow 'returned' on parameter that is also 'sret' (no sensible semantics, as far as I can tell).
2) Conservatively disallow tail calls through 'returned' parameters that also are 'zext' or 'sext' (for consistency with treatment of other zero-extending and sign-extending operations in tail call position detection...can be revised later to handle situations that can be determined to be safe).
This is a new attribute that is not yet used, so there is no impact.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180118 91177308-0d34-0410-b5e6-96231b3b80d8
Also add a check for llvm.used in the verifier and simplify clients now that
they can assume they have a ConstantArray.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180019 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r179840 with a fix to test/DebugInfo/two-cus-from-same-file.ll
I'm not sure why that test only failed on ARM & MIPS and not X86 Linux, even
though the debug info was clearly invalid on all of them, but this ought to fix
it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179996 91177308-0d34-0410-b5e6-96231b3b80d8
Rather than just splitting the input type and hoping for the best, apply
a bit more cleverness. Just splitting the types until the source is
legal often leads to an illegal result time, which is then widened and a
scalarization step is introduced which leads to truly horrible code
generation. With the loop vectorizer, these sorts of operations are much
more common, and so it's worth extra effort to do them well.
Add a legalization hook for the operands of a TRUNCATE node, which will
be encountered after the result type has been legalized, but if the
operand type is still illegal. If simple splitting of both types
ends up with the result type of each half still being legal, just
do that (v16i16 -> v16i8 on ARM, for example). If, however, that would
result in an illegal result type (v8i32 -> v8i8 on ARM, for example),
we can get more clever with power-two vectors. Specifically,
split the input type, but also widen the result element size, then
concatenate the halves and truncate again. For example on ARM,
To perform a "%res = v8i8 trunc v8i32 %in" we transform to:
%inlo = v4i32 extract_subvector %in, 0
%inhi = v4i32 extract_subvector %in, 4
%lo16 = v4i16 trunc v4i32 %inlo
%hi16 = v4i16 trunc v4i32 %inhi
%in16 = v8i16 concat_vectors v4i16 %lo16, v4i16 %hi16
%res = v8i8 trunc v8i16 %in16
This allows instruction selection to generate three VMOVN instructions
instead of a sequences of moves, stores and loads.
Update the ARMTargetTransformInfo to take this improved legalization
into account.
Consider the simplified IR:
define <16 x i8> @test1(<16 x i32>* %ap) {
%a = load <16 x i32>* %ap
%tmp = trunc <16 x i32> %a to <16 x i8>
ret <16 x i8> %tmp
}
define <8 x i8> @test2(<8 x i32>* %ap) {
%a = load <8 x i32>* %ap
%tmp = trunc <8 x i32> %a to <8 x i8>
ret <8 x i8> %tmp
}
Previously, we would generate the truly hideous:
.syntax unified
.section __TEXT,__text,regular,pure_instructions
.globl _test1
.align 2
_test1: @ @test1
@ BB#0:
push {r7}
mov r7, sp
sub sp, sp, #20
bic sp, sp, #7
add r1, r0, #48
add r2, r0, #32
vld1.64 {d24, d25}, [r0:128]
vld1.64 {d16, d17}, [r1:128]
vld1.64 {d18, d19}, [r2:128]
add r1, r0, #16
vmovn.i32 d22, q8
vld1.64 {d16, d17}, [r1:128]
vmovn.i32 d20, q9
vmovn.i32 d18, q12
vmov.u16 r0, d22[3]
strb r0, [sp, #15]
vmov.u16 r0, d22[2]
strb r0, [sp, #14]
vmov.u16 r0, d22[1]
strb r0, [sp, #13]
vmov.u16 r0, d22[0]
vmovn.i32 d16, q8
strb r0, [sp, #12]
vmov.u16 r0, d20[3]
strb r0, [sp, #11]
vmov.u16 r0, d20[2]
strb r0, [sp, #10]
vmov.u16 r0, d20[1]
strb r0, [sp, #9]
vmov.u16 r0, d20[0]
strb r0, [sp, #8]
vmov.u16 r0, d18[3]
strb r0, [sp, #3]
vmov.u16 r0, d18[2]
strb r0, [sp, #2]
vmov.u16 r0, d18[1]
strb r0, [sp, #1]
vmov.u16 r0, d18[0]
strb r0, [sp]
vmov.u16 r0, d16[3]
strb r0, [sp, #7]
vmov.u16 r0, d16[2]
strb r0, [sp, #6]
vmov.u16 r0, d16[1]
strb r0, [sp, #5]
vmov.u16 r0, d16[0]
strb r0, [sp, #4]
vldmia sp, {d16, d17}
vmov r0, r1, d16
vmov r2, r3, d17
mov sp, r7
pop {r7}
bx lr
.globl _test2
.align 2
_test2: @ @test2
@ BB#0:
push {r7}
mov r7, sp
sub sp, sp, #12
bic sp, sp, #7
vld1.64 {d16, d17}, [r0:128]
add r0, r0, #16
vld1.64 {d20, d21}, [r0:128]
vmovn.i32 d18, q8
vmov.u16 r0, d18[3]
vmovn.i32 d16, q10
strb r0, [sp, #3]
vmov.u16 r0, d18[2]
strb r0, [sp, #2]
vmov.u16 r0, d18[1]
strb r0, [sp, #1]
vmov.u16 r0, d18[0]
strb r0, [sp]
vmov.u16 r0, d16[3]
strb r0, [sp, #7]
vmov.u16 r0, d16[2]
strb r0, [sp, #6]
vmov.u16 r0, d16[1]
strb r0, [sp, #5]
vmov.u16 r0, d16[0]
strb r0, [sp, #4]
ldm sp, {r0, r1}
mov sp, r7
pop {r7}
bx lr
Now, however, we generate the much more straightforward:
.syntax unified
.section __TEXT,__text,regular,pure_instructions
.globl _test1
.align 2
_test1: @ @test1
@ BB#0:
add r1, r0, #48
add r2, r0, #32
vld1.64 {d20, d21}, [r0:128]
vld1.64 {d16, d17}, [r1:128]
add r1, r0, #16
vld1.64 {d18, d19}, [r2:128]
vld1.64 {d22, d23}, [r1:128]
vmovn.i32 d17, q8
vmovn.i32 d16, q9
vmovn.i32 d18, q10
vmovn.i32 d19, q11
vmovn.i16 d17, q8
vmovn.i16 d16, q9
vmov r0, r1, d16
vmov r2, r3, d17
bx lr
.globl _test2
.align 2
_test2: @ @test2
@ BB#0:
vld1.64 {d16, d17}, [r0:128]
add r0, r0, #16
vld1.64 {d18, d19}, [r0:128]
vmovn.i32 d16, q8
vmovn.i32 d17, q9
vmovn.i16 d16, q8
vmov r0, r1, d16
bx lr
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179989 91177308-0d34-0410-b5e6-96231b3b80d8
I think it's almost impossible to fold atomic fences profitably under
LLVM/C++11 semantics. As a result, this is now unused and just
cluttering up the target interface.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179940 91177308-0d34-0410-b5e6-96231b3b80d8
trying to move as much FastISel logic as possible out of the main path in
SelectionDAGISel - intermixing them just adds confusion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179902 91177308-0d34-0410-b5e6-96231b3b80d8
Adding another CU-wide list, in this case of imported_modules (since they
should be relatively rare, it seemed better to add a list where each element
had a "context" value, rather than add a (usually empty) list to every scope).
This takes care of DW_TAG_imported_module, but to fully address PR14606 we'll
need to expand this to cover DW_TAG_imported_declaration too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179836 91177308-0d34-0410-b5e6-96231b3b80d8
This is a rework of the broken parts in r179373 which were subsequently reverted in r179374 due to incompatibility with C++98 compilers. This version should be ok under C++98.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179520 91177308-0d34-0410-b5e6-96231b3b80d8
The register allocator expects minimal physreg live ranges. Schedule
physreg copies accordingly. This is slightly tricky when they occur in
the middle of the scheduling region. For now, this is handled by
rescheduling the copy when its associated instruction is
scheduled. Eventually we may instead bundle them, but only if we can
preserve the bundles as parallel copies during regalloc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179449 91177308-0d34-0410-b5e6-96231b3b80d8
When debugging performance regressions we often ask ourselves if the regression
that we see is due to poor isel/sched/ra or due to some micro-architetural
problem. When comparing two code sequences one good way to rule out front-end
bottlenecks (and other the issues) is to force code alignment. This pass adds
a flag that forces the alignment of all of the basic blocks in the program.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179353 91177308-0d34-0410-b5e6-96231b3b80d8
In the simple and triangle if-conversion cases, when CopyAndPredicateBlock is
used because the to-be-predicated block has other predecessors, we need to
explicitly remove the old copied block from the successors list. Normally if
conversion relies on TII->AnalyzeBranch combined with BB->CorrectExtraCFGEdges
to cleanup the successors list, but if the predicated block contained an
un-analyzable branch (such as a now-predicated return), then this will fail.
These extra successors were causing a problem on PPC because it was causing
later passes (such as PPCEarlyReturm) to leave dead return-only basic blocks in
the code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179227 91177308-0d34-0410-b5e6-96231b3b80d8
The target hooks are getting out of hand. What does it mean to run
before or after regalloc anyway? Allowing either Pass* or AnalysisID
pass identification should make it much easier for targets to use the
substitutePass and insertPass APIs, and create less need for badly
named target hooks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179140 91177308-0d34-0410-b5e6-96231b3b80d8
therefore not at all) of the pc or statement list. We also don't
need to emit the compilation dir so save so space and time
and don't bother.
Fix up the testcase accordingly and verify that we don't emit
the attributes or the items that they use.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179114 91177308-0d34-0410-b5e6-96231b3b80d8
This pattern occurs in SROA output due to the way vector arguments are lowered
on ARM.
The testcase from PR15525 now compiles into this, which is better than the code
we got with the old scalarrepl:
_Store:
ldr.w r9, [sp]
vmov d17, r3, r9
vmov d16, r1, r2
vst1.8 {d16, d17}, [r0]
bx lr
Differential Revision: http://llvm-reviews.chandlerc.com/D647
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179106 91177308-0d34-0410-b5e6-96231b3b80d8
a relocation across sections. Do this for DW_AT_stmt list in the
skeleton CU and check the relocations in the debug_info section.
Add a FIXME for multiple CUs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178969 91177308-0d34-0410-b5e6-96231b3b80d8
We used to do "SmallString += CUID", which is incorrect, since CUID will
be truncated to a char.
rdar://problem/13573833
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178941 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes PEI as previously described, but correctly handles the case where
the instruction defining the virtual register to be scavenged is the first in
the block. Arnold provided me with a bugpoint-reduced test case, but even that
seems too large to use as a regression test. If I'm successful in cleaning it
up then I'll commit that as well.
Original commit message:
This change fixes a bug that I introduced in r178058. After a register is
scavenged using one of the available spills slots the instruction defining the
virtual register needs to be moved to after the spill code. The scavenger has
already processed the defining instruction so that registers killed by that
instruction are available for definition in that same instruction. Unfortunately,
after this, the scavenger needs to iterate through the spill code and then
visit, again, the instruction that defines the now-scavenged register. In order
to avoid confusion, the register scavenger needs the ability to 'back up'
through the spill code so that it can again process the instructions in the
appropriate order. Prior to this fix, once the scavenger reached the
just-moved instruction, it would assert if it killed any registers because,
having already processed the instruction, it believed they were undefined.
Unfortunately, I don't yet have a small test case. Thanks to Pranav Bhandarkar
for diagnosing the problem and testing this fix.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178919 91177308-0d34-0410-b5e6-96231b3b80d8
During LTO, the target options on functions within the same Module may
change. This would necessitate resetting some of the back-end. Do this for X86,
because it's a Friday afternoon.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178917 91177308-0d34-0410-b5e6-96231b3b80d8
Reverting because this breaks one of the LTO builders. Original commit message:
This change fixes a bug that I introduced in r178058. After a register is
scavenged using one of the available spills slots the instruction defining the
virtual register needs to be moved to after the spill code. The scavenger has
already processed the defining instruction so that registers killed by that
instruction are available for definition in that same instruction. Unfortunately,
after this, the scavenger needs to iterate through the spill code and then
visit, again, the instruction that defines the now-scavenged register. In order
to avoid confusion, the register scavenger needs the ability to 'back up'
through the spill code so that it can again process the instructions in the
appropriate order. Prior to this fix, once the scavenger reached the
just-moved instruction, it would assert if it killed any registers because,
having already processed the instruction, it believed they were undefined.
Unfortunately, I don't yet have a small test case. Thanks to Pranav Bhandarkar
for diagnosing the problem and testing this fix.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178916 91177308-0d34-0410-b5e6-96231b3b80d8
This change fixes a bug that I introduced in r178058. After a register is
scavenged using one of the available spills slots the instruction defining the
virtual register needs to be moved to after the spill code. The scavenger has
already processed the defining instruction so that registers killed by that
instruction are available for definition in that same instruction. Unfortunately,
after this, the scavenger needs to iterate through the spill code and then
visit, again, the instruction that defines the now-scavenged register. In order
to avoid confusion, the register scavenger needs the ability to 'back up'
through the spill code so that it can again process the instructions in the
appropriate order. Prior to this fix, once the scavenger reached the
just-moved instruction, it would assert if it killed any registers because,
having already processed the instruction, it believed they were undefined.
Unfortunately, I don't yet have a small test case. Thanks to Pranav Bhandarkar
for diagnosing the problem and testing this fix.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178845 91177308-0d34-0410-b5e6-96231b3b80d8
For now, just save the compile time since the ConvergingScheduler
heuristics don't use this analysis. We'll probably enable it later
after compile-time investigation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178822 91177308-0d34-0410-b5e6-96231b3b80d8
On certain architectures we can support efficient vectorized version of
instructions if the operand value is uniform (splat) or a constant scalar.
An example of this is a vector shift on x86.
We can efficiently support
for (i = 0 ; i < ; i += 4)
w[0:3] = v[0:3] << <2, 2, 2, 2>
but not
for (i = 0; i < ; i += 4)
w[0:3] = v[0:3] << x[0:3]
This patch adds a parameter to getArithmeticInstrCost to further qualify operand
values as uniform or uniform constant.
Targets can then choose to return a different cost for instructions with such
operand values.
A follow-up commit will test this feature on x86.
radar://13576547
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178807 91177308-0d34-0410-b5e6-96231b3b80d8
There is a difference for FORM_ref_addr between DWARF 2 and DWARF 3+.
Since Eric is against guarding DWARF 2 ref_addr with DarwinGDBCompat, we are
still in discussion on how to handle this.
The correct solution is to update our header to say version 4 instead of version
2 and update tool chains as well.
rdar://problem/13559431
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178806 91177308-0d34-0410-b5e6-96231b3b80d8
the target system.
It was hard-coded to 4 bytes before. I can't get llvm to generate a
ref_addr on a reasonably sized testing case.
rdar://problem/13559431
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178722 91177308-0d34-0410-b5e6-96231b3b80d8
For this we need to use a libcall. Previously LLVM didn't implement
libcall support for frem, so I've added it in the usual
straightforward manner. A test case from the bug report is included.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178639 91177308-0d34-0410-b5e6-96231b3b80d8
The new instruction scheduling models provide information about the
number of cycles consumed on each processor resource. This makes it
possible to estimate ILP more accurately than simply counting
instructions / issue width.
The functions getResourceDepth() and getResourceLength() now identify
the limiting processor resource, and return a cycle count based on that.
This gives more precise resource information, particularly in traces
that use one resource a lot more than others.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178553 91177308-0d34-0410-b5e6-96231b3b80d8
This is helps on architectures where i8,i16 are not legal but we have byte, and
short loads/stores. Allowing us to merge copies like the one below on ARM.
copy(char *a, char *b, int n) {
do {
int t0 = a[0];
int t1 = a[1];
b[0] = t0;
b[1] = t1;
radar://13536387
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178546 91177308-0d34-0410-b5e6-96231b3b80d8
We would also like to merge sequences that involve a variable index like in the
example below.
int index = *idx++
int i0 = c[index+0];
int i1 = c[index+1];
b[0] = i0;
b[1] = i1;
By extending the parsing of the base pointer to handle dags that contain a
base, index, and offset we can handle examples like the one above.
The dag for the code above will look something like:
(load (i64 add (i64 copyfromreg %c)
(i64 signextend (i8 load %index))))
(load (i64 add (i64 copyfromreg %c)
(i64 signextend (i32 add (i32 signextend (i8 load %index))
(i32 1)))))
The code that parses the tree ignores the intermediate sign extensions. However,
if there is a sign extension it needs to be on all indexes.
(load (i64 add (i64 copyfromreg %c)
(i64 signextend (add (i8 load %index)
(i8 1))))
vs
(load (i64 add (i64 copyfromreg %c)
(i64 signextend (i32 add (i32 signextend (i8 load %index))
(i32 1)))))
radar://13536387
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178483 91177308-0d34-0410-b5e6-96231b3b80d8
die values. A lot of DIEs have 10 attributes in C++ code (example
clang), none had more than 12. Seems like a good default.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178366 91177308-0d34-0410-b5e6-96231b3b80d8
immediate in a register. I don't believe this should ever fail, but I see no
harm in trying to make this code bullet proof.
I've added an assert to ensure my assumtion is correct. If the assertion fires
something is wrong and we should fix it, rather then just silently fall back to
SelectionDAG isel.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178305 91177308-0d34-0410-b5e6-96231b3b80d8
This is a follow-up to r178073 (which should actually make target-customized
spilling work again).
I still don't have a regression test for this (but it would be good to have
one; Thumb 1 and Mips16 use this callback as well).
Patch by Richard Sandiford.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178137 91177308-0d34-0410-b5e6-96231b3b80d8
As pointed out by Richard Sandiford, my recent updates to the register
scavenger broke targets that use custom spilling (because the new code assumed
that if there were no valid spill slots, than spilling would be impossible).
I don't have a test case, but it should be possible to create one for Thumb 1,
Mips 16, etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178073 91177308-0d34-0410-b5e6-96231b3b80d8
The previous algorithm could not deal properly with scavenging multiple virtual
registers because it kept only one live virtual -> physical mapping (and
iterated through operands in order). Now we don't maintain a current mapping,
but rather use replaceRegWith to completely remove the virtual register as
soon as the mapping is established.
In order to allow the register scavenger to return a physical register killed
by an instruction for definition by that same instruction, we now call
RS->forward(I) prior to eliminating virtual registers defined in I. This
requires a minor update to forward to ignore virtual registers.
These new features will be tested in forthcoming commits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178058 91177308-0d34-0410-b5e6-96231b3b80d8
- Handle the case where the result of 'insert_subvect' is bitcasted
before 'extract_subvec'. This removes the redundant insertf128/extractf128
pair on unaligned 256-bit vector load/store on vectors of non 64-bit integer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177945 91177308-0d34-0410-b5e6-96231b3b80d8
For instance, following transformation will be disabled:
x + x + x => 3.0f * x;
The problem of these transformations is that it introduces a FP constant, which
following Instruction-Selection pass cannot handle.
Reviewed by Nadav, thanks a lot!
rdar://13445387
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177933 91177308-0d34-0410-b5e6-96231b3b80d8
Performing this check unilaterally prevented us from generating FMAs when the incoming IR contained illegal vector types which would eventually be legalized to underlying types that *did* support FMA.
For example, an @llvm.fmuladd on an OpenCL float16 should become a sequence of float4 FMAs, not float4 fmul+fadd's.
NOTE: Because we still call the target-specific profitability hook, individual targets can reinstate the old behavior, if desired, by simply performing the legality check inside their callback hook. They can also perform more sophisticated legality checks, if, for example, some illegal vector types can be productively implemented as FMAs, but not others.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177820 91177308-0d34-0410-b5e6-96231b3b80d8
177774 broke the lld-x86_64-darwin11 builder; error:
error: comparison of integers of different signs: 'int' and 'size_type' (aka 'unsigned long')
for (SI = 0; SI < Scavenged.size(); ++SI)
~~ ^ ~~~~~~~~~~~~~~~~
Fix this by making SI also unsigned.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177780 91177308-0d34-0410-b5e6-96231b3b80d8
This patch lets the register scavenger make use of multiple spill slots in
order to guarantee that it will be able to provide multiple registers
simultaneously.
To support this, the RS's API has changed slightly: setScavengingFrameIndex /
getScavengingFrameIndex have been replaced by addScavengingFrameIndex /
isScavengingFrameIndex / getScavengingFrameIndices.
In forthcoming commits, the PowerPC backend will use this capability in order
to implement the spilling of condition registers, and some special-purpose
registers, without relying on r0 being reserved. In some cases, spilling these
registers requires two GPRs: one for addressing and one to hold the value being
transferred.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177774 91177308-0d34-0410-b5e6-96231b3b80d8
ScavengedRC was a dead private variable (set, but not otherwise used). No
functionality change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177708 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit 06091513c2.
The code is obviously wrong, but the trivial fix causes
inefficient code generation on X86. Somebody with more
knowledge of the code needs to take a look here.
Signed-off-by: Christian König <christian.koenig@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177529 91177308-0d34-0410-b5e6-96231b3b80d8
TargetOpcodes need to be treaded as Machine- and not ISD-Opcodes.
Signed-off-by: Christian König <christian.koenig@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177518 91177308-0d34-0410-b5e6-96231b3b80d8
A node's ordering is only propagated during legalization if (a) the new node does
not have an ordering (is not a CSE'd node), or (b) the new node has an ordering
that is higher than the node being legalized.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177465 91177308-0d34-0410-b5e6-96231b3b80d8
The always-true "(int)Int == (signed)Int" comparison was found
while experimenting with a potential new Clang warning.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177290 91177308-0d34-0410-b5e6-96231b3b80d8
The linker sorts the .tls$<xyz> sections by name, and we need
to make sure any extra sections we produce (e.g. for weak globals)
always end up between .tls$AAA and .tls$ZZZ, even if the name
starts with e.g. an underscore.
Patch by David Nadlinger!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177256 91177308-0d34-0410-b5e6-96231b3b80d8
Implicit defs are not currently positional and not modeled by the
per-operand machine model. Unfortunately, we treat defs that are part
of the architectural instruction description, like flags, the same as
other implicit defs. Really, they should have a fixed MachineInstr
layout and probably shouldn't be "implicit" at all.
For now, we'll change the default latency to be the max operand
latency. That will give flag setting operands full latency for x86
folded loads. Other kinds of "fake" implicit defs don't occur prior to
regalloc anyway, and we would like them to go away postRegAlloc as
well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177227 91177308-0d34-0410-b5e6-96231b3b80d8
This is a generic function (derived from PEI); moving it into
MachineFrameInfo eliminates a current redundancy between the ARM and AArch64
backends, and will allow it to be used by the PowerPC target code.
No functionality change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177111 91177308-0d34-0410-b5e6-96231b3b80d8
Add the current PEI register scavenger as a parameter to the
processFunctionBeforeFrameFinalized callback.
This change is necessary in order to allow the PowerPC target code to
set the register scavenger frame index after the save-area offset
adjustments performed by processFunctionBeforeFrameFinalized. Only
after these adjustments have been made is it possible to estimate
the size of the stack frame.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177108 91177308-0d34-0410-b5e6-96231b3b80d8
This doesn't reset all of the target options within the TargetOptions
object. This is because some of those are ABI-specific and must be determined if
it's okay to change those on the fly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176986 91177308-0d34-0410-b5e6-96231b3b80d8
belongs to a different compile unit.
DW_FORM_ref_addr should be used for cross compile-unit reference.
When compiling a large application, we got a dwarfdump verification error where
abstract_origin points to nowhere.
This error can't be reproduced on any testing case in MultiSource.
We may have other cases where we use DW_FORM_ref4 unconditionally.
rdar://problem/13370501
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176882 91177308-0d34-0410-b5e6-96231b3b80d8
Versioned debug info support has been a burden to maintain & also compromised
current debug info verification by causing test cases testing old debug info to
remain rather than being updated to the latest. It also makes it hard to add or
change the metadata schema by requiring various backwards-compatibility in the
DI* hierarchy.
So it's being removed in preparation for new changes to the schema to tidy up
old/unnecessary fields and add new fields needed for new debug info (well, new
to LLVM at least).
The more surprising part of this is the changes to DI*::Verify - this became
necessary due to the changes to AsmWriter. AsmWriter was relying on the version
test to decide which bits of metadata were actually debug info when printing
the comment annotations. Without the version information the tag numbers were
too common & it would print debug info on random metadata that happened to
start with an integer that matched a tag number. Instead this change makes the
Verify functions more precise (just adding "number of operands" checks - not
type checking those operands yet) & relies on that to decide which metadata is
debug info metadata.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176838 91177308-0d34-0410-b5e6-96231b3b80d8
PHIs are allowed to have multiple operand pairs per predecessor, and
this code works just fine when it happens.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176734 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Statistics are still available in Release+Asserts (any +Asserts builds),
and stats can also be turned on with LLVM_ENABLE_STATS.
Move some of the FastISel stats that were moved under DEBUG()
back out of DEBUG(), since stats are disabled across the board now.
Many tests depend on grepping "-stats" output. Move those into
a orig_dir/Stats/. so that they can be marked as unsupported
when building without statistics.
Differential Revision: http://llvm-reviews.chandlerc.com/D486
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176733 91177308-0d34-0410-b5e6-96231b3b80d8
To find the last use of a register unit, start from the bottom and scan
upwards until a user is found.
<rdar://problem/13353090>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176706 91177308-0d34-0410-b5e6-96231b3b80d8
LegalizeDAG.cpp uses the value of the comparison operands when checking
the legality of BR_CC, so DAGCombiner should do the same.
v2:
- Expand more BR_CC value types for NVPTX
v3:
- Expand correct BR_CC value types for Hexagon, Mips, and XCore.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176694 91177308-0d34-0410-b5e6-96231b3b80d8
This verifies live intervals both before and after scheduling. It's
useful for anyone hacking on live interval update.
Note that we don't yet pass verification all the time. We don't yet
handle updating nonallocatable live intervals perfectly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176685 91177308-0d34-0410-b5e6-96231b3b80d8
Code generation makes some basic assumptions about the IR it's been given. In
particular, if there is only one 'invoke' in the function, then that invoke
won't be going away. However, with the advent of the `llvm.donothing' intrinsic,
those invokes may go away. If all of them go away, the landing pad no longer has
any users. This confuses the back-end, which asserts.
This happens with SjLj exceptions, because that's the model that modifies the IR
based on there being invokes, etc. in the function.
Remove any invokes of `llvm.donothing' during SjLj EH preparation. This will
give us a CFG that the back-end won't be confused about. If all of the invokes
in a function are removed, then the SjLj EH prepare pass won't insert the bogus
code the relies upon the invokes being there.
<rdar://problem/13228754&13316637>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176677 91177308-0d34-0410-b5e6-96231b3b80d8
In very rare cases caused by irreducible control flow, the dominating
block can have the same trace head without actually being part of the
trace.
As long as such a dominator still has valid instruction depths, it is OK
to use it for computing instruction depths.
Rename the function to avoid lying, and add a check that instruction
depths are computed for the dominator.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176668 91177308-0d34-0410-b5e6-96231b3b80d8
rdar:13370002 [pre-RA-sched] assertion: released too many times
I tracked this down to an earlier hack that is no longer applicable
and interfered with normal scheduler logic. With the changes in
r176037, it was causing an instruction to be scheduled multiple times.
I have an external test case that I tried hard to reduce and
failed. I can't even reproduce with llc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176636 91177308-0d34-0410-b5e6-96231b3b80d8
We now emit a line table for each compile unit. To reduce the prologue size
of each line table, the files and directories used by each compile unit are
stored in std::map<unsigned, std::vector< > > instead of std::vector< >.
The prologue for a lto'ed image can be as big as 93K. Duplicating 93K for each
compile unit causes a huge increase of debug info. With this patch, each
prologue will only emit the files required by the compile unit.
rdar://problem/13342023
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176605 91177308-0d34-0410-b5e6-96231b3b80d8
- ISD::SHL/SRL/SRA must have either both scalar or both vector operands
but TLI.getShiftAmountTy() so far only return scalar type. As a
result, backend logic assuming that breaks.
- Rename the original TLI.getShiftAmountTy() to
TLI.getScalarShiftAmountTy() and re-define TLI.getShiftAmountTy() to
return target-specificed scalar type or the same vector type as the
1st operand.
- Fix most TICG logic assuming TLI.getShiftAmountTy() a simple scalar
type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176364 91177308-0d34-0410-b5e6-96231b3b80d8
We avoided computing DAG height/depth during Node printing because it
shouldn't depend on an otherwise valid DAG. But this has become far
too annoying for the common case of a valid DAG where we want to see
valid values. If doing the computation on-the-fly turns out to be a
problem in practice, then I'll add a mode to the diagnostics to only
force it when we're likely to have a valid DAG, otherwise explicitly
print INVALID instead of bogus numbers. For now, just go for it all
the time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176314 91177308-0d34-0410-b5e6-96231b3b80d8
SelectionDAGIsel::LowerArguments needs a function, not a basic block. So it
makes sense to pass it the function instead of extracting a basic-block from
the function and then tossing it. This is also more self-documenting (functions
have arguments, BBs don't).
In addition, added comments to a couple of Select* methods.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176305 91177308-0d34-0410-b5e6-96231b3b80d8
We make the cost for calling libm functions extremely high as emitting the
calls is expensive and causes spills (on x86) so performance suffers. We still
vectorize important calls like ceilf and friends on SSE4.1. and fabs.
Differential Revision: http://llvm-reviews.chandlerc.com/D466
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176287 91177308-0d34-0410-b5e6-96231b3b80d8
definition DIE (TAG_variable), and put AT_MIPS_linkage_name to TAG_member when
DarwinGDBCompat is true.
Darwin GDB needs AT_MIPS_linkage_name at both places to work.
Follow-up patch to r176143.
rdar://problem/13291234
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176220 91177308-0d34-0410-b5e6-96231b3b80d8
This patch disables the counters on non-debug builds. This reduces the runtime of SelectionDAGISel::SelectCodeCommon by ~5%.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176214 91177308-0d34-0410-b5e6-96231b3b80d8
definition DIE, to make old GDB happy.
We have a regression for old GDB when Clang uses DW_TAG_member to declare
static members inside a class, instead of DW_TAG_variable. This patch will fix
this regression.
rdar://problem/13291234
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176143 91177308-0d34-0410-b5e6-96231b3b80d8
TAG_member inside a class to the specification DIE.
Having AT_MIPS_linkage_name on TAG_member caused old gdb (GNU 6.3.50) to
error out. Also gcc 4.7 has AT_MIPS_linkage_name on the specification DIE.
rdar://problem/13291234
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176120 91177308-0d34-0410-b5e6-96231b3b80d8
fewer scalar integer (i32 or i64) arguments. It completely eliminates the need
for SDISel for trivial functions.
Also, add the new llc -fast-isel-abort-args option, which is similar to
-fast-isel-abort option, but for formal argument lowering.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176052 91177308-0d34-0410-b5e6-96231b3b80d8
memory intrinsics in the SDAG builder.
When alignment is zero, the lang ref says that *no* alignment
assumptions can be made. This is the exact opposite of the internal API
contracts of the DAG where alignment 0 indicates that the alignment can
be made to be anything desired.
There is another, more explicit alignment that is better suited for the
role of "no alignment at all": an alignment of 1. Map the intrinsic
alignment to this early so that we don't end up generating aligned DAGs.
It is really terrifying that we've never seen this before, but we
suddenly started generating a large number of alignment 0 memcpys due to
the new code to do memcpy-based copying of POD class members. That patch
contains a bug that rounds bitfield alignments down when they are the
first field. This can in turn produce zero alignments.
This fixes weird crashes I've seen in library users of LLVM on 32-bit
hosts, etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176022 91177308-0d34-0410-b5e6-96231b3b80d8
itself recursively with a new instruction that has not been finalized, in order
to determine whether to keep the instruction. On 'make check' and test-suite the
only cases where the recursive invocation made any transformations were simple
instruction commutations, so I am restricting the recursive invocation to do
only this.
The other cases wouldn't work correctly when updating LiveIntervals, since the
new instructions don't have slot indices and LiveIntervals hasn't yet been
updated. If the other transformations were actually triggering in any test case
it would be possible to support it with a lot of effort, but since they don't
it's not worth it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175979 91177308-0d34-0410-b5e6-96231b3b80d8
unless it was requested to with an optional parameter that defaults to false, so
we don't need to handle that case in TwoAddressInstructionPass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175974 91177308-0d34-0410-b5e6-96231b3b80d8
TwoAddressInstructionPass. The code in rescheduleMIBelowKill() is a bit tricky,
since multiple instructions need to be moved down, one-at-a-time, in reverse
order.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175955 91177308-0d34-0410-b5e6-96231b3b80d8
One of the phases of SelectionDAG is LegalizeVectors. We don't need to sort the DAG and copy nodes around if there are no vector ops.
Speeds up the compilation time of SelectionDAG on a big scalar workload by ~8%.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175929 91177308-0d34-0410-b5e6-96231b3b80d8
It was incorrectly checking a Function* being an IntrinsicInst* which
isn't possible. It should always have been checking the CallInst* instead.
Added test case for x86 which ensures we only get one constant load.
It was 2 before this change.
rdar://problem/13267920
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175853 91177308-0d34-0410-b5e6-96231b3b80d8
pass. One of the callers of isKilled() can cope with overapproximation of kills
and the other can't, so I added a flag to indicate this.
In theory this could pessimize code slightly, but in practice most physical
register uses are kills, and most important kills of physical registers are the
only uses of that register prior to register allocation, so we can recognize
them as kills even without kill flags.
This is relevant because LiveIntervals gets rid of all kill flags.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175821 91177308-0d34-0410-b5e6-96231b3b80d8
to TargetFrameLowering, where it belongs. Incidentally, this allows us
to delete some duplicated (and slightly different!) code in TRI.
There are potentially other layering problems that can be cleaned up
as a result, or in a similar manner.
The refactoring was OK'd by Anton Korobeynikov on llvmdev.
Note: this touches the target interfaces, so out-of-tree targets may
be affected.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175788 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes some problems with too conservative checking where we were
marking all aliases of a register as used, and then also checking all
aliases when allocating a register.
<rdar://problem/13249625>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175782 91177308-0d34-0410-b5e6-96231b3b80d8
A legal BUILD_VECTOR goes in and gets constant folded into another legal
BUILD_VECTOR so we don't lose any legality here. The problematic PPC
optimization that made this check necessary was fixed recently.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175759 91177308-0d34-0410-b5e6-96231b3b80d8
This brings the number of remaining failures in 'make check' without
LiveVariables down to 39, with 1 unexpectedly passing test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175727 91177308-0d34-0410-b5e6-96231b3b80d8
available.
With this commit there are no longer any assertion or verifier failures when
running 'make check' without LiveVariables. There are still 56 failing tests
with codegen differences and 1 unexpectedly passing test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175719 91177308-0d34-0410-b5e6-96231b3b80d8
Rewrite value numbers directly in the 'Other' LiveInterval which is
moribund anyway. This avoids allocating the OtherAssignments vector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175690 91177308-0d34-0410-b5e6-96231b3b80d8
When findReachingDefs() finds that only one value can reach the basic
block, just copy the work list of visited blocks directly into the live
interval.
Sort the block list and use a LiveRangeUpdater to make the bulk add
fast.
When multiple reaching defs are found, transfer the work list to the
updateSSA() work list as before. Also use LiveRangeUpdater in
updateLiveIns() following updateSSA().
This makes live interval analysis more than 3x faster on one huge test
case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175685 91177308-0d34-0410-b5e6-96231b3b80d8
related failures when running 'make check' without LiveVariables with the
verifier enabled. Some of the remaining failures elsewhere may still be fallout
from incorrect updating of LiveIntervals or the few missing cases left in the
two-address pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175672 91177308-0d34-0410-b5e6-96231b3b80d8
(2xi32) (truncate ((2xi64) bitcast (buildvector i32 a, i32 x, i32 b, i32 y)))
can be folded into a (2xi32) (buildvector i32 a, i32 b).
Such a DAG would cause uneccessary vdup instructions followed by vmovn
instructions.
We generate this code on ARM NEON for a setcc olt, 2xf64, 2xf64. For example, in
the vectorized version of the code below.
double A[N];
double B[N];
void test_double_compare_to_double() {
int i;
for(i=0;i<N;i++)
A[i] = (double)(A[i] < B[i]);
}
radar://13191881
Fixes bug 15283.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175670 91177308-0d34-0410-b5e6-96231b3b80d8
Adding new segments to large LiveIntervals can be expensive because the
LiveRange objects after the insertion point may need to be moved left or
right. This can cause quadratic behavior when adding a large number of
segments to a live range.
The LiveRangeUpdater class allows the LIveInterval to be in a temporary
invalid state while segments are being added. It maintains an internal
gap in the LiveInterval when it is shrinking, and it has a spill area
for new segments when the LiveInterval is growing.
The behavior is similar to the existing mergeIntervalRanges() function,
except it allocates less memory for the spill area, and the algorithm is
turned inside out so the loop is driven by the clients.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175644 91177308-0d34-0410-b5e6-96231b3b80d8
- When extloading from a vector with non-byte-addressable element, e.g.
<4 x i1>, the current logic breaks. Extend the current logic to
fix the case where the element type is not byte-addressable by loading
all bytes, bit-extracting/packing each element.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175642 91177308-0d34-0410-b5e6-96231b3b80d8
and removing instructions. The implementation seems more complicated than it
needs to be, but I couldn't find something simpler that dealt with all of the
corner cases.
Also add a call to repairIndexesInRange() from repairIntervalsInRange().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175601 91177308-0d34-0410-b5e6-96231b3b80d8
Target implementations of getRegAllocationHints() should use the
provided allocation order, and they can never return hints outside the
order. This is already documented in TargetRegisterInfo.h.
<rdar://problem/13240556>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175540 91177308-0d34-0410-b5e6-96231b3b80d8
Due to the execution order of doFinalization functions, the GC information were
deleted before AsmPrinter::doFinalization was executed. Thus, the
GCMetadataPrinter::finishAssembly was never called.
The patch fixes that by moving the code of the GCInfoDeleter::doFinalization to
Printer::doFinalization.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175528 91177308-0d34-0410-b5e6-96231b3b80d8
arguably better than forward iterators for this use case, they are confusing and
there are some implementation problems with reverse iterators and MI bundles.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175393 91177308-0d34-0410-b5e6-96231b3b80d8
MachineBasicBlock::SplitCriticalEdge. Since this is an iterator rather than
an instr_iterator, the isBundled() check only passes if getFirstTerminator()
returned end() and the garbage memory happens to lean that way.
Multiple successors can be present without any terminator instructions in the
case of exception handling with a fallthrough.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175383 91177308-0d34-0410-b5e6-96231b3b80d8
terminators that actually have register uses when splitting critical edges.
This commit also introduces a method repairIntervalsInRange() on LiveIntervals,
which allows for repairing LiveIntervals in a small range after an arbitrary
target hook modifies, inserts, and removes instructions. It's pretty limited
right now, but I hope to extend it to support all of the things that are done
by the convertToThreeAddress() target hooks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175382 91177308-0d34-0410-b5e6-96231b3b80d8
If the frame pointer is omitted, and any stack changes occur in the inline
assembly, e.g.: "pusha", then any C local variable or C argument references
will be incorrect.
I pass no judgement on anyone who would do such a thing. ;)
rdar://13218191
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175334 91177308-0d34-0410-b5e6-96231b3b80d8
If two functions require different features (e.g., `-mno-sse' vs. `-msse') then
we want to honor that, especially during LTO. We can do that by resetting the
subtarget's features depending upon the 'target-feature' attribute.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175314 91177308-0d34-0410-b5e6-96231b3b80d8
- add sincos to runtime library if target triple environment is GNU
- added canCombineSinCosLibcall() which checks that sincos is in the RTL and
if the environment is GNU then unsafe fpmath is enabled (required to
preserve errno)
- extended sincos-opt lit test
Reviewed by: Hal Finkel
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175283 91177308-0d34-0410-b5e6-96231b3b80d8