This happens e.g. with <2 x i64> -1 on x86_32. It cannot be generated directly
because i64 is illegal. It would be nice if getNOT would handle this
transparently, but I don't see a way to generate a legal constant there right
now. Fixes PR17487.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192795 91177308-0d34-0410-b5e6-96231b3b80d8
This is really an extension of the current (shl (shr ...)) -> shl optimization.
The main difference is that certain upper bits must also not be demanded.
The motivating examples are the first two in the testcase, which occur
in llvmpipe output.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192783 91177308-0d34-0410-b5e6-96231b3b80d8
1) Make sure we emit static member variables by checking
at the end of createGlobalVariableDIE rather than piecemeal
in the function.
(As a note, createGlobalVariableDIE needs rewriting.)
2) Make sure we use the definition rather than declaration DIE
for two things: a) determining linkage for gnu pubnames, and b)
as the address of the DIE for global variables.
(As a note, createGlobalVariableDIE really needs rewriting.)
Adjust the testcase to make sure we're checking the correct DIEs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192761 91177308-0d34-0410-b5e6-96231b3b80d8
twice and just look up the value. Fix the one case where
we were trying to create a subprogram DIE and we should already
have had one. Reflow formatting in collectDeadVariables while fixing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192749 91177308-0d34-0410-b5e6-96231b3b80d8
rdar:15221834 False AVX register dependencies cause 5x slowdown on
flops-5/6 and significant slowdown on several others.
This was blocking the switch to MI-Sched.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192669 91177308-0d34-0410-b5e6-96231b3b80d8
This pass is needed to break false dependencies. Without it, unlucky
register assignment can result in wild (5x) swings in
performance. This pass was trying to handle AVX but not getting it
right. AVX doesn't have partial register defs, it has unused register
reads in which the high bits of a source operand are copied into the
unused bits of the dest.
Fixing this requires conservative liveness analysis. This is awkard
because the pass already has its own pseudo-liveness. However, proper
liveness is expensive, and we would like to use a generic utility to
compute it. The fix only invokes liveness on-demand. It is rare to
detect a case that needs undef-read dependence breaking, but when it
happens, it can be needed many times within a very large block.
I think the existing heuristic which uses a register window of 16 is
too conservative for loop-carried false dependencies. If the loop is a
reduction. The out-of-order engine may be able to execute several loop
iterations in parallel. However, I'll leave this tuning exercise for
next time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192635 91177308-0d34-0410-b5e6-96231b3b80d8
Clobbering is exclusive not inclusive on register units.
For liveness, we need to consider all the preserved registers.
e.g. A regmask that clobbers YMM0 may preserve XMM0.
Units are only clobbered when all super-registers are clobbered.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192623 91177308-0d34-0410-b5e6-96231b3b80d8
Some clients may add block live ins and may track liveness over a
large scope. This guarantees an efficient implementation in all cases
with no memory allocation/deallocation, independent of the number of
target registers. It could be slightly less convenient but is fine in
the expected case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192622 91177308-0d34-0410-b5e6-96231b3b80d8
Clean up creation of static member DIEs. We can create static member DIEs from
two places, so we call getOrCreateStaticMemberDIE from the two places.
getOrCreateStaticMemberDIE will get or create the context DIE first, then it
will check if the DIE already exists, if not, we create the static member DIE
and add it to the context.
Creation of static member DIEs are handled in a similar way as subprogram DIEs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192618 91177308-0d34-0410-b5e6-96231b3b80d8
Per original comment, the intention of this loop
is to go ahead and break the critical edge
(in order to sink this instruction) if there's
reason to believe doing so might "unblock" the
sinking of additional instructions that define
registers used by this one. The idea is that if
we have a few instructions to sink "together"
breaking the edge might be worthwhile.
This commit makes a few small changes
to help better realize this goal:
First, modify the loop to ignore registers
defined by this instruction. We don't
sink definitions of physical registers,
and sinking an SSA definition isn't
going to unblock an upstream instruction.
Second, ignore uses of physical registers.
Instructions that define physical registers are
rejected for sinking, and so moving this one
won't enable moving any defining instructions.
As an added bonus, while virtual register
use-def chains are generally small due
to SSA goodness, iteration over the uses
and definitions (used by hasOneNonDBGUse)
for physical registers like EFLAGS
can be rather expensive in practice.
(This is the original reason for looking at this)
Finally, to keep things simple continue
to only consider this trick for registers that
have a single use (via hasOneNonDBGUse),
but to avoid spuriously breaking critical edges
only do so if the definition resides
in the same MBB and therefore this one directly
blocks it from being sunk as well.
If sinking them together is meant to be,
let the iterative nature of this pass
sink the definition into this block first.
Update tests to accomodate this change,
add new testcase where sinking avoids pipeline stalls.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192608 91177308-0d34-0410-b5e6-96231b3b80d8
The alignment of allocated space was wrong, see Bugzila 17345.
Done by Zvi Rackover <zvi.rackover@intel.com>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192573 91177308-0d34-0410-b5e6-96231b3b80d8
The form must be a reference form in addDIEEntry. Which reference form to
use will be decided by the callee.
No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192517 91177308-0d34-0410-b5e6-96231b3b80d8
When if converting something like:
true:
... = R0<kill>
false:
... = R0<kill>
then the instructions of the true block must not have a <kill> flag
anymore, as the instruction of the false block follow and do still read
the R0 value.
Specifically this patch determines the set of register live-in in the
false block (possibly after simulating the liveness changes of the
duplicated instructions). Each of these live-in registers mustn't be
killed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192482 91177308-0d34-0410-b5e6-96231b3b80d8
This should fix the buildbots.
Original commit message:
[DAGCombiner] Slice a big load in two loads when the element are next to each
other in memory and the target has paired load and performs post-isel loads
combining.
E.g., this optimization will transform something like this:
a = load i64* addr
b = trunc i64 a to i32
c = lshr i64 a, 32
d = trunc i64 c to i32
into:
b = load i32* addr1
d = load i32* addr2
Where addr1 = addr2 +/- sizeof(i32), if the target supports paired load and
performs post-isel loads combining.
One should overload TargetLowering::hasPairedLoad to provide this information.
The default is false.
<rdar://problem/14477220>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192476 91177308-0d34-0410-b5e6-96231b3b80d8
other in memory and the target has paired load and performs post-isel loads
combining.
E.g., this optimization will transform something like this:
a = load i64* addr
b = trunc i64 a to i32
c = lshr i64 a, 32
d = trunc i64 c to i32
into:
b = load i32* addr1
d = load i32* addr2
Where addr1 = addr2 +/- sizeof(i32), if the target supports paired load and
performs post-isel loads combining.
One should overload TargetLowering::hasPairedLoad to provide this information.
The default is false.
<rdar://problem/14477220>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192471 91177308-0d34-0410-b5e6-96231b3b80d8
For NVPTX, this fixes a crash where the emitImplicitDef implementation was expecting physical registers,
while NVPTX uses virtual registers (with a couple of exceptions). Now, the implicit def comment will be
emitted as a true PTX register name. Other targets can use this to customize the output of implicit def
comments.
Fixes PR17519
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192444 91177308-0d34-0410-b5e6-96231b3b80d8
Previously LiveInterval has been used, but having a spill weight and
register number is unnecessary for a register unit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192397 91177308-0d34-0410-b5e6-96231b3b80d8
This makes the API a bit more natural to use and makes it easier to make
LiveRanges implementation details private.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192394 91177308-0d34-0410-b5e6-96231b3b80d8
LiveRange just manages a list of segments and a list of value numbers
now as LiveInterval did previously, but without having details like spill
weight or a fixed register number.
LiveInterval is now a subclass of LiveRange and simply adds the spill weight
and the register number.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192393 91177308-0d34-0410-b5e6-96231b3b80d8
The Segment struct contains a single interval; multiple instances of this struct
are used to construct a live range, but the struct is not a live range by
itself.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192392 91177308-0d34-0410-b5e6-96231b3b80d8
template_value are updated to use DIRef.
A paired commit at clang is required due to changes to DIBuilder.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192320 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes repeated -Wmicrosoft warnings when self-hosting clang on
Windows, and gets us real unsigned enum types with MSVC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192227 91177308-0d34-0410-b5e6-96231b3b80d8
This patch fixes an old FIXME by creating a MCTargetStreamer interface
and moving the target specific functions for ARM, Mips and PPC to it.
The ARM streamer is still declared in a common place because it is
used from lib/CodeGen/ARMException.cpp, but the Mips and PPC are
completely hidden in the corresponding Target directories.
I will send an email to llvmdev with instructions on how to use this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192181 91177308-0d34-0410-b5e6-96231b3b80d8