to support replacing a node with another that has a superset of
the result types. Use this instead of calling
ReplaceAllUsesOfValueWith for each value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@69209 91177308-0d34-0410-b5e6-96231b3b80d8
operator is used by a CopyToReg to export the value to a different
block, don't reuse the CopyToReg's register for the subreg operation
result if the register isn't precisely the right class for the
subreg operation.
Also, rename the h-registers.ll test, now that there are more
than one.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@69087 91177308-0d34-0410-b5e6-96231b3b80d8
promoted to legal types without changing the type of the vector. This is
following a suggestion from Duncan
(http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-February/019923.html).
The transformation that used to be done during type legalization is now
postponed to DAG legalization. This allows the BUILD_VECTORs to be optimized
and potentially handled specially by target-specific code.
It turns out that this is also consistent with an optimization done by the
DAG combiner: a BUILD_VECTOR and INSERT_VECTOR_ELT may be combined by
replacing one of the BUILD_VECTOR operands with the newly inserted element;
but INSERT_VECTOR_ELT allows its scalar operand to be larger than the
element type, with any extra high bits being implicitly truncated. The
result is a BUILD_VECTOR where one of the operands has a type larger the
the vector element type.
Any code that operates on BUILD_VECTORs may now need to be aware of the
potential type discrepancy between the vector element type and the
BUILD_VECTOR operands. This patch updates all of the places that I could
find to handle that case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@68996 91177308-0d34-0410-b5e6-96231b3b80d8
Now debug_inlined section is covered by TAI->doesDwarfUsesInlineInfoSection(), which is false by default.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@68964 91177308-0d34-0410-b5e6-96231b3b80d8
This will be used to replace things like X86's MOV32to32_.
Enhance ScheduleDAGSDNodesEmit to be more flexible and robust
in the presense of subregister superclasses and subclasses. It
can now cope with the definition of a virtual register being in
a subclass of a use.
Re-introduce the code for recording register superreg classes and
subreg classes. This is needed because when subreg extracts and
inserts get coalesced away, the virtual registers are left in
the correct subclass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@68961 91177308-0d34-0410-b5e6-96231b3b80d8
Create debug_inlined dwarf section using these information. This info is used by gdb, at least on Darwin, to enable better experience debugging inlined functions. See DwarfWriter.cpp for more information on structure of debug_inlined section.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@68847 91177308-0d34-0410-b5e6-96231b3b80d8
in addition to ZERO_EXTEND and SIGN_EXTEND. Fix a bug in the
way it checked for live-out values, and simplify the way it
find users by using SDNode::use_iterator's (relatively) new
features. Also, make it slightly more permissive on targets
with free truncates.
In SelectionDAGBuild, avoid creating ANY_EXTEND nodes that are
larger than necessary. If the target's SwitchAmountTy has
enough bits, use it. This exposes the truncate to optimization
early, enabling more optimizations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@68670 91177308-0d34-0410-b5e6-96231b3b80d8
eagerly. This helps avoid CopyToReg nodes in some cases where they
aren't needed, and also helps subsequent optimizer heuristics
in cases where the extra nodes would cause the node to appear
to have multiple results. This doesn't have a significant impact
currently; it'll help an upcoming change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@68667 91177308-0d34-0410-b5e6-96231b3b80d8
with SUBREG_TO_REG, teach SimpleRegisterCoalescing to coalesce
SUBREG_TO_REG instructions (which are similar to INSERT_SUBREG
instructions), and teach the DAGCombiner to take advantage of this on
targets which support it. This eliminates many redundant
zero-extension operations on x86-64.
This adds a new TargetLowering hook, isZExtFree. It's similar to
isTruncateFree, except it only applies to actual definitions, and not
no-op truncates which may not zero the high bits.
Also, this adds a new optimization to SimplifyDemandedBits: transform
operations like x+y into (zext (add (trunc x), (trunc y))) on targets
where all the casts are no-ops. In contexts where the high part of the
add is explicitly masked off, this allows the mask operation to be
eliminated. Fix the DAGCombiner to avoid undoing these transformations
to eliminate casts on targets where the casts are no-ops.
Also, this adds a new two-address lowering heuristic. Since
two-address lowering runs before coalescing, it helps to be able to
look through copies when deciding whether commuting and/or
three-address conversion are profitable.
Also, fix a bug in LiveInterval::MergeInClobberRanges. It didn't handle
the case that a clobber range extended both before and beyond an
existing live range. In that case, multiple live ranges need to be
added. This was exposed by the new subreg coalescing code.
Remove 2008-05-06-SpillerBug.ll. It was bugpoint-reduced, and the
spiller behavior it was looking for no longer occurrs with the new
instruction selection.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@68576 91177308-0d34-0410-b5e6-96231b3b80d8
Note that these are distinct from TargetInstrInfo::INSERT_SUBREG
and TargetInstrInfo::EXTRACT_SUBREG, which are used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@68355 91177308-0d34-0410-b5e6-96231b3b80d8
x * 40
=>
shlq $3, %rdi
leaq (%rdi,%rdi,4), %rax
This has the added benefit of allowing more multiply to be folded into addressing mode. e.g.
a * 24 + b
=>
leaq (%rdi,%rdi,2), %rax
leaq (%rsi,%rax,8), %rax
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67917 91177308-0d34-0410-b5e6-96231b3b80d8
Also fixes SDISel so it *does not* force promote return value if the function is not marked signext / zeroext.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67701 91177308-0d34-0410-b5e6-96231b3b80d8
stoppoint nodes around until Legalize; doing this
imposed an ordering on a sequence of loads that
came from different lines, interfering with scheduling.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67692 91177308-0d34-0410-b5e6-96231b3b80d8
help out the register pressure reduction heuristics in the case of
nodes with multiple uses. Currently this uses very conservative
heuristics, so it doesn't have a broad impact, but in cases where it
does help it can make a big difference.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67586 91177308-0d34-0410-b5e6-96231b3b80d8
a data dependency on the load node, so it really needs a
data-dependence edge to the load node, even if the load previously
existed.
And add a few comments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67554 91177308-0d34-0410-b5e6-96231b3b80d8
and expanding a bit convert (PR3711). In both cases, we extract the
valid part of the widen vector and then do the conversion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67175 91177308-0d34-0410-b5e6-96231b3b80d8
size by the array amount as an i32 value instead of promoting from
i32 to i64 then doing the multiply. Not doing this broke wrap-around
assumptions that the optimizers (validly) made. The ultimate real
fix for this is to introduce i64 version of alloca and remove mallocinst.
This fixes PR3829
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67093 91177308-0d34-0410-b5e6-96231b3b80d8
vector shuffle mask. Forced the mask to be built using i32. Note: this will
be irrelevant once vector_shuffle no longer takes a build vector for the
shuffle mask.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67076 91177308-0d34-0410-b5e6-96231b3b80d8
if FPConstant is legal because if the FPConstant doesn't need to be stored
in a constant pool, the transformation is unlikely to be profitable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66994 91177308-0d34-0410-b5e6-96231b3b80d8
ptrtoint and inttoptr in X86FastISel. These casts aren't always
handled in the generic FastISel code because X86 sometimes needs
custom code to do truncation and zero-extension.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66988 91177308-0d34-0410-b5e6-96231b3b80d8
by inserting explicit zero extensions where necessary. Included
is a testcase where SelectionDAG produces a virtual register
holding an i1 value which FastISel previously mistakenly assumed
to be zero-extended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66941 91177308-0d34-0410-b5e6-96231b3b80d8
1. ConstantPoolSDNode alignment field is log2 value of the alignment requirement. This is not consistent with other SDNode variants.
2. MachineConstantPool alignment field is also a log2 value.
3. However, some places are creating ConstantPoolSDNode with alignment value rather than log2 values. This creates entries with artificially large alignments, e.g. 256 for SSE vector values.
4. Constant pool entry offsets are computed when they are created. However, asm printer group them by sections. That means the offsets are no longer valid. However, asm printer uses them to determine size of padding between entries.
5. Asm printer uses expensive data structure multimap to track constant pool entries by sections.
6. Asm printer iterate over SmallPtrSet when it's emitting constant pool entries. This is non-deterministic.
Solutions:
1. ConstantPoolSDNode alignment field is changed to keep non-log2 value.
2. MachineConstantPool alignment field is also changed to keep non-log2 value.
3. Functions that create ConstantPool nodes are passing in non-log2 alignments.
4. MachineConstantPoolEntry no longer keeps an offset field. It's replaced with an alignment field. Offsets are not computed when constant pool entries are created. They are computed on the fly in asm printer and JIT.
5. Asm printer uses cheaper data structure to group constant pool entries.
6. Asm printer compute entry offsets after grouping is done.
7. Change JIT code to compute entry offsets on the fly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66875 91177308-0d34-0410-b5e6-96231b3b80d8
related transformations out of target-specific dag combine into the
ARM backend. These were added by Evan in r37685 with no testcases
and only seems to help ARM (e.g. test/CodeGen/ARM/select_xform.ll).
Add some simple X86-specific (for now) DAG combines that turn things
like cond ? 8 : 0 -> (zext(cond) << 3). This happens frequently
with the recently added cp constant select optimization, but is a
very general xform. For example, we now compile the second example
in const-select.ll to:
_test:
movsd LCPI2_0, %xmm0
ucomisd 8(%esp), %xmm0
seta %al
movzbl %al, %eax
movl 4(%esp), %ecx
movsbl (%ecx,%eax,4), %eax
ret
instead of:
_test:
movl 4(%esp), %eax
leal 4(%eax), %ecx
movsd LCPI2_0, %xmm0
ucomisd 8(%esp), %xmm0
cmovbe %eax, %ecx
movsbl (%ecx), %eax
ret
This passes multisource and dejagnu.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66779 91177308-0d34-0410-b5e6-96231b3b80d8
alignment of the generated constant pool entry to the
desired alignment of a type. If we don't do this, we end up
trying to do movsd from 4-byte alignment memory. This fixes
450.soplex and 456.hmmer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66641 91177308-0d34-0410-b5e6-96231b3b80d8
and extern_weak_odr. These are the same as the non-odr versions,
except that they indicate that the global will only be overridden
by an *equivalent* global. In C, a function with weak linkage can
be overridden by a function which behaves completely differently.
This means that IP passes have to skip weak functions, since any
deductions made from the function definition might be wrong, since
the definition could be replaced by something completely different
at link time. This is not allowed in C++, thanks to the ODR
(One-Definition-Rule): if a function is replaced by another at
link-time, then the new function must be the same as the original
function. If a language knows that a function or other global can
only be overridden by an equivalent global, it can give it the
weak_odr linkage type, and the optimizers will understand that it
is alright to make deductions based on the function body. The
code generators on the other hand map weak and weak_odr linkage
to the same thing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66339 91177308-0d34-0410-b5e6-96231b3b80d8
with multiple chain operands. This can occur when the scheduler
has added chain operands to a node that already has a chain
operand, in order to handle physical register dependencies.
This fixes an llvm-gcc bootstrap failure on x86-64 introduced
in r66058.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66240 91177308-0d34-0410-b5e6-96231b3b80d8
It is an error to call APInt::zext with a size that is equal to the value's
current size, so use zextOrTrunc instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66039 91177308-0d34-0410-b5e6-96231b3b80d8
so it changed it into a 31 via the TLO.ShrinkDemandedConstant() call. Then it
would go through the DAG combiner again. This time it had a value of 31, which
was turned into a -1 by TLI.SimplifyDemandedBits(). This would ping pong
forever.
Teach the TLO.ShrinkDemandedConstant() call not to lower a value if the demanded
value is an XOR of all ones.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65985 91177308-0d34-0410-b5e6-96231b3b80d8
arbitrary vector sizes. Add an optional MinSplatBits parameter to specify
a minimum for the splat element size. Update the PPC target to use the
revised interface.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65899 91177308-0d34-0410-b5e6-96231b3b80d8
overly long ints, e.g. i96, into pieces at PHIs
and the nodes that feed into them; however big-endian
reverses the order of the pieces (for some reason), and
wasn't doing it the same way on both sides, so
the pieces didn't match and runtime failures ensued.
Fixes 188.ammp and sqlite3 on ppc32.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65481 91177308-0d34-0410-b5e6-96231b3b80d8
results via reference parameters.
This patch also appears to fix Evan's reported problem supplied as a
reduced bugpoint test case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65426 91177308-0d34-0410-b5e6-96231b3b80d8
them are generic changes.
- Use the "fast" flag that's already being passed into the asm printers instead
of shoving it into the DwarfWriter.
- Instead of calling "MI->getParent()->getParent()" for every MI, set the
machine function when calling "runOnMachineFunction" in the asm printers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65379 91177308-0d34-0410-b5e6-96231b3b80d8
a DBG_LABEL or not. We want to fall back to the original way of emitting debug
info when we're in -O0/-fast mode.
- Add plumbing in to pass the "Fast" flag to places that need it.
- XFAIL DebugInfo/deaddebuglabel.ll. This is finding 11 labels instead of 8. I
need to investigate still.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65367 91177308-0d34-0410-b5e6-96231b3b80d8
ashr instcombine to help expose this code. And apply the fix to
SelectionDAG's copy of this code too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65364 91177308-0d34-0410-b5e6-96231b3b80d8
instruction. The class also consolidates the code for detecting constant
splats that's shared across PowerPC and the CellSPU backends (and might be
useful for other backends.) Also introduces SelectionDAG::getBUID_VECTOR() for
generating new BUILD_VECTOR nodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65296 91177308-0d34-0410-b5e6-96231b3b80d8
that checks whether it's safe to transform a store of a bitcast
value into a store of the original value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65201 91177308-0d34-0410-b5e6-96231b3b80d8
(Note: Eventually, commits like this will be handled via a pre-commit hook that
does this automagically, as well as expand tabs to spaces and look for 80-col
violations.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64827 91177308-0d34-0410-b5e6-96231b3b80d8
U include/llvm/CodeGen/DebugLoc.h
U lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
U lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp
U lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.cpp
Enable debug location generation at -Os. This goes with the reapplication of the
r63639 patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64715 91177308-0d34-0410-b5e6-96231b3b80d8
one bit set, because the bit may be shifted off the end. Instead,
just check for a constant 1 being shifted. This is still sufficient
to handle all the cases in test/CodeGen/X86/bt.ll. This fixes PR3583.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64622 91177308-0d34-0410-b5e6-96231b3b80d8
Cleanup some warning.
Remark: when struct/class are declared differently than they are defined, this make problem for VC++ since it seems to mangle class differently that struct. These error are very hard to understand and find. So please, try to keep your definition/declaration in sync.
Only tested with VS2008. hope it does not break anything. feel free to revert.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64554 91177308-0d34-0410-b5e6-96231b3b80d8
in inline asm as signed (what gcc does). Add partial support
for x86-specific "e" and "Z" constraints, with appropriate
signedness for printing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64400 91177308-0d34-0410-b5e6-96231b3b80d8
unless they actually have data successors, and likewise for nodes
with no data successors unless they actually have data precessors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64327 91177308-0d34-0410-b5e6-96231b3b80d8
is determined by whether the node has a Flag operand. However, if the
node does have a Flag operand, it will be glued to its register's
def, so the heuristic would end up spuriously applying to whatever
node is the def.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64319 91177308-0d34-0410-b5e6-96231b3b80d8
It was transforming (x&y)==y to (x&y)!=0 in the case where
y is variable and known to have at most one bit set (e.g. z&1).
This is not correct; the expressions are not equivalent when y==0.
I believe this patch salvages what can be salvaged, including
all the cases in bt.ll. Dan, please review.
Fixes gcc.c-torture/execute/20040709-[12].c
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64314 91177308-0d34-0410-b5e6-96231b3b80d8
instruction index across each part. Instruction indices are used
to make live range queries, and live ranges can extend beyond
scheduling region boundaries.
Refactor the ScheduleDAGSDNodes class some more so that it
doesn't have to worry about this additional information.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64288 91177308-0d34-0410-b5e6-96231b3b80d8
scheduling, and generalize is so that preserves state across
scheduling regions. This fixes incorrect live-range information around
terminators and labels, which are effective region boundaries.
In place of looking for terminators to anchor inter-block dependencies,
introduce special entry and exit scheduling units for this purpose.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64254 91177308-0d34-0410-b5e6-96231b3b80d8
Adjust derived classes to pass UnknownLoc where
a DebugLoc does not make sense. Pick one of
DebugLoc and non-DebugLoc variants to survive
for all such classes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64000 91177308-0d34-0410-b5e6-96231b3b80d8
Many targets build placeholder nodes for special operands, e.g.
GlobalBaseReg on X86 and PPC for the PIC base. There's no
sensible way to associate debug info with these. I've left
them built with getNode calls with explicit DebugLoc::getUnknownLoc operands.
I'm not too happy about this but don't see a good improvement;
I considered adding a getPseudoOperand or something, but it
seems to me that'll just make it harder to read.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63992 91177308-0d34-0410-b5e6-96231b3b80d8
getCALLSEQ_{END,START} to permit passing no DebugLoc
there. UNDEF doesn't logically have DebugLoc; add
getUNDEF to encapsulate this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63978 91177308-0d34-0410-b5e6-96231b3b80d8
SelectionDAGISel::CreateScheduler, and make it just create the
scheduler. Leave running the scheduler to the higher-level code.
This makes the higher-level code a little more explicit and
easier to follow, and will help enable some future refactoring.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63944 91177308-0d34-0410-b5e6-96231b3b80d8
target directories themselves. This also means that VMCore no longer
needs to know about every target's list of intrinsics. Future work
will include converting the PowerPC target to this interface as an
example implementation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63765 91177308-0d34-0410-b5e6-96231b3b80d8
but when legalizing the operation, we split the vector type and generate a library
call whose type needs to be promoted. For example, X86 with SSE on but MMX off,
a divide v2i64 will be scalarized to 2 calls to a library using i64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63760 91177308-0d34-0410-b5e6-96231b3b80d8
support GraphViz, I've been using the foo->dump() facility. This
patch is a minor rewrite to the SelectionDAG dump() stuff to make it a
little more helpful. The existing foo->dump() functionality does not
change; this patch adds foo->dumpr(). All of this is only useful when
running LLVM under a debugger.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63736 91177308-0d34-0410-b5e6-96231b3b80d8
in any old order. Since analyzing a node analyzes its
operands also, this can mean that when we pop a node
off the list of nodes to be analyzed, it may already
have been analyzed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63632 91177308-0d34-0410-b5e6-96231b3b80d8
information. This eliminates the need for the Flags field in MemSDNode,
so this makes LoadSDNode and StoreSDNode smaller. Also, it makes
FoldingSetNodeIDs for loads and stores two AddIntegers smaller.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63577 91177308-0d34-0410-b5e6-96231b3b80d8
crashes or wrong code with codegen of large integers:
eliminate the legacy getIntegerVTBitMask and
getIntegerVTSignBit methods, which returned their
value as a uint64_t, so couldn't handle huge types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63494 91177308-0d34-0410-b5e6-96231b3b80d8
returned by getShiftAmountTy may be too small
to hold shift values (it is an i8 on x86-32).
Before and during type legalization, use a large
but legal type for shift amounts: getPointerTy;
afterwards use getShiftAmountTy, fixing up any
shift amounts with a big type during operation
legalization. Thanks to Dan for writing the
original patch (which I shamelessly pillaged).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63482 91177308-0d34-0410-b5e6-96231b3b80d8