type as the vector element type: allow them to be of
a wider integer type than the element type all the way
through the system, and not just as far as LegalizeDAG.
This should be safe because it used to be this way
(the old type legalizer would produce such nodes), so
backends should be able to handle it. In fact only
targets which have legal vector types with an illegal
promoted element type will ever see this (eg: <4 x i16>
on ppc). This fixes a regression with the new type
legalizer (vec_splat.ll). Also, treat SCALAR_TO_VECTOR
the same as BUILD_VECTOR. After all, it is just a
special case of BUILD_VECTOR.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@69467 91177308-0d34-0410-b5e6-96231b3b80d8
promoted to legal types without changing the type of the vector. This is
following a suggestion from Duncan
(http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-February/019923.html).
The transformation that used to be done during type legalization is now
postponed to DAG legalization. This allows the BUILD_VECTORs to be optimized
and potentially handled specially by target-specific code.
It turns out that this is also consistent with an optimization done by the
DAG combiner: a BUILD_VECTOR and INSERT_VECTOR_ELT may be combined by
replacing one of the BUILD_VECTOR operands with the newly inserted element;
but INSERT_VECTOR_ELT allows its scalar operand to be larger than the
element type, with any extra high bits being implicitly truncated. The
result is a BUILD_VECTOR where one of the operands has a type larger the
the vector element type.
Any code that operates on BUILD_VECTORs may now need to be aware of the
potential type discrepancy between the vector element type and the
BUILD_VECTOR operands. This patch updates all of the places that I could
find to handle that case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@68996 91177308-0d34-0410-b5e6-96231b3b80d8
in addition to ZERO_EXTEND and SIGN_EXTEND. Fix a bug in the
way it checked for live-out values, and simplify the way it
find users by using SDNode::use_iterator's (relatively) new
features. Also, make it slightly more permissive on targets
with free truncates.
In SelectionDAGBuild, avoid creating ANY_EXTEND nodes that are
larger than necessary. If the target's SwitchAmountTy has
enough bits, use it. This exposes the truncate to optimization
early, enabling more optimizations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@68670 91177308-0d34-0410-b5e6-96231b3b80d8
with SUBREG_TO_REG, teach SimpleRegisterCoalescing to coalesce
SUBREG_TO_REG instructions (which are similar to INSERT_SUBREG
instructions), and teach the DAGCombiner to take advantage of this on
targets which support it. This eliminates many redundant
zero-extension operations on x86-64.
This adds a new TargetLowering hook, isZExtFree. It's similar to
isTruncateFree, except it only applies to actual definitions, and not
no-op truncates which may not zero the high bits.
Also, this adds a new optimization to SimplifyDemandedBits: transform
operations like x+y into (zext (add (trunc x), (trunc y))) on targets
where all the casts are no-ops. In contexts where the high part of the
add is explicitly masked off, this allows the mask operation to be
eliminated. Fix the DAGCombiner to avoid undoing these transformations
to eliminate casts on targets where the casts are no-ops.
Also, this adds a new two-address lowering heuristic. Since
two-address lowering runs before coalescing, it helps to be able to
look through copies when deciding whether commuting and/or
three-address conversion are profitable.
Also, fix a bug in LiveInterval::MergeInClobberRanges. It didn't handle
the case that a clobber range extended both before and beyond an
existing live range. In that case, multiple live ranges need to be
added. This was exposed by the new subreg coalescing code.
Remove 2008-05-06-SpillerBug.ll. It was bugpoint-reduced, and the
spiller behavior it was looking for no longer occurrs with the new
instruction selection.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@68576 91177308-0d34-0410-b5e6-96231b3b80d8
x * 40
=>
shlq $3, %rdi
leaq (%rdi,%rdi,4), %rax
This has the added benefit of allowing more multiply to be folded into addressing mode. e.g.
a * 24 + b
=>
leaq (%rdi,%rdi,2), %rax
leaq (%rsi,%rax,8), %rax
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67917 91177308-0d34-0410-b5e6-96231b3b80d8
vector shuffle mask. Forced the mask to be built using i32. Note: this will
be irrelevant once vector_shuffle no longer takes a build vector for the
shuffle mask.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67076 91177308-0d34-0410-b5e6-96231b3b80d8
if FPConstant is legal because if the FPConstant doesn't need to be stored
in a constant pool, the transformation is unlikely to be profitable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66994 91177308-0d34-0410-b5e6-96231b3b80d8
1. ConstantPoolSDNode alignment field is log2 value of the alignment requirement. This is not consistent with other SDNode variants.
2. MachineConstantPool alignment field is also a log2 value.
3. However, some places are creating ConstantPoolSDNode with alignment value rather than log2 values. This creates entries with artificially large alignments, e.g. 256 for SSE vector values.
4. Constant pool entry offsets are computed when they are created. However, asm printer group them by sections. That means the offsets are no longer valid. However, asm printer uses them to determine size of padding between entries.
5. Asm printer uses expensive data structure multimap to track constant pool entries by sections.
6. Asm printer iterate over SmallPtrSet when it's emitting constant pool entries. This is non-deterministic.
Solutions:
1. ConstantPoolSDNode alignment field is changed to keep non-log2 value.
2. MachineConstantPool alignment field is also changed to keep non-log2 value.
3. Functions that create ConstantPool nodes are passing in non-log2 alignments.
4. MachineConstantPoolEntry no longer keeps an offset field. It's replaced with an alignment field. Offsets are not computed when constant pool entries are created. They are computed on the fly in asm printer and JIT.
5. Asm printer uses cheaper data structure to group constant pool entries.
6. Asm printer compute entry offsets after grouping is done.
7. Change JIT code to compute entry offsets on the fly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66875 91177308-0d34-0410-b5e6-96231b3b80d8
related transformations out of target-specific dag combine into the
ARM backend. These were added by Evan in r37685 with no testcases
and only seems to help ARM (e.g. test/CodeGen/ARM/select_xform.ll).
Add some simple X86-specific (for now) DAG combines that turn things
like cond ? 8 : 0 -> (zext(cond) << 3). This happens frequently
with the recently added cp constant select optimization, but is a
very general xform. For example, we now compile the second example
in const-select.ll to:
_test:
movsd LCPI2_0, %xmm0
ucomisd 8(%esp), %xmm0
seta %al
movzbl %al, %eax
movl 4(%esp), %ecx
movsbl (%ecx,%eax,4), %eax
ret
instead of:
_test:
movl 4(%esp), %eax
leal 4(%eax), %ecx
movsd LCPI2_0, %xmm0
ucomisd 8(%esp), %xmm0
cmovbe %eax, %ecx
movsbl (%ecx), %eax
ret
This passes multisource and dejagnu.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66779 91177308-0d34-0410-b5e6-96231b3b80d8
alignment of the generated constant pool entry to the
desired alignment of a type. If we don't do this, we end up
trying to do movsd from 4-byte alignment memory. This fixes
450.soplex and 456.hmmer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66641 91177308-0d34-0410-b5e6-96231b3b80d8
instruction. The class also consolidates the code for detecting constant
splats that's shared across PowerPC and the CellSPU backends (and might be
useful for other backends.) Also introduces SelectionDAG::getBUID_VECTOR() for
generating new BUILD_VECTOR nodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65296 91177308-0d34-0410-b5e6-96231b3b80d8
that checks whether it's safe to transform a store of a bitcast
value into a store of the original value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65201 91177308-0d34-0410-b5e6-96231b3b80d8
(Note: Eventually, commits like this will be handled via a pre-commit hook that
does this automagically, as well as expand tabs to spaces and look for 80-col
violations.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64827 91177308-0d34-0410-b5e6-96231b3b80d8
getCALLSEQ_{END,START} to permit passing no DebugLoc
there. UNDEF doesn't logically have DebugLoc; add
getUNDEF to encapsulate this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63978 91177308-0d34-0410-b5e6-96231b3b80d8
crashes or wrong code with codegen of large integers:
eliminate the legacy getIntegerVTBitMask and
getIntegerVTSignBit methods, which returned their
value as a uint64_t, so couldn't handle huge types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63494 91177308-0d34-0410-b5e6-96231b3b80d8
returned by getShiftAmountTy may be too small
to hold shift values (it is an i8 on x86-32).
Before and during type legalization, use a large
but legal type for shift amounts: getPointerTy;
afterwards use getShiftAmountTy, fixing up any
shift amounts with a big type during operation
legalization. Thanks to Dan for writing the
original patch (which I shamelessly pillaged).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63482 91177308-0d34-0410-b5e6-96231b3b80d8