x * 40
=>
shlq $3, %rdi
leaq (%rdi,%rdi,4), %rax
This has the added benefit of allowing more multiply to be folded into addressing mode. e.g.
a * 24 + b
=>
leaq (%rdi,%rdi,2), %rax
leaq (%rsi,%rax,8), %rax
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67917 91177308-0d34-0410-b5e6-96231b3b80d8
Also fixes SDISel so it *does not* force promote return value if the function is not marked signext / zeroext.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67701 91177308-0d34-0410-b5e6-96231b3b80d8
stoppoint nodes around until Legalize; doing this
imposed an ordering on a sequence of loads that
came from different lines, interfering with scheduling.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67692 91177308-0d34-0410-b5e6-96231b3b80d8
help out the register pressure reduction heuristics in the case of
nodes with multiple uses. Currently this uses very conservative
heuristics, so it doesn't have a broad impact, but in cases where it
does help it can make a big difference.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67586 91177308-0d34-0410-b5e6-96231b3b80d8
a data dependency on the load node, so it really needs a
data-dependence edge to the load node, even if the load previously
existed.
And add a few comments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67554 91177308-0d34-0410-b5e6-96231b3b80d8
and expanding a bit convert (PR3711). In both cases, we extract the
valid part of the widen vector and then do the conversion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67175 91177308-0d34-0410-b5e6-96231b3b80d8
size by the array amount as an i32 value instead of promoting from
i32 to i64 then doing the multiply. Not doing this broke wrap-around
assumptions that the optimizers (validly) made. The ultimate real
fix for this is to introduce i64 version of alloca and remove mallocinst.
This fixes PR3829
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67093 91177308-0d34-0410-b5e6-96231b3b80d8
vector shuffle mask. Forced the mask to be built using i32. Note: this will
be irrelevant once vector_shuffle no longer takes a build vector for the
shuffle mask.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@67076 91177308-0d34-0410-b5e6-96231b3b80d8
if FPConstant is legal because if the FPConstant doesn't need to be stored
in a constant pool, the transformation is unlikely to be profitable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66994 91177308-0d34-0410-b5e6-96231b3b80d8
ptrtoint and inttoptr in X86FastISel. These casts aren't always
handled in the generic FastISel code because X86 sometimes needs
custom code to do truncation and zero-extension.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66988 91177308-0d34-0410-b5e6-96231b3b80d8
by inserting explicit zero extensions where necessary. Included
is a testcase where SelectionDAG produces a virtual register
holding an i1 value which FastISel previously mistakenly assumed
to be zero-extended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66941 91177308-0d34-0410-b5e6-96231b3b80d8
1. ConstantPoolSDNode alignment field is log2 value of the alignment requirement. This is not consistent with other SDNode variants.
2. MachineConstantPool alignment field is also a log2 value.
3. However, some places are creating ConstantPoolSDNode with alignment value rather than log2 values. This creates entries with artificially large alignments, e.g. 256 for SSE vector values.
4. Constant pool entry offsets are computed when they are created. However, asm printer group them by sections. That means the offsets are no longer valid. However, asm printer uses them to determine size of padding between entries.
5. Asm printer uses expensive data structure multimap to track constant pool entries by sections.
6. Asm printer iterate over SmallPtrSet when it's emitting constant pool entries. This is non-deterministic.
Solutions:
1. ConstantPoolSDNode alignment field is changed to keep non-log2 value.
2. MachineConstantPool alignment field is also changed to keep non-log2 value.
3. Functions that create ConstantPool nodes are passing in non-log2 alignments.
4. MachineConstantPoolEntry no longer keeps an offset field. It's replaced with an alignment field. Offsets are not computed when constant pool entries are created. They are computed on the fly in asm printer and JIT.
5. Asm printer uses cheaper data structure to group constant pool entries.
6. Asm printer compute entry offsets after grouping is done.
7. Change JIT code to compute entry offsets on the fly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66875 91177308-0d34-0410-b5e6-96231b3b80d8
related transformations out of target-specific dag combine into the
ARM backend. These were added by Evan in r37685 with no testcases
and only seems to help ARM (e.g. test/CodeGen/ARM/select_xform.ll).
Add some simple X86-specific (for now) DAG combines that turn things
like cond ? 8 : 0 -> (zext(cond) << 3). This happens frequently
with the recently added cp constant select optimization, but is a
very general xform. For example, we now compile the second example
in const-select.ll to:
_test:
movsd LCPI2_0, %xmm0
ucomisd 8(%esp), %xmm0
seta %al
movzbl %al, %eax
movl 4(%esp), %ecx
movsbl (%ecx,%eax,4), %eax
ret
instead of:
_test:
movl 4(%esp), %eax
leal 4(%eax), %ecx
movsd LCPI2_0, %xmm0
ucomisd 8(%esp), %xmm0
cmovbe %eax, %ecx
movsbl (%ecx), %eax
ret
This passes multisource and dejagnu.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66779 91177308-0d34-0410-b5e6-96231b3b80d8
alignment of the generated constant pool entry to the
desired alignment of a type. If we don't do this, we end up
trying to do movsd from 4-byte alignment memory. This fixes
450.soplex and 456.hmmer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66641 91177308-0d34-0410-b5e6-96231b3b80d8
and extern_weak_odr. These are the same as the non-odr versions,
except that they indicate that the global will only be overridden
by an *equivalent* global. In C, a function with weak linkage can
be overridden by a function which behaves completely differently.
This means that IP passes have to skip weak functions, since any
deductions made from the function definition might be wrong, since
the definition could be replaced by something completely different
at link time. This is not allowed in C++, thanks to the ODR
(One-Definition-Rule): if a function is replaced by another at
link-time, then the new function must be the same as the original
function. If a language knows that a function or other global can
only be overridden by an equivalent global, it can give it the
weak_odr linkage type, and the optimizers will understand that it
is alright to make deductions based on the function body. The
code generators on the other hand map weak and weak_odr linkage
to the same thing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66339 91177308-0d34-0410-b5e6-96231b3b80d8
with multiple chain operands. This can occur when the scheduler
has added chain operands to a node that already has a chain
operand, in order to handle physical register dependencies.
This fixes an llvm-gcc bootstrap failure on x86-64 introduced
in r66058.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66240 91177308-0d34-0410-b5e6-96231b3b80d8
It is an error to call APInt::zext with a size that is equal to the value's
current size, so use zextOrTrunc instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66039 91177308-0d34-0410-b5e6-96231b3b80d8
so it changed it into a 31 via the TLO.ShrinkDemandedConstant() call. Then it
would go through the DAG combiner again. This time it had a value of 31, which
was turned into a -1 by TLI.SimplifyDemandedBits(). This would ping pong
forever.
Teach the TLO.ShrinkDemandedConstant() call not to lower a value if the demanded
value is an XOR of all ones.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65985 91177308-0d34-0410-b5e6-96231b3b80d8
arbitrary vector sizes. Add an optional MinSplatBits parameter to specify
a minimum for the splat element size. Update the PPC target to use the
revised interface.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@65899 91177308-0d34-0410-b5e6-96231b3b80d8