1. ConstantPoolSDNode alignment field is log2 value of the alignment requirement. This is not consistent with other SDNode variants.
2. MachineConstantPool alignment field is also a log2 value.
3. However, some places are creating ConstantPoolSDNode with alignment value rather than log2 values. This creates entries with artificially large alignments, e.g. 256 for SSE vector values.
4. Constant pool entry offsets are computed when they are created. However, asm printer group them by sections. That means the offsets are no longer valid. However, asm printer uses them to determine size of padding between entries.
5. Asm printer uses expensive data structure multimap to track constant pool entries by sections.
6. Asm printer iterate over SmallPtrSet when it's emitting constant pool entries. This is non-deterministic.
Solutions:
1. ConstantPoolSDNode alignment field is changed to keep non-log2 value.
2. MachineConstantPool alignment field is also changed to keep non-log2 value.
3. Functions that create ConstantPool nodes are passing in non-log2 alignments.
4. MachineConstantPoolEntry no longer keeps an offset field. It's replaced with an alignment field. Offsets are not computed when constant pool entries are created. They are computed on the fly in asm printer and JIT.
5. Asm printer uses cheaper data structure to group constant pool entries.
6. Asm printer compute entry offsets after grouping is done.
7. Change JIT code to compute entry offsets on the fly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66875 91177308-0d34-0410-b5e6-96231b3b80d8
of MachineInstr def operands must be subtracted out. This bug
was uncovered by the recent x86 EFLAGS optimization. Before
that, the only instructions that ever needed unfolding were
things like CMP32rm, where NumDefs is zero.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66056 91177308-0d34-0410-b5e6-96231b3b80d8
suprise to some callers, e.g. register coalescer. For now, add an parameter
that tells AnalyzeBranch whether it's safe to modify the mbb. A better
solution is out there, but I don't have time to deal with it right now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64124 91177308-0d34-0410-b5e6-96231b3b80d8
converted to LEA64_32r in x86's convertToThreeAddress. This
replaces code like this:
movl %esi, %edi
inc %edi
with this:
lea 1(%rsi), %edi
which appears to be beneficial.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@61830 91177308-0d34-0410-b5e6-96231b3b80d8
1. GlobalBaseReg may have been spilled.
2. It may not be live at the use.
3. Spiller doesn't know this is happening so it won't prevent GlobalBaseReg from being spilled later (That by itself is a nasty hack. It's needed because we don't insert the reload until later).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60595 91177308-0d34-0410-b5e6-96231b3b80d8
foldMemoryOperand how to "fold" them, by converting them into constant-pool
loads. When they aren't folded, they use xorps/cmpeqd, but for example when
register pressure is high, they may now be folded as memory operands, which
reduces register pressure.
Also, mark V_SET0 isAsCheapAsAMove so that two-address-elimination will
remat it instead of copying zeros around (V_SETALLONES was already marked).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60461 91177308-0d34-0410-b5e6-96231b3b80d8
the conditional for the BRCOND statement. For instance, it will generate:
addl %eax, %ecx
jo LOF
instead of
addl %eax, %ecx
; About 10 instructions to compare the signs of LHS, RHS, and sum.
jl LOF
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@60123 91177308-0d34-0410-b5e6-96231b3b80d8
Where previously LLVM might emit code like this:
ucomisd %xmm1, %xmm0
setne %al
setp %cl
orb %al, %cl
jne .LBB4_2
it now emits this:
ucomisd %xmm1, %xmm0
jne .LBB4_2
jp .LBB4_2
It has fewer instructions and uses fewer registers, but it does
have more branches. And in the case that this code is followed by
a non-fallthrough edge, it may be followed by a jmp instruction,
resulting in three branch instructions in sequence. Some effort
is made to avoid this situation.
To achieve this, X86ISelLowering.cpp now recognizes FCMP_OEQ and
FCMP_UNE in lowered form, and replace them with code that emits
two branches, except in the case where it would require converting
a fall-through edge to an explicit branch.
Also, X86InstrInfo.cpp's branch analysis and transform code now
knows now to handle blocks with multiple conditional branches. It
uses loops instead of having fixed checks for up to two
instructions. It can now analyze and transform code generated
from FCMP_OEQ and FCMP_UNE.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57873 91177308-0d34-0410-b5e6-96231b3b80d8
the copy instruction from the instruction list before asking the
target to create the new instruction. This gets the old instruction
out of the way so that it doesn't interfere with the target's
rematerialization code. In the case of x86, this helps it find
more cases where EFLAGS is not live.
Also, in the X86InstrInfo.cpp, teach isSafeToClobberEFLAGS to check
to see if it reached the end of the block after scanning each
instruction, instead of just before. This lets it notice when the
end of the block is only two instructions away, without doing any
additional scanning.
These changes allow rematerialization to clobber EFLAGS in more
cases, for example using xor instead of mov to set the return value
to zero in the included testcase.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57872 91177308-0d34-0410-b5e6-96231b3b80d8
shift counts, and patterns that match dynamic shift counts
when the subtract is obscured by a truncate node.
Add DAGCombiner support for recognizing rotate patterns
when the shift counts are defined by truncate nodes.
Fix and simplify the code for commuting shld and shrd
instructions to work even when the given instruction doesn't
have a parent, and when the caller needs a new instruction.
These changes allow LLVM to use the shld, shrd, rol, and ror
instructions on x86 to replace equivalent code using two
shifts and an or in many more cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@57662 91177308-0d34-0410-b5e6-96231b3b80d8
and X86FastISel.cpp into X86MachineFunction.h, so that it
can be shared, instead of having each selector keep track
of its own.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@56825 91177308-0d34-0410-b5e6-96231b3b80d8