EXTRACT_SUBREG is emitted as %dst = COPY %src:sub, so there is no need to
constrain the %dst register class. RegisterCoalescer will apply the
necessary constraints if it decides to eliminate the COPY.
The %src register class does need to be constrained to something with
the right sub-registers, though. This is currently done manually with
COPY_TO_REGCLASS nodes. They can possibly be removed after this patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141207 91177308-0d34-0410-b5e6-96231b3b80d8
The register class created by INSERT_SUBREG and SUBREG_TO_REG must be
legal and support the SubIdx sub-registers.
The new getSubClassWithSubReg() hook can compute that.
This may create INSERT_SUBREG instructions defining a larger register
class than the sub-register being inserted. That is OK,
RegisterCoalescer will constrain the register class as needed when it
eliminates the INSERT_SUBREG instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141198 91177308-0d34-0410-b5e6-96231b3b80d8
and the alignment is 0 (i.e., it's defined globally in one file and declared in
another file) it could get an alignment which is larger than the ABI allows for
that type, resulting in aligned moves being used for unaligned loads.
For instance, in file A.c:
struct S s;
In file B.c:
struct {
// something long
};
extern S s;
void foo() {
struct S p = s;
// ...
}
this copy is a 'memcpy' which is turned into a series of 'movaps' instructions
on X86. But this is wrong, because 'struct S' has alignment of 4, not 16.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140902 91177308-0d34-0410-b5e6-96231b3b80d8
This intrinsic is used to pass the index of the function context to the back-end
for further processing. The back-end is in charge of filling in the rest of the
entries.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140676 91177308-0d34-0410-b5e6-96231b3b80d8
SDNodes may return values which are wider than the incoming element types. In
this patch we fix the integer promotion of these nodes.
Fixes spill-q.ll when running -promote-elements.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140471 91177308-0d34-0410-b5e6-96231b3b80d8
(this is always the case for scalars), otherwise use the promoted result type.
Fix test/CodeGen/X86/vsplit-and.ll when promote-elements is enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140464 91177308-0d34-0410-b5e6-96231b3b80d8
When generating the trunc-store of i1's, we need to use the vector type and not
the scalar type.
This patch fixes the assertion in CodeGen/Generic/bool-vector.ll when
running with -promote-elements.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140463 91177308-0d34-0410-b5e6-96231b3b80d8
DecomposeMERGE_VALUES to "know" that results are legalized in
a particular order, by passing it the number of the result
being legalized (the type legalization core provides this, it
just needs to be passed on).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140373 91177308-0d34-0410-b5e6-96231b3b80d8
integer-promotion of CONCAT_VECTORS.
Test: test/CodeGen/X86/widen_shuffle-1.ll
This patch fixes the above tests (when running in with -promote-elements).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140372 91177308-0d34-0410-b5e6-96231b3b80d8
Sometimes register class constraints are trivial, like GR32->GR32_NOSP,
or GPR->rGPR. Teach InstrEmitter to simply constrain the virtual
register instead of emitting a copy in these cases.
Normally, these copies are handled by the coalescer. This saves some
coalescer work.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140340 91177308-0d34-0410-b5e6-96231b3b80d8
Vector SetCC result types need to be type-legalized.
This code worked before because scalar result types are known to be legal.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140249 91177308-0d34-0410-b5e6-96231b3b80d8
This is still a hack until we can teach tblgen to generate the
optional CPSR operand rather than an implicit CPSR def. But the
strangeness is now limited to the selection DAG. ADD/SUB MI's no
longer have implicit CPSR defs, nor do we allow flag setting variants
of these opcodes in machine code. There are several corner cases to
consider, and getting one wrong would previously lead to nasty
miscompilation. It's not the first time I've debugged one, so this
time I added enough verification to ensure it won't happen again.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140228 91177308-0d34-0410-b5e6-96231b3b80d8
No functionality change. The hook makes it explicit which patterns
require "special" handling. i.e. it self-documents tblgen
deficiencies. I plan to add verification in ExpandISelPseudos and
Thumb2SizeReduce to catch any missing hasPostISelHooks. Otherwise it's
too fragile.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140160 91177308-0d34-0410-b5e6-96231b3b80d8
Modified ARMISelLowering::AdjustInstrPostInstrSelection to handle the
full gamut of CPSR defs/uses including instructins whose "optional"
cc_out operand is not really optional. This allowed removal of the
hasPostISelHook to simplify the .td files and make the implementation
more robust.
Fixes rdar://10137436: sqlite3 miscompile
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140134 91177308-0d34-0410-b5e6-96231b3b80d8
(The fix for the related failures on x86 is going to be nastier because we actually need Acquire memoperands attached to the atomic load instrs, etc.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139221 91177308-0d34-0410-b5e6-96231b3b80d8
with a vector condition); such selects become VSELECT codegen nodes.
This patch also removes VSETCC codegen nodes, unifying them with SETCC
nodes (codegen was actually often using SETCC for vector SETCC already).
This ensures that various DAG combiner optimizations kick in for vector
comparisons. Passes dragonegg bootstrap with no testsuite regressions
(nightly testsuite as well as "make check-all"). Patch mostly by
Nadav Rotem.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139159 91177308-0d34-0410-b5e6-96231b3b80d8
init.trampoline and adjust.trampoline intrinsics, into two intrinsics
like in GCC. While having one combined intrinsic is tempting, it is
not natural because typically the trampoline initialization needs to
be done in one function, and the result of adjust trampoline is needed
in a different (nested) function. To get around this llvm-gcc hacks the
nested function lowering code to insert an additional parent variable
holding the adjust.trampoline result that can be accessed from the child
function. Dragonegg doesn't have the luxury of tweaking GCC code, so it
stored the result of adjust.trampoline in the memory GCC set aside for
the trampoline itself (this is always available in the child function),
and set up some new memory (using an alloca) to hold the trampoline.
Unfortunately this breaks Go which allocates trampoline memory on the
heap and wants to use it even after the parent has exited (!). Rather
than doing even more hacks to get Go working, it seemed best to just use
two intrinsics like in GCC. Patch mostly by Sanjoy Das.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139140 91177308-0d34-0410-b5e6-96231b3b80d8
If we have a chain of zext -> assert_zext -> zext -> use, the first zext would get simplified away because of the later zext, and then the later zext would get simplified away because of the assert. The solution is to teach SimplifyDemandedBits that assert_zext demands all of the high bits of its input, rather than only those demanded by its users. No testcase because the only example I have manifests as llvm-gcc miscompiling LLVM, and I haven't found a smaller case that reproduces this problem.
Fixes <rdar://problem/10063365>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139059 91177308-0d34-0410-b5e6-96231b3b80d8
to be unreliable on platforms which require memcpy calls, and it is
complicating broader legalize cleanups. It is hoped that these cleanups
will make memcpy byval easier to implement in the future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138977 91177308-0d34-0410-b5e6-96231b3b80d8
Added canClobberReachingPhysRegUse() to handle a particular pattern in
which a two-address instruction could be forced to interfere with
EFLAGS, causing a compare to be unnecessarilly cloned.
Fixes rdar://problem/5875261
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138924 91177308-0d34-0410-b5e6-96231b3b80d8
Add a instruction flag: hasPostISelHook which tells the pre-RA scheduler to
call a target hook to adjust the instruction. For ARM, this is used to
adjust instructions which may be setting the 's' flag. ADC, SBC, RSB, and RSC
instructions have implicit def of CPSR (required since it now uses CPSR physical
register dependency rather than "glue"). If the carry flag is used, then the
target hook will *fill in* the optional operand with CPSR. Otherwise, the hook
will remove the CPSR implicit def from the MachineInstr.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138810 91177308-0d34-0410-b5e6-96231b3b80d8
I don't really like the patterns, but I'm having trouble coming up with a
better way to handle them.
I plan on making other targets use the same legalization
ARM-without-memory-barriers is using... it's not especially efficient, but
if anyone cares, it's not that hard to fix for a given target if there's
some better lowering.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138621 91177308-0d34-0410-b5e6-96231b3b80d8