with the VST4 instructions. Until after register allocation, we want to
represent sets of adjacent registers by a single super-register. These
VST4 pseudo instructions have a single QQ or QQQQ source register operand.
They get expanded to the real VST4 instructions with 4 separate D register
operands. Once this conversion is complete, we'll be able to remove the
NEONPreAllocPass and avoid some fragile and hacky code elsewhere.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112108 91177308-0d34-0410-b5e6-96231b3b80d8
comparison that would overflow.
- The other under/overflow cases can't actually happen because the immediates
which would trigger them are legal (so we don't enter this code), but
adjusted the style to make it clear the transform is always valid.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112053 91177308-0d34-0410-b5e6-96231b3b80d8
Mark _alloca call as clobberring EFLAGS, otherwise some DCE might remove
other flags-clobberring stuff (e.g. cmp instructions) occuring after
_alloca call.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112034 91177308-0d34-0410-b5e6-96231b3b80d8
For now it's still a command line option, but the interface to the generic
code doesn't need to know that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111942 91177308-0d34-0410-b5e6-96231b3b80d8
Intended to help ease reproducing problems by increasing base register usage
after heuristics for only using the when needed are in place.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111930 91177308-0d34-0410-b5e6-96231b3b80d8
that it needs to go away. Use loadRegFromStackSlot where possible.
Also, remember to update the value map.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111883 91177308-0d34-0410-b5e6-96231b3b80d8
it's COFF emitter which does not support differences of two symbols
(and needs to be fixed). GAS is pretty fine with code produced.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111801 91177308-0d34-0410-b5e6-96231b3b80d8
comparison is in a different basic block from the branch. In such
cases, the comparison's operands may not have initialized virtual
registers available.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111709 91177308-0d34-0410-b5e6-96231b3b80d8
general idea here is to have a group of x86 target specific nodes which are
going to be selected during lowering and then directly matched in isel.
The commit includes the addition of those specific nodes and a *bunch* of
patterns, and incrementally we're going to switch between them and what we
have right now. Both the patterns and target specific nodes can change as
we move forward with this work.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111691 91177308-0d34-0410-b5e6-96231b3b80d8
It's similar to "linker_private_weak", but it's known that the address of the
object is not taken. For instance, functions that had an inline definition, but
the compiler decided not to inline it. Note, unlike linker_private and
linker_private_weak, linker_private_weak_def_auto may have only default
visibility. The symbols are removed by the linker from the final linked image
(executable or dynamic library).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111684 91177308-0d34-0410-b5e6-96231b3b80d8
frame index reference to an object in the local block is seen, check if
it's near enough to any previously allocaated base register to re-use.
rdar://8277890
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111443 91177308-0d34-0410-b5e6-96231b3b80d8
Nothing fancy, just ask the target if any currently available base reg
is in range for the instruction under consideration and use the first one
that is. Placeholder ARM implementation simply returns false for now.
ongoing saga of rdar://8277890
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111374 91177308-0d34-0410-b5e6-96231b3b80d8
The "half vectors" are now widened to full size by the legalizer.
The only exception is in parameter passing, where half vectors are
expanded. This causes changes to some dejagnu tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111360 91177308-0d34-0410-b5e6-96231b3b80d8
"SPU Application Binary Interface Specification, v1.9" by
IBM.
Specifically: use r3-r74 to pass parameters and the return value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111358 91177308-0d34-0410-b5e6-96231b3b80d8
the local block. Resolve references to those indices to a new base register.
For simplification and testing purposes, a new virtual base register is
allocated for each frame index being resolved. The result is truly horrible,
but correct, code that's good for exercising the new code paths.
Next up is adding thumb1 support, which should be very simple. Following that
will be adding base register re-use and implementing a reasonable ARM
heuristic for when a virtual base register should be generated at all.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111315 91177308-0d34-0410-b5e6-96231b3b80d8
- Do not clobber al during variadic calls, this is AMD64 ABI-only feature
- Emit wincall64, where necessary
Patch by Cameron Esfahani!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111289 91177308-0d34-0410-b5e6-96231b3b80d8
Modernize predicates a bit.
The Predicate_* methods are not used by TableGen any longer. They are only
emitted for the sake of legacy code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111263 91177308-0d34-0410-b5e6-96231b3b80d8
whether to allocate a virtual frame base register to resolve the frame
index reference in it. Implement a simple version for ARM to aid debugging.
In LocalStackSlotAllocation, scan the function for frame index references
to local frame indices and ask the target whether to allocate virtual
frame base registers for any it encounters. Purely infrastructural for
debug output. Next step is to actually allocate base registers, then add
intelligent re-use of them.
rdar://8277890
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111262 91177308-0d34-0410-b5e6-96231b3b80d8
printing "lsl #0". This fixes the remaining parts of pr7792. Make
corresponding changes for encoding/decoding these instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111251 91177308-0d34-0410-b5e6-96231b3b80d8
the memory barrier variants (other than 'SY' full system domain read and write)
are treated as one instruction with option operand.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110951 91177308-0d34-0410-b5e6-96231b3b80d8
- Make foldMemoryOperandImpl aware of 256-bit zero vectors folding and support the 128-bit counterparts of AVX too.
- Make sure MOV[AU]PS instructions are only selected when SSE1 is enabled, and duplicate the patterns to match AVX.
- Add a testcase for a simple 128-bit zero vector creation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110946 91177308-0d34-0410-b5e6-96231b3b80d8
term goal here is to be able to match enough of vector_shuffle and build_vector
so all avx intrinsics which aren't mapped to their own built-ins but to
shufflevector calls can be codegen'd. This is the first (baby) step, support
building zeroed vectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110897 91177308-0d34-0410-b5e6-96231b3b80d8
entry for ARM STRBT is actually a super-instruction for A8.6.199 STRBT A1 & A2.
Recover by looking for ARM:USAT encoding pattern before delegating to the auto-
gened decoder.
Added a "usat" test case to arm-tests.txt.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110894 91177308-0d34-0410-b5e6-96231b3b80d8
When a register is defined by a partial load:
%reg1234:sub_32 = MOV32mr <fi#-1>; GR64:%reg1234
That load cannot be folded into an instruction using the full 64-bit register.
It would become a 64-bit load.
This is related to the recent change to have isLoadFromStackSlot return false on
a sub-register load.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110874 91177308-0d34-0410-b5e6-96231b3b80d8
that many of these things, so the memory savings isn't significant,
and there are now situations where there can be alignments greater
than 128.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110836 91177308-0d34-0410-b5e6-96231b3b80d8
avoids trouble if the return type of TD->getPointerSize() is
changed to something which doesn't promote to a signed type,
and is simpler anyway.
Also, use getCopyFromReg instead of getRegister to read a
physical register's value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110835 91177308-0d34-0410-b5e6-96231b3b80d8
float t1(int argc) {
return (argc == 1123) ? 1.234f : 2.38213f;
}
We would generate truly awful code on ARM (those with a weak stomach should look
away):
_t1:
movw r1, #1123
movs r2, #1
movs r3, #0
cmp r0, r1
mov.w r0, #0
it eq
moveq r0, r2
movs r1, #4
cmp r0, #0
it ne
movne r3, r1
adr r0, #LCPI1_0
ldr r0, [r0, r3]
bx lr
The problem was that legalization was creating a cascade of SELECT_CC nodes, for
for the comparison of "argc == 1123" which was fed into a SELECT node for the ?:
statement which was itself converted to a SELECT_CC node. This is because the
ARM back-end doesn't have custom lowering for SELECT nodes, so it used the
default "Expand".
I added a fairly simple "LowerSELECT" to the ARM back-end. It takes care of this
testcase, but can obviously be expanded to include more cases.
Now we generate this, which looks optimal to me:
_t1:
movw r1, #1123
movs r2, #0
cmp r0, r1
adr r0, #LCPI0_0
it eq
moveq r2, #4
ldr r0, [r0, r2]
bx lr
.align 2
LCPI0_0:
.long 1075344593 @ float 2.382130e+00
.long 1067316150 @ float 1.234000e+00
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110799 91177308-0d34-0410-b5e6-96231b3b80d8
memory and synchronization barrier dmb and dsb instructions.
- Change instruction names to something more sensible (matching name of actual
instructions).
- Added tests for memory barrier codegen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110785 91177308-0d34-0410-b5e6-96231b3b80d8
for some reason they have a very odd MCInst form where the operands overlap, but
I haven't dug in to find out why yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110781 91177308-0d34-0410-b5e6-96231b3b80d8
(I discovered 2 more copies of the ARM instruction format list, bringing the
total to 4!! Two of them were already out of sync. I haven't yet gotten into
the disassembler enough to know the best way to fix this, but something needs
to be done.) Add support for encoding these instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110754 91177308-0d34-0410-b5e6-96231b3b80d8
Apply the same approach of SSE4.1 ptest intrinsics but
create a new x86 node "testp" since AVX introduces
vtest{ps}{pd} instructions which set ZF and CF depending
on sign bit AND and ANDN of packed floating-point sources.
This is slightly different from what the "ptest" does.
Tests comming with the other 256 intrinsics tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110744 91177308-0d34-0410-b5e6-96231b3b80d8