the memory barrier variants (other than 'SY' full system domain read and write)
are treated as one instruction with option operand.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110951 91177308-0d34-0410-b5e6-96231b3b80d8
- Make foldMemoryOperandImpl aware of 256-bit zero vectors folding and support the 128-bit counterparts of AVX too.
- Make sure MOV[AU]PS instructions are only selected when SSE1 is enabled, and duplicate the patterns to match AVX.
- Add a testcase for a simple 128-bit zero vector creation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110946 91177308-0d34-0410-b5e6-96231b3b80d8
term goal here is to be able to match enough of vector_shuffle and build_vector
so all avx intrinsics which aren't mapped to their own built-ins but to
shufflevector calls can be codegen'd. This is the first (baby) step, support
building zeroed vectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110897 91177308-0d34-0410-b5e6-96231b3b80d8
entry for ARM STRBT is actually a super-instruction for A8.6.199 STRBT A1 & A2.
Recover by looking for ARM:USAT encoding pattern before delegating to the auto-
gened decoder.
Added a "usat" test case to arm-tests.txt.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110894 91177308-0d34-0410-b5e6-96231b3b80d8
When a register is defined by a partial load:
%reg1234:sub_32 = MOV32mr <fi#-1>; GR64:%reg1234
That load cannot be folded into an instruction using the full 64-bit register.
It would become a 64-bit load.
This is related to the recent change to have isLoadFromStackSlot return false on
a sub-register load.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110874 91177308-0d34-0410-b5e6-96231b3b80d8
that many of these things, so the memory savings isn't significant,
and there are now situations where there can be alignments greater
than 128.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110836 91177308-0d34-0410-b5e6-96231b3b80d8
avoids trouble if the return type of TD->getPointerSize() is
changed to something which doesn't promote to a signed type,
and is simpler anyway.
Also, use getCopyFromReg instead of getRegister to read a
physical register's value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110835 91177308-0d34-0410-b5e6-96231b3b80d8
float t1(int argc) {
return (argc == 1123) ? 1.234f : 2.38213f;
}
We would generate truly awful code on ARM (those with a weak stomach should look
away):
_t1:
movw r1, #1123
movs r2, #1
movs r3, #0
cmp r0, r1
mov.w r0, #0
it eq
moveq r0, r2
movs r1, #4
cmp r0, #0
it ne
movne r3, r1
adr r0, #LCPI1_0
ldr r0, [r0, r3]
bx lr
The problem was that legalization was creating a cascade of SELECT_CC nodes, for
for the comparison of "argc == 1123" which was fed into a SELECT node for the ?:
statement which was itself converted to a SELECT_CC node. This is because the
ARM back-end doesn't have custom lowering for SELECT nodes, so it used the
default "Expand".
I added a fairly simple "LowerSELECT" to the ARM back-end. It takes care of this
testcase, but can obviously be expanded to include more cases.
Now we generate this, which looks optimal to me:
_t1:
movw r1, #1123
movs r2, #0
cmp r0, r1
adr r0, #LCPI0_0
it eq
moveq r2, #4
ldr r0, [r0, r2]
bx lr
.align 2
LCPI0_0:
.long 1075344593 @ float 2.382130e+00
.long 1067316150 @ float 1.234000e+00
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110799 91177308-0d34-0410-b5e6-96231b3b80d8
memory and synchronization barrier dmb and dsb instructions.
- Change instruction names to something more sensible (matching name of actual
instructions).
- Added tests for memory barrier codegen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110785 91177308-0d34-0410-b5e6-96231b3b80d8
for some reason they have a very odd MCInst form where the operands overlap, but
I haven't dug in to find out why yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110781 91177308-0d34-0410-b5e6-96231b3b80d8
(I discovered 2 more copies of the ARM instruction format list, bringing the
total to 4!! Two of them were already out of sync. I haven't yet gotten into
the disassembler enough to know the best way to fix this, but something needs
to be done.) Add support for encoding these instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110754 91177308-0d34-0410-b5e6-96231b3b80d8
Apply the same approach of SSE4.1 ptest intrinsics but
create a new x86 node "testp" since AVX introduces
vtest{ps}{pd} instructions which set ZF and CF depending
on sign bit AND and ANDN of packed floating-point sources.
This is slightly different from what the "ptest" does.
Tests comming with the other 256 intrinsics tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110744 91177308-0d34-0410-b5e6-96231b3b80d8
Also added a test case to check for the added benefit of this patch: it's optimizing away the unnecessary restore of sp from fp for some non-leaf functions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110707 91177308-0d34-0410-b5e6-96231b3b80d8
reserved, not available for general allocation. This eliminates all the
extra checks for Darwin.
This change also fixes the use of FP to access frame indices in leaf
functions and cleaned up some confusing code in epilogue emission.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110655 91177308-0d34-0410-b5e6-96231b3b80d8
This will always be false before PEI:
(DisableFramePointerElim(MF) && MFI->adjustsStack())
Which means it's going to make r11 available as a general purpose register even
if -disable-fp-elim is specified. It's working on Darwin only because r7 is
always reserved. But it's obviously broken for other targets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110614 91177308-0d34-0410-b5e6-96231b3b80d8
Next time the build is broken due to wrong library dependencies, just
try building again (if you are on some Unix and are building all LLVM
targets) or ask someone to commit the regenerated LLVMLibDeps.cmake.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110593 91177308-0d34-0410-b5e6-96231b3b80d8
relatively expensive comparison analyzer on each instruction. Also rename the
comparison analyzer method to something more in line with what it actually does.
This pass is will eventually be folded into the Machine CSE pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110539 91177308-0d34-0410-b5e6-96231b3b80d8
form of CMPSD (etc.) Matching a 128-bit memory
operand is wrong, the instruction uses only 64 bits
(same as ADDSD etc.) 8193553.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110491 91177308-0d34-0410-b5e6-96231b3b80d8
implementation of the function is equivalent, so no need to provide
the target-specific version until/unless it needs to do something.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110465 91177308-0d34-0410-b5e6-96231b3b80d8
Without this what was happening was:
* R3 is not marked as "used"
* ARM backend thinks it has to save it to the stack because of vaarg
* Offset computation correctly ignores it
* Offsets are wrong
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110446 91177308-0d34-0410-b5e6-96231b3b80d8
This pass tries to remove comparison instructions when possible. For instance,
if you have this code:
sub r1, 1
cmp r1, 0
bz L1
and "sub" either sets the same flag as the "cmp" instruction or could be
converted to set the same flag, then we can eliminate the "cmp" instruction all
together. This is a important for ARM where the ALU instructions could set the
CPSR flag, but need a special suffix ('s') to do so.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110423 91177308-0d34-0410-b5e6-96231b3b80d8
register for local access when it's closer to the stack slot being refererenced
than the stack pointer. Make sure to take into account any argument frame
SP adjustments that are in affect at the time.
rdar://8256090
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110366 91177308-0d34-0410-b5e6-96231b3b80d8