IR add/sub operations with one or both operands sign- or zero-extended.
Auto-upgrade the old intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112416 91177308-0d34-0410-b5e6-96231b3b80d8
the special values that for ARM would be used with IB or DA modes. Fall
through and consider materializing a new base address is it would be
profitable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112329 91177308-0d34-0410-b5e6-96231b3b80d8
all the other LDM/STM instructions. This fixes asm printer crashes when
compiling with -O0. I've changed one of the NEON tests (vst3.ll) to run
with -O0 to check this in the future.
Prior to this change VLDM/VSTM used addressing mode #5, but not really.
The offset field was used to hold a count of the number of registers being
loaded or stored, and the AM5 opcode field was expanded to specify the IA
or DB mode, instead of the standard ADD/SUB specifier. Much of the backend
was not aware of these special cases. The crashes occured when rewriting
a frameindex caused the AM5 offset field to be changed so that it did not
have a valid submode. I don't know exactly what changed to expose this now.
Maybe we've never done much with -O0 and NEON. Regardless, there's no longer
any reason to keep a count of the VLDM/VSTM registers, so we can use
addressing mode #4 and clean things up in a lot of places.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112322 91177308-0d34-0410-b5e6-96231b3b80d8
still having a significant effect. It shouldn't be now that the pre-RA
virtual base reg stuff is in. Assuming that's valididated by the nightly
testers, we can simplify a lot of the PEI frame index code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112220 91177308-0d34-0410-b5e6-96231b3b80d8
comparison with 0. These two pieces of code should give identical results:
rsbs r1, r1, 0
cmp r0, r1
mov r0, #0
it ls
mov r0, #1
and:
cmn r0, r1
mov r0, #0
it ls
mov r0, #1
However, the CMN gives the *opposite* result when r1 is 0. This is because the
carry flag is set in the CMP case but not in the CMN case. In short, the CMP
instruction doesn't perform a truncate of the (logical) NOT of 0 plus the value
of r0 and the carry bit (because the "carry bit" parameter to AddWithCarry is
defined as 1 in this case, the carry flag will always be set when r0 >= 0). The
CMN instruction doesn't perform a NOT of 0 so there is never a "carry" when this
AddWithCarry is performed (because the "carry bit" parameter to AddWithCarry is
defined as 0).
The AddWithCarry in the CMP case seems to be relying upon the identity:
~x + 1 = -x
However when x is 0 and unsigned, this doesn't hold:
x = 0
~x = 0xFFFF FFFF
~x + 1 = 0x1 0000 0000
(-x = 0) != (0x1 0000 0000 = ~x + 1)
Therefore, we should disable *all* versions of CMN, especially when comparing
against zero, until we can limit when the CMN instruction is used (when we know
that the RHS is not 0) or when we have a hardware fix for this.
(See the ARM docs for the "AddWithCarry" pseudo-code.)
This is related to <rdar://problem/7569620>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112176 91177308-0d34-0410-b5e6-96231b3b80d8
with the VST4 instructions. Until after register allocation, we want to
represent sets of adjacent registers by a single super-register. These
VST4 pseudo instructions have a single QQ or QQQQ source register operand.
They get expanded to the real VST4 instructions with 4 separate D register
operands. Once this conversion is complete, we'll be able to remove the
NEONPreAllocPass and avoid some fragile and hacky code elsewhere.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112108 91177308-0d34-0410-b5e6-96231b3b80d8
comparison that would overflow.
- The other under/overflow cases can't actually happen because the immediates
which would trigger them are legal (so we don't enter this code), but
adjusted the style to make it clear the transform is always valid.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112053 91177308-0d34-0410-b5e6-96231b3b80d8
For now it's still a command line option, but the interface to the generic
code doesn't need to know that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111942 91177308-0d34-0410-b5e6-96231b3b80d8
Intended to help ease reproducing problems by increasing base register usage
after heuristics for only using the when needed are in place.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111930 91177308-0d34-0410-b5e6-96231b3b80d8
that it needs to go away. Use loadRegFromStackSlot where possible.
Also, remember to update the value map.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111883 91177308-0d34-0410-b5e6-96231b3b80d8
frame index reference to an object in the local block is seen, check if
it's near enough to any previously allocaated base register to re-use.
rdar://8277890
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111443 91177308-0d34-0410-b5e6-96231b3b80d8
Nothing fancy, just ask the target if any currently available base reg
is in range for the instruction under consideration and use the first one
that is. Placeholder ARM implementation simply returns false for now.
ongoing saga of rdar://8277890
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111374 91177308-0d34-0410-b5e6-96231b3b80d8
the local block. Resolve references to those indices to a new base register.
For simplification and testing purposes, a new virtual base register is
allocated for each frame index being resolved. The result is truly horrible,
but correct, code that's good for exercising the new code paths.
Next up is adding thumb1 support, which should be very simple. Following that
will be adding base register re-use and implementing a reasonable ARM
heuristic for when a virtual base register should be generated at all.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111315 91177308-0d34-0410-b5e6-96231b3b80d8
whether to allocate a virtual frame base register to resolve the frame
index reference in it. Implement a simple version for ARM to aid debugging.
In LocalStackSlotAllocation, scan the function for frame index references
to local frame indices and ask the target whether to allocate virtual
frame base registers for any it encounters. Purely infrastructural for
debug output. Next step is to actually allocate base registers, then add
intelligent re-use of them.
rdar://8277890
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111262 91177308-0d34-0410-b5e6-96231b3b80d8
printing "lsl #0". This fixes the remaining parts of pr7792. Make
corresponding changes for encoding/decoding these instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111251 91177308-0d34-0410-b5e6-96231b3b80d8
the memory barrier variants (other than 'SY' full system domain read and write)
are treated as one instruction with option operand.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110951 91177308-0d34-0410-b5e6-96231b3b80d8
entry for ARM STRBT is actually a super-instruction for A8.6.199 STRBT A1 & A2.
Recover by looking for ARM:USAT encoding pattern before delegating to the auto-
gened decoder.
Added a "usat" test case to arm-tests.txt.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110894 91177308-0d34-0410-b5e6-96231b3b80d8
float t1(int argc) {
return (argc == 1123) ? 1.234f : 2.38213f;
}
We would generate truly awful code on ARM (those with a weak stomach should look
away):
_t1:
movw r1, #1123
movs r2, #1
movs r3, #0
cmp r0, r1
mov.w r0, #0
it eq
moveq r0, r2
movs r1, #4
cmp r0, #0
it ne
movne r3, r1
adr r0, #LCPI1_0
ldr r0, [r0, r3]
bx lr
The problem was that legalization was creating a cascade of SELECT_CC nodes, for
for the comparison of "argc == 1123" which was fed into a SELECT node for the ?:
statement which was itself converted to a SELECT_CC node. This is because the
ARM back-end doesn't have custom lowering for SELECT nodes, so it used the
default "Expand".
I added a fairly simple "LowerSELECT" to the ARM back-end. It takes care of this
testcase, but can obviously be expanded to include more cases.
Now we generate this, which looks optimal to me:
_t1:
movw r1, #1123
movs r2, #0
cmp r0, r1
adr r0, #LCPI0_0
it eq
moveq r2, #4
ldr r0, [r0, r2]
bx lr
.align 2
LCPI0_0:
.long 1075344593 @ float 2.382130e+00
.long 1067316150 @ float 1.234000e+00
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110799 91177308-0d34-0410-b5e6-96231b3b80d8
memory and synchronization barrier dmb and dsb instructions.
- Change instruction names to something more sensible (matching name of actual
instructions).
- Added tests for memory barrier codegen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110785 91177308-0d34-0410-b5e6-96231b3b80d8
for some reason they have a very odd MCInst form where the operands overlap, but
I haven't dug in to find out why yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110781 91177308-0d34-0410-b5e6-96231b3b80d8