Commit Graph

402 Commits

Author SHA1 Message Date
Jim Grosbach
bb3a2e4d0d ARM NEON refactor VST2 w/ writeback instructions.
In addition to improving the representation, this adds support for assembly
parsing of these instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146588 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-14 21:32:11 +00:00
Jim Grosbach
a4e3c7fc4b ARM assembly parsing and encoding for VLD2 with writeback.
Refactor the instructions into fixed writeback and register-stride
writeback variants to simplify the offset operand (no more optional
register operand using reg0). This is a simpler representation and allows
the assembly parser to more easily handle these instructions.

Add tests for the instruction variants now supported.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146278 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-09 21:28:25 +00:00
Jim Grosbach
4c7edb3ad8 ARM assembly parsing and encoding for four-register VST1.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145450 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-29 22:58:48 +00:00
Jim Grosbach
d5ca201891 ARM assembly parsing and encoding for three-register VST1.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@145442 91177308-0d34-0410-b5e6-96231b3b80d8
2011-11-29 22:38:04 +00:00
Jim Grosbach
4334e03252 ARM VST1 w/ writeback assembly parsing and encoding.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143369 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-31 21:50:31 +00:00
Jakob Stoklund Olesen
b0117eed84 Also set addrmode6 alignment when align==size.
Previously, we were only setting the alignment bits on over-aligned
loads and stores.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143160 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-27 22:39:16 +00:00
Jim Grosbach
55dabaa73a ARM isel for vld1, opcode selection for register stride post-index pseudos.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@143158 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-27 22:25:42 +00:00
Jim Grosbach
10b90a9bbf ARM refactor am6offset usage for VLD1.
Split am6offset into fixed and register offset variants so the instruction
encodings are explicit rather than relying an a magic reg0 marker.
Needed to being able to parse these.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142853 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-24 21:45:13 +00:00
Eli Friedman
0851a29b6d Fix misc warnings. Patch by Joe Abbey.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@142332 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-18 03:17:34 +00:00
Bill Wendling
ef2c86f876 Reapply r141365 now that PR11107 is fixed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141591 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-10 22:59:55 +00:00
Bill Wendling
eba564ceac Revert r141365. It was causing MultiSource/Benchmarks/MiBench/consumer-lame to
hang, and possibly SPEC/CINT2006/464_h264ref.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141560 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-10 18:27:30 +00:00
Anton Korobeynikov
2d4b60f3a4 Disable ABS optimization for Thumb1 target, we don't have necessary instructions there.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141481 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-08 08:38:45 +00:00
Anton Korobeynikov
244455e6d6 Peephole optimization for ABS on ARM.
Patch by Ana Pazos!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141365 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-07 16:15:08 +00:00
Cameron Zwarich
8f8aa815b4 Always merge profitable shifts on A9, not just when they have a single use.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141248 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-05 23:39:02 +00:00
Cameron Zwarich
d78ebe1e12 Remove a check from ARM shifted operand isel helper methods, which were blocking
merging an lsl #2 that has multiple uses on A9. This shift is free, so there is
no problem merging it in multiple places. Other unprofitable shifts will not be
merged.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141247 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-05 23:38:50 +00:00
Cameron Zwarich
fb77752253 Add braces around something that throws me for a loop.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141173 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-05 08:59:10 +00:00
Cameron Zwarich
5ee0262014 There is no point in setting out-parameters for a ComplexPattern function when
it returns false, at least as far as I could tell by reading the code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141172 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-05 08:59:05 +00:00
Jakob Stoklund Olesen
11ebe3d7c1 Also match negative offsets for addrmode3 and addrmode5.
Math is hard, and isScaledConstantInRange() always returned false for
negative constants.  It was doing unsigned division of negative numbers
before casting back to signed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140425 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-23 22:10:33 +00:00
Jim Grosbach
b04546ff5b Tidy up a few 80 column violations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139636 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-13 20:30:37 +00:00
Owen Anderson
d84192fe4f When performing instruction selection for LDR_PRE_IMM/LDRB_PRE_IMM, we still need to preserve the sign of the index. This fixes miscompilations of Quicksort in the nightly testsuite, and hopefully others as well.
<rdar://problem/10046188>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138885 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-31 20:00:11 +00:00
Eli Friedman
4d3f329453 64-bit atomic cmpxchg for ARM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138868 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-31 17:52:22 +00:00
Eli Friedman
2bdffe4882 Some 64-bit atomic operations on ARM. 64-bit cmpxchg coming next.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138845 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-31 00:31:29 +00:00
Owen Anderson
c4e16de765 addrmode_imm12 and addrmode2_offset encode their immediate values differently. Update the manual instruction selection code that was encoding them the addrmode2 way even though LDR_PRE_IMM/LDRB_PRE_IMM had switched to addrmode_imm12. Should fix a number of nightly test failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138758 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-29 20:16:50 +00:00
Owen Anderson
2b568fb3ce Fix ARM codegen breakage caused by r138653.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138657 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-26 21:12:37 +00:00
Owen Anderson
9ab0f25fc1 invalid-LDR_PRE-arm.txt was already passing, but for the wrong reasons. We were failing to specify enough fixed bits of LDR_PRE/LDRB_PRE, resulting in decoding conflicts. Separate them into immediate vs. register versions, allowing us to specify the necessary fixed bits. This in turn results in the test being decoded properly, and being rejected as UNPREDICTABLE rather than a hard failure.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138653 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-26 20:43:14 +00:00
Jim Grosbach
5b81584f74 Thumb1 ADD/SUB SP instructions are predicable in Thumb2 mode.
Add the predicate operand to the instructions. Update the back end
accordingly where the instructions are used. Restrict the SP operands
to actually only be SP, as otherwise these break assembly parsing for the
normal instruction variants.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138445 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-24 17:46:13 +00:00
Jim Grosbach
19dec207fc ARM refactor indexed store instructions.
Refactor STR[B] pre and post indexed instructions to use addressing modes for
memory operands, which is necessary for assembly parsing and is more consistent
with the rest of the memory instruction definitions. Make some incremental
progress on refactoring away the mega-operand addrmode2 along the way, which
is nice.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136978 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-05 20:35:44 +00:00
Jim Grosbach
fb8989e640 ARM parsing and encoding of SBFX and UBFX.
Encode the width operand as it encodes in the instruction, which simplifies
the disassembler and the encoder, by using the imm1_32 operand def. Add a
diagnostic for the context-sensitive constraint that the width must be in
the range [1,32-lsb].


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136264 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-27 21:09:25 +00:00
Owen Anderson
793e79601f Split am2offset into register addend and immediate addend forms, necessary for allowing the fixed-length disassembler to distinguish between SBFX and STR_PRE.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136141 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-26 20:54:26 +00:00
Owen Anderson
e0a03143df Fix test failures caused by my so_reg refactoring.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135785 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-22 18:30:30 +00:00
Owen Anderson
152d4a4bb6 Get rid of the extraneous GPR operand on so_reg_imm operands, which in turn necessitates a lot of changes to related bits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135722 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-21 23:38:37 +00:00
Owen Anderson
92a202213b Split up the ARM so_reg ComplexPattern into so_reg_reg and so_reg_imm, allowing us to distinguish the encodings that use shifted registers from those that use shifted immediates. This is necessary to allow the fixed-length decoder to distinguish things like BICS vs LDRH.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135693 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-21 18:54:16 +00:00
Evan Cheng
ee04a6d3a4 Sink ARMMCExpr and ARMAddressingModes into MC layer. First step to separate ARM MC code from target.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135636 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-20 23:34:39 +00:00
Evan Cheng
e837dead3c - Rename TargetInstrDesc, TargetOperandInfo to MCInstrDesc and MCOperandInfo and
sink them into MC layer.
- Added MCInstrInfo, which captures the tablegen generated static data. Chang
TargetInstrInfo so it's based off MCInstrInfo.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134021 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-28 19:10:37 +00:00
Owen Anderson
1300f3019e Change the REG_SEQUENCE SDNode to take an explict register class ID as its first operand. This operand is lowered away by the time we reach MachineInstrs, so the actual register-allocation handling of them doesn't need to change.
This is intended to support using REG_SEQUENCE SDNode's with type MVT::untyped, and is part of the long road to eliminating some of the hacks we currently use to support register pairs and other strange constraints, particularly on ARM NEON.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133178 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-16 18:17:13 +00:00
Bruno Cardoso Lopes
a0112d0c39 Add support for ARM ldrexd/strexd intrinsics. They both use i32 register pairs
to load/store i64 values. Since there's no current support to explicitly
declare such restrictions, implement it by using specific hardcoded register
pairs during isel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132248 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-28 04:07:29 +00:00
Eli Friedman
b451770b26 Zap a couple now-unused functions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130557 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-29 22:56:48 +00:00
Bob Wilson
84c5eed15b This patch combines several changes from Evan Cheng for rdar://8659675.
Making use of VFP / NEON floating point multiply-accumulate / subtraction is
difficult on current ARM implementations for a few reasons.
1. Even though a single vmla has latency that is one cycle shorter than a pair
   of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause
   additional pipeline stall. So it's frequently better to single codegen
   vmul + vadd.
2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to
   stall for 4 cycles. We need to schedule them apart.
3. A vmla followed vmla is a special case. Obvious issuing back to back RAW
   vmla + vmla is very bad. But this isn't ideal either:
     vmul
     vadd
     vmla
   Instead, we want to expand the second vmla:
     vmla
     vmul
     vadd
   Even with the 4 cycle vmul stall, the second sequence is still 2 cycles
   faster.

Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough
but it isn't the optimial solution. This patch attempts to make it possible to
use vmla / vmls in cases where it is profitable.

A. Add missing isel predicates which cause vmla to be codegen'ed.
B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to
   compute a fmul and a fmla.
C. Add additional isel checks for vmla, avoid cases where vmla is feeding into
   fp instructions (except for the #3 exceptional case).
D. Add ARM hazard recognizer to model the vmla / vmls hazards.
E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the
   vmla / vmls will trigger one of the special hazards.

Enable these fp vmlx codegen changes for Cortex-A9.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129775 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-19 18:11:57 +00:00
Evan Cheng
b58a340fa2 Do not lose mem_operands while lowering VLD / VST intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129738 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-19 00:04:03 +00:00
Owen Anderson
099e5553eb Reduce code duplication.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127899 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-18 19:46:58 +00:00
Bill Wendling
69a05a7b92 Generate a VTBL instruction instead of a series of loads and stores when we
can. As Nate pointed out, VTBL isn't super performant, but it *has* to be better
than this:

_shuf:
@ BB#0:       @ %entry
  push        {r4, r7, lr}
  add         r7, sp, #4
  sub         sp, #12
  mov         r4, sp
  bic         r4, r4, #7
  mov         sp, r4
  mov         r2, sp
  vmov        d16, r0, r1
  orr         r0, r2, #6
  orr         r3, r2, #7
  vst1.8      {d16[0]}, [r3]
  vst1.8      {d16[5]}, [r0]
  subs        r4, r7, #4
  orr         r0, r2, #5
  vst1.8      {d16[4]}, [r0]
  orr         r0, r2, #4
  vst1.8      {d16[4]}, [r0]
  orr         r0, r2, #3
  vst1.8      {d16[0]}, [r0]
  orr         r0, r2, #2
  vst1.8      {d16[2]}, [r0]
  orr         r0, r2, #1
  vst1.8      {d16[1]}, [r0]
  vst1.8      {d16[3]}, [r2]
  vldr.64     d16, [sp]
  vmov        r0, r1, d16
  mov         sp, r4
  pop         {r4, r7, pc}

The "illegal" testcase in vext.ll is no longer illegal.
<rdar://problem/9078775>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127630 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-14 23:02:38 +00:00
Jim Grosbach
3c5edaaf59 Remove dead code. These ARM instruction definitions no longer exist.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127509 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-11 23:15:02 +00:00
Bob Wilson
4faa0e1952 Remove unused conditional negate operations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127090 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-05 16:54:31 +00:00
Bob Wilson
da52506792 Add patterns to use post-increment addressing for Neon VST1-lane instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126477 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-25 06:42:42 +00:00
Chris Lattner
0a9481f44f Enhance ComputeMaskedBits to know that aligned frameindexes
have their low bits set to zero.  This allows us to optimize
out explicit stack alignment code like in stack-align.ll:test4 when
it is redundant.

Doing this causes the code generator to start turning FI+cst into
FI|cst all over the place, which is general goodness (that is the
canonical form) except that various pieces of the code generator
don't handle OR aggressively.  Fix this by introducing a new
SelectionDAG::isBaseWithConstantOffset predicate, and using it
in places that are looking for ADD(X,CST).  The ARM backend in
particular was missing a lot of addressing mode folding opportunities
around OR.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125470 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-13 22:25:43 +00:00
Bob Wilson
1c3ef90cab Add codegen support for using post-increment NEON load/store instructions.
The vld1-lane, vld1-dup and vst1-lane instructions do not yet support using
post-increment versions, but all the rest of the NEON load/store instructions
should be handled now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125014 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-07 17:43:21 +00:00
Bob Wilson
7de6814405 Change VLD3/4 and VST3/4 for quad registers to not update the address register.
These operations are expanded to pairs of loads or stores, and the first one
uses the address register update to produce the address for the second one.
So far, the second load/store has also updated the address register, just
for convenience, since that output has never been used.  In anticipation of
actually supporting post-increment updates for these operations, this changes
the non-updating operations to use a non-updating load/store for the second
instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125013 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-07 17:43:15 +00:00
Evan Cheng
9fe2009956 Sorry, several patches in one.
TargetInstrInfo:
Change produceSameValue() to take MachineRegisterInfo as an optional argument.
When in SSA form, targets can use it to make more aggressive equality analysis.

Machine LICM:
1. Eliminate isLoadFromConstantMemory, use MI.isInvariantLoad instead.
2. Fix a bug which prevent CSE of instructions which are not re-materializable.
3. Use improved form of produceSameValue.

ARM:
1. Teach ARM produceSameValue to look pass some PIC labels.
2. Look for operands from different loads of different constant pool entries
   which have same values.
3. Re-implement PIC GA materialization using movw + movt. Combine the pair with
   a "add pc" or "ldr [pc]" to form pseudo instructions. This makes it possible
   to re-materialize the instruction, allow machine LICM to hoist the set of
   instructions out of the loop and make it possible to CSE them. It's a bit
   hacky, but it significantly improve code quality.
4. Some minor bug fixes as well.

With the fixes, using movw + movt to materialize GAs significantly outperform the
load from constantpool method. 186.crafty and 255.vortex improved > 20%, 254.gap
and 176.gcc ~10%.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123905 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-20 08:34:58 +00:00
Daniel Dunbar
ec91d52a77 ARM/ISel: Factor out isScaledConstantInRange() helper.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123823 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-19 15:12:16 +00:00
Evan Cheng
5de5d4b6d0 Materialize GA addresses with movw + movt pairs for Darwin in PIC mode. e.g.
movw    r0, :lower16:(L_foo$non_lazy_ptr-(LPC0_0+4))
        movt    r0, :upper16:(L_foo$non_lazy_ptr-(LPC0_0+4))
LPC0_0:
        add     r0, pc, r0

It's not yet enabled by default as some tests are failing. I suspect bugs in
down stream tools.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123619 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-17 08:03:18 +00:00