64 Commits

Author SHA1 Message Date
Evan Cheng
48575f6ea7 Making use of VFP / NEON floating point multiply-accumulate / subtraction is
difficult on current ARM implementations for a few reasons.
1. Even though a single vmla has latency that is one cycle shorter than a pair
   of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause
   additional pipeline stall. So it's frequently better to single codegen
   vmul + vadd.
2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to
   stall for 4 cycles. We need to schedule them apart.
3. A vmla followed vmla is a special case. Obvious issuing back to back RAW
   vmla + vmla is very bad. But this isn't ideal either:
     vmul
     vadd
     vmla
   Instead, we want to expand the second vmla:
     vmla
     vmul
     vadd
   Even with the 4 cycle vmul stall, the second sequence is still 2 cycles
   faster.

Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough
but it isn't the optimial solution. This patch attempts to make it possible to
use vmla / vmls in cases where it is profitable.

A. Add missing isel predicates which cause vmla to be codegen'ed.
B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to
   compute a fmul and a fmla.
C. Add additional isel checks for vmla, avoid cases where vmla is feeding into
   fp instructions (except for the #3 exceptional case).
D. Add ARM hazard recognizer to model the vmla / vmls hazards.
E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the
   vmla / vmls will trigger one of the special hazards.

Work in progress, only A+B are enabled.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120960 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-05 22:04:16 +00:00
Evan Cheng
8239daf7c8 Two sets of changes. Sorry they are intermingled.
1. Fix pre-ra scheduler so it doesn't try to push instructions above calls to
   "optimize for latency". Call instructions don't have the right latency and
   this is more likely to use introduce spills.
2. Fix if-converter cost function. For ARM, it should use instruction latencies,
   not # of micro-ops since multi-latency instructions is completely executed
   even when the predicate is false. Also, some instruction will be "slower"
   when they are predicated due to the register def becoming implicit input.
   rdar://8598427


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118135 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-03 00:45:17 +00:00
Owen Anderson
e3cc84a43d Thread the determination of branch prediction hit rates back through the if-conversion heuristic APIs. For now,
stick with a constant estimate of 90% (branch predictors are good!), but we might find that we want to provide
more nuanced estimates in the future.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115364 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-01 22:45:50 +00:00
Owen Anderson
aa9f0a57d0 Provide an option to restore old-style if-conversion heuristics for Thumb2.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115339 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-01 20:28:06 +00:00
Owen Anderson
b20b85168c Part one of switching to using a more sane heuristic for determining if-conversion profitability.
Rather than having arbitrary cutoffs, actually try to cost model the conversion.

For now, the constants are tuned to more or less match our existing behavior, but these will be
changed to reflect realistic values as this work proceeds.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114973 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-28 18:32:13 +00:00
Chris Lattner
59db5496f4 convert targets to the new MF.getMachineMemOperand interface.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114391 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-21 04:39:43 +00:00
Evan Cheng
3ef1c8759a Teach if-converter to be more careful with predicating instructions that would
take multiple cycles to decode.
For the current if-converter clients (actually only ARM), the instructions that
are predicated on false are not nops. They would still take machine cycles to
decode. Micro-coded instructions such as LDM / STM can potentially take multiple
cycles to decode. If-converter should take treat them as non-micro-coded
simple instructions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113570 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-10 01:29:16 +00:00
Jim Grosbach
6ccfc507dc Many Thumb2 instructions can reference the full ARM register set (i.e.,
have 4 bits per register in the operand encoding), but have undefined
behavior when the operand value is 13 or 15 (SP and PC, respectively).
The trivial coalescer in linear scan sometimes will merge a copy from
SP into a subsequent instruction which uses the copy, and if that
instruction cannot legally reference SP, we get bad code such as:
  mls r0,r9,r0,sp
instead of:
  mov r2, sp
  mls r0, r9, r0, r2

This patch adds a new register class for use by Thumb2 that excludes
the problematic registers (SP and PC) and is used instead of GPR
for those operands which cannot legally reference PC or SP. The
trivial coalescer explicitly requires that the register class
of the destination for the COPY instruction contain the source
register for the COPY to be considered for coalescing. This prevents
errant instructions like that above.

PR7499




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109842 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-30 02:41:01 +00:00
Jakob Stoklund Olesen
ac27366700 Replace copyRegToReg with copyPhysReg for ARM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108078 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-11 06:33:54 +00:00
Bob Wilson
1ab38469df The t2MOVi16 and t2MOVTi16 instructions do not set CPSR. Trying to add
a CPSR operand to them causes an assertion failure, so apparently these
instructions haven't been getting a lot of use.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107147 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-29 16:25:11 +00:00
Evan Cheng
c170f66742 Change if-cvt options to something that actually as useable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107121 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-29 05:37:59 +00:00
Evan Cheng
13151432ed Change if-conversion block size limit checks to add some flexibility.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106901 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-25 22:42:03 +00:00
Evan Cheng
4d54e5b2dd Tail merging pass shall not break up IT blocks. rdar://8115404
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106517 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-22 01:18:16 +00:00
Evan Cheng
86050dc8cc Allow ARM if-converter to be run after post allocation scheduling.
- This fixed a number of bugs in if-converter, tail merging, and post-allocation
  scheduler. If-converter now runs branch folding / tail merging first to
  maximize if-conversion opportunities.
- Also changed the t2IT instruction slightly. It now defines the ITSTATE
  register which is read by instructions in the IT block.
- Added Thumb2 specific hazard recognizer to ensure the scheduler doesn't
  change the instruction ordering in the IT block (since IT mask has been
  finalized). It also ensures no other instructions can be scheduled between
  instructions in the IT block.

This is not yet enabled.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106344 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-18 23:09:54 +00:00
Dale Johannesen
6470a116f1 Next round of tail call changes. Register used in a tail
call must not be callee-saved; following x86, add a new
regclass to represent this.  Also fixes a couple of bugs.
Still disabled by default; Thumb doesn't work yet.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106053 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-15 22:08:33 +00:00
Evan Cheng
68fc2daf8f Allow target to place 2-address pass inserted copies in better spots. Thumb2 will use this to try to avoid breaking up IT blocks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105745 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-09 19:26:01 +00:00
Jim Grosbach
18f30e6f5e Clean up 80 column violations. No functional change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105350 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-02 21:53:11 +00:00
Dan Gohman
34dcc6fadc Add a DebugLoc argument to TargetInstrInfo::copyRegToReg, so that it
doesn't have to guess.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@103194 91177308-0d34-0410-b5e6-96231b3b80d8
2010-05-06 20:33:48 +00:00
Evan Cheng
746ad69e08 Add argument TargetRegisterInfo to loadRegFromStackSlot and storeRegToStackSlot.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@103193 91177308-0d34-0410-b5e6-96231b3b80d8
2010-05-06 19:06:44 +00:00
Bob Wilson
5dfa87ecc6 Handle register-to-register copies within the tGPR class.
Radar 7896289


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@102396 91177308-0d34-0410-b5e6-96231b3b80d8
2010-04-26 23:20:08 +00:00
Chris Lattner
c7f3ace20c use DebugLoc default ctor instead of DebugLoc::getUnknownLoc()
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100214 91177308-0d34-0410-b5e6-96231b3b80d8
2010-04-02 20:16:16 +00:00
Jim Grosbach
9ab0427015 Thumb2 storeFrom/LoadToStackSlot() need to handle tGPR regs directly, not pass
through to the generic version. The generic functions use STR/LDR, but T2
needs the t2STR/t2LDR instead so we get the addressing mode correct.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99678 91177308-0d34-0410-b5e6-96231b3b80d8
2010-03-27 00:09:12 +00:00
Bob Wilson
f5fd499791 Fix a crash compiling 254.gap for Thumb2. The Thumb2 add/sub with 12-bit
immediate instructions cannot set the condition codes, so they do not have
the extra cc_out operand.  We hit an assertion during tail duplication
because the instruction being duplicated had more operands that expected.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@98001 91177308-0d34-0410-b5e6-96231b3b80d8
2010-03-08 22:56:15 +00:00
Bob Wilson
e6373eb826 Handle AddrMode6 (for NEON load/stores) in Thumb2's rewriteT2FrameIndex.
Radar 7614112.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95456 91177308-0d34-0410-b5e6-96231b3b80d8
2010-02-06 00:24:38 +00:00
Jakob Stoklund Olesen
35f0febcb6 Remove predicates when changing an add into an unpredicable mov.
Since the mov is executed unconditionally, make sure that the add didn't have
any predicate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93909 91177308-0d34-0410-b5e6-96231b3b80d8
2010-01-19 21:08:28 +00:00
Dan Gohman
864e2efce2 Remove the target hook TargetInstrInfo::BlockHasNoFallThrough in favor of
MachineBasicBlock::canFallThrough(), which is target-independent and more
thorough.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@90634 91177308-0d34-0410-b5e6-96231b3b80d8
2009-12-05 00:44:40 +00:00
Evan Cheng
fdc834046e Refactor code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@86423 91177308-0d34-0410-b5e6-96231b3b80d8
2009-11-08 00:15:23 +00:00
Jim Grosbach
31c24bf5b3 80-column cleanup of file header comments
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@86408 91177308-0d34-0410-b5e6-96231b3b80d8
2009-11-07 22:00:39 +00:00
Evan Cheng
bf992817f2 t2ldrpci_pic can be used for blockaddress as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@86400 91177308-0d34-0410-b5e6-96231b3b80d8
2009-11-07 19:40:04 +00:00
Evan Cheng
d457e6e9a5 Refactor code. Fix a potential missing check. Teach isIdentical() about tLDRpci_pic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@86330 91177308-0d34-0410-b5e6-96231b3b80d8
2009-11-07 04:04:34 +00:00
Evan Cheng
78e5c1140a - Add TargetInstrInfo::isIdentical(). It's similar to MachineInstr::isIdentical
except it doesn't care if the definitions' virtual registers differ. This is
  used by machine LICM and other MI passes to perform CSE.
- Teach Thumb2InstrInfo::isIdentical() to check two t2LDRpci_pic are identical.
  Since pc relative constantpool entries are always different, this requires it
  it check if the values can actually the same.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@86328 91177308-0d34-0410-b5e6-96231b3b80d8
2009-11-07 03:52:02 +00:00
Evan Cheng
b9803a8fa6 - Add pseudo instructions tLDRpci_pic and t2LDRpci_pic which does a pc-relative
load of a GV from constantpool and then add pc. It allows the code sequence to
  be rematerializable so it would be hoisted by machine licm.
- Add a late pass to break these pseudo instructions into a number of real
  instructions. Also move the code in Thumb2 IT pass that breaks up t2MOVi32imm
  to this pass. This is done before post regalloc scheduling to allow the
  scheduler to proper schedule these instructions. It also allow them to be
  if-converted and shrunk by later passes.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@86304 91177308-0d34-0410-b5e6-96231b3b80d8
2009-11-06 23:52:48 +00:00
Anton Korobeynikov
f95215f551 Use NEON reg-reg moves, where profitable. This reduces "domain-cross" stalls, when we used to mix vfp and neon code (the former were used for reg-reg moves)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@85764 91177308-0d34-0410-b5e6-96231b3b80d8
2009-11-02 00:10:38 +00:00
Evan Cheng
e3ce8aab0a Fix a couple more places where we are creating ld / st instructions without memoperands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@85746 91177308-0d34-0410-b5e6-96231b3b80d8
2009-11-01 22:04:35 +00:00
Bob Wilson
8d4de5abfa Add a Thumb BRIND pattern. Change the ARM BRIND assembly to separate the
opcode and operand with a tab.  Check for these instructions in the usual
places.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@85411 91177308-0d34-0410-b5e6-96231b3b80d8
2009-10-28 18:26:41 +00:00
Bob Wilson
e4863f4759 Handle AddrMode4 for Thumb2 in rewriteT2FrameIndex. This occurs for
VLDM/VSTM instructions, and without this check, the code assumes that an
offset is allowed, as it would be with VLDR/VSTR.  The asm printer,
however, silently drops the offset, producing incorrect code.  Since the
address register in this case is either the stack or frame pointer, the
spill location ends up conflicting with some other stack slot or with
outgoing arguments on the stack.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@81879 91177308-0d34-0410-b5e6-96231b3b80d8
2009-09-15 17:56:18 +00:00
Evan Cheng
cdbb3f5d33 Fix PR4789. Teach eliminateFrameIndex how to handle VLDRQ and VSTRQ which cannot fold any immediate offset.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@80191 91177308-0d34-0410-b5e6-96231b3b80d8
2009-08-27 01:23:50 +00:00
Jim Grosbach
764ab52dd8 Whitespace cleanup. Remove trailing whitespace.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@78666 91177308-0d34-0410-b5e6-96231b3b80d8
2009-08-11 15:33:49 +00:00
Evan Cheng
09d97354ed Always use the 16-bit tMOVgpr2gpr instead of the 32-bit t2MOVr.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@78549 91177308-0d34-0410-b5e6-96231b3b80d8
2009-08-10 02:06:53 +00:00
Evan Cheng
e118cb6146 Use 16-bit tMOVgpr2gpr instead of tMOVr to copy GPR registers in Thumb2 mode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@78398 91177308-0d34-0410-b5e6-96231b3b80d8
2009-08-07 19:34:35 +00:00
Evan Cheng
861986401e It turns out most of the thumb2 instructions are not allowed to touch SP. The semantics of such instructions are unpredictable. We have just been lucky that tests have been passing.
This patch takes pain to ensure all the PEI lowering code does the right thing when lowering frame indices, insert code to manipulate stack pointers, etc. It's also custom lowering dynamic stack alloc into pseudo instructions so we can insert the right instructions at scheduling time.

This fixes PR4659 and PR4682.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@78361 91177308-0d34-0410-b5e6-96231b3b80d8
2009-08-07 00:34:42 +00:00
Evan Cheng
a8e8984ee4 Use the i12 variant of load / store opcodes if offset is zero. Now we pass all of multisource as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@77939 91177308-0d34-0410-b5e6-96231b3b80d8
2009-08-03 02:38:06 +00:00
Chris Lattner
d90183d25d Move the getInlineAsmLength virtual method from TAI to TII, where
the only real caller (GetFunctionSizeInBytes) uses it.

The custom ARM implementation of this is basically reimplementing
an assembler poorly for negligible gain.  It should be removed 
IMNSHO, but I'll leave that to ARMish folks to decide.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@77877 91177308-0d34-0410-b5e6-96231b3b80d8
2009-08-02 05:20:37 +00:00
Evan Cheng
5657c01949 Optimize Thumb2 jumptable to use tbb / tbh when all the offsets fit in byte / halfword.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@77422 91177308-0d34-0410-b5e6-96231b3b80d8
2009-07-29 02:18:14 +00:00
David Goodwin
d9453784fb Thumb-2: fix typo that caused incorrect stack elimination for VFP operations and very large stack frames.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@77401 91177308-0d34-0410-b5e6-96231b3b80d8
2009-07-28 23:52:33 +00:00
Evan Cheng
6495f63945 - More refactoring. This gets rid of all of the getOpcode calls.
- This change also makes it possible to switch between ARM / Thumb on a
  per-function basis.
- Fixed thumb2 routine which expand reg + arbitrary immediate. It was using
  using ARM so_imm logic.
- Use movw and movt to do reg + imm when profitable.
- Other code clean ups and minor optimizations.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@77300 91177308-0d34-0410-b5e6-96231b3b80d8
2009-07-28 05:48:47 +00:00
Evan Cheng
e0f21bd47f More DCE.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@77231 91177308-0d34-0410-b5e6-96231b3b80d8
2009-07-27 18:48:45 +00:00
Evan Cheng
fc17fb0aee Get rid of more dead code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@77227 91177308-0d34-0410-b5e6-96231b3b80d8
2009-07-27 18:38:54 +00:00
Evan Cheng
5ca53a7ad8 Get rid of some more getOpcode calls.
This also fixes potential problems in ARMBaseInstrInfo routines not recognizing thumb1 instructions when 32-bit and 16-bit instructions mix.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@77218 91177308-0d34-0410-b5e6-96231b3b80d8
2009-07-27 18:20:05 +00:00
Evan Cheng
5732ca084a Use t2LDRi12 and t2STRi12 to load / store to / from stack frames. Eliminate more getOpcode calls.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@77181 91177308-0d34-0410-b5e6-96231b3b80d8
2009-07-27 03:14:20 +00:00