Commit Graph

227 Commits

Author SHA1 Message Date
Evan Cheng
53519f015e Last round of fixes for movw + movt global address codegen.
1. Fixed ARM pc adjustment.
2. Fixed dynamic-no-pic codegen
3. CSE of pc-relative load of global addresses.

It's now enabled by default for Darwin.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123991 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-21 18:55:51 +00:00
Andrew Trick
c8bfd1d78f Convert -enable-sched-cycles and -enable-sched-hazard to -disable
flags. They are still not enable in this revision.

Added TargetInstrInfo::isZeroCost() to fix a fundamental problem with
the scheduler's model of operand latency in the selection DAG.

Generalized unit tests to work with sched-cycles.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123969 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-21 05:51:33 +00:00
Evan Cheng
d7e3cc840b Don't be overly aggressive with CSE of "ldr constantpool". If it's a pc-relative
value, the "add pc" must be CSE'ed at the same time. We could follow the same
approach as T2 by adding pseudo instructions that combine the ldr + "add pc".
But the better approach is to use movw + movt (which I will enable soon), so
I'll leave this as a TODO.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123949 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-20 23:55:07 +00:00
Evan Cheng
9fe2009956 Sorry, several patches in one.
TargetInstrInfo:
Change produceSameValue() to take MachineRegisterInfo as an optional argument.
When in SSA form, targets can use it to make more aggressive equality analysis.

Machine LICM:
1. Eliminate isLoadFromConstantMemory, use MI.isInvariantLoad instead.
2. Fix a bug which prevent CSE of instructions which are not re-materializable.
3. Use improved form of produceSameValue.

ARM:
1. Teach ARM produceSameValue to look pass some PIC labels.
2. Look for operands from different loads of different constant pool entries
   which have same values.
3. Re-implement PIC GA materialization using movw + movt. Combine the pair with
   a "add pc" or "ldr [pc]" to form pseudo instructions. This makes it possible
   to re-materialize the instruction, allow machine LICM to hoist the set of
   instructions out of the loop and make it possible to CSE them. It's a bit
   hacky, but it significantly improve code quality.
4. Some minor bug fixes as well.

With the fixes, using movw + movt to materialize GAs significantly outperform the
load from constantpool method. 186.crafty and 255.vortex improved > 20%, 254.gap
and 176.gcc ~10%.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123905 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-20 08:34:58 +00:00
Evan Cheng
5de5d4b6d0 Materialize GA addresses with movw + movt pairs for Darwin in PIC mode. e.g.
movw    r0, :lower16:(L_foo$non_lazy_ptr-(LPC0_0+4))
        movt    r0, :upper16:(L_foo$non_lazy_ptr-(LPC0_0+4))
LPC0_0:
        add     r0, pc, r0

It's not yet enabled by default as some tests are failing. I suspect bugs in
down stream tools.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123619 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-17 08:03:18 +00:00
Jakob Stoklund Olesen
c9df025e33 Simplify a bunch of isVirtualRegister() and isPhysicalRegister() logic.
These functions not longer assert when passed 0, but simply return false instead.

No functional change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123155 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-10 02:58:51 +00:00
Evan Cheng
55d4200336 Recognize inline asm 'rev /bin/bash, ' as a bswap intrinsic call.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123048 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-08 01:24:27 +00:00
Andrew Trick
2da8bc8a5f Various bits of framework needed for precise machine-level selection
DAG scheduling during isel. Most new functionality is currently
guarded by -enable-sched-cycles and -enable-sched-hazard.

Added InstrItineraryData::IssueWidth field, currently derived from
ARM itineraries, but could be initialized differently on other targets.

Added ScheduleHazardRecognizer::MaxLookAhead to indicate whether it is
active, and if so how many cycles of state it holds.

Added SchedulingPriorityQueue::HasReadyFilter to allowing gating entry
into the scheduler's available queue.

ScoreboardHazardRecognizer now accesses the ScheduleDAG in order to
get information about it's SUnits, provides RecedeCycle for bottom-up
scheduling, correctly computes scoreboard depth, tracks IssueCount, and
considers potential stall cycles when checking for hazards.

ScheduleDAGRRList now models machine cycles and hazards (under
flags). It tracks MinAvailableCycle, drives the hazard recognizer and
priority queue's ready filter, manages a new PendingQueue, properly
accounts for stall cycles, etc.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122541 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-24 05:03:26 +00:00
Andrew Trick
6e8f4c4048 whitespace
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122539 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-24 04:28:06 +00:00
Bob Wilson
4711d5cda3 Remove the rest of the *_sfp Neon instruction patterns.
Use the same COPY_TO_REGCLASS approach as for the 2-register *_sfp instructions.
This change made a big difference in the code generated for the
CodeGen/Thumb2/cross-rc-coalescing-2.ll test: The coalescer is still doing
a fine job, but some instructions that were previously moved outside the loop
are not moved now.  It's using fewer VFP registers now, which is generally
a good thing, so I think the estimates for register pressure changed and that
affected the LICM behavior.  Since that isn't obviously wrong, I've just
changed the test file.  This completes the work for Radar 8711675.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121730 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-13 23:02:37 +00:00
Jim Grosbach
97a884d602 Refactor the ARM CMPz* patterns to just use the normal CMP instructions when
possible. They were duplicates for everything exception the source pattern
before.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121179 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-07 20:41:06 +00:00
Evan Cheng
48575f6ea7 Making use of VFP / NEON floating point multiply-accumulate / subtraction is
difficult on current ARM implementations for a few reasons.
1. Even though a single vmla has latency that is one cycle shorter than a pair
   of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause
   additional pipeline stall. So it's frequently better to single codegen
   vmul + vadd.
2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to
   stall for 4 cycles. We need to schedule them apart.
3. A vmla followed vmla is a special case. Obvious issuing back to back RAW
   vmla + vmla is very bad. But this isn't ideal either:
     vmul
     vadd
     vmla
   Instead, we want to expand the second vmla:
     vmla
     vmul
     vadd
   Even with the 4 cycle vmul stall, the second sequence is still 2 cycles
   faster.

Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough
but it isn't the optimial solution. This patch attempts to make it possible to
use vmla / vmls in cases where it is profitable.

A. Add missing isel predicates which cause vmla to be codegen'ed.
B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to
   compute a fmul and a fmla.
C. Add additional isel checks for vmla, avoid cases where vmla is feeding into
   fp instructions (except for the #3 exceptional case).
D. Add ARM hazard recognizer to model the vmla / vmls hazards.
E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the
   vmla / vmls will trigger one of the special hazards.

Work in progress, only A+B are enabled.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120960 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-05 22:04:16 +00:00
Jim Grosbach
d092a87ba3 Rename t2 TBB and TBH instructions to reference that they encode the jump table
data. Next up, pseudo-izing them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120320 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-29 21:28:32 +00:00
Anton Korobeynikov
cd775ceff0 Move callee-saved regs spills / reloads to TFI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120228 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-27 23:05:03 +00:00
Eric Christopher
8b3ca6216d Rewrite stack callee saved spills and restores to use push/pop instructions.
Remove movePastCSLoadStoreOps and associated code for simple pointer
increments. Update routines that depended upon other opcodes for save/restore.

Adjust all testcases accordingly.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119725 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-18 19:40:05 +00:00
Evan Cheng
5c71c7a137 Silence compiler warnings.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119610 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-18 01:43:23 +00:00
Evan Cheng
c4af4638df Remove ARM isel hacks that fold large immediates into a pair of add, sub, and,
and xor. The 32-bit move immediates can be hoisted out of loops by machine
LICM but the isel hacks were preventing them.

Instead, let peephole optimization pass recognize registers that are defined by
immediates and the ARM target hook will fold the immediates in.

Other changes include 1) do not fold and / xor into cmp to isel TST / TEQ
instructions if there are multiple uses. This happens when the 'and' is live
out, machine sink would have sinked the computation and that ends up pessimizing
code. The peephole pass would recognize situations where the 'and' can be
toggled to define CPSR and eliminate the comparison anyway.

2) Move peephole pass to after machine LICM, sink, and CSE to avoid blocking
important optimizations.

rdar://8663787, rdar://8241368


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119548 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-17 20:13:28 +00:00
Evan Cheng
3642e64c11 Simplify code that toggle optional operand to ARM::CPSR.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119484 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-17 08:06:50 +00:00
Bill Wendling
73fe34a3ee Encode the multi-load/store instructions with their respective modes ('ia',
'db', 'ib', 'da') instead of having that mode as a separate field in the
instruction. It's more convenient for the asm parser and much more readable for
humans.
<rdar://problem/8654088>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119310 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-16 01:16:36 +00:00
Evan Cheng
eb96a2f6c0 Code clean up. The peephole pass should be the one updating the instruction
iterator, not TII->OptimizeCompareInstr.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119186 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-15 21:20:45 +00:00
Eric Christopher
6c50119ba3 Revert this temporarily.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118827 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-11 19:47:02 +00:00
Eric Christopher
391f228e7e Change the prologue and epilogue to use push/pop for the low ARM registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118823 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-11 19:26:03 +00:00
Evan Cheng
8239daf7c8 Two sets of changes. Sorry they are intermingled.
1. Fix pre-ra scheduler so it doesn't try to push instructions above calls to
   "optimize for latency". Call instructions don't have the right latency and
   this is more likely to use introduce spills.
2. Fix if-converter cost function. For ARM, it should use instruction latencies,
   not # of micro-ops since multi-latency instructions is completely executed
   even when the predicate is false. Also, some instruction will be "slower"
   when they are predicated due to the register def becoming implicit input.
   rdar://8598427


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118135 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-03 00:45:17 +00:00
Bill Wendling
40a5eb18b0 When we look at instructions to convert to setting the 's' flag, we need to look
at more than those which define CPSR. You can have this situation:

(1)  subs  ...
(2)  sub   r6, r5, r4
(3)  movge ...
(4)  cmp   r6, 0
(5)  movge ...

We cannot convert (2) to "subs" because (3) is using the CPSR set by
(1). There's an analogous situation here:

(1)  sub   r1, r2, r3
(2)  sub   r4, r5, r6
(3)  cmp   r4, ...
(5)  movge ...
(6)  cmp   r1, ...
(7)  movge ...

We cannot convert (1) to "subs" because of the intervening use of CPSR.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117950 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-01 20:41:43 +00:00
Evan Cheng
e09206d4d7 Fix fpscr <-> GPR latency info.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117737 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-29 23:16:55 +00:00
Evan Cheng
089751535d Avoiding overly aggressive latency scheduling. If the two nodes share an
operand and one of them has a single use that is a live out copy, favor the
one that is live out. Otherwise it will be difficult to eliminate the copy
if the instruction is a loop induction variable update. e.g.

BB:
sub r1, r3, #1
str r0, [r2, r3]
mov r3, r1
cmp
bne BB

=>

BB:
str r0, [r2, r3]
sub r3, r3, #1
cmp
bne BB

This fixed the recent 256.bzip2 regression.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117675 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-29 18:09:28 +00:00
Evan Cheng
7e2fe9150f Re-commit 117518 and 117519 now that ARM MC test failures are out of the way.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117531 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-28 06:47:08 +00:00
Evan Cheng
9e08ee5d16 Revert 117518 and 117519 for now. They changed scheduling and cause MC tests to fail. Ugh.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117520 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-28 02:00:25 +00:00
Evan Cheng
0104d9de04 - Assign load / store with shifter op address modes the right itinerary classes.
- For now, loads of [r, r] addressing mode is the same as the
  [r, r lsl/lsr/asr #] variants. ARMBaseInstrInfo::getOperandLatency() should
  identify the former case and reduce the output latency by 1.
- Also identify [r, r << 2] case. This special form of shifter addressing mode
  is "free".


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117519 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-28 01:49:06 +00:00
Jim Grosbach
7e3383c007 Refactor ARM STR/STRB instruction patterns into STR{B}i12 and STR{B}rs, like
the LDR instructions have. This makes the literal/register forms of the
instructions explicit and allows us to assign scheduling itineraries
appropriately. rdar://8477752

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117505 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-27 23:12:14 +00:00
Jim Grosbach
063efbf569 The immediate operands of an LDRi12 instruction doesn't need the addrmode2
encoding tricks. Handle the 'imm doesn't fit in the insn' case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117454 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-27 16:50:31 +00:00
Jim Grosbach
77aee8e22c LDRi12 machine instructions handle negative offset operands normally (simple
integer values), not with the addrmode2 encoding.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117429 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-27 01:19:41 +00:00
Jim Grosbach
c1d30212e9 Split ARM::LDRB into LDRBi12 and LDRBrs. Adjust accordingly. Continuing on
rdar://8477752.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117419 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-27 00:19:44 +00:00
Jim Grosbach
3e55612472 First part of refactoring ARM addrmode2 (load/store) instructions to be more
explicit about the operands. Split out the different variants into separate
instructions. This gives us the ability to, among other things, assign
different scheduling itineraries to the variants. rdar://8477752.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117409 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-26 22:37:02 +00:00
Evan Cheng
c8141dfc7f Use instruction itinerary to determine what instructions are 'cheap'.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117348 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-26 02:08:50 +00:00
Chandler Carruth
19e57025d4 Move the remaining attribute macros to systematic names based on the attribute
name and prefixed with 'LLVM_'.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117203 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-23 08:40:19 +00:00
Evan Cheng
dd9dd6f857 Latency between CPSR def and branch is zero.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117192 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-23 02:04:38 +00:00
Evan Cheng
2312842de0 Re-enable register pressure aware machine licm with fixes. Hoist() may have
erased the instruction during LICM so UpdateRegPressureAfter() should not
reference it afterwards.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116845 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-19 18:58:51 +00:00
Daniel Dunbar
9869413802 Revert r116781 "- Add a hook for target to determine whether an instruction def
is", which breaks some nightly tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116816 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-19 17:14:24 +00:00
Evan Cheng
11e8b74a7a - Add a hook for target to determine whether an instruction def is
"long latency" enough to hoist even if it may increase spilling. Reloading
  a value from spill slot is often cheaper than performing an expensive
  computation in the loop. For X86, that means machine LICM will hoist
  SQRT, DIV, etc. ARM will be somewhat aggressive with VFP and NEON
  instructions.
- Enable register pressure aware machine LICM by default.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116781 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-19 00:55:07 +00:00
Bill Wendling
b41ee96d76 Don't recompute MachineRegisterInfo in the Optimize* method.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116750 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-18 21:22:31 +00:00
Bill Wendling
0aa38b9381 Check to make sure that the iterator isn't at the beginning of the basic block
before decrementing. <rdar://problem/8529919>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116126 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-09 00:03:48 +00:00
Evan Cheng
344d9db970 Code refactoring.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116002 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-07 23:12:15 +00:00
Evan Cheng
5a50ceeaea Model operand cycles of vldm / vstm; also fixes scheduling itineraries of vldr / vstr, etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115898 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-07 01:50:48 +00:00
Jim Grosbach
3c38f96af2 Clean up MOVi32imm and t2MOVi32imm pseudo instruction definitions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115853 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-06 22:01:26 +00:00
Evan Cheng
a0792de66c - Add TargetInstrInfo::getOperandLatency() to compute operand latencies. This
allow target to correctly compute latency for cases where static scheduling
  itineraries isn't sufficient. e.g. variable_ops instructions such as
  ARM::ldm.
  This also allows target without scheduling itineraries to compute operand
  latencies. e.g. X86 can return (approximated) latencies for high latency
  instructions such as division.
- Compute operand latencies for those defined by load multiple instructions,
  e.g. ldm and those used by store multiple instructions, e.g. stm.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115755 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-06 06:27:31 +00:00
Michael J. Spencer
f000a7a212 fix MSVC 2010 build.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115594 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-05 06:00:43 +00:00
Michael J. Spencer
2bbb769091 Cleanup Whitespace.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115593 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-05 06:00:33 +00:00
Owen Anderson
e3cc84a43d Thread the determination of branch prediction hit rates back through the if-conversion heuristic APIs. For now,
stick with a constant estimate of 90% (branch predictors are good!), but we might find that we want to provide
more nuanced estimates in the future.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115364 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-01 22:45:50 +00:00
Owen Anderson
00d4f48168 Make the spelling of the flags for old-style if-conversion heuristics consistent between ARM and Thumb2.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115341 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-01 20:33:47 +00:00