Commit Graph

99 Commits

Author SHA1 Message Date
Evan Cheng
9fe2009956 Sorry, several patches in one.
TargetInstrInfo:
Change produceSameValue() to take MachineRegisterInfo as an optional argument.
When in SSA form, targets can use it to make more aggressive equality analysis.

Machine LICM:
1. Eliminate isLoadFromConstantMemory, use MI.isInvariantLoad instead.
2. Fix a bug which prevent CSE of instructions which are not re-materializable.
3. Use improved form of produceSameValue.

ARM:
1. Teach ARM produceSameValue to look pass some PIC labels.
2. Look for operands from different loads of different constant pool entries
   which have same values.
3. Re-implement PIC GA materialization using movw + movt. Combine the pair with
   a "add pc" or "ldr [pc]" to form pseudo instructions. This makes it possible
   to re-materialize the instruction, allow machine LICM to hoist the set of
   instructions out of the loop and make it possible to CSE them. It's a bit
   hacky, but it significantly improve code quality.
4. Some minor bug fixes as well.

With the fixes, using movw + movt to materialize GAs significantly outperform the
load from constantpool method. 186.crafty and 255.vortex improved > 20%, 254.gap
and 176.gcc ~10%.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123905 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-20 08:34:58 +00:00
Andrew Trick
2da8bc8a5f Various bits of framework needed for precise machine-level selection
DAG scheduling during isel. Most new functionality is currently
guarded by -enable-sched-cycles and -enable-sched-hazard.

Added InstrItineraryData::IssueWidth field, currently derived from
ARM itineraries, but could be initialized differently on other targets.

Added ScheduleHazardRecognizer::MaxLookAhead to indicate whether it is
active, and if so how many cycles of state it holds.

Added SchedulingPriorityQueue::HasReadyFilter to allowing gating entry
into the scheduler's available queue.

ScoreboardHazardRecognizer now accesses the ScheduleDAG in order to
get information about it's SUnits, provides RecedeCycle for bottom-up
scheduling, correctly computes scoreboard depth, tracks IssueCount, and
considers potential stall cycles when checking for hazards.

ScheduleDAGRRList now models machine cycles and hazards (under
flags). It tracks MinAvailableCycle, drives the hazard recognizer and
priority queue's ready filter, manages a new PendingQueue, properly
accounts for stall cycles, etc.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122541 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-24 05:03:26 +00:00
Evan Cheng
48575f6ea7 Making use of VFP / NEON floating point multiply-accumulate / subtraction is
difficult on current ARM implementations for a few reasons.
1. Even though a single vmla has latency that is one cycle shorter than a pair
   of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause
   additional pipeline stall. So it's frequently better to single codegen
   vmul + vadd.
2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to
   stall for 4 cycles. We need to schedule them apart.
3. A vmla followed vmla is a special case. Obvious issuing back to back RAW
   vmla + vmla is very bad. But this isn't ideal either:
     vmul
     vadd
     vmla
   Instead, we want to expand the second vmla:
     vmla
     vmul
     vadd
   Even with the 4 cycle vmul stall, the second sequence is still 2 cycles
   faster.

Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough
but it isn't the optimial solution. This patch attempts to make it possible to
use vmla / vmls in cases where it is profitable.

A. Add missing isel predicates which cause vmla to be codegen'ed.
B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to
   compute a fmul and a fmla.
C. Add additional isel checks for vmla, avoid cases where vmla is feeding into
   fp instructions (except for the #3 exceptional case).
D. Add ARM hazard recognizer to model the vmla / vmls hazards.
E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the
   vmla / vmls will trigger one of the special hazards.

Work in progress, only A+B are enabled.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120960 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-05 22:04:16 +00:00
Bill Wendling
6e46d84eea s/ARM::BRIND/ARM::BX/g to coincide with r120366.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120371 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-30 00:48:15 +00:00
Anton Korobeynikov
cd775ceff0 Move callee-saved regs spills / reloads to TFI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120228 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-27 23:05:03 +00:00
Eric Christopher
8b3ca6216d Rewrite stack callee saved spills and restores to use push/pop instructions.
Remove movePastCSLoadStoreOps and associated code for simple pointer
increments. Update routines that depended upon other opcodes for save/restore.

Adjust all testcases accordingly.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119725 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-18 19:40:05 +00:00
Evan Cheng
c4af4638df Remove ARM isel hacks that fold large immediates into a pair of add, sub, and,
and xor. The 32-bit move immediates can be hoisted out of loops by machine
LICM but the isel hacks were preventing them.

Instead, let peephole optimization pass recognize registers that are defined by
immediates and the ARM target hook will fold the immediates in.

Other changes include 1) do not fold and / xor into cmp to isel TST / TEQ
instructions if there are multiple uses. This happens when the 'and' is live
out, machine sink would have sinked the computation and that ends up pessimizing
code. The peephole pass would recognize situations where the 'and' can be
toggled to define CPSR and eliminate the comparison anyway.

2) Move peephole pass to after machine LICM, sink, and CSE to avoid blocking
important optimizations.

rdar://8663787, rdar://8241368


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119548 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-17 20:13:28 +00:00
Evan Cheng
eb96a2f6c0 Code clean up. The peephole pass should be the one updating the instruction
iterator, not TII->OptimizeCompareInstr.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119186 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-15 21:20:45 +00:00
Eric Christopher
6c50119ba3 Revert this temporarily.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118827 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-11 19:47:02 +00:00
Eric Christopher
391f228e7e Change the prologue and epilogue to use push/pop for the low ARM registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118823 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-11 19:26:03 +00:00
Evan Cheng
8239daf7c8 Two sets of changes. Sorry they are intermingled.
1. Fix pre-ra scheduler so it doesn't try to push instructions above calls to
   "optimize for latency". Call instructions don't have the right latency and
   this is more likely to use introduce spills.
2. Fix if-converter cost function. For ARM, it should use instruction latencies,
   not # of micro-ops since multi-latency instructions is completely executed
   even when the predicate is false. Also, some instruction will be "slower"
   when they are predicated due to the register def becoming implicit input.
   rdar://8598427


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118135 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-03 00:45:17 +00:00
Jim Grosbach
3e55612472 First part of refactoring ARM addrmode2 (load/store) instructions to be more
explicit about the operands. Split out the different variants into separate
instructions. This gives us the ability to, among other things, assign
different scheduling itineraries to the variants. rdar://8477752.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117409 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-26 22:37:02 +00:00
Evan Cheng
c8141dfc7f Use instruction itinerary to determine what instructions are 'cheap'.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117348 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-26 02:08:50 +00:00
Bob Wilson
b3a6817d06 Tidy up redundant check.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117331 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-26 00:02:19 +00:00
Jim Grosbach
e4ad387a5a Add a pre-dispatch SjLj EH hook on the unwind edge for targets to do any
setup they require. Use this for ARM/Darwin to rematerialize the base
pointer from the frame pointer when required. rdar://8564268

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116879 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-19 23:27:08 +00:00
Evan Cheng
2312842de0 Re-enable register pressure aware machine licm with fixes. Hoist() may have
erased the instruction during LICM so UpdateRegPressureAfter() should not
reference it afterwards.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116845 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-19 18:58:51 +00:00
Daniel Dunbar
9869413802 Revert r116781 "- Add a hook for target to determine whether an instruction def
is", which breaks some nightly tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116816 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-19 17:14:24 +00:00
Evan Cheng
11e8b74a7a - Add a hook for target to determine whether an instruction def is
"long latency" enough to hoist even if it may increase spilling. Reloading
  a value from spill slot is often cheaper than performing an expensive
  computation in the loop. For X86, that means machine LICM will hoist
  SQRT, DIV, etc. ARM will be somewhat aggressive with VFP and NEON
  instructions.
- Enable register pressure aware machine LICM by default.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116781 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-19 00:55:07 +00:00
Bill Wendling
b41ee96d76 Don't recompute MachineRegisterInfo in the Optimize* method.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116750 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-18 21:22:31 +00:00
Jim Grosbach
42fac8ee3b MC machine encoding for simple aritmetic instructions that use a shifted
register operand.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116259 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-11 23:16:21 +00:00
Evan Cheng
344d9db970 Code refactoring.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116002 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-07 23:12:15 +00:00
Evan Cheng
a0792de66c - Add TargetInstrInfo::getOperandLatency() to compute operand latencies. This
allow target to correctly compute latency for cases where static scheduling
  itineraries isn't sufficient. e.g. variable_ops instructions such as
  ARM::ldm.
  This also allows target without scheduling itineraries to compute operand
  latencies. e.g. X86 can return (approximated) latencies for high latency
  instructions such as division.
- Compute operand latencies for those defined by load multiple instructions,
  e.g. ldm and those used by store multiple instructions, e.g. stm.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115755 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-06 06:27:31 +00:00
Jim Grosbach
d86609fca4 Increase the number of bits used internally by the ARM target to represent the
addressing mode from four to five.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115645 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-05 18:14:55 +00:00
Owen Anderson
e3cc84a43d Thread the determination of branch prediction hit rates back through the if-conversion heuristic APIs. For now,
stick with a constant estimate of 90% (branch predictors are good!), but we might find that we want to provide
more nuanced estimates in the future.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115364 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-01 22:45:50 +00:00
Owen Anderson
b20b85168c Part one of switching to using a more sane heuristic for determining if-conversion profitability.
Rather than having arbitrary cutoffs, actually try to cost model the conversion.

For now, the constants are tuned to more or less match our existing behavior, but these will be
changed to reflect realistic values as this work proceeds.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114973 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-28 18:32:13 +00:00
Gabor Greif
04ac81d5db Move the search for the appropriate AND instruction
into OptimizeCompareInstr.
This necessitates the passing of CmpValue around,
so widen the virtual functions to accomodate.

No functionality changes.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114428 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-21 12:01:15 +00:00
Jim Grosbach
c686e33d12 handle the upper16/lower16 target operand flags on symbol references for MC
instruction lowering.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114191 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-17 18:25:25 +00:00
Bill Wendling
a65568676d Rename ConvertToSetZeroFlag to something more general.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113670 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-11 00:13:50 +00:00
Bill Wendling
3665661a57 No need to recompute the SrcReg and CmpValue.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113666 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-10 23:46:12 +00:00
Bill Wendling
92ad57f066 Move some of the decision logic for converting an instruction into one that sets
the 'zero' bit down into the back-end. There are other cases where this logic
isn't sufficient, so they should be handled separately.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113665 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-10 23:34:19 +00:00
Bill Wendling
220e240bdf Modify the comparison optimizations in the peephole optimizer to update the
iterator when an optimization took place. This allows us to do more insane
things with the code than just remove an instruction or two.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113640 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-10 21:55:43 +00:00
Evan Cheng
3ef1c8759a Teach if-converter to be more careful with predicating instructions that would
take multiple cycles to decode.
For the current if-converter clients (actually only ARM), the instructions that
are predicated on false are not nops. They would still take machine cycles to
decode. Micro-coded instructions such as LDM / STM can potentially take multiple
cycles to decode. If-converter should take treat them as non-micro-coded
simple instructions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113570 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-10 01:29:16 +00:00
Evan Cheng
5f54ce3473 For each instruction itinerary class, specify the number of micro-ops each
instruction in the class would be decoded to. Or zero if the number of
uOPs must be determined dynamically.

This will be used to determine the cost-effectiveness of predicating a
micro-coded instruction.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113513 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-09 18:18:55 +00:00
Bob Wilson
9a1c189d9e Add a separate ARM instruction format for Saturate instructions.
(I discovered 2 more copies of the ARM instruction format list, bringing the
total to 4!!  Two of them were already out of sync.  I haven't yet gotten into
the disassembler enough to know the best way to fix this, but something needs
to be done.)  Add support for encoding these instructions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110754 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-11 00:01:18 +00:00
Bill Wendling
c98af3370f Use the "isCompare" machine instruction attribute instead of calling the
relatively expensive comparison analyzer on each instruction. Also rename the
comparison analyzer method to something more in line with what it actually does.

This pass is will eventually be folded into the Machine CSE pass.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110539 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-08 05:04:59 +00:00
Bill Wendling
e4ddbdfd3c Add the Optimize Compares pass (disabled by default).
This pass tries to remove comparison instructions when possible. For instance,
if you have this code:

   sub r1, 1
   cmp r1, 0
   bz  L1

and "sub" either sets the same flag as the "cmp" instruction or could be
converted to set the same flag, then we can eliminate the "cmp" instruction all
together. This is a important for ARM where the ALU instructions could set the
CPSR flag, but need a special suffix ('s') to do so.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110423 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-06 01:32:48 +00:00
Chris Lattner
2062875a7d eliminate the TargetInstrInfo::GetInstSizeInBytes hook.
ARM/PPC/MSP430-specific code (which are the only targets that
implement the hook) can directly reference their target-specific
instrinfo classes.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109171 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-22 21:27:00 +00:00
Chris Lattner
4dbbe3433f prune #includes a little.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108929 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-20 21:17:29 +00:00
Jakob Stoklund Olesen
78e6e00922 Remove the isMoveInstr() hook.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108567 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-16 22:35:46 +00:00
Jakob Stoklund Olesen
600f171486 RISC architectures get their memory operand folding for free.
The only folding these load/store architectures can do is converting COPY into a
load or store, and the target independent part of foldMemoryOperand already
knows how to do that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108099 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-11 19:19:13 +00:00
Jakob Stoklund Olesen
ac27366700 Replace copyRegToReg with copyPhysReg for ARM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108078 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-11 06:33:54 +00:00
Bob Wilson
80d9bc0336 Renumber NEON instruction formats to be consecutive.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106927 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-26 00:05:09 +00:00
Bob Wilson
184723d9be Rename ARM instruction formats NEONGetLnFrm, NEONSetLnFrm and NEONDupFrm to
"N..." instead of "NEON..." for consistency with the other NEON format names.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106921 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-25 23:56:05 +00:00
Bob Wilson
2653263165 Remove unused NEONFrm and ThumbMiscFrm ARM instruction formats.
Renumber MiscFrm to 25.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106916 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-25 23:45:37 +00:00
Evan Cheng
13151432ed Change if-conversion block size limit checks to add some flexibility.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106901 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-25 22:42:03 +00:00
Bill Wendling
4b722108e2 We are missing opportunites to use ldm. Take code like this:
void t(int *cp0, int *cp1, int *dp, int fmd) {
  int c0, c1, d0, d1, d2, d3;
  c0 = (*cp0++ & 0xffff) | ((*cp1++ << 16) & 0xffff0000);
  c1 = (*cp0++ & 0xffff) | ((*cp1++ << 16) & 0xffff0000);
  /* ... */
}

It code gens into something pretty bad. But with this change (analogous to the
X86 back-end), it will use ldm and generate few instructions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106693 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-23 23:00:16 +00:00
Evan Cheng
86050dc8cc Allow ARM if-converter to be run after post allocation scheduling.
- This fixed a number of bugs in if-converter, tail merging, and post-allocation
  scheduler. If-converter now runs branch folding / tail merging first to
  maximize if-conversion opportunities.
- Also changed the t2IT instruction slightly. It now defines the ITSTATE
  register which is read by instructions in the IT block.
- Added Thumb2 specific hazard recognizer to ensure the scheduler doesn't
  change the instruction ordering in the IT block (since IT mask has been
  finalized). It also ensures no other instructions can be scheduled between
  instructions in the IT block.

This is not yet enabled.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106344 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-18 23:09:54 +00:00
Stuart Hastings
3bf9125933 Add a DebugLoc parameter to TargetInstrInfo::InsertBranch(). This
addresses a longstanding deficiency noted in many FIXMEs scattered
across all the targets.

This effectively moves the problem up one level, replacing eleven
FIXMEs in the targets with eight FIXMEs in CodeGen, plus one path
through FastISel where we actually supply a DebugLoc, fixing Radar
7421831.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@106243 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-17 22:43:56 +00:00
Bob Wilson
1a913ed178 Add instruction encoding for the Neon VMOV immediate instruction. This changes
the machine instruction representation of the immediate value to be encoded
into an integer with similar fields as the actual VMOV instruction.  This makes
things easier for the disassembler, since it can just stuff the bits into the
immediate operand, but harder for the asm printer since it has to decode the
value to be printed.  Testcase for the encoding will follow later when MC has
more support for ARM.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105836 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-11 21:34:50 +00:00
Jakob Stoklund Olesen
9edf7deb37 Slightly change the meaning of the reMaterialize target hook when the original
instruction defines subregisters.

Any existing subreg indices on the original instruction are preserved or
composed with the new subreg index.

Also substitute multiple operands mentioning the original register by using the
new MachineInstr::substituteRegister() function. This is necessary because there
will soon be <imp-def> operands added to non read-modify-write partial
definitions. This instruction:

  %reg1234:foo = FLAP %reg1234<imp-def>

will reMaterialize(%reg3333, bar) like this:

  %reg3333:bar-foo = FLAP %reg333:bar<imp-def>

Finally, replace the TargetRegisterInfo pointer argument with a reference to
indicate that it cannot be NULL.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@105358 91177308-0d34-0410-b5e6-96231b3b80d8
2010-06-02 22:47:25 +00:00