Commit Graph

139 Commits

Author SHA1 Message Date
Tim Northover
ad249171e4 ARM: decide whether to use movw/movt based on "minsize" attribute.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196102 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-02 14:46:26 +00:00
Tim Northover
323ac85d6a ARM: fold prologue/epilogue sp updates into push/pop for code size
ARM prologues usually look like:
    push {r7, lr}
    sub sp, sp, #4

If code size is extremely important, this can be optimised to the single
instruction:
    push {r6, r7, lr}

where we don't actually care about the contents of r6, but pushing it subtracts
4 from sp as a side effect.

This should implement such a conversion, predicated on the "minsize" function
attribute (-Oz) since I've yet to find any code it actually makes faster.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194264 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-08 17:18:07 +00:00
Arnold Schwaighofer
d42730dc71 IfConverter: Use TargetSchedule for instruction latencies
For targets that have instruction itineraries this means no change. Targets
that move over to the new schedule model will use be able the new schedule
module for instruction latencies in the if-converter (the logic is such that if
there is no itineary we will use the new sched model for the latencies).

Before, we queried "TTI->getInstructionLatency()" for the instruction latency
and the extra prediction cost. Now, we query the TargetSchedule abstraction for
the instruction latency and TargetInstrInfo for the extra predictation cost. The
TargetSchedule abstraction will internally call "TTI->getInstructionLatency" if
an itinerary exists, otherwise it will use the new schedule model.

ATTENTION: Out of tree targets!

(I will also send out an email later to LLVMDev)

This means, if your target implements

 unsigned getInstrLatency(const InstrItineraryData *ItinData,
                          const MachineInstr *MI,
                          unsigned *PredCost);

and returns a value for "PredCost", you now also need to implement

 unsigned getPredictationCost(const MachineInstr *MI);

(if your target uses the IfConversion.cpp pass)

radar://15077010

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191671 91177308-0d34-0410-b5e6-96231b3b80d8
2013-09-30 15:28:56 +00:00
David Blaikie
0187e7a9ba DebugInfo: remove target-specific Frame Index handling for DBG_VALUE MachineInstrs
Frame index handling is now target-agnostic, so delete the target hooks
for creation & asm printing of target-specific addressing in DBG_VALUEs
and any related functions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184067 91177308-0d34-0410-b5e6-96231b3b80d8
2013-06-16 20:34:27 +00:00
Bill Wendling
57148c166a Don't cache the instruction and register info from the TargetMachine, because
the internals of TargetMachine could change.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183488 91177308-0d34-0410-b5e6-96231b3b80d8
2013-06-07 05:54:19 +00:00
Tim Northover
4cc1407b84 ARM: Use ldrd/strd to spill 64-bit pairs when available.
This allows common sp-offsets to be part of the instruction and is
probably faster on modern CPUs too.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179977 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-21 11:57:07 +00:00
Arnold Schwaighofer
08da486557 ARM scheduler model: Swift has varying latencies, uops for simple ALU ops
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178842 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05 04:42:00 +00:00
Chandler Carruth
a1514e24cc Sort includes for all of the .h files under the 'lib' tree. These were
missed in the first pass because the script didn't yet handle include
guards.

Note that the script is now able to handle all of these headers without
manual edits. =]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169224 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-04 07:12:27 +00:00
Andrew Trick
412cd2f813 misched: Use the TargetSchedModel interface wherever possible.
Allows the new machine model to be used for NumMicroOps and OutputLatency.

Allows the HazardRecognizer to be disabled along with itineraries.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165603 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-10 05:43:09 +00:00
Bob Wilson
eb1641d54a Add LLVM support for Swift.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164899 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-29 21:43:49 +00:00
Bob Wilson
154418cdd8 Whitespace.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164898 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-29 21:27:31 +00:00
Andrew Trick
9eed53379f Implement getNumLDMAddresses and expose through ARMBaseInstrInfo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163922 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-14 18:48:46 +00:00
Jakob Stoklund Olesen
053b5b0b3c Handle ARM MOVCC optimization in PeepholeOptimizer.
Use the target independent select analysis hooks.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162060 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-16 23:14:20 +00:00
Jakob Stoklund Olesen
2860b7ea3a Fold predicable instructions into MOVCC / t2MOVCC.
The ARM select instructions are just predicated moves. If the select is
the only use of an operand, the instruction defining the operand can be
predicated instead, saving one instruction and decreasing register
pressure.

This implementation can turn AND/ORR/EOR instructions into their
corresponding ANDCC/ORRCC/EORCC variants. Ideally, we should be able to
predicate any instruction, but we don't yet support predicated
instructions in SSA form.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161994 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-15 22:16:39 +00:00
Manman Ren
de7266c611 Add SrcReg2 to analyzeCompare and optimizeCompareInstr to handle Compare
instructions with two register operands.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159465 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-29 21:33:59 +00:00
Andrew Trick
b7e0289fb3 misched: API for minimum vs. expected latency.
Minimum latency determines per-cycle scheduling groups.
Expected latency determines critical path and cost.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158021 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-05 21:11:27 +00:00
Jakob Stoklund Olesen
c5041cac7d Implement ARMBaseInstrInfo::commuteInstruction() for MOVCCr.
A MOVCCr instruction can be commuted by inverting the condition. This
can help reduce register pressure and remove unnecessary copies in some
cases.

<rdar://problem/11182914>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154033 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-04 18:23:42 +00:00
Jim Grosbach
c01810eeb7 ARM implement TargetInstrInfo::getNoopForMachoTarget()
Without this hook, functions w/ a completely empty body (including no
epilogue) will cause an MCEmitter assertion failure.

For example,
define internal fastcc void @empty_function() {
  unreachable
}

rdar://10947471

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@151673 91177308-0d34-0410-b5e6-96231b3b80d8
2012-02-28 23:53:30 +00:00
Jia Liu
31d157ae1a Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, MSP430, PPC, PTX, Sparc, X86, XCore.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@150878 91177308-0d34-0410-b5e6-96231b3b80d8
2012-02-18 12:03:15 +00:00
Evan Cheng
020f4106f8 Model ARM predicated write as read-mod-write. e.g.
r0 = mov #0
r0 = moveq #1

Then the second instruction has an implicit data dependency on the first
instruction. Sadly I have yet to come up with a small test case that
demonstrate the post-ra scheduler taking advantage of this.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146583 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-14 20:00:08 +00:00
Evan Cheng
ddfd1377d2 - Add MachineInstrBundle.h and MachineInstrBundle.cpp. This includes a function
to finalize MI bundles (i.e. add BUNDLE instruction and computing register def
  and use lists of the BUNDLE instruction) and a pass to unpack bundles.
- Teach more of MachineBasic and MachineInstr methods to be bundle aware.
- Switch Thumb2 IT block to MI bundles and delete the hazard recognizer hack to
  prevent IT blocks from being broken apart.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@146542 91177308-0d34-0410-b5e6-96231b3b80d8
2011-12-14 02:11:42 +00:00
Jakob Stoklund Olesen
142bd1a54e Move -widen-vmovs to ARMBaseInstrInfo::expandPostRAPseudo().
The VMOVS widening needs to look at the implicit COPY operands.  Trying
to dig out the COPY instruction from an iterator in copyPhysReg() is the
wrong approach.

The expandPostRAPseudo() hook gets to look at COPY instructions before
they are converted to copyPhysReg() calls.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141619 91177308-0d34-0410-b5e6-96231b3b80d8
2011-10-11 00:59:06 +00:00
Jakob Stoklund Olesen
13fd601e0f Implement TII::get/setExecutionDomain() for ARM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140653 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-27 22:57:21 +00:00
Andrew Trick
3be654f808 Lower ARM adds/subs to add/sub after adding optional CPSR operand.
This is still a hack until we can teach tblgen to generate the
optional CPSR operand rather than an implicit CPSR def. But the
strangeness is now limited to the selection DAG. ADD/SUB MI's no
longer have implicit CPSR defs, nor do we allow flag setting variants
of these opcodes in machine code. There are several corner cases to
consider, and getting one wrong would previously lead to nasty
miscompilation. It's not the first time I've debugged one, so this
time I added enough verification to ensure it won't happen again.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140228 91177308-0d34-0410-b5e6-96231b3b80d8
2011-09-21 02:20:46 +00:00
Jakob Stoklund Olesen
36ee0e6405 Implement isLoadFromStackSlotPostFE and isStoreToStackSlotPostFE for ARM.
They improve the verbose assembly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137069 91177308-0d34-0410-b5e6-96231b3b80d8
2011-08-08 21:45:32 +00:00
Evan Cheng
ee04a6d3a4 Sink ARMMCExpr and ARMAddressingModes into MC layer. First step to separate ARM MC code from target.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135636 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-20 23:34:39 +00:00
Owen Anderson
16884415db Add a target-indepedent entry to MCInstrDesc to describe the encoded size of an opcode. Switch ARM over to using that rather than its own special MCInstrDesc bits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135106 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-13 23:22:26 +00:00
Jakub Staszak
f81b7f6069 Use BranchProbability instead of floating points in IfConverter.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134858 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-10 02:58:07 +00:00
Evan Cheng
4db3cffe94 Hide the call to InitMCInstrInfo into tblgen generated ctor.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134244 91177308-0d34-0410-b5e6-96231b3b80d8
2011-07-01 17:57:27 +00:00
Evan Cheng
e837dead3c - Rename TargetInstrDesc, TargetOperandInfo to MCInstrDesc and MCOperandInfo and
sink them into MC layer.
- Added MCInstrInfo, which captures the tablegen generated static data. Chang
TargetInstrInfo so it's based off MCInstrInfo.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134021 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-28 19:10:37 +00:00
Jim Grosbach
f921c0fe34 Clean up a few 80 column violations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132946 91177308-0d34-0410-b5e6-96231b3b80d8
2011-06-13 22:54:22 +00:00
Chris Lattner
7a2bdde0a0 Fix a ton of comment typos found by codespell. Patch by
Luis Felipe Strano Moraes!



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129558 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-15 05:18:47 +00:00
Cameron Zwarich
5876db7a66 Fix a typo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129429 91177308-0d34-0410-b5e6-96231b3b80d8
2011-04-13 06:39:16 +00:00
Bruno Cardoso Lopes
ae0855401b Apply again changes to support ARM memory asm parsing. I removed
all LDR/STR changes and left them to a future patch. Passing all
checks now.

- Implement asm parsing support for LDRT, LDRBT, STRT, STRBT and
  fix the encoding wherever is possible.
- Add a new encoding bit to describe the index mode used and teach
  printAddrMode2Operand to check by the addressing mode which index
  mode to print.
- Testcases

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128689 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-31 23:26:08 +00:00
Bruno Cardoso Lopes
b41aaab5a1 Revert r128632 again, until I figure out what break the tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128635 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-31 15:54:36 +00:00
Bruno Cardoso Lopes
bcd3a9cd84 Reapply r128585 without generating a lib depedency cycle. An updated log:
- Implement asm parsing support for LDRT, LDRBT, STRT, STRBT and
  {STR,LDC}{2}_{PRE,POST} fixing the encoding wherever is possible.
- Move all instructions which use am2offset without a pattern to use
  addrmode2.
- Add a new encoding bit to describe the index mode used and teach
  printAddrMode2Operand to check by the addressing mode which index
  mode to print.
- Testcases

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128632 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-31 14:52:28 +00:00
Matt Beaumont-Gay
e4345c9977 Revert "- Implement asm parsing support for LDRT, LDRBT, STRT, STRBT and"
This revision introduced a dependency cycle, as nlewycky mentioned by email.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128597 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-31 00:39:16 +00:00
Bruno Cardoso Lopes
40829ed6f5 - Implement asm parsing support for LDRT, LDRBT, STRT, STRBT and
{STR,LDC}{2}_PRE.
- Fixed the encoding in some places.
- Some of those instructions were using am2offset and now use addrmode2.
Codegen isn't affected, instructions which use SelectAddrMode2Offset were not
touched.
- Teach printAddrMode2Operand to check by the addressing mode which index
mode to print.
- This is a work in progress, more work to come. The idea is to change places
which use am2offset to use addrmode2 instead, as to unify assembly parser.
- Add testcases for assembly parser

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128585 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-30 23:32:32 +00:00
Anton Korobeynikov
57caad7a33 Preliminary support for ARM frame save directives emission via MI flags.
This is just very first approximation how the stuff should be done
(e.g. ARM-only for now). More to follow.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127101 91177308-0d34-0410-b5e6-96231b3b80d8
2011-03-05 18:43:32 +00:00
Evan Cheng
6557bce3ec VFP single precision arith instructions can go down to NEON pipeline, but on Cortex-A8 only.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126238 91177308-0d34-0410-b5e6-96231b3b80d8
2011-02-22 19:53:14 +00:00
Evan Cheng
9fe2009956 Sorry, several patches in one.
TargetInstrInfo:
Change produceSameValue() to take MachineRegisterInfo as an optional argument.
When in SSA form, targets can use it to make more aggressive equality analysis.

Machine LICM:
1. Eliminate isLoadFromConstantMemory, use MI.isInvariantLoad instead.
2. Fix a bug which prevent CSE of instructions which are not re-materializable.
3. Use improved form of produceSameValue.

ARM:
1. Teach ARM produceSameValue to look pass some PIC labels.
2. Look for operands from different loads of different constant pool entries
   which have same values.
3. Re-implement PIC GA materialization using movw + movt. Combine the pair with
   a "add pc" or "ldr [pc]" to form pseudo instructions. This makes it possible
   to re-materialize the instruction, allow machine LICM to hoist the set of
   instructions out of the loop and make it possible to CSE them. It's a bit
   hacky, but it significantly improve code quality.
4. Some minor bug fixes as well.

With the fixes, using movw + movt to materialize GAs significantly outperform the
load from constantpool method. 186.crafty and 255.vortex improved > 20%, 254.gap
and 176.gcc ~10%.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123905 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-20 08:34:58 +00:00
Andrew Trick
2da8bc8a5f Various bits of framework needed for precise machine-level selection
DAG scheduling during isel. Most new functionality is currently
guarded by -enable-sched-cycles and -enable-sched-hazard.

Added InstrItineraryData::IssueWidth field, currently derived from
ARM itineraries, but could be initialized differently on other targets.

Added ScheduleHazardRecognizer::MaxLookAhead to indicate whether it is
active, and if so how many cycles of state it holds.

Added SchedulingPriorityQueue::HasReadyFilter to allowing gating entry
into the scheduler's available queue.

ScoreboardHazardRecognizer now accesses the ScheduleDAG in order to
get information about it's SUnits, provides RecedeCycle for bottom-up
scheduling, correctly computes scoreboard depth, tracks IssueCount, and
considers potential stall cycles when checking for hazards.

ScheduleDAGRRList now models machine cycles and hazards (under
flags). It tracks MinAvailableCycle, drives the hazard recognizer and
priority queue's ready filter, manages a new PendingQueue, properly
accounts for stall cycles, etc.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122541 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-24 05:03:26 +00:00
Evan Cheng
48575f6ea7 Making use of VFP / NEON floating point multiply-accumulate / subtraction is
difficult on current ARM implementations for a few reasons.
1. Even though a single vmla has latency that is one cycle shorter than a pair
   of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause
   additional pipeline stall. So it's frequently better to single codegen
   vmul + vadd.
2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to
   stall for 4 cycles. We need to schedule them apart.
3. A vmla followed vmla is a special case. Obvious issuing back to back RAW
   vmla + vmla is very bad. But this isn't ideal either:
     vmul
     vadd
     vmla
   Instead, we want to expand the second vmla:
     vmla
     vmul
     vadd
   Even with the 4 cycle vmul stall, the second sequence is still 2 cycles
   faster.

Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough
but it isn't the optimial solution. This patch attempts to make it possible to
use vmla / vmls in cases where it is profitable.

A. Add missing isel predicates which cause vmla to be codegen'ed.
B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to
   compute a fmul and a fmla.
C. Add additional isel checks for vmla, avoid cases where vmla is feeding into
   fp instructions (except for the #3 exceptional case).
D. Add ARM hazard recognizer to model the vmla / vmls hazards.
E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the
   vmla / vmls will trigger one of the special hazards.

Work in progress, only A+B are enabled.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120960 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-05 22:04:16 +00:00
Bill Wendling
6e46d84eea s/ARM::BRIND/ARM::BX/g to coincide with r120366.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120371 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-30 00:48:15 +00:00
Anton Korobeynikov
cd775ceff0 Move callee-saved regs spills / reloads to TFI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120228 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-27 23:05:03 +00:00
Eric Christopher
8b3ca6216d Rewrite stack callee saved spills and restores to use push/pop instructions.
Remove movePastCSLoadStoreOps and associated code for simple pointer
increments. Update routines that depended upon other opcodes for save/restore.

Adjust all testcases accordingly.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119725 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-18 19:40:05 +00:00
Evan Cheng
c4af4638df Remove ARM isel hacks that fold large immediates into a pair of add, sub, and,
and xor. The 32-bit move immediates can be hoisted out of loops by machine
LICM but the isel hacks were preventing them.

Instead, let peephole optimization pass recognize registers that are defined by
immediates and the ARM target hook will fold the immediates in.

Other changes include 1) do not fold and / xor into cmp to isel TST / TEQ
instructions if there are multiple uses. This happens when the 'and' is live
out, machine sink would have sinked the computation and that ends up pessimizing
code. The peephole pass would recognize situations where the 'and' can be
toggled to define CPSR and eliminate the comparison anyway.

2) Move peephole pass to after machine LICM, sink, and CSE to avoid blocking
important optimizations.

rdar://8663787, rdar://8241368


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119548 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-17 20:13:28 +00:00
Evan Cheng
eb96a2f6c0 Code clean up. The peephole pass should be the one updating the instruction
iterator, not TII->OptimizeCompareInstr.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119186 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-15 21:20:45 +00:00
Eric Christopher
6c50119ba3 Revert this temporarily.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118827 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-11 19:47:02 +00:00
Eric Christopher
391f228e7e Change the prologue and epilogue to use push/pop for the low ARM registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118823 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-11 19:26:03 +00:00