Commit Graph

62 Commits

Author SHA1 Message Date
Evan Cheng
48575f6ea7 Making use of VFP / NEON floating point multiply-accumulate / subtraction is
difficult on current ARM implementations for a few reasons.
1. Even though a single vmla has latency that is one cycle shorter than a pair
   of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause
   additional pipeline stall. So it's frequently better to single codegen
   vmul + vadd.
2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to
   stall for 4 cycles. We need to schedule them apart.
3. A vmla followed vmla is a special case. Obvious issuing back to back RAW
   vmla + vmla is very bad. But this isn't ideal either:
     vmul
     vadd
     vmla
   Instead, we want to expand the second vmla:
     vmla
     vmul
     vadd
   Even with the 4 cycle vmul stall, the second sequence is still 2 cycles
   faster.

Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough
but it isn't the optimial solution. This patch attempts to make it possible to
use vmla / vmls in cases where it is profitable.

A. Add missing isel predicates which cause vmla to be codegen'ed.
B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to
   compute a fmul and a fmla.
C. Add additional isel checks for vmla, avoid cases where vmla is feeding into
   fp instructions (except for the #3 exceptional case).
D. Add ARM hazard recognizer to model the vmla / vmls hazards.
E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the
   vmla / vmls will trigger one of the special hazards.

Work in progress, only A+B are enabled.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120960 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-05 22:04:16 +00:00
Evan Cheng
dfed19fe2c Fix preload instruction isel. Only v7 supports pli, and only v7 with mp extension supports pldw. Add subtarget attribute to denote mp extension support and legalize illegal ones to nothing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118160 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-03 06:34:55 +00:00
Bob Wilson
77f42b5278 PR8359: The ARM backend may end up allocating registers D16 to D31 when
"-mattr=+vfp3" is specified. However, this will not work for hardware that
only supports 16 registers.  Add a new flag to support -"mattr=+vfp3,+d16".
Patch by Jan Voung!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116310 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-12 16:22:47 +00:00
Rafael Espindola
0febc4657b Jim Asked us to move DataLayout on ARM back to the most specialized classes. Do
so and also change X86 for consistency.

Investigating if this can be improved a bit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115469 91177308-0d34-0410-b5e6-96231b3b80d8
2010-10-03 18:59:45 +00:00
Bob Wilson
7122ba7efb Increase ARM APCS preferred alignment for i64 and f64 from 32 bits to 64 bits.
LDM/STM instructions can run one cycle faster on some ARM processors if the
memory address is 64-bit aligned.  Radar 8489376.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115047 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-29 17:54:10 +00:00
Owen Anderson
654d5440a4 Add a subtarget hook for reporting the misprediction penalty. Use this to provide more precise
cost modeling for if-conversion.  Now if only we had a way to estimate the misprediction probability.

Adjsut CodeGen/ARM/ifcvt10.ll.  The pipeline on Cortex-A8 is long enough that it is still profitable
to predicate an ldm, but the shorter pipeline on Cortex-A9 makes it unprofitable.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114995 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-28 21:57:50 +00:00
Bob Wilson
02aba73a9e Add a command line option "-arm-strict-align" to disallow unaligned memory
accesses for ARM targets that would otherwise allow it.  Radar 8465431.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114941 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-28 04:09:35 +00:00
Daniel Dunbar
3b9569e70d Hard to imagine there are still people using inferior compilers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114862 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-27 20:12:58 +00:00
Rafael Espindola
fd9493d74e Odd additional stub framework for the ARM MC ELF emission.
llc now recognizes the "intent" to support MC/obj emission for ARM, but
given that they are all stubs, it asserts on --filetype=obj --march=arm

Patch by Jason Kim.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114856 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-27 18:31:37 +00:00
Evan Cheng
3ef1c8759a Teach if-converter to be more careful with predicating instructions that would
take multiple cycles to decode.
For the current if-converter clients (actually only ARM), the instructions that
are predicated on false are not nops. They would still take machine cycles to
decode. Micro-coded instructions such as LDM / STM can potentially take multiple
cycles to decode. If-converter should take treat them as non-micro-coded
simple instructions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113570 91177308-0d34-0410-b5e6-96231b3b80d8
2010-09-10 01:29:16 +00:00
Jim Grosbach
fcba5e6b64 cortex m4 has floating point support, but only single precision.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110810 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-11 15:44:15 +00:00
Evan Cheng
7b4d31176e Report error if codegen tries to instantiate a ARM target when the cpu does support it. e.g. cortex-m* processors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110798 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-11 07:17:46 +00:00
Evan Cheng
d6b4632256 Add ARM Archv6M and let it implies FeatureDB (having dmb, etc.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110795 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-11 06:51:54 +00:00
Evan Cheng
11db068721 - Add subtarget feature -mattr=+db which determine whether an ARM cpu has the
memory and synchronization barrier dmb and dsb instructions.
- Change instruction names to something more sensible (matching name of actual
  instructions).
- Added tests for memory barrier codegen.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110785 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-11 06:22:01 +00:00
Evan Cheng
e44be63816 Change -prefer-32bit-thumb to attribute -mattr=+32bit instead to disable more 32-bit to 16-bit optimizations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110584 91177308-0d34-0410-b5e6-96231b3b80d8
2010-08-09 18:35:19 +00:00
Evan Cheng
7a41599962 Add an ARM "feature". Cortex-a8 fp comparison is very slow (> 20 cycles).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108256 91177308-0d34-0410-b5e6-96231b3b80d8
2010-07-13 19:21:50 +00:00
Shantonu Sen
eae216c6d3 Fix "warning: extra ';' inside a struct or union" when building llvm with clang
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@103179 91177308-0d34-0410-b5e6-96231b3b80d8
2010-05-06 14:57:47 +00:00
Jim Grosbach
29402132f3 Cleanup of ARMv7M support. Move hardware divide and Thumb2 extract/pack
instructions to subtarget features and update tests to reflect.
PR5717.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@103136 91177308-0d34-0410-b5e6-96231b3b80d8
2010-05-05 23:44:43 +00:00
Jim Grosbach
b1dc393bd5 Add initial support for ARMv7M subtarget and cortex-m3 cpu. Patch by
Jordy <snhjordy@gmail.com>.

Followup patches will add some tests and adjust to use Subtarget features
for the instructions.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@103119 91177308-0d34-0410-b5e6-96231b3b80d8
2010-05-05 20:44:35 +00:00
Dan Gohman
46510a73e9 Add const qualifiers to CodeGen's use of LLVM IR constructs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101334 91177308-0d34-0410-b5e6-96231b3b80d8
2010-04-15 01:51:59 +00:00
Jim Grosbach
6b2e8dc9a0 switch the use-vml[as] instructions flag to a subtarget 'feature'
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99565 91177308-0d34-0410-b5e6-96231b3b80d8
2010-03-25 23:11:16 +00:00
Jim Grosbach
2676737e5e Make the use of the vmla and vmls VFP instructions controllable via cmd line.
Preliminary testing shows significant performance wins by not using these
instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99436 91177308-0d34-0410-b5e6-96231b3b80d8
2010-03-24 22:31:46 +00:00
Anton Korobeynikov
631379e79c Add substarget feature for FP16
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@98503 91177308-0d34-0410-b5e6-96231b3b80d8
2010-03-14 18:42:38 +00:00
Bob Wilson
4d6113ee06 Lower small memcpys to load/stores on Thumb2.
Radar 7686922.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@98210 91177308-0d34-0410-b5e6-96231b3b80d8
2010-03-11 00:20:49 +00:00
Anton Korobeynikov
ce7bf1c55f Initial bits of ARMv4-only support.
Patch by John Tytgat!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@97886 91177308-0d34-0410-b5e6-96231b3b80d8
2010-03-06 19:39:36 +00:00
Bob Wilson
15217e63bc Remove isProfitableToDuplicateIndirectBranch target hook. It is profitable
for all the processors where I have tried it, and even when it might not help
performance, the cost is quite low.  The opportunities for duplicating
indirect branches are limited by other factors so code size does not change
much due to tail duplicating indirect branches aggressively.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@90144 91177308-0d34-0410-b5e6-96231b3b80d8
2009-11-30 18:35:03 +00:00
Anton Korobeynikov
5cdc3a949a Materialize global addresses via movt/movw pair, this is always better
than doing the same via constpool:
1. Load from constpool costs 3 cycles on A9, movt/movw pair - just 2.
2. Load from constpool might stall up to 300 cycles due to cache miss.
3. Movt/movw does not use load/store unit.
4. Less constpool entries => better compiler performance.

This is only enabled on ELF systems, since darwin does not have needed
relocations (yet).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@89720 91177308-0d34-0410-b5e6-96231b3b80d8
2009-11-24 00:44:37 +00:00
Bob Wilson
834b08af8d Add a target hook to allow changing the tail duplication limit based on the
contents of the block to be duplicated.  Use this for ARM Cortex A8/9 to
be more aggressive tail duplicating indirect branches, since it makes it
much more likely that they will be predicted in the branch target buffer.
Testcase coming soon.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@89187 91177308-0d34-0410-b5e6-96231b3b80d8
2009-11-18 03:34:27 +00:00
David Goodwin
87d21b92fc Allow target to specify regclass for which antideps will only be broken along the critical path.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@88682 91177308-0d34-0410-b5e6-96231b3b80d8
2009-11-13 19:52:48 +00:00
David Goodwin
c2e8a7e8d2 Fixed to address code review. No functional changes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@86634 91177308-0d34-0410-b5e6-96231b3b80d8
2009-11-10 00:48:55 +00:00
David Goodwin
0855dee564 Allow targets to specify register classes whose member registers should not be renamed to break anti-dependencies.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@86628 91177308-0d34-0410-b5e6-96231b3b80d8
2009-11-10 00:15:47 +00:00
David Goodwin
2e7be612d5 Break anti-dependence breaking out into its own class.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@85127 91177308-0d34-0410-b5e6-96231b3b80d8
2009-10-26 16:59:04 +00:00
David Goodwin
4c3715c2e5 Allow the target to select the level of anti-dependence breaking that should be performed by the post-RA scheduler. The default is none.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@84911 91177308-0d34-0410-b5e6-96231b3b80d8
2009-10-22 23:19:17 +00:00
Evan Cheng
fa16354e03 Change createPostRAScheduler so it can be turned off at llc -O1.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@84273 91177308-0d34-0410-b5e6-96231b3b80d8
2009-10-16 21:06:15 +00:00
David Goodwin
0dad89fa94 Remove -post-RA-schedule flag and add a TargetSubtarget method to enable post-register-allocation scheduling. By default it is off. For ARM, enable/disable with -mattr=+/-postrasched. Enable by default for cortex-a8.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@83122 91177308-0d34-0410-b5e6-96231b3b80d8
2009-09-30 00:10:16 +00:00
Evan Cheng
63476a8040 Reference to hidden symbols do not have to go through non-lazy pointer in non-pic mode. rdar://7187172.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@80904 91177308-0d34-0410-b5e6-96231b3b80d8
2009-09-03 07:04:02 +00:00
Evan Cheng
e4e4ed3b56 Let Darwin linker auto-synthesize stubs and lazy-pointers. This deletes a bunch of nasty code in ARM asm printer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@80404 91177308-0d34-0410-b5e6-96231b3b80d8
2009-08-28 23:18:09 +00:00
Jim Grosbach
764ab52dd8 Whitespace cleanup. Remove trailing whitespace.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@78666 91177308-0d34-0410-b5e6-96231b3b80d8
2009-08-11 15:33:49 +00:00
David Goodwin
1f0e404c87 By default, for cortex-a8 use NEON for single-precision FP.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@78200 91177308-0d34-0410-b5e6-96231b3b80d8
2009-08-05 16:01:19 +00:00
David Goodwin
42a83f2d15 Initial support for single-precision FP using NEON. Added "neonfp" attribute to enable. Added patterns for some binary FP operations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@78081 91177308-0d34-0410-b5e6-96231b3b80d8
2009-08-04 17:53:06 +00:00
Daniel Dunbar
3be03406c9 Normalize Subtarget constructors to take a target triple string instead of
Module*.

Also, dropped uses of TargetMachine where unnecessary. The only target which
still takes a TargetMachine& is Mips, I would appreciate it if someone would
normalize this to match other targets.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@77918 91177308-0d34-0410-b5e6-96231b3b80d8
2009-08-02 22:11:08 +00:00
Evan Cheng
3147fb2cff isThumb2 really should mean thumb2 only, not thumb2+.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74871 91177308-0d34-0410-b5e6-96231b3b80d8
2009-07-06 22:29:14 +00:00
Evan Cheng
d770d9e7d1 Change the meaning of predicate hasThumb2 to mean thumb2 ISA is available, not that it's in thumb mode and thumb2 is available. Added isThumb2 predicate to replace the old predicate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74692 91177308-0d34-0410-b5e6-96231b3b80d8
2009-07-02 06:38:40 +00:00
Bob Wilson
e481f12749 Revert 74164. We'll want to use this method later.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74176 91177308-0d34-0410-b5e6-96231b3b80d8
2009-06-25 16:03:07 +00:00
Bob Wilson
c9028e69f1 Remove unused hasV6T2Ops method. We already have a separate feature to
identify Thumb2.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74164 91177308-0d34-0410-b5e6-96231b3b80d8
2009-06-25 05:20:31 +00:00
Evan Cheng
8557c2bcb8 Latency information for ARM v6. It's rough and not yet hooked up. Right now we are only using branch latency to determine if-conversion limits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73747 91177308-0d34-0410-b5e6-96231b3b80d8
2009-06-19 01:51:50 +00:00
Evan Cheng
cd828618b8 Remove UseThumbBacktraces. Just check if subtarget is darwin.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73734 91177308-0d34-0410-b5e6-96231b3b80d8
2009-06-18 23:14:30 +00:00
Anton Korobeynikov
bb62962342 Rename methods for the sake of consistency.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73428 91177308-0d34-0410-b5e6-96231b3b80d8
2009-06-15 21:46:20 +00:00
Anton Korobeynikov
fbbf1eeccf Separate V6 from V6T2 since the latter has some extra nice instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73085 91177308-0d34-0410-b5e6-96231b3b80d8
2009-06-08 21:20:36 +00:00
Anton Korobeynikov
a7b0ded2a2 Add helper for checking of Thumb1 mode
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73080 91177308-0d34-0410-b5e6-96231b3b80d8
2009-06-08 20:31:02 +00:00