llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-12-15 20:29:48 +00:00

Author	SHA1	Message	Date
Bob Wilson	84c5eed15b	This patch combines several changes from Evan Cheng for rdar://8659675. Making use of VFP / NEON floating point multiply-accumulate / subtraction is difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Enable these fp vmlx codegen changes for Cortex-A9. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129775 91177308-0d34-0410-b5e6-96231b3b80d8	2011-04-19 18:11:57 +00:00
Bob Wilson	cd70496ad1	Add -mcpu=cortex-a9-mp. It's cortex-a9 with MP extension. rdar://8648637. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129774 91177308-0d34-0410-b5e6-96231b3b80d8	2011-04-19 18:11:52 +00:00
Bob Wilson	5dde893c2b	Avoid some 's' 16-bit instruction which partially update CPSR (and add false dependency) when it isn't dependent on last CPSR defining instruction. rdar://8928208 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129773 91177308-0d34-0410-b5e6-96231b3b80d8	2011-04-19 18:11:49 +00:00
Evan Cheng	463d358f1d	Distribute (A + B) * C to (A * C) + (B * C) to make use of NEON multiplier accumulator forwarding: vadd d3, d0, d1 vmul d3, d3, d2 => vmul d3, d0, d2 vmla d3, d1, d2 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128665 91177308-0d34-0410-b5e6-96231b3b80d8	2011-03-31 19:38:48 +00:00
Bob Wilson	0406356cd4	Add Neon VCVT instructions for f32 <-> f16 conversions. Clang is now providing intrinsics for these and so we need to support them in the backend. Radar 8068427. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121902 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-15 22:14:12 +00:00
Evan Cheng	167be80ee7	Code clean up. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120965 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-05 23:03:45 +00:00
Evan Cheng	48575f6ea7	Making use of VFP / NEON floating point multiply-accumulate / subtraction is difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Work in progress, only A+B are enabled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120960 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-05 22:04:16 +00:00
Evan Cheng	529916ca4a	Add some missing isel predicates on def : pat patterns to avoid generating VFP vmla / vmls (they cause stalls). Disabling them in isel is properly not a right solution, I'll look into a proper solution next. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118922 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-12 20:32:20 +00:00
Evan Cheng	dfed19fe2c	Fix preload instruction isel. Only v7 supports pli, and only v7 with mp extension supports pldw. Add subtarget attribute to denote mp extension support and legalize illegal ones to nothing. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118160 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-03 06:34:55 +00:00
Bob Wilson	77f42b5278	PR8359: The ARM backend may end up allocating registers D16 to D31 when "-mattr=+vfp3" is specified. However, this will not work for hardware that only supports 16 registers. Add a new flag to support -"mattr=+vfp3,+d16". Patch by Jan Voung! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116310 91177308-0d34-0410-b5e6-96231b3b80d8	2010-10-12 16:22:47 +00:00
Jim Grosbach	2317e40539	Nuke it from orbit. It's the only way to be sure. (Kill the dead non-MC asm printer for the ARM target.) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115127 91177308-0d34-0410-b5e6-96231b3b80d8	2010-09-30 01:57:53 +00:00
Evan Cheng	3ef1c8759a	Teach if-converter to be more careful with predicating instructions that would take multiple cycles to decode. For the current if-converter clients (actually only ARM), the instructions that are predicated on false are not nops. They would still take machine cycles to decode. Micro-coded instructions such as LDM / STM can potentially take multiple cycles to decode. If-converter should take treat them as non-micro-coded simple instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113570 91177308-0d34-0410-b5e6-96231b3b80d8	2010-09-10 01:29:16 +00:00
Jim Grosbach	c5ed0134a7	80 column cleanup. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111266 91177308-0d34-0410-b5e6-96231b3b80d8	2010-08-17 18:39:16 +00:00
Chris Lattner	23e70ebf35	fix emacs language spec's, patch by Edmund Grimley-Evans! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111241 91177308-0d34-0410-b5e6-96231b3b80d8	2010-08-17 16:20:04 +00:00
Jim Grosbach	fcba5e6b64	cortex m4 has floating point support, but only single precision. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110810 91177308-0d34-0410-b5e6-96231b3b80d8	2010-08-11 15:44:15 +00:00
Evan Cheng	7b4d31176e	Report error if codegen tries to instantiate a ARM target when the cpu does support it. e.g. cortex-m* processors. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110798 91177308-0d34-0410-b5e6-96231b3b80d8	2010-08-11 07:17:46 +00:00
Evan Cheng	8d62e713ea	ArchV7M implies HW division instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110797 91177308-0d34-0410-b5e6-96231b3b80d8	2010-08-11 07:00:16 +00:00
Evan Cheng	cb5ce6e62b	ArchV6T2, V7A, and V7M implies Thumb2; Archv7A implies NEON. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110796 91177308-0d34-0410-b5e6-96231b3b80d8	2010-08-11 06:57:53 +00:00
Evan Cheng	d6b4632256	Add ARM Archv6M and let it implies FeatureDB (having dmb, etc.) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110795 91177308-0d34-0410-b5e6-96231b3b80d8	2010-08-11 06:51:54 +00:00
Evan Cheng	c7569ed4e4	Add Cortex-M0 support. It's a ARMv6m device (no ARM mode) with some 32-bit instructions: dmb, dsb, isb, msr, and mrs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110786 91177308-0d34-0410-b5e6-96231b3b80d8	2010-08-11 06:30:38 +00:00
Evan Cheng	11db068721	- Add subtarget feature -mattr=+db which determine whether an ARM cpu has the memory and synchronization barrier dmb and dsb instructions. - Change instruction names to something more sensible (matching name of actual instructions). - Added tests for memory barrier codegen. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110785 91177308-0d34-0410-b5e6-96231b3b80d8	2010-08-11 06:22:01 +00:00
Evan Cheng	e44be63816	Change -prefer-32bit-thumb to attribute -mattr=+32bit instead to disable more 32-bit to 16-bit optimizations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110584 91177308-0d34-0410-b5e6-96231b3b80d8	2010-08-09 18:35:19 +00:00
Evan Cheng	7a41599962	Add an ARM "feature". Cortex-a8 fp comparison is very slow (> 20 cycles). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108256 91177308-0d34-0410-b5e6-96231b3b80d8	2010-07-13 19:21:50 +00:00
Jim Grosbach	29402132f3	Cleanup of ARMv7M support. Move hardware divide and Thumb2 extract/pack instructions to subtarget features and update tests to reflect. PR5717. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@103136 91177308-0d34-0410-b5e6-96231b3b80d8	2010-05-05 23:44:43 +00:00
Jim Grosbach	b1dc393bd5	Add initial support for ARMv7M subtarget and cortex-m3 cpu. Patch by Jordy <snhjordy@gmail.com>. Followup patches will add some tests and adjust to use Subtarget features for the instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@103119 91177308-0d34-0410-b5e6-96231b3b80d8	2010-05-05 20:44:35 +00:00
Anton Korobeynikov	2eeeff8371	Some bits of A9 scheduling: VFP git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100643 91177308-0d34-0410-b5e6-96231b3b80d8	2010-04-07 18:19:18 +00:00
Jakob Stoklund Olesen	fddb7667ca	Replace TSFlagsFields and TSFlagsShifts with a simpler TSFlags field. When a target instruction wants to set target-specific flags, it should simply set bits in the TSFlags bit vector defined in the Instruction TableGen class. This works well because TableGen resolves member references late: class I : Instruction { AddrMode AM = AddrModeNone; let TSFlags{3-0} = AM.Value; } let AM = AddrMode4 in def ADD : I; TSFlags gets the expected bits from AddrMode4 in this example. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100384 91177308-0d34-0410-b5e6-96231b3b80d8	2010-04-05 03:10:20 +00:00
Jim Grosbach	1118b5e490	vml[as] are slow on 1136jf-s also. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100066 91177308-0d34-0410-b5e6-96231b3b80d8	2010-04-01 00:13:43 +00:00
Jim Grosbach	7ec7a0e96b	switch the flag for using NEON for SP floating point to a subtarget 'feature'. Re-commit. This time complete with testsuite updates. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99570 91177308-0d34-0410-b5e6-96231b3b80d8	2010-03-25 23:47:34 +00:00
Jim Grosbach	78e496e165	need to fix 'make check' tests first. revert for a moment. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99569 91177308-0d34-0410-b5e6-96231b3b80d8	2010-03-25 23:34:05 +00:00
Jim Grosbach	bd17bc96bf	switch the flag for using NEON for SP floating point to a subtarget 'feature' git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99568 91177308-0d34-0410-b5e6-96231b3b80d8	2010-03-25 23:32:19 +00:00
Jim Grosbach	6b2e8dc9a0	switch the use-vml[as] instructions flag to a subtarget 'feature' git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99565 91177308-0d34-0410-b5e6-96231b3b80d8	2010-03-25 23:11:16 +00:00
Anton Korobeynikov	631379e79c	Add substarget feature for FP16 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@98503 91177308-0d34-0410-b5e6-96231b3b80d8	2010-03-14 18:42:38 +00:00
David Goodwin	ebb5cb9216	Add ARMv6 itineraries. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@89218 91177308-0d34-0410-b5e6-96231b3b80d8	2009-11-18 18:39:57 +00:00
Anton Korobeynikov	f95215f551	Use NEON reg-reg moves, where profitable. This reduces "domain-cross" stalls, when we used to mix vfp and neon code (the former were used for reg-reg moves) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@85764 91177308-0d34-0410-b5e6-96231b3b80d8	2009-11-02 00:10:38 +00:00
David Goodwin	9843a93e83	Remove neonfp attribute and instead set default based on CPU string. Add -arm-use-neon-fp to override the default. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@83218 91177308-0d34-0410-b5e6-96231b3b80d8	2009-10-01 22:19:57 +00:00
David Goodwin	471850ab84	Restore the -post-RA-scheduler flag as an override for the target specification. Remove -mattr for setting PostRAScheduler enable and instead use CPU string. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@83215 91177308-0d34-0410-b5e6-96231b3b80d8	2009-10-01 21:46:35 +00:00
David Goodwin	0dad89fa94	Remove -post-RA-schedule flag and add a TargetSubtarget method to enable post-register-allocation scheduling. By default it is off. For ARM, enable/disable with -mattr=+/-postrasched. Enable by default for cortex-a8. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@83122 91177308-0d34-0410-b5e6-96231b3b80d8	2009-09-30 00:10:16 +00:00
David Goodwin	127221fbdc	Checkpoint NEON scheduling itineraries. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@82657 91177308-0d34-0410-b5e6-96231b3b80d8	2009-09-23 21:38:08 +00:00
David Goodwin	546952fd60	Allow a zero cycle stage to reserve/require a FU without advancing the cycle counter. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@78736 91177308-0d34-0410-b5e6-96231b3b80d8	2009-08-11 22:38:43 +00:00
David Goodwin	767a952a6f	Make NEON single-precision FP support the default for cortex-a8 (again). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@78430 91177308-0d34-0410-b5e6-96231b3b80d8	2009-08-07 23:32:33 +00:00
David Goodwin	ce3c1f2a0e	Disable NEON single-precision FP support for Cortex-A8, for now... git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@78209 91177308-0d34-0410-b5e6-96231b3b80d8	2009-08-05 16:40:57 +00:00
David Goodwin	1f0e404c87	By default, for cortex-a8 use NEON for single-precision FP. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@78200 91177308-0d34-0410-b5e6-96231b3b80d8	2009-08-05 16:01:19 +00:00
David Goodwin	42a83f2d15	Initial support for single-precision FP using NEON. Added "neonfp" attribute to enable. Added patterns for some binary FP operations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@78081 91177308-0d34-0410-b5e6-96231b3b80d8	2009-08-04 17:53:06 +00:00
Evan Cheng	6762d91c05	Add fake v7 itineraries for now. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@76612 91177308-0d34-0410-b5e6-96231b3b80d8	2009-07-21 18:54:14 +00:00
Evan Cheng	34a0fa362d	Add a Thumb2 instruction flag to that indicates whether the instruction can be transformed to 16-bit variant. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@74988 91177308-0d34-0410-b5e6-96231b3b80d8	2009-07-08 01:46:35 +00:00
Evan Cheng	8557c2bcb8	Latency information for ARM v6. It's rough and not yet hooked up. Right now we are only using branch latency to determine if-conversion limits. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73747 91177308-0d34-0410-b5e6-96231b3b80d8	2009-06-19 01:51:50 +00:00
Anton Korobeynikov	fbbf1eeccf	Separate V6 from V6T2 since the latter has some extra nice instructions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@73085 91177308-0d34-0410-b5e6-96231b3b80d8	2009-06-08 21:20:36 +00:00
Anton Korobeynikov	d4022c3fbb	Add placeholder for thumb2 stuff git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@72593 91177308-0d34-0410-b5e6-96231b3b80d8	2009-05-29 23:41:08 +00:00
Anton Korobeynikov	6d7d2aa38a	Add ARMv7 architecture, Cortex processors and different FPU modes handling. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@72337 91177308-0d34-0410-b5e6-96231b3b80d8	2009-05-23 19:51:43 +00:00

1 2

61 Commits