llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-11-11 23:05:31 +00:00

Author	SHA1	Message	Date
Anton Korobeynikov	244455e6d6	Peephole optimization for ABS on ARM. Patch by Ana Pazos! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141365 91177308-0d34-0410-b5e6-96231b3b80d8	2011-10-07 16:15:08 +00:00
Cameron Zwarich	8f8aa815b4	Always merge profitable shifts on A9, not just when they have a single use. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141248 91177308-0d34-0410-b5e6-96231b3b80d8	2011-10-05 23:39:02 +00:00
Cameron Zwarich	d78ebe1e12	Remove a check from ARM shifted operand isel helper methods, which were blocking merging an lsl #2 that has multiple uses on A9. This shift is free, so there is no problem merging it in multiple places. Other unprofitable shifts will not be merged. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141247 91177308-0d34-0410-b5e6-96231b3b80d8	2011-10-05 23:38:50 +00:00
Cameron Zwarich	fb77752253	Add braces around something that throws me for a loop. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141173 91177308-0d34-0410-b5e6-96231b3b80d8	2011-10-05 08:59:10 +00:00
Cameron Zwarich	5ee0262014	There is no point in setting out-parameters for a ComplexPattern function when it returns false, at least as far as I could tell by reading the code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@141172 91177308-0d34-0410-b5e6-96231b3b80d8	2011-10-05 08:59:05 +00:00
Jakob Stoklund Olesen	11ebe3d7c1	Also match negative offsets for addrmode3 and addrmode5. Math is hard, and isScaledConstantInRange() always returned false for negative constants. It was doing unsigned division of negative numbers before casting back to signed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@140425 91177308-0d34-0410-b5e6-96231b3b80d8	2011-09-23 22:10:33 +00:00
Jim Grosbach	b04546ff5b	Tidy up a few 80 column violations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139636 91177308-0d34-0410-b5e6-96231b3b80d8	2011-09-13 20:30:37 +00:00
Owen Anderson	d84192fe4f	When performing instruction selection for LDR_PRE_IMM/LDRB_PRE_IMM, we still need to preserve the sign of the index. This fixes miscompilations of Quicksort in the nightly testsuite, and hopefully others as well. <rdar://problem/10046188> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138885 91177308-0d34-0410-b5e6-96231b3b80d8	2011-08-31 20:00:11 +00:00
Eli Friedman	4d3f329453	64-bit atomic cmpxchg for ARM. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138868 91177308-0d34-0410-b5e6-96231b3b80d8	2011-08-31 17:52:22 +00:00
Eli Friedman	2bdffe4882	Some 64-bit atomic operations on ARM. 64-bit cmpxchg coming next. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138845 91177308-0d34-0410-b5e6-96231b3b80d8	2011-08-31 00:31:29 +00:00
Owen Anderson	c4e16de765	addrmode_imm12 and addrmode2_offset encode their immediate values differently. Update the manual instruction selection code that was encoding them the addrmode2 way even though LDR_PRE_IMM/LDRB_PRE_IMM had switched to addrmode_imm12. Should fix a number of nightly test failures. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138758 91177308-0d34-0410-b5e6-96231b3b80d8	2011-08-29 20:16:50 +00:00
Owen Anderson	2b568fb3ce	Fix ARM codegen breakage caused by r138653. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138657 91177308-0d34-0410-b5e6-96231b3b80d8	2011-08-26 21:12:37 +00:00
Owen Anderson	9ab0f25fc1	invalid-LDR_PRE-arm.txt was already passing, but for the wrong reasons. We were failing to specify enough fixed bits of LDR_PRE/LDRB_PRE, resulting in decoding conflicts. Separate them into immediate vs. register versions, allowing us to specify the necessary fixed bits. This in turn results in the test being decoded properly, and being rejected as UNPREDICTABLE rather than a hard failure. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138653 91177308-0d34-0410-b5e6-96231b3b80d8	2011-08-26 20:43:14 +00:00
Jim Grosbach	5b81584f74	Thumb1 ADD/SUB SP instructions are predicable in Thumb2 mode. Add the predicate operand to the instructions. Update the back end accordingly where the instructions are used. Restrict the SP operands to actually only be SP, as otherwise these break assembly parsing for the normal instruction variants. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@138445 91177308-0d34-0410-b5e6-96231b3b80d8	2011-08-24 17:46:13 +00:00
Jim Grosbach	19dec207fc	ARM refactor indexed store instructions. Refactor STR[B] pre and post indexed instructions to use addressing modes for memory operands, which is necessary for assembly parsing and is more consistent with the rest of the memory instruction definitions. Make some incremental progress on refactoring away the mega-operand addrmode2 along the way, which is nice. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136978 91177308-0d34-0410-b5e6-96231b3b80d8	2011-08-05 20:35:44 +00:00
Jim Grosbach	fb8989e640	ARM parsing and encoding of SBFX and UBFX. Encode the width operand as it encodes in the instruction, which simplifies the disassembler and the encoder, by using the imm1_32 operand def. Add a diagnostic for the context-sensitive constraint that the width must be in the range [1,32-lsb]. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136264 91177308-0d34-0410-b5e6-96231b3b80d8	2011-07-27 21:09:25 +00:00
Owen Anderson	793e79601f	Split am2offset into register addend and immediate addend forms, necessary for allowing the fixed-length disassembler to distinguish between SBFX and STR_PRE. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@136141 91177308-0d34-0410-b5e6-96231b3b80d8	2011-07-26 20:54:26 +00:00
Owen Anderson	e0a03143df	Fix test failures caused by my so_reg refactoring. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135785 91177308-0d34-0410-b5e6-96231b3b80d8	2011-07-22 18:30:30 +00:00
Owen Anderson	152d4a4bb6	Get rid of the extraneous GPR operand on so_reg_imm operands, which in turn necessitates a lot of changes to related bits. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135722 91177308-0d34-0410-b5e6-96231b3b80d8	2011-07-21 23:38:37 +00:00
Owen Anderson	92a202213b	Split up the ARM so_reg ComplexPattern into so_reg_reg and so_reg_imm, allowing us to distinguish the encodings that use shifted registers from those that use shifted immediates. This is necessary to allow the fixed-length decoder to distinguish things like BICS vs LDRH. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135693 91177308-0d34-0410-b5e6-96231b3b80d8	2011-07-21 18:54:16 +00:00
Evan Cheng	ee04a6d3a4	Sink ARMMCExpr and ARMAddressingModes into MC layer. First step to separate ARM MC code from target. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@135636 91177308-0d34-0410-b5e6-96231b3b80d8	2011-07-20 23:34:39 +00:00
Evan Cheng	e837dead3c	- Rename TargetInstrDesc, TargetOperandInfo to MCInstrDesc and MCOperandInfo and sink them into MC layer. - Added MCInstrInfo, which captures the tablegen generated static data. Chang TargetInstrInfo so it's based off MCInstrInfo. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134021 91177308-0d34-0410-b5e6-96231b3b80d8	2011-06-28 19:10:37 +00:00
Owen Anderson	1300f3019e	Change the REG_SEQUENCE SDNode to take an explict register class ID as its first operand. This operand is lowered away by the time we reach MachineInstrs, so the actual register-allocation handling of them doesn't need to change. This is intended to support using REG_SEQUENCE SDNode's with type MVT::untyped, and is part of the long road to eliminating some of the hacks we currently use to support register pairs and other strange constraints, particularly on ARM NEON. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133178 91177308-0d34-0410-b5e6-96231b3b80d8	2011-06-16 18:17:13 +00:00
Bruno Cardoso Lopes	a0112d0c39	Add support for ARM ldrexd/strexd intrinsics. They both use i32 register pairs to load/store i64 values. Since there's no current support to explicitly declare such restrictions, implement it by using specific hardcoded register pairs during isel. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132248 91177308-0d34-0410-b5e6-96231b3b80d8	2011-05-28 04:07:29 +00:00
Eli Friedman	b451770b26	Zap a couple now-unused functions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@130557 91177308-0d34-0410-b5e6-96231b3b80d8	2011-04-29 22:56:48 +00:00
Bob Wilson	84c5eed15b	This patch combines several changes from Evan Cheng for rdar://8659675. Making use of VFP / NEON floating point multiply-accumulate / subtraction is difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Enable these fp vmlx codegen changes for Cortex-A9. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129775 91177308-0d34-0410-b5e6-96231b3b80d8	2011-04-19 18:11:57 +00:00
Evan Cheng	b58a340fa2	Do not lose mem_operands while lowering VLD / VST intrinsics. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129738 91177308-0d34-0410-b5e6-96231b3b80d8	2011-04-19 00:04:03 +00:00
Owen Anderson	099e5553eb	Reduce code duplication. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127899 91177308-0d34-0410-b5e6-96231b3b80d8	2011-03-18 19:46:58 +00:00
Bill Wendling	69a05a7b92	Generate a VTBL instruction instead of a series of loads and stores when we can. As Nate pointed out, VTBL isn't super performant, but it has to be better than this: _shuf: @ BB#0: @ %entry push {r4, r7, lr} add r7, sp, #4 sub sp, #12 mov r4, sp bic r4, r4, #7 mov sp, r4 mov r2, sp vmov d16, r0, r1 orr r0, r2, #6 orr r3, r2, #7 vst1.8 {d16[0]}, [r3] vst1.8 {d16[5]}, [r0] subs r4, r7, #4 orr r0, r2, #5 vst1.8 {d16[4]}, [r0] orr r0, r2, #4 vst1.8 {d16[4]}, [r0] orr r0, r2, #3 vst1.8 {d16[0]}, [r0] orr r0, r2, #2 vst1.8 {d16[2]}, [r0] orr r0, r2, #1 vst1.8 {d16[1]}, [r0] vst1.8 {d16[3]}, [r2] vldr.64 d16, [sp] vmov r0, r1, d16 mov sp, r4 pop {r4, r7, pc} The "illegal" testcase in vext.ll is no longer illegal. <rdar://problem/9078775> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127630 91177308-0d34-0410-b5e6-96231b3b80d8	2011-03-14 23:02:38 +00:00
Jim Grosbach	3c5edaaf59	Remove dead code. These ARM instruction definitions no longer exist. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127509 91177308-0d34-0410-b5e6-96231b3b80d8	2011-03-11 23:15:02 +00:00
Bob Wilson	4faa0e1952	Remove unused conditional negate operations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127090 91177308-0d34-0410-b5e6-96231b3b80d8	2011-03-05 16:54:31 +00:00
Bob Wilson	da52506792	Add patterns to use post-increment addressing for Neon VST1-lane instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126477 91177308-0d34-0410-b5e6-96231b3b80d8	2011-02-25 06:42:42 +00:00
Chris Lattner	0a9481f44f	Enhance ComputeMaskedBits to know that aligned frameindexes have their low bits set to zero. This allows us to optimize out explicit stack alignment code like in stack-align.ll:test4 when it is redundant. Doing this causes the code generator to start turning FI+cst into FI\|cst all over the place, which is general goodness (that is the canonical form) except that various pieces of the code generator don't handle OR aggressively. Fix this by introducing a new SelectionDAG::isBaseWithConstantOffset predicate, and using it in places that are looking for ADD(X,CST). The ARM backend in particular was missing a lot of addressing mode folding opportunities around OR. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125470 91177308-0d34-0410-b5e6-96231b3b80d8	2011-02-13 22:25:43 +00:00
Bob Wilson	1c3ef90cab	Add codegen support for using post-increment NEON load/store instructions. The vld1-lane, vld1-dup and vst1-lane instructions do not yet support using post-increment versions, but all the rest of the NEON load/store instructions should be handled now. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125014 91177308-0d34-0410-b5e6-96231b3b80d8	2011-02-07 17:43:21 +00:00
Bob Wilson	7de6814405	Change VLD3/4 and VST3/4 for quad registers to not update the address register. These operations are expanded to pairs of loads or stores, and the first one uses the address register update to produce the address for the second one. So far, the second load/store has also updated the address register, just for convenience, since that output has never been used. In anticipation of actually supporting post-increment updates for these operations, this changes the non-updating operations to use a non-updating load/store for the second instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125013 91177308-0d34-0410-b5e6-96231b3b80d8	2011-02-07 17:43:15 +00:00
Evan Cheng	9fe2009956	Sorry, several patches in one. TargetInstrInfo: Change produceSameValue() to take MachineRegisterInfo as an optional argument. When in SSA form, targets can use it to make more aggressive equality analysis. Machine LICM: 1. Eliminate isLoadFromConstantMemory, use MI.isInvariantLoad instead. 2. Fix a bug which prevent CSE of instructions which are not re-materializable. 3. Use improved form of produceSameValue. ARM: 1. Teach ARM produceSameValue to look pass some PIC labels. 2. Look for operands from different loads of different constant pool entries which have same values. 3. Re-implement PIC GA materialization using movw + movt. Combine the pair with a "add pc" or "ldr [pc]" to form pseudo instructions. This makes it possible to re-materialize the instruction, allow machine LICM to hoist the set of instructions out of the loop and make it possible to CSE them. It's a bit hacky, but it significantly improve code quality. 4. Some minor bug fixes as well. With the fixes, using movw + movt to materialize GAs significantly outperform the load from constantpool method. 186.crafty and 255.vortex improved > 20%, 254.gap and 176.gcc ~10%. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123905 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-20 08:34:58 +00:00
Daniel Dunbar	ec91d52a77	ARM/ISel: Factor out isScaledConstantInRange() helper. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123823 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-19 15:12:16 +00:00
Evan Cheng	5de5d4b6d0	Materialize GA addresses with movw + movt pairs for Darwin in PIC mode. e.g. movw r0, :lower16:(L_foo$non_lazy_ptr-(LPC0_0+4)) movt r0, :upper16:(L_foo$non_lazy_ptr-(LPC0_0+4)) LPC0_0: add r0, pc, r0 It's not yet enabled by default as some tests are failing. I suspect bugs in down stream tools. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123619 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-17 08:03:18 +00:00
Anton Korobeynikov	4d72860835	Model operand restrictions of mul-like instructions on ARMv5 via earlyclobber stuff. This should fix PRs 2313 and 8157. Unfortunately, no testcase, since it'd be dependent on register assignments. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122663 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-01 20:38:38 +00:00
Andrew Trick	6e8f4c4048	whitespace git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122539 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-24 04:28:06 +00:00
Chris Lattner	f1b4eafbfe	rename MVT::Flag to MVT::Glue. "Flag" is a terrible name for something that just glues two nodes together, even if it is sometimes used for flags. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122310 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-21 02:38:05 +00:00
Bob Wilson	a1f544b62e	Use PairDRegs to implement ConcatVectors. No functionality change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122017 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-17 01:21:08 +00:00
Jim Grosbach	3e333637f1	Thumb1 had two patterns for the same load-from-constant-pool instruction. Canonicalize on tLDRpci and remove tLDRcp. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121920 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-15 23:52:36 +00:00
Bill Wendling	bc4224bc6b	Reapply r121808 now that the missing patterns have been supplied. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121820 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-15 01:03:19 +00:00
Bill Wendling	7d1d8db54a	Revert r121808 until I can fix the build. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121815 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-15 00:04:00 +00:00
Bill Wendling	2af0fd3fee	Make the ISel selections for LDR/STR the same as before the LDRr/LDRi split. In particular, we want ldr r2, [r3] to be equivalent to ldr r2, [r3, #0] and not ldr r2, [r3, r0] git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121808 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-14 23:40:49 +00:00
Bill Wendling	f4caf69720	The tLDR et al instructions were emitting either a reg/reg or reg/imm instruction based on the t_addrmode_s# mode and what it returned. There is some obvious badness to this. In particular, it's hard to do MC-encoding when the instruction may change out from underneath you after the t_addrmode_s# variable is finally resolved. The solution is to revert a long-ago change that merged the reg/reg and reg/imm versions. There is the addition of several new addressing modes. They no longer have extraneous operands associated with them. I.e., if it's reg/reg we don't have to have a dummy zero immediate tacked on to the SDNode. There are some obvious cleanups here, which will happen shortly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121747 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-14 03:36:38 +00:00
Bob Wilson	a92bac64cb	Fix some invalid alignments for Neon vld-dup and vld/st-lane instructions. Alignments smaller than the total size of the memory being loaded or stored, unless the alignment is 8 bytes, are not allowed. Add tests for this, too. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121506 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-10 19:37:42 +00:00
Evan Cheng	48575f6ea7	Making use of VFP / NEON floating point multiply-accumulate / subtraction is difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Work in progress, only A+B are enabled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120960 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-05 22:04:16 +00:00
Bob Wilson	6c4c982f83	Add support for NEON VLD3-dup instructions. The encoding for alignment in VLD4-dup instructions is still a work in progress. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120356 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-30 00:00:35 +00:00

1 2 3 4 5 ...

390 Commits