llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-12-14 11:32:34 +00:00

Author	SHA1	Message	Date
Bob Wilson	4711d5cda3	Remove the rest of the _sfp Neon instruction patterns. Use the same COPY_TO_REGCLASS approach as for the 2-register _sfp instructions. This change made a big difference in the code generated for the CodeGen/Thumb2/cross-rc-coalescing-2.ll test: The coalescer is still doing a fine job, but some instructions that were previously moved outside the loop are not moved now. It's using fewer VFP registers now, which is generally a good thing, so I think the estimates for register pressure changed and that affected the LICM behavior. Since that isn't obviously wrong, I've just changed the test file. This completes the work for Radar 8711675. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121730 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-13 23:02:37 +00:00
Evan Cheng	a9688c4b57	(or (and (shl A, #shamt), mask), B) => ARMbfi B, A, ~mask where lsb(mask) == #shamt. rdar://8752056 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121606 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-11 04:11:38 +00:00
Jim Grosbach	c6f9261711	ARM stm/ldm instructions require more than one register in the register list. Otherwise, a plain str/ldr should be used instead. Make sure we account for that in prologue/epilogue code generation. rdar://8745460 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121391 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-09 18:31:13 +00:00
Bob Wilson	c24130bade	The Thumb tADDrSPi instruction is not valid when the destination is SP. Check for that and try narrowing it to tADDspi instead. Radar 8724703. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120892 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-04 04:40:19 +00:00
Jim Grosbach	41ad0c4c73	When using the 'push' mnemonic for Thumb2 stmdb, be explicit when it's the 32-bit wide version by adding the .w suffix. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120838 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-03 20:33:01 +00:00
Owen Anderson	9d63d90de5	Add correct encodings for STRD and LDRD, including fixup support. Additionally, update these to unified syntax. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120589 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-01 19:18:46 +00:00
Evan Cheng	ab5c703fdb	Fix epilogue codegen to avoid leaving the stack pointer in an invalid state. Previously Thumb2 would restore sp from fp like this: mov sp, r7 sub, sp, #4 If an interrupt is taken after the 'mov' but before the 'sub', callee-saved registers might be clobbered by the interrupt handler. Instead, try restoring directly from sp: add sp, #4 Or, if necessary (with VLA, etc.) use a scratch register to compute sp and then restore it: sub.w r4, r7, #8 mov sp, r7 rdar://8465407 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119977 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-22 18:12:04 +00:00
Eric Christopher	8b3ca6216d	Rewrite stack callee saved spills and restores to use push/pop instructions. Remove movePastCSLoadStoreOps and associated code for simple pointer increments. Update routines that depended upon other opcodes for save/restore. Adjust all testcases accordingly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119725 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-18 19:40:05 +00:00
Dale Johannesen	8abe08d7f9	These tests are looking for library function names that appear to differ on Linux. Try to make them pass on Linux. Would be good for a Linux person to review this. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119572 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-17 21:57:32 +00:00
Evan Cheng	c4af4638df	Remove ARM isel hacks that fold large immediates into a pair of add, sub, and, and xor. The 32-bit move immediates can be hoisted out of loops by machine LICM but the isel hacks were preventing them. Instead, let peephole optimization pass recognize registers that are defined by immediates and the ARM target hook will fold the immediates in. Other changes include 1) do not fold and / xor into cmp to isel TST / TEQ instructions if there are multiple uses. This happens when the 'and' is live out, machine sink would have sinked the computation and that ends up pessimizing code. The peephole pass would recognize situations where the 'and' can be toggled to define CPSR and eliminate the comparison anyway. 2) Move peephole pass to after machine LICM, sink, and CSE to avoid blocking important optimizations. rdar://8663787, rdar://8241368 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119548 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-17 20:13:28 +00:00
Evan Cheng	8239daf7c8	Two sets of changes. Sorry they are intermingled. 1. Fix pre-ra scheduler so it doesn't try to push instructions above calls to "optimize for latency". Call instructions don't have the right latency and this is more likely to use introduce spills. 2. Fix if-converter cost function. For ARM, it should use instruction latencies, not # of micro-ops since multi-latency instructions is completely executed even when the predicate is false. Also, some instruction will be "slower" when they are predicated due to the register def becoming implicit input. rdar://8598427 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118135 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-03 00:45:17 +00:00
Jim Grosbach	ab3d00e535	Revert r114340 (improvements in Darwin function prologue/epilogue), as it broke assumptions about stack layout. Specifically, LR must be saved next to FP. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118026 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-02 17:35:25 +00:00
Bob Wilson	f74a429816	Overhaul memory barriers in the ARM backend. Radar 8601999. There were a number of issues to fix up here: * The "device" argument of the llvm.memory.barrier intrinsic should be used to distinguish the "Full System" domain from the "Inner Shareable" domain. It has nothing to do with using DMB vs. DSB instructions. * The compiler should never need to emit DSB instructions. Remove the ARMISD::SYNCBARRIER node and also remove the instruction patterns for DSB. * Merge the separate DMB/DSB instructions for options only used for the disassembler with the default DMB/DSB instructions. Add the default "full system" option ARM_MB::SY to the ARM_MB::MemBOpt enum. * Add a separate ARMISD::MEMBARRIER_MCR node for subtargets that implement a data memory barrier using the MCR instruction. * Fix up encodings for these instructions (except MCR). I also updated the tests and added a few new ones to check for DMB options that were not currently being exercised. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117756 91177308-0d34-0410-b5e6-96231b3b80d8	2010-10-30 00:54:37 +00:00
Evan Cheng	089751535d	Avoiding overly aggressive latency scheduling. If the two nodes share an operand and one of them has a single use that is a live out copy, favor the one that is live out. Otherwise it will be difficult to eliminate the copy if the instruction is a loop induction variable update. e.g. BB: sub r1, r3, #1 str r0, [r2, r3] mov r3, r1 cmp bne BB => BB: str r0, [r2, r3] sub r3, r3, #1 cmp bne BB This fixed the recent 256.bzip2 regression. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117675 91177308-0d34-0410-b5e6-96231b3b80d8	2010-10-29 18:09:28 +00:00
Evan Cheng	134982daa9	More accurate estimate / tracking of register pressure. - Initial register pressure in the loop should be all the live defs into the loop. Not just those from loop preheader which is often empty. - When an instruction is hoisted, update register pressure from loop preheader to the original BB. - Treat only use of a virtual register as kill since the code is still SSA. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116956 91177308-0d34-0410-b5e6-96231b3b80d8	2010-10-20 22:03:58 +00:00
Dale Johannesen	e4d31593c5	Fix crash introduced in 116852. 8573915. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116955 91177308-0d34-0410-b5e6-96231b3b80d8	2010-10-20 22:03:37 +00:00
Dale Johannesen	575cd148ce	Enable using vdup for vector constants which are splat of integers by default, and remove the controlling flag, now that LICM will hoist such vdup's. 8003375. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116852 91177308-0d34-0410-b5e6-96231b3b80d8	2010-10-19 20:00:17 +00:00
Evan Cheng	2312842de0	Re-enable register pressure aware machine licm with fixes. Hoist() may have erased the instruction during LICM so UpdateRegPressureAfter() should not reference it afterwards. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116845 91177308-0d34-0410-b5e6-96231b3b80d8	2010-10-19 18:58:51 +00:00
Daniel Dunbar	9869413802	Revert r116781 "- Add a hook for target to determine whether an instruction def is", which breaks some nightly tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116816 91177308-0d34-0410-b5e6-96231b3b80d8	2010-10-19 17:14:24 +00:00
Evan Cheng	11e8b74a7a	- Add a hook for target to determine whether an instruction def is "long latency" enough to hoist even if it may increase spilling. Reloading a value from spill slot is often cheaper than performing an expensive computation in the loop. For X86, that means machine LICM will hoist SQRT, DIV, etc. ARM will be somewhat aggressive with VFP and NEON instructions. - Enable register pressure aware machine LICM by default. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116781 91177308-0d34-0410-b5e6-96231b3b80d8	2010-10-19 00:55:07 +00:00
Bob Wilson	7d24705f65	Change register allocation order for ARM VFP and NEON registers to put the callee-saved registers at the end of the lists. Also prefer to avoid using the low registers that are in register subclasses required by certain instructions, so that those registers will more likely be available when needed. This change makes a huge improvement in spilling in some cases. Thanks to Jakob for helping me realize the problem. Most of this patch is fixing the testsuite. There are quite a few places where we're checking for specific registers. I changed those to wildcards in places where that doesn't weaken the tests. The spill-q.ll and thumb2-spill-q.ll tests stopped spilling with this change, so I added a bunch of live values to force spills on those tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116055 91177308-0d34-0410-b5e6-96231b3b80d8	2010-10-08 06:15:13 +00:00
Owen Anderson	8614167572	Enable target-specific mul-lowering on ARM, even at -Os. Remove a test that this makes irrelevant, but add a new test for the new, improved functionality. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114494 91177308-0d34-0410-b5e6-96231b3b80d8	2010-09-21 22:51:46 +00:00
Jim Grosbach	1dc335a79f	Simplify ARM callee-saved register handling by removing the distinction between the high and low registers for prologue/epilogue code. This was a Darwin-only thing that wasn't providing a realistic benefit anymore. Combining the save areas simplifies the compiler code and results in better ARM/Thumb2 codegen. For example, previously we would generate code like: push {r4, r5, r6, r7, lr} add r7, sp, #12 stmdb sp!, {r8, r10, r11} With this change, we combine the register saves and generate: push {r4, r5, r6, r7, r8, r10, r11, lr} add r7, sp, #12 rdar://8445635 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114340 91177308-0d34-0410-b5e6-96231b3b80d8	2010-09-20 19:32:20 +00:00
Jim Grosbach	e6be85e9ff	Teach the (non-MC) instruction printer to use the cannonical names for push/pop, and shift instructions on ARM. Update the tests to match. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114230 91177308-0d34-0410-b5e6-96231b3b80d8	2010-09-17 22:36:38 +00:00
Jim Grosbach	1aaf4cb393	Move thumb2 tests to the thumb2 directory git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@114206 91177308-0d34-0410-b5e6-96231b3b80d8	2010-09-17 20:34:09 +00:00
Evan Cheng	3ef1c8759a	Teach if-converter to be more careful with predicating instructions that would take multiple cycles to decode. For the current if-converter clients (actually only ARM), the instructions that are predicated on false are not nops. They would still take machine cycles to decode. Micro-coded instructions such as LDM / STM can potentially take multiple cycles to decode. If-converter should take treat them as non-micro-coded simple instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113570 91177308-0d34-0410-b5e6-96231b3b80d8	2010-09-10 01:29:16 +00:00
Bob Wilson	0f1e9457a5	Fix NEON VLD pseudo instruction itineraries that were incorrectly copied from the VST pseudos. The VLD/VST scheduling still needs work (see pr6722), but at least we shouldn't confuse the loads with the stores. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113473 91177308-0d34-0410-b5e6-96231b3b80d8	2010-09-09 05:40:26 +00:00
Jim Grosbach	65482b1bb8	Re-apply r112883: "For ARM stack frames that utilize variable sized objects and have either large local stack areas or require dynamic stack realignment, allocate a base register via which to access the local frame. This allows efficient access to frame indices not accessible via the FP (either due to being out of range or due to dynamic realignment) or the SP (due to variable sized object allocation). In particular, this greatly improves efficiency of access to spill slots in Thumb functions which contain VLAs." r112986 fixed a latent bug exposed by the above. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112989 91177308-0d34-0410-b5e6-96231b3b80d8	2010-09-03 18:37:12 +00:00
Daniel Dunbar	6a8700301c	Revert "For ARM stack frames that utilize variable sized objects and have either", it is breaking oggenc with Clang for ARMv6. This reverts commit 8d6e29cfda270be483abf638850311670829ee65. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112962 91177308-0d34-0410-b5e6-96231b3b80d8	2010-09-03 15:26:42 +00:00
Jim Grosbach	1755b3964f	For ARM stack frames that utilize variable sized objects and have either large local stack areas or require dynamic stack realignment, allocate a base register via which to access the local frame. This allows efficient access to frame indices not accessible via the FP (either due to being out of range or due to dynamic realignment) or the SP (due to variable sized object allocation). In particular, this greatly improves efficiency of access to spill slots in Thumb functions which contain VLAs. rdar://7352504 rdar://8374540 rdar://8355680 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112883 91177308-0d34-0410-b5e6-96231b3b80d8	2010-09-02 22:29:01 +00:00
Jim Grosbach	e7c1416263	Now that register allocation properly considers reserved regs, simplify the ARM register class allocation order functions to take advantage of that. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112841 91177308-0d34-0410-b5e6-96231b3b80d8	2010-09-02 18:14:29 +00:00
Chris Lattner	5bcb8a6112	temporarily revert r112664, it is causing a decoding conflict, and the testcases should be merged. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112711 91177308-0d34-0410-b5e6-96231b3b80d8	2010-09-01 16:00:50 +00:00
Bill Wendling	43a6c5e2fc	We have a chance for an optimization. Consider this code: int x(int t) { if (t & 256) return -26; return 0; } We generate this: tst.w r0, #256 mvn r0, #25 it eq moveq r0, #0 while gcc generates this: ands r0, r0, #256 it ne mvnne r0, #25 bx lr Scandalous really! During ISel time, we can look for this particular pattern. One where we have a "MOVCC" that uses the flag off of a CMPZ that itself is comparing an AND instruction to 0. Something like this (greatly simplified): %r0 = ISD::AND ... ARMISD::CMPZ %r0, 0 @ sets [CPSR] %r0 = ARMISD::MOVCC 0, -26 @ reads [CPSR] All we have to do is convert the "ISD::AND" into an "ARM::ANDS" that sets [CPSR] when it's zero. The zero value will all ready be in the %r0 register and we only need to change it if the AND wasn't zero. Easy! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112664 91177308-0d34-0410-b5e6-96231b3b80d8	2010-08-31 22:41:22 +00:00
Bob Wilson	7a9ef44b3b	Add alignment arguments to all the NEON load/store intrinsics. Update all the tests using those intrinsics and add support for auto-upgrading bitcode files with the old versions of the intrinsics. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112271 91177308-0d34-0410-b5e6-96231b3b80d8	2010-08-27 17:13:24 +00:00
Daniel Dunbar	3cc3283fcb	ARM/Thumb2: Fix a misselect in getARMCmp, when attempting to adjust a signed comparison that would overflow. - The other under/overflow cases can't actually happen because the immediates which would trigger them are legal (so we don't enter this code), but adjusted the style to make it clear the transform is always valid. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112053 91177308-0d34-0410-b5e6-96231b3b80d8	2010-08-25 16:58:05 +00:00
Bob Wilson	f955f290c9	Change ARM PKHTB and PKHBT instructions to use a shift_imm operand to avoid printing "lsl #0". This fixes the remaining parts of pr7792. Make corresponding changes for encoding/decoding these instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111251 91177308-0d34-0410-b5e6-96231b3b80d8	2010-08-17 17:23:19 +00:00
Bob Wilson	dc66edaced	Generalize a pattern for PKHTB: an SRL of 16-31 bits will guarantee that the high halfword is zero. The shift need not be exactly 16 bits. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111196 91177308-0d34-0410-b5e6-96231b3b80d8	2010-08-16 22:26:55 +00:00
Bob Wilson	b05b80160a	Convert test to FileCheck. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111195 91177308-0d34-0410-b5e6-96231b3b80d8	2010-08-16 22:21:13 +00:00
Bob Wilson	703af3ab12	Temporarily disable tail calls on ARM to work around some linker problems. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111050 91177308-0d34-0410-b5e6-96231b3b80d8	2010-08-13 22:43:33 +00:00
Jim Grosbach	b5aa11f2d6	fix silly typo git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110831 91177308-0d34-0410-b5e6-96231b3b80d8	2010-08-11 17:32:46 +00:00
Jim Grosbach	7166e622d7	Add a target triple, as the runtime library invocation varies a bit by platform. It's apparently "bl __muldf3" on linux, for example. Since that's not what we're checking here, it's more robust to just force a triple. We just wwant to check that the inline FP instructions are only generated on cpus that have them." git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110830 91177308-0d34-0410-b5e6-96231b3b80d8	2010-08-11 17:31:12 +00:00
Dan Gohman	3cc5d13f58	Temporarily disable some failing tests, until they can be properly investigated. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110825 91177308-0d34-0410-b5e6-96231b3b80d8	2010-08-11 16:36:07 +00:00
Jim Grosbach	fcba5e6b64	cortex m4 has floating point support, but only single precision. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110810 91177308-0d34-0410-b5e6-96231b3b80d8	2010-08-11 15:44:15 +00:00
Evan Cheng	7b4d31176e	Report error if codegen tries to instantiate a ARM target when the cpu does support it. e.g. cortex-m* processors. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110798 91177308-0d34-0410-b5e6-96231b3b80d8	2010-08-11 07:17:46 +00:00
Evan Cheng	11db068721	- Add subtarget feature -mattr=+db which determine whether an ARM cpu has the memory and synchronization barrier dmb and dsb instructions. - Change instruction names to something more sensible (matching name of actual instructions). - Added tests for memory barrier codegen. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110785 91177308-0d34-0410-b5e6-96231b3b80d8	2010-08-11 06:22:01 +00:00
Evan Cheng	ac096808a3	Re-apply r110655 with fixes. Epilogue must restore sp from fp if the function stack frame has a var-sized object. Also added a test case to check for the added benefit of this patch: it's optimizing away the unnecessary restore of sp from fp for some non-leaf functions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@110707 91177308-0d34-0410-b5e6-96231b3b80d8	2010-08-10 19:30:19 +00:00
Jim Grosbach	6ccfc507dc	Many Thumb2 instructions can reference the full ARM register set (i.e., have 4 bits per register in the operand encoding), but have undefined behavior when the operand value is 13 or 15 (SP and PC, respectively). The trivial coalescer in linear scan sometimes will merge a copy from SP into a subsequent instruction which uses the copy, and if that instruction cannot legally reference SP, we get bad code such as: mls r0,r9,r0,sp instead of: mov r2, sp mls r0, r9, r0, r2 This patch adds a new register class for use by Thumb2 that excludes the problematic registers (SP and PC) and is used instead of GPR for those operands which cannot legally reference PC or SP. The trivial coalescer explicitly requires that the register class of the destination for the COPY instruction contain the source register for the COPY to be considered for coalescing. This prevents errant instructions like that above. PR7499 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109842 91177308-0d34-0410-b5e6-96231b3b80d8	2010-07-30 02:41:01 +00:00
Dale Johannesen	f630c712b1	Implement vector constants which are splat of integers with mov + vdup. 8003375. This is currently disabled by default because LICM will not hoist a VDUP, so it pessimizes the code if the construct occurs inside a loop (8248029). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@109799 91177308-0d34-0410-b5e6-96231b3b80d8	2010-07-29 20:10:08 +00:00
Jim Grosbach	f27ca42552	update tests for smarter BIC usage git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108846 91177308-0d34-0410-b5e6-96231b3b80d8	2010-07-20 16:16:48 +00:00
Jim Grosbach	5423856e44	Add combiner patterns to more effectively utilize the BFI (bitfield insert) instruction for non-constant operands. This includes the case referenced in the README.txt regarding a bitfield copy. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108608 91177308-0d34-0410-b5e6-96231b3b80d8	2010-07-17 03:30:54 +00:00

1 2 3 4 5 ...

298 Commits