llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-11-16 11:05:54 +00:00

Author	SHA1	Message	Date
Evan Cheng	48575f6ea7	Making use of VFP / NEON floating point multiply-accumulate / subtraction is difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Work in progress, only A+B are enabled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120960 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-05 22:04:16 +00:00
Chris Lattner	9637d5b22e	Teach X86ISelLowering that the second result of X86ISD::UMUL is a flags result. This allows us to compile: void *test12(long count) { return new int[count]; } into: test12: movl $4, %ecx movq %rdi, %rax mulq %rcx movq $-1, %rdi cmovnoq %rax, %rdi jmp __Znam ## TAILCALL instead of: test12: movl $4, %ecx movq %rdi, %rax mulq %rcx seto %cl testb %cl, %cl movq $-1, %rdi cmoveq %rax, %rdi jmp __Znam Of course it would be even better if the regalloc inverted the cmov to 'cmovoq', which would eliminate the need for the 'movq %rdi, %rax'. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120936 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-05 07:49:54 +00:00
Chris Lattner	b20e0b1fdd	it turns out that when ".with.overflow" intrinsics were added to the X86 backend that they were all implemented except umul. This one fell back to the default implementation that did a hi/lo multiply and compared the top. Fix this to check the overflow flag that the 'mul' instruction sets, so we can avoid an explicit test. Now we compile: void *func(long count) { return new int[count]; } into: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] seto %cl ## encoding: [0x0f,0x90,0xc1] testb %cl, %cl ## encoding: [0x84,0xc9] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL instead of: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] testq %rdx, %rdx ## encoding: [0x48,0x85,0xd2] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL Other than the silly seto+test, this is using the o bit directly, so it's going in the right direction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120935 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-05 07:30:36 +00:00
Chris Lattner	777dd07394	fix the rest of the linux miscompares :) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120933 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-05 02:08:07 +00:00
Chris Lattner	96908b17ae	generalize the previous check to handle -1 on either side of the select, inserting a not to compensate. Add a missing isZero check that I lost somehow. This improves codegen of: void *func(long count) { return new int[count]; } from: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] testq %rdx, %rdx ## encoding: [0x48,0x85,0xd2] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL ## encoding: [0xeb,A] to: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] cmpq $1, %rdx ## encoding: [0x48,0x83,0xfa,0x01] sbbq %rdi, %rdi ## encoding: [0x48,0x19,0xff] notq %rdi ## encoding: [0x48,0xf7,0xd7] orq %rax, %rdi ## encoding: [0x48,0x09,0xc7] jmp __Znam ## TAILCALL ## encoding: [0xeb,A] git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120932 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-05 02:00:51 +00:00
Chris Lattner	c8c20d1486	relax this to handle linux defaulting to -static. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120930 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-05 01:31:13 +00:00
Chris Lattner	a2b5600e61	Improve an integer select optimization in two ways: 1. generalize (select (x == 0), -1, 0) -> (sign_bit (x - 1)) to: (select (x == 0), -1, y) -> (sign_bit (x - 1)) \| y 2. Handle the identical pattern that happens with !=: (select (x != 0), y, -1) -> (sign_bit (x - 1)) \| y cmov is often high latency and can't fold immediates or memory operands. For example for (x == 0) ? -1 : 1, before we got: < testb %sil, %sil < movl $-1, %ecx < movl $1, %eax < cmovel %ecx, %eax now we get: > cmpb $1, %sil > sbbl %eax, %eax > orl $1, %eax git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120929 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-05 01:23:24 +00:00
Chris Lattner	bced6a1b8f	merge some tests into select.ll and make them more specific. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120928 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-05 01:13:58 +00:00
Chris Lattner	bbdabf411b	rename test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120927 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-05 01:02:23 +00:00
Chris Lattner	63d7c17ff1	remove two tests that aren't really testing anything. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120926 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-05 01:02:13 +00:00
Benjamin Kramer	1292c22645	Add patterns for the x86 popcnt instruction. - Also adds a new POPCNT subtarget feature that is currently enabled if the target supports SSE4.2 (nehalem) or SSE4A (barcelona). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120917 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-04 20:32:23 +00:00
Bob Wilson	c24130bade	The Thumb tADDrSPi instruction is not valid when the destination is SP. Check for that and try narrowing it to tADDspi instead. Radar 8724703. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120892 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-04 04:40:19 +00:00
Jim Grosbach	41ad0c4c73	When using the 'push' mnemonic for Thumb2 stmdb, be explicit when it's the 32-bit wide version by adding the .w suffix. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120838 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-03 20:33:01 +00:00
Devang Patel	3fda44f276	Hide tests, that check .loc, .file in output assembly, from darwin9 buildbot. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120750 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-02 23:29:58 +00:00
Devang Patel	ee4854faf3	Use set directive for StartMinusEndExpr. This is a fix for llvm-gcc-i386-darwin9 buildbot failure. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120742 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-02 21:32:30 +00:00
Evan Cheng	fabdafbacb	Fix test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120730 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-02 20:17:34 +00:00
Evan Cheng	1bf891ae6e	Fix and re-enable tail call optimization of expanded libcalls. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120622 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-01 22:59:46 +00:00
Owen Anderson	9d63d90de5	Add correct encodings for STRD and LDRD, including fixup support. Additionally, update these to unified syntax. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120589 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-01 19:18:46 +00:00
Evan Cheng	28cd48fffb	Speculatively disable x86 portion of r120501 to appease the x86_64 buildbot. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120549 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-01 03:27:20 +00:00
Jason W Kim	85fed5e0c5	ARM/MC/ELF relocation "hello world" for movw/movt. Lifted adjustFixupValue() from Darwin for sharing w ELF. Test added TODO: refactor ELFObjectWriter::RecordRelocation more. Possibly share more code with Darwin? Lots more relocations... git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120534 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-01 02:40:06 +00:00
Evan Cheng	3d2125c9db	Enable sibling call optimization of libcalls which are expanded during legalization time. Since at legalization time there is no mapping from SDNode back to the corresponding LLVM instruction and the return SDNode is target specific, this requires a target hook to check for eligibility. Only x86 and ARM support this form of sibcall optimization right now. rdar://8707777 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120501 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-30 23:55:39 +00:00
Che-Liang Chiou	21d8b9bcad	ptx: add command-line options for gpu target and ptx version git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120423 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-30 10:14:14 +00:00
Eric Christopher	c459d06ae6	Not all platforms use _<func>. Duh. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120418 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-30 09:23:54 +00:00
Eric Christopher	228232b282	Rewrite mwait and monitor support and custom lower arguments. Fixes PR8573. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120404 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-30 07:20:12 +00:00
Bob Wilson	6c4c982f83	Add support for NEON VLD3-dup instructions. The encoding for alignment in VLD4-dup instructions is still a work in progress. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120356 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-30 00:00:35 +00:00
Evan Cheng	1e0eab122b	Mark Darwin call instructions as using "r7" to prevent the frame-register assignment instructions from being moved below / above calls. rdar://8690640 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120339 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-29 22:43:27 +00:00
Benjamin Kramer	59127b2a4e	Add missing colon. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120336 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-29 22:39:38 +00:00
Benjamin Kramer	8ad87ab166	Fix some broken CHECK lines. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120332 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-29 22:34:55 +00:00
Bob Wilson	86c6d80a7a	Add support for NEON VLD3-dup instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120312 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-29 19:35:29 +00:00
Kalle Raiskila	9363f739cd	Handle lshr for i128 correctly on SPU also when shiftamount > 7. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120288 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-29 14:44:28 +00:00
Kalle Raiskila	c2ebfd454c	Enable PostRA scheduling for SPU. This speeds up selected test cases with up to 5% - no slowdowns observed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120286 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-29 10:30:25 +00:00
Bob Wilson	b1dfa7a8e0	Add support for NEON VLD2-dup instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120236 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-28 06:51:26 +00:00
Rafael Espindola	5bf7c534cf	Lower TLS_addr32 and TLS_addr64. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120225 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-27 20:43:02 +00:00
Bob Wilson	2a0e97431e	Add NEON VLD1-dup instructions (load 1 element to all lanes). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120194 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-27 06:35:16 +00:00
Kalle Raiskila	702a4046a9	Allow for 'fcmp ogt' in SPU. Fix by Visa Putkinen! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120090 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-24 11:42:17 +00:00
Bob Wilson	626613d5e8	Recognize sign/zero-extended constant BUILD_VECTORs for VMULL operations. We need to check if the individual vector elements are sign/zero-extended values. For now this only handles constants values. Radar 8687140. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120034 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-23 19:38:38 +00:00
Kalle Raiskila	0cc5b1f60e	Division by pow-of-2 is not cheap on SPU, do it with shifts. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120022 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-23 13:27:59 +00:00
Chris Lattner	2e1a75d6f4	filecheckize git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119987 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-23 02:26:52 +00:00
Evan Cheng	ab5c703fdb	Fix epilogue codegen to avoid leaving the stack pointer in an invalid state. Previously Thumb2 would restore sp from fp like this: mov sp, r7 sub, sp, #4 If an interrupt is taken after the 'mov' but before the 'sub', callee-saved registers might be clobbered by the interrupt handler. Instead, try restoring directly from sp: add sp, #4 Or, if necessary (with VLA, etc.) use a scratch register to compute sp and then restore it: sub.w r4, r7, #8 mov sp, r7 rdar://8465407 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119977 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-22 18:12:04 +00:00
Kalle Raiskila	d87e571e62	Fix a bug with extractelement on SPU. In the attached testcase, the element was never extracted (missing rotate). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119973 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-22 16:28:26 +00:00
Benjamin Kramer	ce750f0332	Implement the "if (X == 6 \|\| X == 4)" -> "if ((X\|2) == 6)" optimization. This currently only catches the most basic case, a two-case switch, but can be extended later. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119964 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-22 09:45:38 +00:00
Wesley Peck	46a928b864	Implement branch analysis in the MBlaze backend. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119951 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-21 21:53:36 +00:00
Andrew Trick	b9e6fe1e3a	Removing the useless test that I added recently. It was meant as an example, but not complicated enough to merit another test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119898 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-20 07:26:51 +00:00
Dale Johannesen	76eb5f2401	Prefetch has a MemOperand now. FileCheckize a test. This finishes up 8460971. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119848 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-19 21:49:38 +00:00
Mon P Wang	cab98e3168	Make isScalarToVector to return false if the node is a scalar. This will prevent DAGCombine from making an illegal transformation of bitcast of a scalar to a vector into a scalar_to_vector. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119819 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-19 19:08:12 +00:00
Tanya Lattner	9684a7c128	Fix bug in DAGCombiner for ARM that was trying to do a ShiftCombine on illegal types (vector should be split first). Added test case. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119749 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-18 22:06:46 +00:00
Duncan Sands	dcfd3a798f	The DAGCombiner was threading select over pairs of extending loads even if the extension types were not the same. The result was that if you fed a select with sext and zext loads, as in the testcase, then it would get turned into a zext (or sext) of the select, which is wrong in the cases when it should have been an sext (resp. zext). Reported and diagnosed by Sebastien Deldon. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119728 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-18 20:05:18 +00:00
Eric Christopher	8b3ca6216d	Rewrite stack callee saved spills and restores to use push/pop instructions. Remove movePastCSLoadStoreOps and associated code for simple pointer increments. Update routines that depended upon other opcodes for save/restore. Adjust all testcases accordingly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119725 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-18 19:40:05 +00:00
Rafael Espindola	5c0556341e	Change CodeGen to use .loc directives. This produces a lot more readable output and testing is easier. A good example is the unknown-location.ll test that now can just look for ".loc 1 0 0". We also don't use a DW_LNE_set_address for every address change anymore. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119613 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-18 02:04:25 +00:00
Dale Johannesen	b4ac2858da	Do not throw away alignment when generating the DAG for memset; we may need it to decide between MOVAPS and MOVUPS later. Adjust a test that was looking for wrong code. PR 3866 / 8675131. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119605 91177308-0d34-0410-b5e6-96231b3b80d8	2010-11-18 01:35:23 +00:00

1 2 3 4 5 ...

4668 Commits