llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-09-03 02:30:00 +00:00

Author	SHA1	Message	Date
Bob Wilson	103b4a571e	Revert "Adding support for llvm.arm.neon.vaddl[su].* and" This reverts r170694. The operations can be represented in IR without adding any new intrinsics. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170765 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-20 21:09:38 +00:00
Renato Golin	332bd79951	Adding support for llvm.arm.neon.vaddl[su].* and llvm.arm.neon.vsub[su].* intrinsics. Patch by Pete Couperus <pjcoup@gmail.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170694 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-20 13:52:11 +00:00
Evan Cheng	733c6b1db1	LLVM sdisel normalize bit extraction of the form: ((x & 0xff00) >> 8) << 2 to (x >> 6) & 0x3fc This is general goodness since it folds a left shift into the mask. However, the trailing zeros in the mask prevents the ARM backend from using the bit extraction instructions. And worse since the mask materialization may require an addition instruction. This comes up fairly frequently when the result of the bit twiddling is used as memory address. e.g. = ptr[(x & 0xFF0000) >> 16] We want to generate: ubfx r3, r1, #16, #8 ldr.w r3, [r0, r3, lsl #2] vs. mov.w r9, #1020 and.w r2, r9, r1, lsr #14 ldr r2, [r0, r2] Add a late ARM specific isel optimization to ARMDAGToDAGISel::PreprocessISelDAG(). It folds the left shift to the 'base + offset' address computation; change the mask to one which doesn't have trailing zeros and enable the use of ubfx. Note the optimization has to be done late since it's target specific and we don't want to change the DAG normalization. It's also fairly restrictive as shifter operands are not always free. It's only done for lsh 1 / 2. It's known to be free on some cpus and they are most common for address computation. This is a slight win for blowfish, rijndael, etc. rdar://12870177 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170581 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-19 20:16:09 +00:00
Quentin Colombet	b519351b87	Disable ARM partial flag dependency optimization at -Oz To not over constrain the scheduler for ARM in thumb mode, some optimizations for code size reduction, specific to ARM thumb, are blocked when they add a dependency (like write after read dependency). Disables this check when code size is the priority, i.e., code is compiled with -Oz. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170462 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-18 22:47:16 +00:00
Andrew Trick	04f52e1300	MISched: add dependence to ExitSU to model live-out latency. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170454 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-18 20:53:01 +00:00
Evan Cheng	376642ed62	Some enhancements for memcpy / memset inline expansion. 1. Teach it to use overlapping unaligned load / store to copy / set the trailing bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies. 2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g. x86 and ARM. 3. When memcpy from a constant string, do not replace the load with a constant if it's not possible to materialize an integer immediate with a single instruction (required a new target hook: TLI.isIntImmLegal()). 4. Use unaligned load / stores more aggressively if target hooks indicates they are "fast". 5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8. Also increase the threshold to something reasonable (8 for memset, 4 pairs for memcpy). This significantly improves Dhrystone, up to 50% on ARM iOS devices. rdar://12760078 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169791 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-10 23:21:26 +00:00
Tim Northover	6eb3e87df0	Added Mapping Symbols for ARM ELF Before this patch, when you objdump an LLVM-compiled file, objdump tried to decode data-in-code sections as if they were code. This patch adds the missing Mapping Symbols, as defined by "ELF for the ARM Architecture" (ARM IHI 0044D). Patch based on work by Greg Fitzgerald. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169609 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-07 16:50:23 +00:00
Dmitri Gribenko	00e97c2126	Fix typos in CHECK lines. Patch by Alexander Zinenko. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169547 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-06 21:24:47 +00:00
Evan Cheng	824ec7d01a	Properly fix the tes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169464 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-06 02:29:29 +00:00
NAKAMURA Takumi	e1ab8e3e73	llvm/test/CodeGen/ARM/extload-knownzero.ll: Try to unbreak, to add -O0. I guess Chad expects fastisel here. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169463 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-06 02:22:58 +00:00
Chad Rosier	c9758b1366	[arm fast-isel] Make the fast-isel implementation of memcpy respect alignment. rdar://12821569 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169460 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-06 01:34:31 +00:00
Evan Cheng	8a7186dbc2	Let targets provide hooks that compute known zero and ones for any_extend and extload's. If they are implemented as zero-extend, or implicitly zero-extend, then this can enable more demanded bits optimizations. e.g. define void @foo(i16* %ptr, i32 %a) nounwind { entry: %tmp1 = icmp ult i32 %a, 100 br i1 %tmp1, label %bb1, label %bb2 bb1: %tmp2 = load i16* %ptr, align 2 br label %bb2 bb2: %tmp3 = phi i16 [ 0, %entry ], [ %tmp2, %bb1 ] %cmp = icmp ult i16 %tmp3, 24 br i1 %cmp, label %bb3, label %exit bb3: call void @bar() nounwind br label %exit exit: ret void } This compiles to the followings before: push {lr} mov r2, #0 cmp r1, #99 bhi LBB0_2 @ BB#1: @ %bb1 ldrh r2, [r0] LBB0_2: @ %bb2 uxth r0, r2 cmp r0, #23 bhi LBB0_4 @ BB#3: @ %bb3 bl _bar LBB0_4: @ %exit pop {lr} bx lr The uxth is not needed since ldrh implicitly zero-extend the high bits. With this change it's eliminated. rdar://12771555 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169459 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-06 01:28:01 +00:00
Evan Cheng	c8e7045c8a	ARM custom lower ctpop for vector types. Patch by Pete Couperus. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169325 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-04 22:41:50 +00:00
Bill Wendling	9493dae613	Use the 'count' attribute to calculate the upper bound of an array. The count attribute is more accurate with regards to the size of an array. It also obviates the upper bound attribute in the subrange. We can also better handle an unbound array by setting the count to -1 instead of the lower bound to 1 and upper bound to 0. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169312 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-04 21:34:03 +00:00
Bill Wendling	a7645a3c66	Add a 'count' field to the DWARF subrange. The count field is necessary because there isn't a difference between the 'lo' and 'hi' attributes for a one-element array and a zero-element array. When the count is '0', we know that this is a zero-element array. When it's >=1, then it's a normal constant sized array. When it's -1, then the array is unbounded. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169218 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-04 06:20:49 +00:00
Manman Ren	69261a6442	Stack Alignment: when creating stack objects in MachineFrameInfo, make sure the alignment is clamped to TargetFrameLowering.getStackAlignment if the target does not support stack realignment or the option "realign-stack" is off. This will cause miscompile if the address is treated as aligned and add is replaced with or in DAGCombine. Added a bool StackRealignable to TargetFrameLowering to check whether stack realignment is implemented for the target. Also added a bool RealignOption to MachineFrameInfo to check whether the option "realign-stack" is on. rdar://12713765 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169197 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-04 00:52:33 +00:00
Jakob Stoklund Olesen	8c3dccde92	Simplify REG_SEQUENCE lowering. The TwoAddressInstructionPass takes the machine code out of SSA form by expanding REG_SEQUENCE instructions into copies. It is no longer necessary to rewrite the registers used by a REG_SEQUENCE instruction because the new coalescer algorithm can do it now. REG_SEQUENCE is just converted to a sequence of sub-register copies now. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169067 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-01 01:06:44 +00:00
Sebastian Pop	cb4953089b	Codegen failure for vmull with small vectors Codegen was failing with an assertion because of unexpected vector operands when legalizing the selection DAG for a MUL instruction. The asserting code was legalizing multiplies for vectors of size 128 bits. It uses a custom lowering to try and detect cases where it can use a VMULL instruction instead of a VMOVL + VMUL. The code was looking for input operands to the MUL that had been sign or zero extended. If it found the extended operands it would drop the sign/zero extension and use the original vector size as input to a VMULL instruction. The code assumed that the original input vector was 64 bits so that after dropping the extension it would fit directly into a D register and could be used as an operand of a VMULL instruction. The input code that trigger the failure used a vector of <4 x i8> that was sign extended to <4 x i32>. It was not safe to drop the sign extension in this case because the original vector is only 32 bits wide. The fix is to insert a sign extension for the vector to reach the required 64 bit size. In this particular example, the vector would need to be sign extented to a <4 x i16>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169024 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-30 19:08:04 +00:00
Bill Wendling	7360116048	Handle the situation where CodeGenPrepare removes a reference to a BB that has the last invoke instruction in the function. This also removes the last landing pad in an function. This is fine, but with SjLj EH code, we've already placed a bunch of code in the 'entry' block, which expects the landing pad to stick around. When we get to the situation where CGP has removed the last landing pad, go ahead and nuke the SjLj instructions from the 'entry' block. <rdar://problem/12721258> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168930 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-29 19:38:06 +00:00
Silviu Baranga	35b3df6e31	Added atomic 64 min/max/umin/umax instrinsics support in the ARM backend. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168886 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-29 14:41:25 +00:00
Jakob Stoklund Olesen	89bea17af2	Avoid rewriting instructions twice. This could cause miscompilations in targets where sub-register composition is not always idempotent (ARM). <rdar://problem/12758887> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168837 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-29 00:26:11 +00:00
Benjamin Kramer	350c00843b	ARM: Implement CanLowerReturn so large vectors get expanded into sret. Fixes 14337. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168809 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-28 20:55:10 +00:00
Chad Rosier	92a6e532b8	Add -verify-machineinstrs to these fast-isel test cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168723 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-27 20:49:56 +00:00
Manman Ren	39834da697	CSE: allow PerformTrivialCoalescing to check copies across basic block boundaries. Given the following case: BB0 %vreg1<def> = SUBrr %vreg0, %vreg7 %vreg2<def> = COPY %vreg7 BB1 %vreg10<def> = SUBrr %vreg0, %vreg2 We should be able to CSE between SUBrr in BB0 and SUBrr in BB1. rdar://12462006 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168717 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-27 18:58:41 +00:00
Ulrich Weigand	dba37a3c43	Never use .lcomm on platforms where it does not accept an alignment argument. Instead, use a pair of .local and .comm directives. This avoids spurious differences between binaries built by the integrated assembler vs. those built by the external assembler, since the external assembler may impose alignment requirements on .lcomm symbols where the integrated assembler does not. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168704 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-27 16:11:16 +00:00
Chad Rosier	277068fe40	Extend test case for r168657. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168658 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-27 01:10:48 +00:00
Tim Northover	310f248c22	Fix physical register liveness calculations: + Take account of clobbers + Give outputs priority over inputs since they happen later. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168360 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-20 09:56:11 +00:00
Anton Korobeynikov	2386fc8daa	Factor out type info emission into separate routine. It turned out that ARM wants different layout of type infos. This is yet another patch in attempt to fix PR7187 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168325 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-19 21:06:26 +00:00
Eli Friedman	43147afd71	Mark FP_EXTEND form v2f32 to v2f64 as "expand" for ARM NEON. Patch by Pete Couperus. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168240 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-17 01:52:46 +00:00
Chad Rosier	0a63b6ac79	[fast-isel] Add the -verify-machineinstrs to these test cases. The remaining test cases require fixes to fast-isel before the verifier can be enabled. Part of rdar://12594152 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168233 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-17 00:42:06 +00:00
Weiming Zhao	e56764bad1	Remove hard coded registers in ARM ldrexd and strexd instructions This patch replaces the hard coded GPR pair [R0, R1] of Intrinsic:arm_ldrexd and [R2, R3] of Intrinsic:arm_strexd with even/odd GPRPair reg class. Similar to the lowering of atomic_64 operation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168207 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-16 21:55:34 +00:00
Anton Korobeynikov	b1a392e7c5	Make sure FABS on v2f32 and v4f32 is legal on ARM NEON This fixes PR14359 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168200 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-16 21:15:20 +00:00
Eli Friedman	846ce8ea67	Mark FP_ROUND for converting NEON v2f64 to v2f32 as expand. Add a missing case to vector legalization so this actually works. Patch by Pete Couperus. Fixes PR12540. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168107 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-15 22:44:27 +00:00
Nadav Rotem	50b66387e3	The code pattern "imm0_255_neg" is used for checking if an immediate value is a small negative number. This patch changes the definition of negative from -0..-255 to -1..-255. I am changing this because of a bug that we had in some of the patterns that assumed that "subs" of zero does not set the carry flag. rdar://12028498 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167963 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-14 19:39:15 +00:00
Anton Korobeynikov	062a6c8380	Fix really stupid ARM EHABI info generation bug: we should not emit eh table and handler data if there are no landing pads in the function. Patch by Logan Chien with some cleanups from me. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167945 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-14 19:13:30 +00:00
Anton Korobeynikov	25efd6d556	Use TARGET2 relocation for TType references on ARM. Do some cleanup of the code while here. Inspired by patch by Logan Chien! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167904 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-14 01:47:00 +00:00
Andrew Trick	f546ac5f9b	Cleanup the main RegisterCoalescer loop. Block priorities still apply outside loops. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167793 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-13 00:34:44 +00:00
Andrew Trick	ae692f2bae	misched: Infrastructure for weak DAG edges. This adds support for weak DAG edges to the general scheduling infrastructure in preparation for MachineScheduler support for heuristics based on weak edges. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167738 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-12 19:28:57 +00:00
Evan Cheng	b341fac05a	Disable the Thumb no-return call optimization: mov lr, pc b.w _foo The "mov" instruction doesn't set bit zero to one, it's putting incorrect value in lr. It messes up backtraces. rdar://12663632 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167657 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-10 02:09:05 +00:00
Amara Emerson	214fd3d244	Recommit modified r167540. Improve ARM build attribute emission for architectures types. This also changes the default architecture emitted for a generic CPU to "v7". git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167574 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-08 09:51:45 +00:00
Quentin Colombet	43934aee71	Vext Lowering was missing opportunities git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167318 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-02 21:32:17 +00:00
Quentin Colombet	9a419f656e	Change ForceSizeOpt attribute into MinSize attribute git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167020 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-30 16:32:52 +00:00
Jakob Stoklund Olesen	573303e62a	Completely disallow partial copies in adjustCopiesBackFrom(). Partial copies can show up even when CoalescerPair.isPartial() returns false. For example: %vreg24:dsub_0<def> = COPY %vreg31:dsub_0; QPR:%vreg24,%vreg31 Such a partial-partial copy is not good enough for the transformation adjustCopiesBackFrom() needs to do. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166944 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-29 17:51:52 +00:00
Quentin Colombet	80acd97266	[code size][ARM] Emit regular call instructions instead of the move, branch sequence git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166854 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-27 01:10:17 +00:00
Jakob Stoklund Olesen	17f42e02a1	Revert r163298 "Optimize codegen for VSETLNi{8,16,32} operating on Q registers." Keep the integer_insertelement test case, the new coalescer can handle this kind of lane insertion without help from pseudo-instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166835 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-26 23:39:46 +00:00
Evan Cheng	d258eb3ec5	Fix a miscompilation caused by a typo. When turning a adde with negative value into a sbc with a positive number, the immediate should be complemented, not negated. Also added a missing pattern for ARM codegen. rdar://12559385 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166613 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-24 19:53:01 +00:00
Bill Wendling	8f47fc8f00	When a block ends in an indirect branch, add its successors to the machine basic block. The CFG of the machine function needs to know that the targets of the indirect branch are successors to the indirect branch. <rdar://problem/12529625> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166448 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-22 23:30:04 +00:00
Shuxin Yang	970755e519	This patch is to fix radar://8426430. It is about llvm support of __builtin_debugtrap() which is supposed to consistently raise SIGTRAP across all systems. In contrast, __builtin_trap() behave differently on different systems. e.g. it raises SIGTRAP on ARM, and SIGILL on X86. The purpose of __builtin_debugtrap() is to consistently provide "trap" functionality, in the mean time preserve the compatibility with on gcc on __builtin_trap(). The X86 backend is already able to handle debugtrap(). This patch is to: 1) make front-end recognize "__builtin_debugtrap()" (emboddied in the one-line change to Clang). 2) In DAG legalization phase, by default, "debugtrap" will be replaced with "trap", which make the __builtin_debugtrap() "available" to all existing ports without the hassle of changing their code. 3) If trap-function is specified (via -trap-func=xyz to llc), both __builtin_debugtrap() and __builtin_trap() will be expanded into the function call of the specified trap function. This behavior may need change in the future. The provided testing-case is to make sure 2) and 3) are working for ARM port, and we already have a testing case for x86. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166300 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-19 20:11:16 +00:00
Stepan Dyatkovskiy	0d3c8d5d16	ARM: Removed extra stack frame object for fixed byval arguments, VarArgsStyleRegisters invocation was reworked due to some improper usage in past. PR14099 also demonstrates it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166273 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-19 08:23:06 +00:00
Jakob Stoklund Olesen	cdcdfd2cab	Revert r166046 "Switch back to the old coalescer for now to fix the 32 bit bit" A fix for PR14098, including the test case is in the next commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166067 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-16 22:51:55 +00:00
Rafael Espindola	6f7cccd2e2	Switch back to the old coalescer for now to fix the 32 bit bit llvm+clang+compiler-rt bootstrap. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166046 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-16 19:34:06 +00:00
Stepan Dyatkovskiy	b52ba9f8a8	Issue: Stack is formed improperly for long structures passed as byval arguments for EABI mode. If we took AAPCS reference, we can found the next statements: A: "If the argument requires double-word alignment (8-byte), the NCRN (Next Core Register Number) is rounded up to the next even register number." (5.5 Parameter Passing, Stage C, C.3). B: "The alignment of an aggregate shall be the alignment of its most-aligned component." (4.3 Composite Types, 4.3.1 Aggregates). So if we have structure with doubles (9 double fields) and 3 Core unused registers (r1, r2, r3): caller should use r2 and r3 registers only. Currently r1,r2,r3 set is used, but it is invalid. Callee VA routine should also use r2 and r3 regs only. All is ok here. This behaviour is guessed by rounding up SP address with ADD+BFC operations. Fix: Main fix is in ARMTargetLowering::HandleByVal. If we detected AAPCS mode and 8 byte alignment, we waste odd registers then. P.S.: I also improved LDRB_POST_IMM regression test. Since ldrb instruction will not generated by current regression test after this patch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166018 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-16 07:16:47 +00:00
Jim Grosbach	64ba635209	ARM: v1i64 and v2i64 VBSL intrinsic support. rdar://12502028 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165981 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-15 21:23:40 +00:00
Silviu Baranga	bb1078ea13	Fixed PR13938: the ARM backend was crashing because it couldn't select a VDUPLANE node with the vector input size different from the output size. This was bacause the BUILD_VECTOR lowering code didn't check that the size of the input vector was correct for using VDUPLANE. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165929 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-15 09:41:32 +00:00
Jakob Stoklund Olesen	d86296a4ae	Drop <def,dead> flags when merging into an unused lane. The new coalescer can merge a dead def into an unused lane of an otherwise live vector register. Clear the <dead> flag when that happens since the flag refers to the full virtual register which is still live after the partial dead def. This fixes PR14079. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165877 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-13 17:26:47 +00:00
Jakob Stoklund Olesen	af89690760	Allow for loops in LiveIntervals::pruneValue(). It is possible that the live range of the value being pruned loops back into the kill MBB where the search started. When that happens, make sure that the beginning of KillMBB is also pruned. Instead of starting a DFS at KillMBB and skipping the root of the search, start a DFS at each KillMBB successor, and allow the search to loop back to KillMBB. This fixes PR14078. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165872 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-13 16:15:31 +00:00
Manman Ren	e6c3cc8dc5	ARM: tail-call inside a function where part of a byval argument is on caller's local frame causes problem. For example: void f(StructToPass s) { g(&s, sizeof(s)); } will cause problem with tail-call since part of s is passed via registers and saved in f's local frame. When g tries to access s, part of s may be corrupted since f's local frame is popped out before the tail-call. The current fix is to disable tail-call if getVarArgsRegSaveSize is not 0 for the caller. This is a conservative approach, if we can prove the address of s or part of s is not taken and passed to g, it should be okay to perform tail-call. rdar://12442472 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165853 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-12 23:39:43 +00:00
Jim Grosbach	4346fa9437	ARM: Mark VSELECT as 'expand'. The backend already pattern matches to form VBSL when it can. We may want to teach it to use the vbsl intrinsics at some point to prevent machine licm from mucking with this, but using the Expand is completely correct. http://llvm.org/bugs/show_bug.cgi?id=13831 http://llvm.org/bugs/show_bug.cgi?id=13961 Patch by Peter Couperus <peter.couperus@st.com>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165845 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-12 22:59:21 +00:00
Evan Cheng	d36696c4e0	Legalizer optimize a pair of div / mod to a call to divrem libcall if they are not legal. However, it should use a div instruction + mul + sub if divide is legal. The rem legalization code was missing a check and incorrectly uses a divrem libcall even when div is legal. rdar://12481395 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165778 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-12 01:15:47 +00:00
Evan Cheng	6b61491de3	Add isel patterns for v2f32 / v4f32 neon.vbsl intrinsics. rdar://12471808 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165673 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-10 23:06:34 +00:00
Stepan Dyatkovskiy	2c2cb3c09f	Fix for LDRB instruction: SDNode for LDRB_POST_IMM is invalid: number of registers added to SDNode fewer that described in .td. 7 ops is needed, but SDNode with only 6 is created. In more details: In ARMInstrInfo.td, in multiclass AI2_ldridx, in definition _POST_IMM, offset operand is defined as am2offset_imm. am2offset_imm is complex parameter type, and actually it consists from dummy register and imm itself. As I understood trick with dummy reg was made for AsmParser. In ARMISelLowering.cpp, this dummy register was not added to SDNode, and it cause crash in Peephole Optimizer pass. The problem fixed by setting up additional dummy reg when emitting LDRB_POST_IMM instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165617 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-10 11:43:40 +00:00
Stepan Dyatkovskiy	661afe75e8	Issue description: SchedulerDAGInstrs::buildSchedGraph ignores dependencies between FixedStack objects and byval parameters. So loading byval parameters from stack may be inserted before it will be stored, since these operations are treated as independent. Fix: Currently ARMTargetLowering::LowerFormalArguments saves byval registers with FixedStack MachinePointerInfo. To fix the problem we need to store byval registers with MachinePointerInfo referenced to first the "byval" parameter. Also commit adds two new fields to the InputArg structure: Function's argument index and InputArg's part offset in bytes relative to the start position of Function's argument. E.g.: If function's argument is 128 bit width and it was splitted onto 32 bit regs, then we got 4 InputArg structs with same arg index, but different offset values. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165616 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-10 11:37:36 +00:00
Jim Grosbach	837c28a840	ARM: locate user-defined text sections next to default text. Make sure functions located in user specified text sections (via the section attribute) are located together with the default text sections. Otherwise, for large object files, the relocations for call instructions are more likely to be out of range. This becomes even more likely in the presence of LTO. rdar://12402636 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165254 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-04 21:33:24 +00:00
Silviu Baranga	541a858f1a	Fixed a bug in the ExecutionDependencyFix pass that caused dependencies to not propagate through implicit defs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165102 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-03 08:29:36 +00:00
Jakob Stoklund Olesen	27cb347d0e	Make sure the whole live range is covered when values are pruned twice. JoinVals::pruneValues() calls LIS->pruneValue() to avoid conflicts when overlapping two different values. This produces a set of live range end points that are used to reconstruct the live range (with SSA update) after joining the two registers. When a value is pruned twice, the set of end points was insufficient: v1 = DEF v1 = REPLACE1 v1 = REPLACE2 KILL v1 The end point at KILL would only reconstruct the live range from REPLACE2 to KILL, leaving the range REPLACE1-REPLACE2 dead. Add REPLACE2 as an end point in this case so the full live range is reconstructed. This fixes PR13999. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165056 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-02 21:46:39 +00:00
Bob Wilson	eb1641d54a	Add LLVM support for Swift. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164899 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-29 21:43:49 +00:00
Bob Wilson	154418cdd8	Whitespace. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164898 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-29 21:27:31 +00:00
Jakob Stoklund Olesen	5cf178f281	Enable the new coalescer algorithm by default. The new coalescer is better at merging values into unused vector lanes, improving NEON code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164794 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-27 21:06:02 +00:00
Jush Lu	8f50647662	[arm-fast-isel] Add support for ELF PIC. This is a preliminary step towards ELF support; currently ARMFastISel hasn't been used for ELF object files yet. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164759 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-27 05:21:41 +00:00
NAKAMURA Takumi	9f4df20fe0	ARM/atomicrmw_minmax.ll: Fix RUN line. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164687 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-26 10:12:20 +00:00
James Molloy	d6d10ae151	Fix ordering of operands on lowering of atomicrmw min/max nodes on ARM. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164685 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-26 09:48:32 +00:00
Bill Wendling	f18eb5887f	Generate an error message instead of asserting or segfaulting when we have a scalar-to-vector conversion that we cannot handle. For instance, when an invalid constraint is used in an inline asm statement. <rdar://problem/12284092> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164662 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-26 06:16:18 +00:00
Bill Wendling	1293130f4f	Generate an error message instead of asserting or segfaulting when we have a scalar-to-vector conversion that we cannot handle. For instance, when an invalid constraint is used in an inline asm statement. <rdar://problem/12284092> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164657 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-26 04:04:19 +00:00
Chad Rosier	e5e674ba11	[fast-isel] Fallback to SelectionDAG isel if we require strict alignment for non-aligned i32 loads/stores. rdar://12304911 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164381 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-21 16:58:35 +00:00
NAKAMURA Takumi	df03a28990	llvm/test/CodeGen/ARM/fast-isel.ll: Fix possible typos, s/@unaligned_i16_store/@unaligned_i16_load/g. I guess this had apparently passed in +Asserts possibly due to verborsity. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164350 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-21 01:15:05 +00:00
Chad Rosier	3ca380d8d7	Testcase does not need to be this strict. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164347 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-21 00:47:08 +00:00
Chad Rosier	ba6aec2cf3	Add newline. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164346 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-21 00:43:18 +00:00
Chad Rosier	d70c98e884	[fast-isel] Fallback to SelectionDAG isel if we require strict alignment for non-halfword-aligned i16 loads/stores. rdar://12304911 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164345 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-21 00:41:42 +00:00
Jim Grosbach	ced674e470	ARM: Use a dedicated intrinsic for vector bitwise select. The expression based expansion too often results in IR level optimizations splitting the intermediate values into separate basic blocks, preventing the formation of the VBSL instruction as the code author intended. In particular, LICM would often hoist part of the computation out of a loop. rdar://11011471 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164340 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-21 00:18:20 +00:00
Jakob Stoklund Olesen	e6e2d8cd90	Ignore PHI-defs for -new-coalescer interference checks. A PHI can't create interference on its own. If two live ranges interfere at a PHI, they must also interfere when leaving one of the PHI predecessors. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164330 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-20 23:08:42 +00:00
Evan Cheng	2dad6b501b	Try to make these tests more portable. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164320 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-20 21:35:21 +00:00
Jakob Stoklund Olesen	d40d4c34f7	Resolve conflicts involving dead vector lanes for -new-coalescer. A common coalescing conflict in vector code is lane insertion: %dst = FOO %src = BAR %dst:ssub0 = COPY %src The live range of %src interferes with the ssub0 lane of %dst, but that lane is never read after %src would have clobbered it. That makes it safe to merge the live ranges and eliminate the COPY: %dst = FOO %dst:ssub0 = BAR This patch teaches the new coalescer to resolve conflicts where dead vector lanes would be clobbered, at least as long as the clobbered vector lanes don't escape the basic block. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164250 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-19 21:29:18 +00:00
Evan Cheng	b37b6ca4bb	MOVi16 (movw) is only legal on cpus with V6T2 support. rdar://12300648 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164169 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-18 21:24:16 +00:00
James Molloy	97ecb83dff	More domain conversion; convert VFP VMOVS to NEON instructions in more cases - when we may clobber the other S-lane by converting an S to a D instruction, make an effort to work out if the S lane is clobberable or not. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164114 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-18 08:31:15 +00:00
Evan Cheng	d10eab0a95	Use vld1 / vst2 for unaligned v2f64 load / store. e.g. Use vld1.16 for 2-byte aligned address. Based on patch by David Peixotto. Also use vld1.64 / vst1.64 with 128-bit alignment to take advantage of alignment hints. rdar://12090772, rdar://12238782 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164089 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-18 01:42:45 +00:00
Jakob Stoklund Olesen	87f7864c6d	Merge into undefined lanes under -new-coalescer. Add LIS::pruneValue() and extendToIndices(). These two functions are used by the register coalescer when merging two live ranges requires more than a trivial value mapping as supported by LiveInterval::join(). The pruneValue() function can remove the part of a value number that is going to conflict in join(). Afterwards, extendToIndices can restore the live range, using any new dominating value numbers and updating the SSA form. Use this complex value mapping to support merging a register into a vector lane that has a conflicting value, but the clobbered lane is undef. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164074 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-17 23:03:25 +00:00
Silviu Baranga	c8bf0f8662	Removed the VMLxForwarding feature for the Cortex-A15 target. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164030 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-17 14:10:54 +00:00
Silviu Baranga	616471d4bf	This patch introduces A15 as a target in LLVM. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163803 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-13 15:05:10 +00:00
Kristof Beyls	789efbad2a	Fix constant folding through bitcasts by no longer relying on undefined behaviour (converting NaN values between float and double). SelectionDAG::getConstantFP(double Val, EVT VT, bool isTarget); should not be used when Val is not a simple constant (as the comment in SelectionDAG.h indicates). This patch avoids using this function when folding an unknown constant through a bitcast, where it cannot be guaranteed that Val will be a simple constant. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163703 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-12 11:25:02 +00:00
Jakob Stoklund Olesen	519daf5d2d	Don't attempt to use flags from predicated instructions. The ARM backend can eliminate cmp instructions by reusing flags from a nearby sub instruction with similar arguments. Don't do that if the sub is predicated - the flags are not written unconditionally. <rdar://problem/12263428> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163535 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-10 19:17:25 +00:00
James Molloy	8cd08bf4ac	Fix an assertion failure when optimising a shufflevector incorrectly into concat_vectors, and a followup bug with SelectionDAG::getNode() creating nodes with invalid types. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163511 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-10 14:01:21 +00:00
Craig Topper	a1fb1d2ed7	Set operation action for FFLOOR to Expand for all vector types for X86. Set FFLOOR of v4f32 to Expand for ARM. v2f64 was already correct. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163458 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-08 04:58:43 +00:00
James Molloy	ba8562af44	Improve codegen for BUILD_VECTORs on ARM. If we have a BUILD_VECTOR that is mostly a constant splat, it is often better to splat that constant then insertelement the non-constant lanes instead of insertelementing every lane from an undef base. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163304 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-06 09:55:02 +00:00
James Molloy	6c822eea47	Optimize codegen for VSETLNi{8,16,32} operating on Q registers. Degenerate to a VSETLN on D registers, instead of an (INSERT_SUBREG (VSETLN (EXTRACT_SUBREG ))) sequence to help the register coalescer. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163298 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-06 09:16:01 +00:00
Jakob Stoklund Olesen	098c6a547f	Use predication instead of pseudo-opcodes when folding into MOVCC. Now that it is possible to dynamically tie MachineInstr operands, predicated instructions are possible in SSA form: %vreg3<def> = SUBri %vreg1, -2147483647, pred:14, pred:%noreg, %opt:%noreg %vreg4<def,tied1> = MOVCCr %vreg3<tied0>, %vreg1, %pred:12, pred:%CPSR Becomes a predicated SUBri with a tied imp-use: SUBri %vreg1, -2147483647, pred:13, pred:%CPSR, opt:%noreg, %vreg1<imp-use,tied0> This means that any instruction that is safe to move can be folded into a MOVCC, and the *CC pseudo-instructions are no longer needed. The test case changes reflect that Thumb2SizeReduce recognizes the predicated instructions. It didn't understand the pseudos. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163274 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-05 23:58:02 +00:00
Tim Northover	7bebddf55e	Strip old MachineInstrs after we know we can put them back. Previous patch accidentally decided it couldn't convert a VFP to a NEON instruction after it had already destroyed the old one. Not a good move. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163230 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-05 18:37:53 +00:00
Silviu Baranga	3d5e161fe4	Fixed the DAG combiner to better handle the folding of AND nodes for vector types. The previous code was making the assumption that the length of the bitmask returned by isConstantSplat was equal to the size of the vector type. Now we first make sure that the splat value has at least the length of the vector lane type, then we only use as many fields as we have available in the splat value. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163203 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-05 08:57:21 +00:00
Arnold Schwaighofer	67514e9066	Patch to implement UMLAL/SMLAL instructions for the ARM architecture This patch corrects the definition of umlal/smlal instructions and adds support for matching them to the ARM dag combiner. Bug 12213 Patch by Yin Ma! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163136 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-04 14:37:49 +00:00
Nadav Rotem	9f40cb32ac	Not all targets have efficient ISel code generation for select instructions. For example, the ARM target does not have efficient ISel handling for vector selects with scalar conditions. This patch adds a TLI hook which allows the different targets to report which selects are supported well and which selects should be converted to CF duting codegen prepare. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163093 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-02 12:10:19 +00:00
Nadav Rotem	f55ef64544	Generate better select code by allowing the target to use scalar select, and not sign-extend. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163086 91177308-0d34-0410-b5e6-96231b3b80d8	2012-09-02 08:20:07 +00:00

1 2 3 4 5 ...

1604 Commits