llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2025-02-25 19:29:53 +00:00

Author	SHA1	Message	Date
Nadav Rotem	a05f7cbbde	Reformat the docs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171091 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-26 04:59:20 +00:00
Benjamin Kramer	99f78061e0	X86: Shave off one shuffle from the pcmpeqq sequence for SSE2 by making use of and commutativity. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171064 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-25 13:09:08 +00:00
Benjamin Kramer	382ed78d3f	X86: Custom lower <2 x i64> eq and ne when SSE41 is not available. pcmpeqd, pshufd, pshufd, pand is cheaper than unpack + cmpq, sbbq, cmpq, sbbq + pack. Small speedup on loop-vectorized viterbi (-march=core2). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171063 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-25 12:54:19 +00:00
Nick Lewycky	71f30bffcf	Quiet gcc's -Wparenthesis warning. No functionality change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171044 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-24 19:58:45 +00:00
Nadav Rotem	40b04a481d	whitespace git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170997 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-23 07:33:44 +00:00
Nadav Rotem	d54fed2786	Loop Vectorizer: Update the cost model of scatter/gather operations and make them more expensive. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170995 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-23 07:23:55 +00:00
Benjamin Kramer	2f8a6cdfa3	X86: Turn mul of <4 x i32> into pmuludq when no SSE4.1 is available. pmuludq is slow, but it turns out that all the unpacking and packing of the scalarized mul is even slower. 10% speedup on loop-vectorized paq8p. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170985 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-22 16:07:56 +00:00
Benjamin Kramer	17347912b4	X86: Emit vector sext as shuffle + sra if vpmovsx is not available. Also loosen the SSSE3 dependency a bit, expanded pshufb + psra is still better than scalarized loads. Fixes PR14590. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170984 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-22 11:34:28 +00:00
Benjamin Kramer	2556c6b4b6	X86: Match pmin/pmax as a target specific dag combine. This occurs during vectorization. Part of PR14667. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170908 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-21 17:46:58 +00:00
Benjamin Kramer	739c7a83e1	X86: Match the SSE/AVX min/max vector ops using a custom node instead of intrinsics This is very mechanical, no functionality change. Preparation for PR14667. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170898 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-21 14:04:55 +00:00
Nadav Rotem	f5637c3997	Improve the X86 cost model for loads and stores. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170830 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-21 01:33:59 +00:00
Patrik Hagglund	e5c65911a6	Change TargetLowering::getTypeForExtArgOrReturn to take and return MVTs, instead of EVTs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170537 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-19 12:02:25 +00:00
Patrik Hagglund	0340557fb8	Change TargetLowering::findRepresentativeClass to take an MVT, instead of EVT. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170532 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-19 11:30:36 +00:00
NAKAMURA Takumi	16537418f4	X86ISelLowering.cpp: Fix warnings. [-Wlogical-op-parentheses] git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170523 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-19 10:12:48 +00:00
Elena Demikhovsky	4b977312c7	Optimized load + SIGN_EXTEND patterns in the X86 backend. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170506 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-19 07:50:20 +00:00
Bill Wendling	034b94b170	Rename the 'Attributes' class to 'Attribute'. It's going to represent a single attribute in the future. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170502 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-19 07:18:57 +00:00
Jakub Staszak	270bfbd3d1	Reverse order of checking SSE level when calculating compare cost, so we check AVX2 before AVX. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170464 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-18 22:57:56 +00:00
Craig Topper	b926afcc5b	Simplify BMI ANDN matching to use patterns instead of a DAG combine. Also add ANDN to isDefConvertible. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170305 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-17 05:12:30 +00:00
Benjamin Kramer	388fc6a988	X86: Add a couple of target-specific dag combines that turn VSELECTS into psubus if possible. We match the pattern "x >= y ? x-y : 0" into "subus x, y" and two special cases if y is a constant. DAGCombiner canonicalizes those so we first have to undo the canonicalization for those cases. The pattern occurs in gzip when the loop vectorizer is enabled. Part of PR14613. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170273 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-15 16:47:44 +00:00
Nadav Rotem	0a1e914f8f	TypeLegalizer: Do not generate target specific nodes with illegal types, because we cant type-legalize them. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170245 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-14 21:20:37 +00:00
Evan Cheng	946a3a9f22	Sorry about the churn. One more change to getOptimalMemOpType() hook. Did I mention the inline memcpy / memset expansion code is a mess? This patch split the ZeroOrLdSrc argument into two: IsMemset and ZeroMemset. The first indicates whether it is expanding a memset or a memcpy / memmove. The later is whether the memset is a memset of zero. It's totally possible (likely even) that targets may want to do different things for memcpy and memset of zero. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169959 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-12 02:34:41 +00:00
Evan Cheng	7d34267df6	- Rename isLegalMemOpType to isSafeMemOpType. "Legal" is a very overloade term. Also added more comments to explain why it is generally ok to return true. - Rename getOptimalMemOpType argument IsZeroVal to ZeroOrLdSrc. It's meant to be true for loaded source (memcpy) or zero constants (memset). The poor name choice is probably some kind of legacy issue. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169954 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-12 01:32:07 +00:00
Evan Cheng	61f4dfe369	Avoid using lossy load / stores for memcpy / memset expansion. e.g. f64 load / store on non-SSE2 x86 targets. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169944 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-12 00:42:09 +00:00
Patrik Hagglund	34525f9ac0	Revert EVT->MVT changes, r169836-169851, due to buildbot failures. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169854 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-11 11:14:33 +00:00
Patrik Hagglund	47fd10f2fc	Change TargetLowering::getTypeForExtArgOrReturn to take and return MVTs, instead of EVTs. Accordingly, add bitsLT (and similar) to MVT. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169850 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-11 10:20:51 +00:00
Patrik Hagglund	bade0345d1	Change TargetLowering::findRepresentativeClass to take an MVT, instead of EVT. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169845 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-11 09:57:18 +00:00
Evan Cheng	376642ed62	Some enhancements for memcpy / memset inline expansion. 1. Teach it to use overlapping unaligned load / store to copy / set the trailing bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies. 2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g. x86 and ARM. 3. When memcpy from a constant string, do not replace the load with a constant if it's not possible to materialize an integer immediate with a single instruction (required a new target hook: TLI.isIntImmLegal()). 4. Use unaligned load / stores more aggressively if target hooks indicates they are "fast". 5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8. Also increase the threshold to something reasonable (8 for memset, 4 pairs for memcpy). This significantly improves Dhrystone, up to 50% on ARM iOS devices. rdar://12760078 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169791 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-10 23:21:26 +00:00
Shuxin Yang	5518a1355b	- Re-enable population count loop idiom recognization - fix a bug which cause sigfault. - add two testing cases which was causing crash git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169687 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-09 03:12:46 +00:00
Chandler Carruth	7065a2bcec	Revert the patches adding a popcount loop idiom recognition pass. There are still bugs in this pass, as well as other issues that are being worked on, but the bugs are crashers that occur pretty easily in the wild. Test cases have been sent to the original commit's review thread. This reverts the commits: r169671: Fix a logic error. r169604: Move the popcnt tests to an X86 subdirectory. r168931: Initial commit adding the pass. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169683 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-08 22:18:29 +00:00
Bill Wendling	99faa3b4ec	s/AttrListPtr/AttributeSet/g to better label what this class is going to be in the near future. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169651 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-07 23:16:57 +00:00
Nadav Rotem	af59e9adbd	When we use the BLEND instruction that uses the MSB as a mask, we can remove the VSRI instruction before it since it does not affect the MSB. Thanks Craig Topper for suggesting this. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169638 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-07 21:43:11 +00:00
Nadav Rotem	e4ccfef809	X86: Prefer using VPSHUFD over VPERMIL because it has better throughput. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169624 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-07 19:01:13 +00:00
Evan Cheng	2766a47310	Replace r169459 with something safer. Rather than having computeMaskedBits to understand target implementation of any_extend / extload, just generate zero_extend in place of any_extend for liveouts when the target knows the zero_extend will be implicit (e.g. ARM ldrb / ldrh) or folded (e.g. x86 movz). rdar://12771555 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169536 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-06 19:13:27 +00:00
Jakub Staszak	d3a056392b	Remove unneeded function, since PR8156 was fixed over a year ago. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169534 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-06 19:05:46 +00:00
Jakub Staszak	b2af3a095b	Simplify code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169521 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-06 18:22:59 +00:00
Evan Cheng	8a7186dbc2	Let targets provide hooks that compute known zero and ones for any_extend and extload's. If they are implemented as zero-extend, or implicitly zero-extend, then this can enable more demanded bits optimizations. e.g. define void @foo(i16* %ptr, i32 %a) nounwind { entry: %tmp1 = icmp ult i32 %a, 100 br i1 %tmp1, label %bb1, label %bb2 bb1: %tmp2 = load i16* %ptr, align 2 br label %bb2 bb2: %tmp3 = phi i16 [ 0, %entry ], [ %tmp2, %bb1 ] %cmp = icmp ult i16 %tmp3, 24 br i1 %cmp, label %bb3, label %exit bb3: call void @bar() nounwind br label %exit exit: ret void } This compiles to the followings before: push {lr} mov r2, #0 cmp r1, #99 bhi LBB0_2 @ BB#1: @ %bb1 ldrh r2, [r0] LBB0_2: @ %bb2 uxth r0, r2 cmp r0, #23 bhi LBB0_4 @ BB#3: @ %bb3 bl _bar LBB0_4: @ %exit pop {lr} bx lr The uxth is not needed since ldrh implicitly zero-extend the high bits. With this change it's eliminated. rdar://12771555 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169459 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-06 01:28:01 +00:00
Elena Demikhovsky	226e0e6264	Simplified BLEND pattern matching for shuffles. Generate VPBLENDD for AVX2 and VPBLENDW for v16i16 type on AVX2. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169366 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-05 09:24:57 +00:00
Evan Cheng	4e54480531	Add x86 isel lowering logic to form bit test with inverted condition. e.g. x ^ -1. Patch by David Majnemer. rdar://12755626 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169339 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-05 00:10:38 +00:00
Chandler Carruth	d04a8d4b33	Use the new script to sort the includes of every file under lib. Sooooo many of these had incorrect or strange main module includes. I have manually inspected all of these, and fixed the main module include to be the nearest plausible thing I could find. If you own or care about any of these source files, I encourage you to take some time and check that these edits were sensible. I can't have broken anything (I strictly added headers, and reordered them, never removed), but they may not be the headers you'd really like to identify as containing the API being implemented. Many forward declarations and missing includes were added to a header files to allow them to parse cleanly when included first. The main module rule does in fact have its merits. =] git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169131 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-03 16:50:05 +00:00
Shuxin Yang	84fca61ca5	rdar://12100355 (part 1) This revision attempts to recognize following population-count pattern: while(a) { c++; ... ; a &= a - 1; ... }, where <c> and <a>could be used multiple times in the loop body. TODO: On X8664 and ARM, __buildin_ctpop() are not expanded to a efficent instruction sequence, which need to be improved in the following commits. Reviewed by Nadav, really appreciate! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168931 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-29 19:38:54 +00:00
Elena Demikhovsky	8564dc67b5	I changed hasAVX() to hasFp256() and hasAVX2() to hasInt256() in X86IselLowering.cpp. The logic was not changed, only names. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168875 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-29 12:44:59 +00:00
Jakub Staszak	d642baf4be	Normalize splat 256bit vectors with 8 elements. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168600 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-26 19:24:31 +00:00
Craig Topper	3dcefc864e	Mark ISD::FMA as Legal instead of custom for x86 with FMA3/FMA4. Needed so that llvm.muladd can be converted to ISD::FMA for fp_contract. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168413 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-21 05:36:24 +00:00
Duncan Sands	dc7f174b5e	Add the Erlang/HiPE calling convention, patch by Yiannis Tsiouris. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168166 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-16 12:36:39 +00:00
Craig Topper	d577552c66	Use roundps/pd for llvm.ceil, llvm.trunc, llvm.rint, and llvm.nearbyint of vector types. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168141 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-16 06:37:56 +00:00
Craig Topper	490104720d	Add llvm.ceil, llvm.trunc, llvm.rint, llvm.nearbyint intrinsics. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168025 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-15 06:51:10 +00:00
Benjamin Kramer	2dbe929685	X86: Enable SSE memory intrinsics even when stack alignment is less than 16 bytes. The stack realignment code was fixed to work when there is stack realignment and a dynamic alloca is present so this shouldn't cause correctness issues anymore. Note that this also enables generation of AVX instructions for memset under the assumptions: - Unaligned loads/stores are always fast on CPUs supporting AVX - AVX is not slower than SSE We may need some tweaked heuristics if one of those assumptions turns out not to be true. Effectively reverts r58317. Part of PR2962. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167967 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-14 20:08:40 +00:00
Craig Topper	55de339dad	Factor out an overly replicated typecast. No functional change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167916 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-14 06:41:09 +00:00
Manman Ren	2adc503f29	X86: when constructing VZEXT_LOAD from other loads, makes sure its output chain is correctly setup. As an example, if the original load must happen before later stores, we need to make sure the constructed VZEXT_LOAD is constrained to be before the stores. rdar://12684358 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167859 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-13 19:13:05 +00:00
Michael Liao	dd3383fd09	Fix PR14314 - Fix operand order for atomic sub, where the minuend is the value loaded from memory and the subtrahend is the parameter specified. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167718 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-12 06:49:17 +00:00

... 2 3 4 5 6 ...

2368 Commits