llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2025-02-20 14:29:27 +00:00

Author	SHA1	Message	Date
Robert Lougher	5a2ae98407	Test commit - added a new line to vec_shuf-insert.ll. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@201083 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-10 12:42:13 +00:00
Elena Demikhovsky	27ef6eec41	AVX-512: Fixed extract_vector_elt for v16i1 and v8i1 vectors. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@201066 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-10 07:02:39 +00:00
Rafael Espindola	26baaa1efb	Always create a temporary symbol to use with the cfi frame. This is a small simplification and a small step in fixing pr18743 since private functions on MachO should be using a 'l' prefix. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200994 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-07 21:23:18 +00:00
Rafael Espindola	d413950a93	Use FileCheck variables to simplify this test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200992 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-07 21:11:33 +00:00
Rafael Espindola	0732e94378	Fix a bug with .weak_def_can_be_hidden: Mutable variables cannot use it. Thanks to John McCall for noticing it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200977 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-07 16:21:30 +00:00
Jim Grosbach	1f65cfad96	X86: Resolve a long standing FIXME and properly isel pextr[bw]. Generalize the AArch64 .td nodes for AssertZext and AssertSext. Use them to match the relevant pextr store instructions. The test widen_load-2.ll requires a slight change because with the stores gone, the remaining instructions are scheduled in a different order. Add test cases for SSE4 and AVX variants. Resolves rdar://13414672. Patch by Adam Nemet <anemet@apple.com>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200957 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-07 00:16:33 +00:00
Quentin Colombet	30c0f72237	[CodeGenPrepare] Move away sign extensions that get in the way of addressing mode. Basically the idea is to transform code like this: %idx = add nsw i32 %a, 1 %sextidx = sext i32 %idx to i64 %gep = gep i8* %myArray, i64 %sextidx load i8* %gep Into: %sexta = sext i32 %a to i64 %idx = add nsw i64 %sexta, 1 %gep = gep i8* %myArray, i64 %idx load i8* %gep That way the computation can be folded into the addressing mode. This transformation is done as part of the addressing mode matcher. If the matching fails (not profitable, addressing mode not legal, etc.), the matcher will revert the related promotions. <rdar://problem/15519855> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200947 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-06 21:44:56 +00:00
Quentin Colombet	1a10a51431	[RegAlloc] Add a last chance recoloring mechanism when everything else failed to find a register. The idea is to choose a color for the variable that cannot be allocated and recolor its interferences around. Unlike the current register allocation scheme, it is allowed to change the color of an already assigned (but maybe not splittable or spillable) live interval while propagating this change to its neighbors. In other word, there are two things that may help finding an available color: - Already assigned variables (RS_Done) can be recolored to different color. - The recoloring allows to catch solutions that needs to touch more that just the neighbors of the current allocated variable. E.g., vA can use {R1, R2 } vB can use { R2, R3} vC can use {R1 } Where vA, vB, and vC cannot be split anymore (they are reloads for instance) and they all interfere. vA is assigned R1 vB is assigned R2 vC tries to evict vA but vA is already done. => Regular register allocation heuristic fails. Last chance recoloring kicks in: vC does as if vA was evicted => vC uses R1. vC is marked as fixed. vA needs to find a color. None are available. vA cannot evict vC: vC is a fixed virtual register now. vA does as if vB was evicted => vA uses R2. vB needs to find a color. R3 is available. Recoloring => vC = R1, vA = R2, vB = R3. <rdar://problem/15947839> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200883 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-05 22:13:59 +00:00
Rafael Espindola	f39297678b	Remove support for not using .loc directives. Clang itself was not using this. The only way to access it was via llc. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200862 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-05 18:00:21 +00:00
Elena Demikhovsky	c341b7c0ef	AVX-512: optimized icmp -> sext -> icmp pattern git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200849 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-05 16:17:36 +00:00
Elena Demikhovsky	002683abc7	AVX-512: Added intrinsic for cvtph2ps. Added VPTESTNM instruction. Added a pattern to vselect (lit tests will follow). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200823 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-05 07:05:03 +00:00
David Blaikie	cec7ce78d7	DebugInfo: Remove some unneeded conditionals now that DIBuilder no longer emits zero-length arrays as {i32 0} A bunch of test cases needed to be cleaned up for this, many my fault - when implementid imported modules I updated test cases by simply duplicating the prior metadata field - which wasn't always the empty metadata entry. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200731 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-04 01:23:52 +00:00
Hal Finkel	e6c04bff3c	Expand vector bswap in LegalizeVectorOps ISD::BSWAP was missing from the list of node types that should be expanded element-wise. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200705 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-03 17:27:25 +00:00
Josh Magee	cde5c26c46	[stackprotector] Implement the sspstrong rules for stack layout. This changes the PrologueEpilogInserter and LocalStackSlotAllocation passes to follow the extended stack layout rules for sspstrong and sspreq. The sspstrong layout rules are: 1. Large arrays and structures containing large arrays (>= ssp-buffer-size) are closest to the stack protector. 2. Small arrays and structures containing small arrays (< ssp-buffer-size) are 2nd closest to the protector. 3. Variables that have had their address taken are 3rd closest to the protector. Differential Revision: http://llvm-reviews.chandlerc.com/D2546 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200601 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-01 01:36:16 +00:00
Reid Kleckner	8a24e83550	Implement inalloca codegen for x86 with the new inalloca design Calls with inalloca are lowered by skipping all stores for arguments passed in memory and the initial stack adjustment to allocate argument memory. Now the frontend is responsible for the memory layout, and the backend doesn't have to do any work. As a result these changes are pretty minimal. Reviewers: echristo Differential Revision: http://llvm-reviews.chandlerc.com/D2637 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200596 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-31 23:50:57 +00:00
Reid Kleckner	f10743d765	Don't put non-static allocas in the static alloca map Allocas marked inalloca are never static, but we were trying to put them into the static alloca map if they were in the entry block. Also add an assertion in x86 fastisel. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200593 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-31 23:45:12 +00:00
Reid Kleckner	9934ff2b7f	Set -mcpu to make this test pass on atom bots git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200588 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-31 22:58:10 +00:00
Lang Hames	f96f832a3c	Replace X86 FMA intrinsic pseduo-instructions with def pats. It looks like these pseudos were only used for pattern matching. Def pats are the appropriate way to do that. As a bonus, these intrinsics will now have memory operands folded properly, and better FMA3 variants selected where appropriate (see r199933). <rdar://problem/15611947> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200577 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-31 21:29:19 +00:00
Reid Kleckner	65c98b9da4	[ms-cxxabi] Add a new calling convention that swaps 'this' and 'sret' MSVC always places the 'this' parameter for a method first. The implicit 'sret' pointer for methods always comes second. We already implement this for __thiscall by putting sret parameters on the stack, but __cdecl methods require putting both parameters on the stack in opposite order. Using a special calling convention allows frontends to keep the sret parameter first, which avoids breaking lots of assumptions in LLVM and Clang. Fixes PR15768 with the corresponding change in Clang. Reviewers: ributzka, majnemer Differential Revision: http://llvm-reviews.chandlerc.com/D2663 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200561 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-31 17:41:22 +00:00
Manman Ren	05324ab015	This patch teaches the DAGCombiner how to fold insert_subvector nodes when the input is a concat_vectors and the insert replaces one of the concat halves: Lower half: fold (insert_subvector (concat_vectors X, Y), Z) -> (concat_vectors Z, Y) Upper half: fold (insert_subvector (concat_vectors X, Y), Z) -> (concat_vectors X, Z) This can be seen with the following IR: define <8 x float> @lower_half(<4 x float> %v1, <4 x float> %v2, <4 x float> %v3) { %1 = shufflevector <4 x float> %v1, <4 x float> %v2, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7> %2 = tail call <8 x float> @llvm.x86.avx.vinsertf128.ps.256(<8 x float> %1, <4 x float> %v3, i8 0) The vinsertf128 intrinsic is converted into an insert_subvector node in SelectionDAGBuilder.cpp. Using AVX, without the patch this generates two vinsertf128 instructions: vinsertf128 $1, %xmm1, %ymm0, %ymm0 vinsertf128 $0, %xmm2, %ymm0, %ymm0 With the patch this is optimized into: vinsertf128 $1, %xmm1, %ymm2, %ymm0 Patch by Robert Lougher. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200506 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-31 01:10:35 +00:00
Manman Ren	21f09088d3	PGO branch weight: update edge weights in SelectionDAGBuilder. When converting from "or + br" to two branches, or converting from "and + br" to two branches, we correctly update the edge weights of the two branches. The previous attempt at r200431 was reverted at r200434 because of two testing case failures. I modified my patch a little, but forgot to re-run "make check-all". Testing case CodeGen/ARM/lsr-unfolded-offset.ll is updated because of the patch's impact on branch probability which causes changes in spill placement. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200502 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-31 00:42:44 +00:00
Juergen Ributzka	014fdcdaf0	[Stackmaps] Record the stack size of each function that contains a stackmap/patchpoint intrinsic. Re-applying the patch, but this time without using AsmPrinter methods. Reviewed by Andy git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200481 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-30 18:58:27 +00:00
Juergen Ributzka	d26c0e731c	Revert "[Stackmaps] Record the stack size of each function that contains a stackmap/patchpoint intrinsic." This reverts commit r200444 to unbreak buildbots. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200445 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-30 03:34:02 +00:00
Juergen Ributzka	2baaf25bf5	[Stackmaps] Record the stack size of each function that contains a stackmap/patchpoint intrinsic. Reviewed by Andy git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200444 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-30 03:06:14 +00:00
Manman Ren	2227f98e32	Revert r200431 due to bot failures. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200434 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-30 00:53:27 +00:00
Manman Ren	0bdaca5058	PGO branch weight: update edge weights in SelectionDAGBuilder. When converting from "or + br" to two branches, or converting from "and + br" to two branches, we correctly update the edge weights of the two branches. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200431 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-30 00:24:37 +00:00
Rafael Espindola	916d3120b3	Use a raw_stream to implement the mangler. This is a bit more convenient for some callers, but more importantly, it is easier to implement correctly. Doing this removes the patching of already printed data that was used for fastcall, fixing a crash with private fastcall symbols. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200367 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-29 02:30:38 +00:00
Andrea Di Biagio	106b79744b	[X86] Add extra rules for combining vselect dag nodes into movsd. This improves the fix committed at revision 199683 adding the following new target specific combine rules: 1) fold (v4i32: vselect <0,0,-1,-1>, A, B) -> (v4i32 (bitcast (movsd (v2i64 (bitcast A)), (v2i64 (bitcast B))) )) 2) fold (v4f32: vselect <0,0,-1,-1>, A, B) -> (v4f32 (bitcast (movsd (v2f64 (bitcast A)), (v2f64 (bitcast B))) )) 3) fold (v4i32: vselect <-1,-1,0,0>, A, B) -> (v4i32 (bitcast (movsd (v2i64 (bitcast B)), (v2i64 (bitcast A))) )) 4) fold (v4f32: vselect <-1,-1,0,0>, A, B) -> (v4f32 (bitcast (movsd (v2i64 (bitcast B)), (v2i64 (bitcast A))) )) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200324 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-28 18:14:21 +00:00
Andrea Di Biagio	5144469bb4	[DAGCombiner] Avoid introducing an illegal build_vector when folding a sign_extend. Make sure that we don't introduce illegal build_vector dag nodes when trying to fold a sign_extend of a build_vector. This fixes a regression introduced by r200234. Added test CodeGen/X86/fold-vector-sext-crash.ll to verify that llc no longer crashes with an assertion failure due to an illegal build_vector of type MVT::v4i64. Thanks to Ilia Filippov for spotting this regression and for providing a reproducible test case. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200313 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-28 12:53:56 +00:00
Andrea Di Biagio	e9c0b5aba6	[DAGCombiner] Teach how to fold sext/aext/zext of constant build vectors. This patch teaches the DAGCombiner how to fold a sext/aext/zext dag node when the operand in input is a build vector of constants (or UNDEFs). The inability to fold a sext/zext of a constant build_vector was the root cause of some pcg bugs affecting vselect expansion on x86-64 with AVX support. Before this change, the DAGCombiner only knew how to fold a sext/zext/aext of a ConstantSDNode. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200234 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-27 18:45:30 +00:00
Stepan Dyatkovskiy	9019340120	Additional fix for 200201: due to dependence on bitwidth test was moved to X86 directory. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200202 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-27 09:43:10 +00:00
Juergen Ributzka	943ce55f39	Revert "Revert "Add Constant Hoisting Pass" (r200034)" This reverts commit r200058 and adds the using directive for ARMTargetTransformInfo to silence two g++ overload warnings. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200062 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-25 02:02:55 +00:00
Hans Wennborg	503793e834	Revert "Add Constant Hoisting Pass" (r200034) This commit caused -Woverloaded-virtual warnings. The two new TargetTransformInfo::getIntImmCost functions were only added to the superclass, and to the X86 subclass. The other targets were not updated, and the warning highlighted this by pointing out that e.g. ARMTTI::getIntImmCost was hiding the two new getIntImmCost variants. We could pacify the warning by adding "using TargetTransformInfo::getIntImmCost" to the various subclasses, or turning it off, but I suspect that it's wrong to leave the functions unimplemnted in those targets. The default implementations return TCC_Free, which I don't think is right e.g. for ARM. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200058 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-25 01:18:18 +00:00
Juergen Ributzka	96172cb4a4	Add Constant Hoisting Pass Retry commit r200022 with a fix for the build bot errors. Constant expressions have (unlike instructions) module scope use lists and therefore may have users in different functions. The fix is to simply ignore these out-of-function uses. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200034 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-24 20:18:00 +00:00
Lang Hames	a9699d475f	Add a testcase for the changes in r199938. <rdar://problem/15611947> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200027 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-24 19:00:19 +00:00
Juergen Ributzka	dc6f9b9a4f	Revert "Add Constant Hoisting Pass" This reverts commit r200022 to unbreak the build bots. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200024 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-24 18:40:30 +00:00
Juergen Ributzka	fb282c68b7	Add Constant Hoisting Pass This pass identifies expensive constants to hoist and coalesces them to better prepare it for SelectionDAG-based code generation. This works around the limitations of the basic-block-at-a-time approach. First it scans all instructions for integer constants and calculates its cost. If the constant can be folded into the instruction (the cost is TCC_Free) or the cost is just a simple operation (TCC_BASIC), then we don't consider it expensive and leave it alone. This is the default behavior and the default implementation of getIntImmCost will always return TCC_Free. If the cost is more than TCC_BASIC, then the integer constant can't be folded into the instruction and it might be beneficial to hoist the constant. Similar constants are coalesced to reduce register pressure and materialization code. When a constant is hoisted, it is also hidden behind a bitcast to force it to be live-out of the basic block. Otherwise the constant would be just duplicated and each basic block would have its own copy in the SelectionDAG. The SelectionDAG recognizes such constants as opaque and doesn't perform certain transformations on them, which would create a new expensive constant. This optimization is only applied to integer constants in instructions and simple (this means not nested) constant cast experessions. For example: %0 = load i64* inttoptr (i64 big_constant to i64*) Reviewed by Eric git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200022 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-24 18:23:08 +00:00
Alp Toker	ae43cab6ba	Fix known typos Sweep the codebase for common typos. Includes some changes to visible function names that were misspelt. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200018 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-24 17:20:08 +00:00
Juergen Ributzka	fe08a38a2c	[X86] Prevent the creation of redundant ops for sadd and ssub with overflow. This commit teaches the X86 backend to create the same X86 instructions when it lowers an sadd/ssub with overflow intrinsic and a conditional branch that uses that overflow result. This allows SelectionDAG to recognize and remove one of the redundant operations. This fixes <rdar://problem/15874016> and <rdar://problem/15661073>. Reviewed by Nadav git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199976 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-24 06:47:57 +00:00
Lang Hames	d8f4348cab	Replace vfmaddxx213 instructions with their 231-type equivalents in accumulator loops. Writing back to the accumulator (231-type) allows the coalescer to eliminate an extra copy. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199933 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-23 20:23:36 +00:00
Elena Demikhovsky	e1a621d84f	AVX-512: added VPERM2D VPERM2Q VPERM2PS VPERM2PD instructions, they give better sequences than VPERMI git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199893 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-23 14:27:26 +00:00
Owen Anderson	9920bd341a	Revert r162101 and replace it with a solution that works for targets where the pointer type is illegal. This is a horrible bit of code. We're calling a simplification routine in the middle of type legalization. We tell the simplification routine that it's running after legalization, but some of the types it will encounter will be illegal! The fix is only to invoke the simplification if the types in question were legal, so that none of its invariants will be violated. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199847 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-22 22:34:17 +00:00
Quentin Colombet	8b594ba851	Add a testcase for r199430. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199831 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-22 20:11:50 +00:00
Elena Demikhovsky	c75a44cda7	AVX512: combining setcc and zext is wrong on AVX512 because vector compare instruction puts result in mask register. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199798 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-22 12:26:19 +00:00
James Molloy	bb96dfcf4f	MachineCopyPropagation has special logic for removing COPY instructions. It will remove plain COPYs using eraseFromParent(), but if the COPY has imp-defs/imp-uses it will convert it to a KILL, to keep the imp-def around. This actually totally breaks and causes the machine verifier to cry in several cases, one of which being: %RAX<def> = COPY %RCX<kill> %ECX<def> = COPY %EAX<kill>, %RAX<imp-use,kill> These subregister copies are together identified as noops, so are both removed. However, the second one as it has an imp-use gets converted into a kill: %ECX<def> = KILL %EAX<kill>, %RAX<imp-use,kill> As the original COPY has been removed, the verifier goes into tears at the use of undefined EAX and RAX. There are several hacky solutions to this hacky problem (which is all to do with imp-use/def weirdnesses), but the least hacky I've come up with is to always remove COPYs by converting to KILLs. KILLs are no-ops to the code generator so the generated code doesn't change (which is why they were partially used in the first place), but using them also keeps the def/use and imp-def/imp-use chains alive: %RAX<def> = KILL %RCX<kill> %ECX<def> = KILL %EAX<kill>, %RAX<imp-use,kill> The patch passes all test cases including the ones that check the removal of MOVs in this circumstance, along with an extra test I added to check subregister behaviour (which made the machine verifier fall over before my patch). The patch also adds some DEBUG() statements because the file hadn't got any. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199797 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-22 09:12:27 +00:00
Chandler Carruth	182a02a8dc	Tweak the spelling of the asserts requirement a bit more. This makes it match the (reasonably prevelant) usage in Clang's test suite and so seems more "canonical". git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199767 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-21 22:39:19 +00:00
Andrea Di Biagio	825b93b2df	[X86] Teach how to combine a vselect into a movss/movsd Add target specific rules for combining vselect dag nodes into movss/movsd when possible. If the vector type of the vselect dag node in input is either MVT::v4i13 or MVT::v4f32, then try to fold according to rules: 1) fold (vselect (build_vector (0, -1, -1, -1)), A, B) -> (movss A, B) 2) fold (vselect (build_vector (-1, 0, 0, 0)), A, B) -> (movss B, A) If the vector type of the vselect dag node in input is either MVT::v2i64 or MVT::v2f64 (and we have SSE2), then try to fold according to rules: 3) fold (vselect (build_vector (0, -1)), A, B) -> (movsd A, B) 4) fold (vselect (build_vector (-1, 0)), A, B) -> (movsd B, A) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199683 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-20 19:35:22 +00:00
Hal Finkel	54c251c77f	Fix misched-aa-colored.ll to require asserts (trying again) Perhaps it needs to be in caps. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199661 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-20 14:15:28 +00:00
Hal Finkel	42691ad862	Fix misched-aa-colored.ll to require asserts. -misched=shuffle is NDEBUG only. Maybe we should change that. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199659 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-20 14:09:34 +00:00
Hal Finkel	48f7a2389e	Update IR when merging slots in stack coloring The way that stack coloring updated MMOs when merging stack slots, while correct, is suboptimal, and is incompatible with the use of AA during instruction scheduling. The solution, which involves the use of const_cast (and more importantly, updating the IR from within an MI-level pass), obviously requires some explanation: When the stack coloring pass was originally committed, the code in ScheduleDAGInstrs::buildSchedGraph tracked possible alias sets by using GetUnderlyingObject, and all load/store and store/store memory control dependencies where added between SUs at the object level (where only one object, that returned by GetUnderlyingObject, was used to identify the object associated with each MMO). When stack coloring merged stack slots, it would replace MMOs derived from the remapped alloca with the alloca with which the remapped alloca was being replaced. Because ScheduleDAGInstrs only used single objects, and tracked alias sets at the object level, this was a fine solution. In r169744, (Andy and) I updated the code in ScheduleDAGInstrs to use GetUnderlyingObjects, and track alias sets using, potentially, multiple underlying objects for each MMO. This was done, primarily, to provide the ability to look through PHIs, and provide better scheduling for induction-variable-dependent loads and stores inside loops. At this point, the MMO-updating code in stack coloring became suboptimal, because it would clear the MMOs for (i.e. completely pessimize) all instructions for which r169744 might help in scheduling. Updating the IR directly is the simplest fix for this (and the one with, by far, the least compile-time impact), but others are possible (we could give each MMO a small vector of potential values, or make use of a remapping table, constructed from MFI, inside ScheduleDAGInstrs). Unfortunately, replacing all MMO values derived from the remapped alloca with the base replacement alloca fundamentally breaks our ability to use AA during instruction scheduling (which is critical to performance on some targets). The reason is that the original MMO might have had an offset (either constant or dynamic) from the base remapped alloca, and that offset is not present in the updated MMO. One possible way around this would be to use GetPointerBaseWithConstantOffset, and update not only the MMO's value, but also its offset based on the original offset. Unfortunately, this solution would only handle constant offsets, and for safety (because AA is not completely restricted to deducing relationships with constant offsets), we would need to clear all MMOs without constant offsets over the entire function. This would be an even worse pessimization than the current single-object restriction. Any other solution would involve passing around a vector of remapped allocas, and teaching AA to use it, introducing additional complexity and overhead into AA. Instead, when remapping an alloca, we replace all IR uses of that alloca as well (optionally inserting a bitcast as necessary). This is even more efficient that the old MMO-updating code in the stack coloring pass (because it removes the need to call GetUnderlyingObject on all MMO values), removes the single-object pessimization in the default configuration, and enables the correct use of AA during instruction scheduling (all without any additional overhead). LLVM now no longer miscompiles itself on x86_64 when using -enable-misched -enable-aa-sched-mi -misched-bottomup=0 -misched-topdown=0 -misched=shuffle! Fixed PR18497. Because the alloca replacement is now done at the IR level, unless the MMO directly refers to the remapped alloca, the change cannot be seen at the MI level. As a result, there is no good way to fix test/CodeGen/X86/pr14090.ll. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199658 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-20 14:03:16 +00:00

1 2 3 4 5 ...

4768 Commits