llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-08-23 17:29:19 +00:00

Author	SHA1	Message	Date
Bill Wendling	95dd442041	Strip the pointer casts off of allocas so that the selection DAG can find them. PR10799 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155954 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-01 22:50:45 +00:00
Manman Ren	769ea2f93f	X86: optimization for max-like struct This patch will optimize the following cases on X86 (a > b) ? (a-b) : 0 (a >= b) ? (a-b) : 0 (b < a) ? (a-b) : 0 (b <= a) ? (a-b) : 0 FROM movl %edi, %ecx subl %esi, %ecx cmpl %edi, %esi movl $0, %eax cmovll %ecx, %eax TO xorl %eax, %eax subl %esi, %edi cmovll %eax, %edi movl %edi, %eax rdar: 10734411 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155919 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-01 17:16:15 +00:00
Jay Foad	4ab6fcaadc	Regression test for PR2960. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155912 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-01 11:11:34 +00:00
Manman Ren	16a76519a5	X86: optimization for -(x != 0) This patch will optimize -(x != 0) on X86 FROM cmpl $0x01,%edi sbbl %eax,%eax notl %eax TO negl %edi sbbl %eax %eax git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155853 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-30 22:51:25 +00:00
Manman Ren	1701105022	test/CodeGen/X86/select.ll: remove spaces git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155840 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-30 18:54:27 +00:00
Derek Schuff	ddc693bd22	Fix fastcc structure return with fast-isel on x86-32 On x86-32, structure return via sret lets the callee pop the hidden pointer argument off the stack, which the caller then re-pushes. However if the calling convention is fastcc, then a register is used instead, and the caller should not adjust the stack. This is implemented with a check of IsTailCallConvention X86TargetLowering::LowerCall but is now checked properly in X86FastISel::DoSelectCall. (this time, actually commit what was reviewed!) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155825 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-30 16:57:15 +00:00
Bob Wilson	ff73d8fef9	Don't introduce illegal types when creating vmull operations. <rdar://11324364> ARM BUILD_VECTORs created after type legalization cannot use i8 or i16 operands, since those types are not legal. Instead use i32 operands, which will be implicitly truncated by the BUILD_VECTOR to match the element type. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155824 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-30 16:53:34 +00:00
Andrew Trick	2674a4acdb	Reapply 155668: Fix the SD scheduler to avoid gluing the same node twice. This time, also fix the caller of AddGlue to properly handle incomplete chains. AddGlue had failure modes, but shamefully hid them from its caller. It's luck ran out. Fixes rdar://11314175: BuildSchedUnits assert. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155749 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-28 01:03:23 +00:00
Derek Schuff	f3db6b855e	Revert r155745 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155746 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-27 23:37:41 +00:00
Derek Schuff	9dc28b0722	Fix fastcc structure return with fast-isel on x86-32 On x86-32, structure return via sret lets the callee pop the hidden pointer argument off the stack, which the caller then re-pushes. However if the calling convention is fastcc, then a register is used instead, and the caller should not adjust the stack. This is implemented with a check of IsTailCallConvention X86TargetLowering::LowerCall but is now checked properly in X86FastISel::DoSelectCall. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155745 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-27 23:27:17 +00:00
Andrew Trick	0e47cfd5b6	Temporarily revert r155668: Fix the SD scheduler to avoid gluing. This definitely caused regression with ARM -mno-thumb. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155743 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-27 22:55:59 +00:00
Chad Rosier	a73b6fc511	Add x86-specific DAG combine to simplify: x == -y --> x+y == 0 x != -y --> x+y != 0 On x86, the generated code goes from negl %esi cmpl %esi, %edi je .LBB0_2 to addl %esi, %edi je .L4 This case is correctly handled for ARM with "cmn". Patch by Manman Ren. rdar://11245199 PR12545 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155739 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-27 22:33:25 +00:00
Evan Cheng	da71cca458	Make test less fragile. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155732 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-27 20:48:18 +00:00
Lang Hames	7787800481	Fix the order of the operands in the llvm.fma intrinsic patterns for ARM, <rdar://problem/11325085>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155724 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-27 18:51:24 +00:00
Benjamin Kramer	17c836c4b5	X86: Don't emit conditional floating point moves on when targeting pre-pentiumpro architectures. * Model FPSW (the FPU status word) as a register. * Add ISel patterns for the FUCOM, FNSTSW and SAHF instructions. During Legalize/Lowering, build a node sequence to transfer the comparison result from FPSW into EFLAGS. If you're wondering about the right-shift: That's an implicit sub-register extraction (%ax -> %ah) which is handled later on by the instruction selector. Fixes PR6679. Patch by Christoph Erhardt! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155704 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-27 12:07:43 +00:00
Craig Topper	76c5897eae	Add mcpu to tests to prevent them from using AVX instructions on Sandy Bridge after r155618. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155696 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-27 07:11:58 +00:00
Evan Cheng	afb3b5ebe6	Implement a bastardized ABI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155686 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-27 02:11:10 +00:00
Evan Cheng	97a454317a	- thumbv6 shouldn't imply +thumb2. Cortex-M0 doesn't suppport 32-bit Thumb2 instructions. - However, it does support dmb, dsb, isb, mrs, and msr. rdar://11331541 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155685 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-27 01:27:19 +00:00
Andrew Trick	aec9240be2	Fix the SD scheduler to avoid gluing the same node twice. DAGCombine strangeness may result in multiple loads from the same offset. They both may try to glue themselves to another load. We could insist that the redundant loads glue themselves to each other, but the beter fix is to bail out from bad gluing at the time we detect it. Fixes rdar://11314175: BuildSchedUnits assert. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155668 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-26 21:48:25 +00:00
Tim Northover	37abe8df4a	Use VLD1 in NEON extenting-load patterns instead of VLDR. On some cores it's a bad idea for performance to mix VFP and NEON instructions and since these patterns are NEON anyway, the NEON load should be used. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155630 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-26 08:46:29 +00:00
Evan Cheng	e67a4163f5	If triple is armv7 / thumbv7 and a CPU is specified, do not automatically assume the feature set of v7a. This comes about if the user specifies something like -arch armv7 -mcpu=cortex-m3. We shouldn't be generating instructions such as uxtab in this case. rdar://11318438 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155601 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-26 01:13:36 +00:00
Jakob Stoklund Olesen	6962106496	Try to fix llvm-arm-linux builder with -mcpu. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155589 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-25 21:22:33 +00:00
Preston Gurd	01c0dd1be3	Trivial change to make the test use -mcpu=generic so as to avoid a failure if run on an Intel Atom with post RA instruction scheduling. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155587 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-25 21:04:54 +00:00
Akira Hatanaka	25052f4077	Do not use $gp as a dedicated global register if the target ABI is not O32. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155522 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-25 01:24:52 +00:00
Nadav Rotem	2003e03045	Fix the testcase. We do expect two vblendw on XMMs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155477 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-24 19:57:38 +00:00
Nadav Rotem	34a13bb412	Add a testcase for 155440 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155475 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-24 19:45:28 +00:00
Evan Cheng	ddb1420e17	MachineBasicBlock::SplitCriticalEdge() should follow LLVM IR variant and refuse to break edge to EH landing pad. rdar://11300144 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155470 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-24 19:06:55 +00:00
Nadav Rotem	d1a79136e3	AVX: We lower VECTOR_SHUFFLE and BUILD_VECTOR nodes into vbroadcast instructions using the pattern (vbroadcast (i32load src)). In some cases, after we generate this pattern new users are added to the load node, which prevent the selection of the blend pattern. This commit provides fallback patterns which perform in-vector broadcast (using in-vector vbroadcast in AVX2 and pshufd on AVX1). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155437 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-24 11:07:03 +00:00
Nadav Rotem	a35407705d	Optimize the vector UINT_TO_FP, SINT_TO_FP and FP_TO_SINT operations where the integer type is i8 (commonly used in graphics). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155397 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-23 21:53:37 +00:00
Preston Gurd	6a8c7bf8e7	This patch fixes a problem which arose when using the Post-RA scheduler on X86 Atom. Some of our tests failed because the tail merging part of the BranchFolding pass was creating new basic blocks which did not contain live-in information. When the anti-dependency code in the Post-RA scheduler ran, it would sometimes rename the register containing the function return value because the fact that the return value was live-in to the subsequent block had been lost. To fix this, it is necessary to run the RegisterScavenging code in the BranchFolding pass. This patch makes sure that the register scavenging code is invoked in the X86 subtarget only when post-RA scheduling is being done. Post RA scheduling in the X86 subtarget is only done for Atom. This patch adds a new function to the TargetRegisterClass to control whether or not live-ins should be preserved during branch folding. This is necessary in order for the anti-dependency optimizations done during the PostRASchedulerList pass to work properly when doing Post-RA scheduling for the X86 in general and for the Intel Atom in particular. The patch adds and invokes the new function trackLivenessAfterRegAlloc() instead of using the existing requiresRegisterScavenging(). It changes BranchFolding.cpp to call trackLivenessAfterRegAlloc() instead of requiresRegisterScavenging(). It changes the all the targets that implemented requiresRegisterScavenging() to also implement trackLivenessAfterRegAlloc(). It adds an assertion in the Post RA scheduler to make sure that post RA liveness information is available when it is needed. It changes the X86 break-anti-dependencies test to use –mcpu=atom, in order to avoid running into the added assertion. Finally, this patch restores the use of anti-dependency checking (which was turned off temporarily for the 3.1 release) for Intel Atom in the Post RA scheduler. Patch by Andy Zhang! Thanks to Jakob and Anton for their reviews. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155395 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-23 21:39:35 +00:00
Chandler Carruth	d410eaba04	Revert r155365, r155366, and r155367. All three of these have regression test suite failures. The failures occur at each stage, and only get worse, so I'm reverting all of them. Please resubmit these patches, one at a time, after verifying that the regression test suite passes. Never submit a patch without running the regression test suite. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155372 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-23 18:25:57 +00:00
Sirish Pande	15e56ad885	Hexagon V5 (floating point) support. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155367 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-23 17:49:40 +00:00
Sirish Pande	1bfd24851e	Support for Hexagon architectural feature, new value jump. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155366 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-23 17:49:28 +00:00
Sirish Pande	0dac3919e5	Support for Hexagon VLIW Packetizer. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155365 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-23 17:49:20 +00:00
Elena Demikhovsky	dd9047815c	cleaned line endings in the newly added test file git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155315 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-22 13:22:48 +00:00
Elena Demikhovsky	1da5867236	ZERO_EXTEND/SIGN_EXTEND/TRUNCATE optimization for AVX2 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155309 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-22 09:39:03 +00:00
Nadav Rotem	db3461662e	Teach getVectorTypeBreakdown about promotion of vectors in addition to widening of vectors. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155296 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-21 20:08:32 +00:00
Jakob Stoklund Olesen	0b35c35efc	Fix PR12599. The X86 target is editing the selection DAG while isel is selecting nodes following a topological ordering. When the DAG hacking triggers CSE, nodes can be deleted and bad things happen. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155257 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-20 23:36:09 +00:00
Joel Jones	c8969fd291	Test for the the problem with xors being changed into ands when the set bits aren't the same for both args of the xor. This transformation is in the function TargetLowering::SimplifyDemandedBits in the file lib/CodeGen/SelectionDAG/TargetLowering.cpp. I have tested this test using a previous version of llc which the defect and the a version of llc which does not. I got the expected fail and pass, respectively. This test goes with rdar://11195364 and the check in with the fix: svn r154955 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155156 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-19 20:54:44 +00:00
Joe Groff	d15c581100	Move win32 SimplifyLibcall test under Transforms git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154967 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-18 00:07:45 +00:00
Joe Groff	d5bda5ec66	fix pr12559: mark unavailable win32 math libcalls also fix SimplifyLibCalls to use TLI rather than compile-time conditionals to enable optimizations on floor, ceil, round, rint, and nearbyint git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154960 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-17 23:05:54 +00:00
Benjamin Kramer	93751c8c99	Force cmov on test so block placement doesn't shuffle the code around. This made the test fail with -mcpu=generic (when building on a non-x86 host). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154926 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-17 13:55:23 +00:00
James Molloy	72aadc057c	Fix bad EXTRACT_SUBREG in instruction selection for extending-loads on NEON. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154915 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-17 08:18:00 +00:00
Andrew Trick	8ca441aad3	Test cases that assume layout should use -disable-code-place. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154908 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-17 06:20:42 +00:00
Preston Gurd	8975f510c0	temporarily XFAIL this test until post RA live-ins is properly enabled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154882 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-17 00:21:35 +00:00
Chandler Carruth	fd2e4e65f7	Disable the atom scheduling test after r154874 broke it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154877 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-16 23:11:39 +00:00
Chandler Carruth	177bea5330	Relax this test a touch to cope with different assembly variants. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154870 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-16 22:20:48 +00:00
Chandler Carruth	f1a60c734c	Fix updateTerminator to be resiliant to degenerate terminators where both fallthrough and a conditional branch target the same successor. Gracefully delete the conditional branch and introduce any unconditional branch needed to reach the actual successor. This fixes memory corruption in 2009-06-15-RegScavengerAssert.ll and possibly other tests. Also, while I'm here fix a latent bug I spotted by inspection. I never applied the same fundamental fix to this fallthrough successor finding logic that I did to the logic used when there are no conditional branches. As a consequence it would have selected landing pads had they be aligned in just the right way here. I don't have a test case as I spotted this by inspection, and the previous time I found this required have of TableGen's source code to produce it. =/ I hate backend bugs. ;] Thanks to Jim Grosbach for helping me reason through this and reviewing the fix. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154867 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-16 22:03:00 +00:00
Jakob Stoklund Olesen	39ac3252e8	FileCheckize these tests. Add an extra test to ldr_post with an immediate increment. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154859 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-16 20:56:42 +00:00
Jakob Stoklund Olesen	fbefc9125d	Disable code placement for this test. It makes it less sensitive to small changes in heuristics. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154857 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-16 20:49:06 +00:00
Richard Smith	2c651fe6f4	Fix incorrect atomics codegen introduced in r154705, and extend test to catch it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154845 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-16 18:43:53 +00:00
Bill Wendling	57ca13ecc4	Move to X86 directory because this fails on non-X86 platforms. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154825 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-16 16:38:48 +00:00
Chandler Carruth	9e67db4af1	Flip the new block-placement pass to be on by default. This is mostly to test the waters. I'd like to get results from FNT build bots and other bots running on non-x86 platforms. This feature has been pretty heavily tested over the last few months by me, and it fixes several of the execution time regressions caused by the inlining work by preventing inlining decisions from radically impacting block layout. I've seen very large improvements in yacr2 and ackermann benchmarks, along with the expected noise across all of the benchmark suite whenever code layout changes. I've analyzed all of the regressions and fixed them, or found them to be impossible to fix. See my email to llvmdev for more details. I'd like for this to be in 3.1 as it complements the inliner changes, but if any failures are showing up or anyone has concerns, it is just a flag flip and so can be easily turned off. I'm switching it on tonight to try and get at least one run through various folks' performance suites in case SPEC or something else has serious issues with it. I'll watch bots and revert if anything shows up. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154816 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-16 13:49:17 +00:00
Chandler Carruth	0de089a4d0	Remove an overly brittle test. This test will no longer be interesting once we start changing the block layout, so just nuke it. If anyone has ideas about how to craft a code layout agnostic form of the test please let me know. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154815 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-16 13:49:09 +00:00
Chandler Carruth	e773e8c3e5	Add a somewhat hacky heuristic to do something different from whole-loop rotation. When there is a loop backedge which is an unconditional branch, we will end up with a branch somewhere no matter what. Try placing this backedge in a fallthrough position above the loop header as that will definitely remove at least one branch from the loop iteration, where whole loop rotation may not. I haven't seen any benchmarks where this is important but loop-blocks.ll tests for it, and so this will be covered when I flip the default. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154812 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-16 13:33:36 +00:00
Chandler Carruth	16295fc20b	Tweak the loop rotation logic to check whether the loop is naturally laid out in a form with a fallthrough into the header and a fallthrough out of the bottom. In that case, leave the loop alone because any rotation will introduce unnecessary branches. If either side looks like it will require an explicit branch, then the rotation won't add any, do it to ensure the branch occurs outside of the loop (if possible) and maximize the benefit of the fallthrough in the bottom. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154806 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-16 09:31:23 +00:00
Hal Finkel	31490baf38	Remove dead SD nodes after the combining pass. Fixes PR12201. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154786 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-16 03:33:22 +00:00
Chandler Carruth	70daea90af	Rewrite how machine block placement handles loop rotation. This is a complex change that resulted from a great deal of experimentation with several different benchmarks. The one which proved the most useful is included as a test case, but I don't know that it captures all of the relevant changes, as I didn't have specific regression tests for each, they were more the result of reasoning about what the old algorithm would possibly do wrong. I'm also failing at the moment to craft more targeted regression tests for these changes, if anyone has ideas, it would be welcome. The first big thing broken with the old algorithm is the idea that we can take a basic block which has a loop-exiting successor and a looping successor and use the looping successor as the layout top in order to get that particular block to be the bottom of the loop after layout. This happens to work in many cases, but not in all. The second big thing broken was that we didn't try to select the exit which fell into the nearest enclosing loop (to which we exit at all). As a consequence, even if the rotation worked perfectly, it would result in one of two bad layouts. Either the bottom of the loop would get fallthrough, skipping across a nearer enclosing loop and thereby making it discontiguous, or it would be forced to take an explicit jump over the nearest enclosing loop to earch its successor. The point of the rotation is to get fallthrough, so we need it to fallthrough to the nearest loop it can. The fix to the first issue is to actually layout the loop from the loop header, and then rotate the loop such that the correct exiting edge can be a fallthrough edge. This is actually much easier than I anticipated because we can handle all the hard parts of finding a viable rotation before we do the layout. We just store that, and then rotate after layout is finished. No inner loops get split across the post-rotation backedge because we check for them when selecting the rotation. That fix exposed a latent problem with our exitting block selection -- we should allow the backedge to point into the middle of some inner-loop chain as there is no real penalty to it, the whole point is that it won't be a fallthrough edge. This may have blocked the rotation at all in some cases, I have no idea and no test case as I've never seen it in practice, it was just noticed by inspection. Finally, all of these fixes, and studying the loops they produce, highlighted another problem: in rotating loops like this, we sometimes fail to align the destination of these backwards jumping edges. Fix this by actually walking the backwards edges rather than relying on loopinfo. This fixes regressions on heapsort if block placement is enabled as well as lots of other cases where the previous logic would introduce an abundance of unnecessary branches into the execution. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154783 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-16 01:12:56 +00:00
Craig Topper	2cb1e9dc7d	Remove AVX2 vpermq and vpermpd intrinsics. These can now be handled with normal shuffle vectors. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154778 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-15 22:43:31 +00:00
Nadav Rotem	f16af0a053	Fix PR12529. The Vxx family of instructions are only supported by AVX. Use non-vex instructions for SSE4. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154770 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-15 19:36:44 +00:00
Nadav Rotem	3ab32ea49e	When emulating vselect using OR/AND/XOR make sure to bitcast the result back to the original type. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154764 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-15 15:08:09 +00:00
Elena Demikhovsky	73c504af9d	Added VPERM optimization for AVX2 shuffles git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154761 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-15 11:18:59 +00:00
Richard Smith	42fc29e717	Fix X86 codegen for 'atomicrmw nand' to generate x = ~(x & y), not x = ~x & y. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154705 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-13 22:47:00 +00:00
Evan Cheng	7ece9539c2	On Darwin targets, only use vfma etc. if the source use fma() intrinsic explicitly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154689 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-13 18:59:28 +00:00
Sirish Pande	2f69e4cf32	Disable Hexagon test temporarily. There is an assert at line 558 in ScheduleDAGInstrs::buildSchedGraph(AliasAnalysis *AA). This assert needs to addressed for post RA scheduler. Until that assert is addressed, any passes that uses post ra scheduler will fail. So, I am temporarily disabling the hexagon tests until that fix is in. The assert is as follows: assert(!MI->isTerminator() && !MI->isLabel() && "Cannot schedule terminators or labels!"); git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154617 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-12 21:06:54 +00:00
Craig Topper	bf596c9c61	Fix 128-bit ptest intrinsics to take v2i64 instead of v4f32 since these are integer instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154580 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-12 07:23:00 +00:00
Akira Hatanaka	ed08489a71	Revert changes that were accidentally committed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154563 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-11 23:19:55 +00:00
Akira Hatanaka	55e0e43e4f	Fix string that is being checked. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154547 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-11 23:11:33 +00:00
Akira Hatanaka	1cc6333161	Emit neg.s or neg.d only if -enable-no-nans-fp-math is supplied by user, otherwise expand FNEG during legalization. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154546 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-11 22:59:08 +00:00
Akira Hatanaka	c12a6e6b53	Emit abs.s or abs.d only if -enable-no-nans-fp-math is supplied by user. Invalid operation is signaled if the operand of these instructions is NaN. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154545 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-11 22:49:04 +00:00
Akira Hatanaka	056c51e598	Fix bugs in lowering of FCOPYSIGN nodes. - FCOPYSIGN nodes that have operands of different types were not handled. - Different code was generated depending on the endianness of the target. Additionally, code is added that emits INS and EXT instructions, if they are supported by target (they are R2 instructions). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154540 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-11 22:13:04 +00:00
Evan Cheng	14b4c03580	Add more fused mul+add/sub patterns. rdar://10139676 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154484 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-11 06:59:47 +00:00
Nadav Rotem	e611378a6e	Reapply 154396 after fixing a test. Original message: Modify the code that lowers shuffles to blends from using blendvXX to vblendXX. blendV uses a register for the selection while Vblend uses an immediate. On sandybridge they still have the same latency and execute on the same execution ports. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154483 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-11 06:40:27 +00:00
Evan Cheng	92c904539a	Match (fneg (fma) to vfnma. rdar://10139676 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154469 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-11 01:21:25 +00:00
Evan Cheng	a0908d0a44	Merge fma.ll into fusedMAC.ll git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154466 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-11 01:03:11 +00:00
Jakob Stoklund Olesen	89cdaf46ec	Fix test to be register assignment invariant. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154453 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-11 00:00:24 +00:00
Owen Anderson	06886aaaeb	Move the constant-folding support for FP_ROUND in SelectionDAG from the one-operand version of getNode() to the two-operand version, since it became a two-operand node at sound point. Zap a testcase that this allows us to completely fold away. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154447 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-10 22:46:53 +00:00
Evan Cheng	3aef2ff514	Handle llvm.fma.* intrinsics. rdar://10914096 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154439 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-10 21:40:28 +00:00
Duncan Sands	507bb7a42f	Add a comment noting that the fdiv -> fmul conversion won't generate multiplication by a denormal, and some tests checking that. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154431 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-10 20:35:27 +00:00
Eric Christopher	a139051654	Temporarily revert this patch to see if it brings the buildbots back. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154425 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-10 19:33:16 +00:00
Eric Christopher	18112d83e7	To ensure that we have more accurate line information for a block don't elide the branch instruction if it's the only one in the block, otherwise it's ok. PR9796 and rdar://11215207 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154417 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-10 18:18:10 +00:00
Nadav Rotem	50e64cfe6e	Modify the code that lowers shuffles to blends from using blendvXX to vblendXX. blendv uses a register for the selection while vblend uses an immediate. On sandybridge they still have the same latency and execute on the same execution ports. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154396 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-10 14:33:13 +00:00
Anton Korobeynikov	999821cddf	Transform div to mul with reciprocal only when fp imm is legal. This fixes PR12516 and uncovers one weird problem in legalize (workarounded) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154394 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-10 13:22:49 +00:00
Evan Cheng	fa12d0df5a	Add proper checks. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154379 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-10 03:15:42 +00:00
Evan Cheng	bf010eb911	Fix a long standing tail call optimization bug. When a libcall is emitted legalizer always use the DAG entry node. This is wrong when the libcall is emitted as a tail call since it effectively folds the return node. If the return node's input chain is not the entry (i.e. call, load, or store) use that as the tail call input chain. PR12419 rdar://9770785 rdar://11195178 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154370 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-10 01:51:00 +00:00
Rafael Espindola	fdb230a154	Don't try to zExt just to check if an integer constant is zero, it might not fit in a i64. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154364 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-10 00:16:22 +00:00
Lang Hames	23f369d1fe	Test case for PR12495. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154359 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-09 23:58:59 +00:00
Akira Hatanaka	787c3fd385	Have TargetLowering::getPICJumpTableRelocBase return a node that points to the GOT if jump table uses 64-bit gp-relative relocation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154341 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-09 20:32:12 +00:00
Chad Rosier	7f35455708	When performing a truncating store, it's possible to rearrange the data in-register, such that we can use a single vector store rather then a series of scalar stores. For func_4_8 the generated code vldr d16, LCPI0_0 vmov d17, r0, r1 vadd.i16 d16, d17, d16 vmov.u16 r0, d16[3] strb r0, [r2, #3] vmov.u16 r0, d16[2] strb r0, [r2, #2] vmov.u16 r0, d16[1] strb r0, [r2, #1] vmov.u16 r0, d16[0] strb r0, [r2] bx lr becomes vldr d16, LCPI0_0 vmov d17, r0, r1 vadd.i16 d16, d17, d16 vuzp.8 d16, d17 vst1.32 {d16[0]}, [r2, :32] bx lr I'm not fond of how this combine pessimizes 2012-03-13-DAGCombineBug.ll, but I couldn't think of a way to judiciously apply this combine. This ldrh r0, [r0, #4] strh r0, [r1] becomes vldr d16, [r0] vmov.u16 r0, d16[2] vmov.32 d16[0], r0 vuzp.16 d16, d17 vst1.32 {d16[0]}, [r1, :32] PR11158 rdar://10703339 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154340 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-09 20:32:02 +00:00
Rafael Espindola	decbc43f72	Pattern match a setcc of boolean value with 0 as a truncate. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154322 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-09 16:06:03 +00:00
Nadav Rotem	e80aa7c783	Lower some x86 shuffle sequences to the vblend family of instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154313 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-09 08:33:21 +00:00
Nadav Rotem	154819dd6f	Fix a bug in the lowering of broadcasts: ConstantPools need to use the target pointer type. Move NormalizeVectorShuffle and LowerVectorBroadcast into X86TargetLowering. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154310 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-09 07:45:58 +00:00
Chandler Carruth	ab5a55e118	Cleanup and relax a restriction on the matching of global offsets into x86 addressing modes. This allows PIE-based TLS offsets to fit directly into an addressing mode immediate offset, which is the last remaining code quality issue from PR12380. With this patch, that PR is completely fixed. To understand why this patch is correct to match these offsets into addressing mode immediates, break it down by cases: 1) 32-bit is trivially correct, and unmodified here. 2) 64-bit non-small mode is unchanged and never matches. 3) 64-bit small PIC code which is RIP-relative is handled specially in the match to try to fit RIP into the base register. If it fails, it now early exits. This behavior is unchanged by the patch. 4) 64-bit small non-PIC code which is not RIP-relative continues to work as it did before. The reason these immediates are safe is because the ABI ensures they fit in small mode. This behavior is unchanged. 5) 64-bit small PIC code which is not using RIP-relative addressing. This is the only case changed by the patch, and the primary place you see it is in TLS, either the win64 section offset TLS or Linux local-exec TLS model in a PIC compilation. Here the ABI again ensures that the immediates fit because we are in small mode, and any other operations required due to the PIC relocation model have been handled externally to the Wrapper node (extra loads etc are made around the wrapper node in ISelLowering). I've tested this as much as I can comparing it with GCC's output, and everything appears safe. I discussed this with Anton and it made sense to him at least at face value. That said, if there are issues with PIC code after this patch, yell and we can revert it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154304 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-09 02:13:06 +00:00
Chandler Carruth	6916a2375a	Fold 15 tiny test cases into a single file that implements the comprehensive testing of TLS codegen for x86. Convert all of the ones that were still using grep to use FileCheck. Remove some redundancies between them. Perhaps most interestingly expand the test cases so that they actually fully list the instruction snippet being tested. TLS operations are very narrowly defined, and so these seem reasonably stable. More importantly, the existing test cases already were crazy fine grained, expecting specific registers to be allocated. This just clarifies that no other instructions are expected, and fills in some crucial gaps that weren't being tested at all. This will make any subsequent changes to TLS much more clear during review. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154303 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-09 01:43:17 +00:00
Duncan Sands	3ef3fcfc04	Only have codegen turn fdiv by a constant into fmul by the reciprocal when -ffast-math, i.e. don't just always do it if the reciprocal can be formed exactly. There is already an IR level transform that does that, and it does it more carefully. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154296 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-08 18:08:12 +00:00
Chandler Carruth	253933ee9e	Teach LLVM about a PIE option which, when enabled on top of PIC, makes optimizations which are valid for position independent code being linked into a single executable, but not for such code being linked into a shared library. I discussed the design of this with Eric Christopher, and the decision was to support an optional bit rather than a completely separate relocation model. Fundamentally, this is still PIC relocation, its just that certain optimizations are only valid under a PIC relocation model when the resulting code won't be in a shared library. The simplest path to here is to expose a single bit option in the TargetOptions. If folks have different/better designs, I'm all ears. =] I've included the first optimization based upon this: changing TLS models to the *Exec models when PIE is enabled. This is the LLVM component of PR12380 and is all of the hard work. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154294 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-08 17:51:45 +00:00
Nadav Rotem	9d68b06bc5	AVX2: Build splat vectors by broadcasting a scalar from the constant pool. Previously we used three instructions to broadcast an immediate value into a vector register. On Sandybridge we continue to load the broadcasted value from the constant pool. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154284 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-08 12:54:54 +00:00
Nadav Rotem	d16c8d0d33	1. Remove the part of r153848 which optimizes shuffle-of-shuffle into a new shuffle node because it could introduce new shuffle nodes that were not supported efficiently by the target. 2. Add a more restrictive shuffle-of-shuffle optimization for cases where the second shuffle reverses the transformation of the first shuffle. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154266 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-07 21:19:08 +00:00
Duncan Sands	961d666be4	Convert floating point division by a constant into multiplication by the reciprocal if converting to the reciprocal is exact. Do it even if inexact if -ffast-math. This substantially speeds up ac.f90 from the polyhedron benchmarks. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154265 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-07 20:04:00 +00:00
Sean Hunt	0fdfaafb70	Make the test for r154235 more platform-independent with a shorter string. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154243 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-07 01:33:14 +00:00

1 2 3 4 5 ...

6712 Commits