llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2025-11-01 15:17:25 +00:00

Author	SHA1	Message	Date
Juergen Ributzka	4f0d671b97	[FastISel][AArch64] Add support for fabs intrinsic. Lower the llvm.fabs intrinsic to the 'fabs' MI instruction. This fixes rdar://problem/18946552. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221729 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-11 23:10:44 +00:00
Tom Roeder	63dea2c952	Add Forward Control-Flow Integrity. This commit adds a new pass that can inject checks before indirect calls to make sure that these calls target known locations. It supports three types of checks and, at compile time, it can take the name of a custom function to call when an indirect call check fails. The default failure function ignores the error and continues. This pass incidentally moves the function JumpInstrTables::transformType from private to public and makes it static (with a new argument that specifies the table type to use); this is so that the CFI code can transform function types at call sites to determine which jump-instruction table to use for the check at that site. Also, this removes support for jumptables in ARM, pending further performance analysis and discussion. Review: http://reviews.llvm.org/D4167 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221708 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-11 21:08:02 +00:00
Sanjay Patel	e7c966f067	Use rcpss/rcpps (X86) to speed up reciprocal calcs (PR21385). This is a first step for generating SSE rcp instructions for reciprocal calcs when fast-math allows it. This is very similar to the rsqrt optimization enabled in D5658 ( http://reviews.llvm.org/rL220570 ). For now, be conservative and only enable this for AMD btver2 where performance improves significantly both in terms of latency and throughput. We may never enable this codegen for Intel Core* chips because the divider circuits are just too fast. On SandyBridge, divss can be as fast as 10 cycles versus the 21 cycle critical path for the rcp + mul + sub + mul + add estimate. Follow-on patches may allow configuration of the number of Newton-Raphson refinement steps, add AVX512 support, and enable the optimization for more chips. More background here: http://llvm.org/bugs/show_bug.cgi?id=21385 Differential Revision: http://reviews.llvm.org/D6175 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221706 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-11 20:51:00 +00:00
Rafael Espindola	612f7d7e00	Simplify testcase. NFC. Thanks to Filipe Cabecinhas for the tip. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221705 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-11 20:49:16 +00:00
Bill Schmidt	10161a0cce	[PowerPC] Replace foul hackery with real calls to __tls_get_addr My original support for the general dynamic and local dynamic TLS models contained some fairly obtuse hacks to generate calls to __tls_get_addr when lowering a TargetGlobalAddress. Rather than generating real calls, special GET_TLS_ADDR nodes were used to wrap the calls and only reveal them at assembly time. I attempted to provide correct parameter and return values by chaining CopyToReg and CopyFromReg nodes onto the GET_TLS_ADDR nodes, but this was also not fully correct. Problems were seen with two back-to-back stores to TLS variables, where the call sequences ended up overlapping with unhappy results. Additionally, since these weren't real calls, the proper register side effects of a call were not recorded, so clobbered values were kept live across the calls. The proper thing to do is to lower these into calls in the first place. This is relatively straightforward; see the changes to PPCTargetLowering::LowerGlobalTLSAddress() in PPCISelLowering.cpp. The changes here are standard call lowering, except that we need to track the fact that these calls will require a relocation. This is done by adding a machine operand flag of MO_TLSLD or MO_TLSGD to the TargetGlobalAddress operand that appears earlier in the sequence. The calls to LowerCallTo() eventually find their way to LowerCall_64SVR4() or LowerCall_32SVR4(), which call FinishCall(), which calls PrepareCall(). In PrepareCall(), we detect the calls to __tls_get_addr and immediately snag the TargetGlobalTLSAddress with the annotated relocation information. This becomes an extra operand on the call following the callee, which is expected for nodes of type tlscall. We change the call opcode to CALL_TLS for this case. Back in FinishCall(), we change it again to CALL_NOP_TLS for 64-bit only, since we require a TOC-restore nop following the call for the 64-bit ABIs. During selection, patterns in PPCInstrInfo.td and PPCInstr64Bit.td convert the CALL_TLS nodes into BL_TLS nodes, and convert the CALL_NOP_TLS nodes into BL8_NOP_TLS nodes. This replaces the code removed from PPCAsmPrinter.cpp, as the BL_TLS or BL8_NOP_TLS nodes can now be emitted normally using their patterns and the associated printTLSCall print method. Finally, as a result of these changes, all references to get-tls-addr in its various guises are no longer used, so they have been removed. There are existing TLS tests to verify the changes haven't messed anything up). I've added one new test that verifies that the problem with the original code has been fixed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221703 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-11 20:44:09 +00:00
Rafael Espindola	71c70733b7	Use a 8 bit immediate when possible. This fixes pr21529. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221700 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-11 19:46:36 +00:00
Dario Domizioli	949d328bee	[X86][ELF] Fix PR20243 - leaf frame pointer bug with TLS access The ISel lowering for global TLS access in PIC mode was creating a pseudo instruction that is later expanded to a call, but the code was not setting the hasCalls flag in the MachineFrameInfo alongside the adjustsStack flag. This caused some functions to be mistakenly recognized as leaf functions, and this in turn affected the decision to eliminate the frame pointer. With the fix, hasCalls is properly set and the leaf frame pointer is correctly preserved. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221695 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-11 18:44:49 +00:00
Oliver Stannard	659b1491b8	LLVM incorrectly folds xor into select LLVM replaces the SelectionDAG pattern (xor (set_cc cc x y) 1) with (set_cc !cc x y), which is only correct when the xor has type i1. Instead, we should check that the constant operand to the xor is all ones. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221693 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-11 17:36:01 +00:00
Vasileios Kalintiris	328bc2f89e	[mips] Add preliminary support for the MIPS II target. Summary: This patch enables code generation for the MIPS II target. Pre-Mips32 targets don't have the MUL instruction, so we add the correspondent pattern that uses the MULT/MFLO combination in order to retrieve the product. This is WIP as we don't support code generation for select nodes due to the lack of conditional-move instructions. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6150 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221686 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-11 11:43:55 +00:00
Andrea Di Biagio	d6548ad013	[X86] Add missing check for 'isINSERTPSMask' in method 'isShuffleMaskLegal'. This helps the DAGCombiner to identify more opportunities to fold shuffles. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221684 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-11 11:20:31 +00:00
Michael Kuperstein	f2fe3b72a9	[X86] Fix pattern match for 32-to-64-bit zext in the presence of AssertSext This fixes an issue with matching trunc -> assertsext -> zext on x86-64, which would not zero the high 32-bits. See PR20494 for details. Recommitting - This time, with a hopefully working test. Differential Revision: http://reviews.llvm.org/D6128 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221672 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-11 07:07:40 +00:00
Quentin Colombet	8201185d61	[X86] Custom lower UINT_TO_FP from v4f32 to v4i32, and for v8f32 to v8i32 if AVX2 is available. According to IACA, the new lowering has a throughput of 8 cycles instead of 13 with the previous one. Althought this lowering kicks in some SPECs benchmarks, the performance improvement was within the noise. Correctness testing has been done for the whole range of uint32_t with the following program: uint4 v = (uint4) {0,1,2,3}; uint32_t i; //Check correctness over entire range for uint4 -> float4 conversion for( i = 0; i < 1U << (32-2); i++ ) { float4 t = test(v); float4 c = correct(v); if( 0xf != _mm_movemask_ps( t == c )) { printf( "Error @ %vx: %vf vs. %vf\n", v, c, t); return -1; } v += 4; } Where "correct" is the old lowering and "test" the new one. The patch adds a test case for the two custom lowering instruction. It also modifies the vector cost model, which is why cast.ll and uitofp.ll are modified. 2009-02-26-MachineLICMBug.ll is also modified because we now hoist 7 instructions instead of 4 (3 more constant loads). rdar://problem/18153096> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221657 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-11 02:23:47 +00:00
Michael Kuperstein	dee48e7ad4	Reverting r221626 due to a too-strict test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221629 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-10 21:07:41 +00:00
Juergen Ributzka	1b9706b8c6	[AArch64][FastISel] Fix kill flags for integer extends. In the case we optimize an integer extend away and replace it directly with the source register, we also have to clear all kill flags at all its uses. This is necessary, because the orignal IR instruction might be trivially dead, but we replaced it with a nop at MI level. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221628 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-10 21:05:31 +00:00
Michael Kuperstein	1a66dc7468	[X86] Fix pattern match for 32-to-64-bit zext in the presence of AssertSext This fixes an issue with matching trunc -> assertsext -> zext on x86-64, which would not zero the high 32-bits. See PR20494 for details. Differential Revision: http://reviews.llvm.org/D6128 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221626 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-10 20:40:21 +00:00
Zoran Jovanovic	c63c935a80	[mips][microMIPS] Fix issue with delay slot filler and microMIPS Differential Revision: http://reviews.llvm.org/D6193 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221612 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-10 17:27:56 +00:00
Daniel Sanders	62c2faa216	[mips] Fix sret arguments for N32/N64 which were accidentally broken in r221534. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221604 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-10 15:57:53 +00:00
Matt Arsenault	79da624ab2	R600/SI: Fix broken check prefixes in test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221565 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-08 00:02:57 +00:00
Daniel Sanders	fe2b8b1960	[mips] Promote i32 arguments to i64 for the N32/N64 ABI and fix <64-bit structs... Summary: ... and after all that refactoring, it's possible to distinguish softfloat floating point values from integers so this patch no longer breaks softfloat to do it. Remove direct handling of i32's in the N32/N64 ABI by promoting them to i64. This more closely reflects the ABI documentation and also fixes problems with stack arguments on big-endian targets. We now rely on signext/zeroext annotations (already generated by clang) and the Assert[SZ]ext nodes to avoid the introduction of unnecessary sign/zero extends. It was not possible to convert three tests to use signext/zeroext. These tests are bswap.ll, ctlz-v.ll, ctlz-v.ll. It's not possible to put signext on a vector type so we just accept the sign extends here for now. These tests don't pass the vectors the same way clang does (clang puts multiple elements in the same argument, these map 1 element to 1 argument) so we don't need to worry too much about it. With this patch, all known N32/N64 bugs should be fixed and we now pass the first 10,000 tests generated by ABITest.py. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6117 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221534 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-07 16:54:21 +00:00
Ahmed Bougacha	75da31ff15	[AArch64] Keep flags on condition vreg when instantiating a CB branch. Reversing a CB* instruction used to drop the flags on the condition. On the included testcase, this lead to a read from an undefined vreg. Using addOperand keeps the flags, here <undef>. Differential Revision: http://reviews.llvm.org/D6159 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221507 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-07 02:50:00 +00:00
Simon Pilgrim	de3d50643c	[X86][SSE] Vector integer/float conversion memory folding (cvttps2dq / cvttpd2dq) Fixed an issue with the (v)cvttps2dq and (v)cvttpd2dq instructions being incorrectly put in the 2 source operand folding tables instead of the 1 source operand and added the missing SSE/AVX versions. Also added missing (v)cvtps2dq and (v)cvtpd2dq instructions to the folding tables. Differential Revision: http://reviews.llvm.org/D6001 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221489 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-06 22:15:41 +00:00
Ahmed Bougacha	112aabeeeb	[X86] Add VFMADDSUB cases for the 213->231 custom inserter. Also add tests for vfmadd/vfmsub. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221488 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-06 22:04:15 +00:00
Ahmed Bougacha	f44d4cd925	[X86] Add missing FMA3 VFMADDSUB in the emitter. Also reuse the fma4 intrinsic test to cover fma3 instructions too. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221487 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-06 21:58:11 +00:00
Ahmed Bougacha	8b6319bfed	[X86] Split FMA4 RM tests into a separate file. NFC. While there, remove useless comments. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221484 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-06 21:46:23 +00:00
Lang Hames	0ea3b243b0	[RegAlloc] Remove reference to the trivial spiller in test case. This test case was never actually testing the trivial spiller: the -spiller option has not been hooked up for a while now. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221475 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-06 19:24:18 +00:00
Rafael Espindola	4396b44b9a	Use FileCheck in a few tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221459 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-06 15:05:51 +00:00
Rafael Espindola	eed959015b	Compute the correct jump table entries on 32 bit windows. On 32 bit windows we use label differences and .set does not suppress rolocations, a combination that was not used before r220256. This fixes PR21497. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221456 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-06 14:39:49 +00:00
Andrea Di Biagio	f0f66a254d	[X86] When commuting SSE immediate blend, make sure that the new blend mask is a valid imm8. Example: define <4 x i32> @test(<4 x i32> %a, <4 x i32> %b) { %shuffle = shufflevector <4 x i32> %a, <4 x i32> %b, <4 x i32> <i32 4, i32 5, i32 6, i32 3> ret <4 x i32> %shuffle } Before llc (-mattr=+sse4.1), produced the following assembly instruction: pblendw $4294967103, %xmm1, %xmm0 After pblendw $63, %xmm1, %xmm0 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221455 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-06 14:36:45 +00:00
Toma Tabacu	7f22a20351	[mips] Tolerate the use of the %z inline asm operand modifier with non-immediates. Summary: Currently, we give an error if %z is used with non-immediates, instead of continuing as if the %z isn't there. For example, you use the %z operand modifier along with the "Jr" constraints ("r" makes the operand a register, and "J" makes it an immediate, but only if its value is 0). In this case, you want the compiler to print "$0" if the inline asm input operand turns out to be an immediate zero and you want it to print the register containing the operand, if it's not. We give an error in the latter case, and we shouldn't (GCC also doesn't). Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6023 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221453 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-06 14:25:42 +00:00
Sasa Stankovic	2b8f96996b	[mips] Add the following MIPS options that control gp-relative addressing of small data items: -mgpopt, -mlocal-sdata, -mextern-sdata. Implement gp-relative addressing for constants. Differential Revision: http://reviews.llvm.org/D4903 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221450 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-06 13:20:12 +00:00
Rafael Espindola	58de7099c7	Add three other sections when L symbols are allowed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221436 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-06 05:01:21 +00:00
Rafael Espindola	e1f22e397d	Allow L symbols in no_dead_strip sections. If a section cannot be dead stripped, it is safe to use L symbols, since the linker will keep all of it in the end. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221431 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-06 02:42:03 +00:00
Quentin Colombet	d465eced34	[X86] Lower VSELECT into SHRUNKBLEND when we shrink the bits used into the condition to match a blend. This prevents optimizations that work on VSELECT to perform invalid transformations. Indeed, the optimized condition does not match the vector boolean content that is expected and bad things may happen. This patch yields the exact same code on the whole test-suite + specs (-O3 and -O3 -march=core-avx2), it improves one test case (vector-blend.ll) and fixes a bug reduced in vselect-avx.ll. <rdar://problem/18819506> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221429 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-06 02:25:03 +00:00
Petar Jovanovic	5940390ece	[mips64] Fix MIPS64 exception personality encoding Remove dynamic relocations of __gxx_personality_v0 from the .eh_frame. The MIPS64 follow-up of the MIPS32 fix (rL209907). Patch by Vladimir Stefanovic. Differential Revision: http://reviews.llvm.org/D6141 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221408 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 22:42:31 +00:00
Simon Pilgrim	3f1d66fe93	[X86][SSE] Vector integer to float conversion memory folding Added missing memory folding for the (V)CVTDQ2PS instructions - we can safely fold these (but not the (V)CVTDQ2PD versions which have a register/memory size discrepancy in the source operand). I've added a test case demonstrating that stack folding now works. Differential Revision: http://reviews.llvm.org/D5981 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221407 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 22:28:25 +00:00
Derek Schuff	65c7979555	Fix test breakage from r221386 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221389 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 20:02:05 +00:00
Derek Schuff	07ffcb1007	[x86 fast-isel] Materialize allocas with the correct-sized lea for ILP32 Summary: X86FastISel::fastMaterializeAlloca was incorrectly conditioning its opcode selection on subtarget bitness rather than pointer size. Differential Revision: http://reviews.llvm.org/D6136 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221386 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 19:27:21 +00:00
Matt Arsenault	e7a8e298fb	R600/SI: Add testcase I forgot to commit from months ago git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221384 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 19:01:22 +00:00
Justin Holewinski	e459c0bf65	[NVPTX] Add NVPTXLowerStructArgs pass This works around the limitation that PTX does not allow .param space loads/stores with arbitrary pointers. If a function has a by-val struct ptr arg, say foo(%struct.x byval %d), then add the following instructions to the first basic block : %temp = alloca %struct.x, align 8 %tt1 = bitcast %struct.x %d to i8 * %tt2 = llvm.nvvm.cvt.gen.to.param %tt2 %tempd = bitcast i8 addrspace(101) * to %struct.x addrspace(101) * %tv = load %struct.x addrspace(101) * %tempd store %struct.x %tv, %struct.x * %temp, align 8 The above code allocates some space in the stack and copies the incoming struct from param space to local space. Then replace all occurences of %d by %temp. Fixes PR21465. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221377 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 18:19:30 +00:00
Zoran Jovanovic	cd2d40cef6	ps][microMIPS] Implement CodeGen support for ANDI16 instruction git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221371 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 17:43:00 +00:00
Zoran Jovanovic	a1925e6d5d	ps][microMIPS] Implement CodeGen support for SLL16 and SRL16 instructions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221369 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 17:38:31 +00:00
Zoran Jovanovic	e9b9ca452f	Reverted revisions 221351, 221352 and 221353. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221354 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 16:19:59 +00:00
Zoran Jovanovic	e7ec22de06	[mips][microMIPS] Implement CodeGen support for ANDI16 instruction Differential Revision: http://reviews.llvm.org/D5797 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221353 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 15:54:05 +00:00
Zoran Jovanovic	8cfd4909f0	[mips][microMIPS] Implement CodeGen support for SLL16 and SRL16 instructions Differential Revision: http://reviews.llvm.org/D5933 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221352 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 15:46:53 +00:00
Tom Stellard	8eaed0f63d	R600/SI: Change all instruction assembly names to lowercase. This matches the format produced by the AMD proprietary driver. //==================================================================// // Shell script for converting .ll test cases: (Pass the .ll files you want to convert to this script as arguments). //==================================================================// ; This was necessary on my system so that A-Z in sed would match only ; upper case. I'm not sure why. export LC_ALL='C' TEST_FILES="$" MATCHES=`grep -v Patterns SIInstructions.td \| grep -o '"[A-Z0-9_]\+["e]' \| grep -o '[A-Z0-9_]\+' \| sort -r` for f in $TEST_FILES; do # Check that there are SI tests: grep -q -e 'verde' -e 'bonaire' -e 'SI' -e 'tahiti' $f if [ $? -eq 0 ]; then for match in $MATCHES; do sed -i -e "s/$[ :]$match$/\L\1/" $f done # Try to get check lines with partial instruction names sed -i 's/$;[ ]SI[A-Z\\-]: $$[A-Z_0-9]\+$/\1\L\2/' $f fi done sed -i -e 's/bb0_1/BB0_1/g' ../../../test/CodeGen/R600/infinite-loop.ll sed -i -e 's/SI-NOT: bfe/SI-NOT: {{[^@]}}bfe/g'../../../test/CodeGen/R600/llvm.AMDGPU.bfe.32.ll ../../../test/CodeGen/R600/sext-in-reg.ll sed -i -e 's/exp_IEEE/EXP_IEEE/g' ../../../test/CodeGen/R600/llvm.exp2.ll sed -i -e 's/numVgprs/NumVgprs/g' ../../../test/CodeGen/R600/register-count-comments.ll sed -i 's/$; CHECK[-NOT]*: $$[A-Z_0-9]\+$/\1\L\2/' ../../../test/CodeGen/R600/select64.ll ../../../test/CodeGen/R600/sgpr-copy.ll //==================================================================// // Shell script for converting .td files (run this last) //==================================================================// export LC_ALL='C' sed -i -e '/Patterns/!s/$"[A-Z0-9_]\+[ "e]$/\L\1/g' SIInstructions.td sed -i -e 's/"EXP/"exp/g' SIInstrInfo.td git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221350 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 14:50:53 +00:00
Tom Stellard	1f06060994	R600/SI: Add an extra check line to make test more strict git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221349 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 14:50:34 +00:00
Andrea Di Biagio	042bee88f3	[X86] Teach method 'isVectorClearMaskLegal' how to check for legal blend masks. This patch improves the folding of vector AND nodes into blend operations for targets that feature SSE4.1. A vector AND node where one of the operands is a constant build_vector with elements that are either zero or all-ones can be converted into a blend. This allows for example to simplify the following code: define <4 x i32> @test(<4 x i32> %A, <4 x i32> %B) { %1 = and <4 x i32> %A, <i32 0, i32 0, i32 0, i32 -1> %2 = and <4 x i32> %B, <i32 -1, i32 -1, i32 -1, i32 0> %3 = or <4 x i32> %1, %2 ret <4 x i32> %3 } Before this patch llc (-mcpu=corei7) generated: andps LCPI1_0(%rip), %xmm0, %xmm0 andps LCPI1_1(%rip), %xmm1, %xmm1 orps %xmm1, %xmm0, %xmm0 retq With this patch we generate a single 'vpblendw'. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221343 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 13:04:14 +00:00
Craig Topper	04a45948a0	Improve logic that decides if its profitable to commute when some of the virtual registers involved have uses/defs chains connecting them to physical register. Fix up the tests that this change improves. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221336 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 06:43:02 +00:00
Tim Northover	cafa378fe0	ARM: try to add extra CS-register whenever stack alignment >= 8. We currently try to push an even number of registers to preserve 8-byte alignment during a function's prologue, but only when the stack alignment is prcisely 8. Many of the reasons for doing it apply also when that alignment > 8 (the extra store is often free, and can save another stack adjustment, though less frequently for 16-byte stack alignment). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221321 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 00:27:20 +00:00
Tim Northover	1f771b80c0	ARM/Dwarf: correctly align stack before callee-saved VPRs We were making an attempt to do this by adding an extra callee-saved GPR (so that there was an even number in the list), but when that failed we went ahead and pushed anyway. This had a couple of potential issues: + The .cfi directives we emit misplaced dN because they were based on PrologEpilogInserter's calculation. + Unaligned stores can be less efficient. + Unaligned stores can actually fault (likely only an issue in niche cases, but possible). This adds a final explicit stack adjustment if all other options fail, so that the actual locations of the registers match up with where they should be. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221320 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 00:27:13 +00:00

1 2 3 4 5 ...

12064 Commits