llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-12-26 21:32:10 +00:00

Author	SHA1	Message	Date
Chris Lattner	7e9b427c87	if an alloca is only ever accessed as a unit, and is accessed with load/store instructions, then don't try to decimate it into its individual pieces. This will just make a mess of the IR and is pointless if none of the elements are individually accessed. This was generating really terrible code for std::bitset (PR8980) because it happens to be lowered by clang as an {[8 x i8]} structure instead of {i64}. The testcase now is optimized to: define i64 @test2(i64 %X) { br label %L2 L2: ; preds = %0 ret i64 %X } before we generated: define i64 @test2(i64 %X) { %sroa.store.elt = lshr i64 %X, 56 %1 = trunc i64 %sroa.store.elt to i8 %sroa.store.elt8 = lshr i64 %X, 48 %2 = trunc i64 %sroa.store.elt8 to i8 %sroa.store.elt9 = lshr i64 %X, 40 %3 = trunc i64 %sroa.store.elt9 to i8 %sroa.store.elt10 = lshr i64 %X, 32 %4 = trunc i64 %sroa.store.elt10 to i8 %sroa.store.elt11 = lshr i64 %X, 24 %5 = trunc i64 %sroa.store.elt11 to i8 %sroa.store.elt12 = lshr i64 %X, 16 %6 = trunc i64 %sroa.store.elt12 to i8 %sroa.store.elt13 = lshr i64 %X, 8 %7 = trunc i64 %sroa.store.elt13 to i8 %8 = trunc i64 %X to i8 br label %L2 L2: ; preds = %0 %9 = zext i8 %1 to i64 %10 = shl i64 %9, 56 %11 = zext i8 %2 to i64 %12 = shl i64 %11, 48 %13 = or i64 %12, %10 %14 = zext i8 %3 to i64 %15 = shl i64 %14, 40 %16 = or i64 %15, %13 %17 = zext i8 %4 to i64 %18 = shl i64 %17, 32 %19 = or i64 %18, %16 %20 = zext i8 %5 to i64 %21 = shl i64 %20, 24 %22 = or i64 %21, %19 %23 = zext i8 %6 to i64 %24 = shl i64 %23, 16 %25 = or i64 %24, %22 %26 = zext i8 %7 to i64 %27 = shl i64 %26, 8 %28 = or i64 %27, %25 %29 = zext i8 %8 to i64 %30 = or i64 %29, %28 ret i64 %30 } In this case, instcombine was able to eliminate the nonsense, but in PR8980 enough PHIs are in play that instcombine backs off. It's better to not generate this stuff in the first place. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123571 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-16 06:18:28 +00:00
Chris Lattner	192228edb1	enhance FoldOpIntoPhi in instcombine to try harder when a phi has multiple uses. In some cases, all the uses are the same operation, so instcombine can go ahead and promote the phi. In the testcase this pushes an add out of the loop. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123568 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-16 05:28:59 +00:00
Owen Anderson	66f708f7e5	Improve the safety of my globalopt enhancement by ensuring that the bitcast of the stored value to the new store type is always. Also, add a testcase. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123563 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-16 04:33:33 +00:00
Chris Lattner	156eb0a569	fix PR8983, a broken assertion. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123562 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-16 03:43:53 +00:00
Chris Lattner	0092b1142f	remove the partial specialization pass. It is unmaintained and has bugs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123554 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-16 00:27:10 +00:00
Nick Lewycky	2820c25e84	Make constmerge a two-pass algorithm so that it won't miss merging opporuntities. Fixes PR8978. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123541 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-15 18:14:21 +00:00
Chris Lattner	6ccb5ef1b5	temporarily revert r123526. While working on a follow-on patch I realize that ConstantFoldTerminator doesn't preserve dominfo. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123527 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-15 07:51:19 +00:00
Chris Lattner	eeba3f5695	fix rdar://8785296 - -fcatch-undefined-behavior generates inefficient code The basic issue is that isel (very reasonably!) expects conditional branches to be folded, so CGP leaving around a bunch dead computation feeding conditional branches isn't such a good idea. Just fold branches on constants into unconditional branches. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123526 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-15 07:36:13 +00:00
Chris Lattner	94e8e0cfbe	Now that instruction optzns can update the iterator as they go, we can have objectsize folding recursively simplify away their result when it folds. It is important to catch this here, because otherwise we won't eliminate the cross-block values at isel and other times. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123524 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-15 07:25:29 +00:00
Chris Lattner	62fe406dc2	implement an instcombine xform that canonicalizes casts outside of and-with-constant operations. This fixes rdar://8808586 which observed that we used to compile: union xy { struct x { _Bool b[15]; } x; __attribute__((packed)) struct y { __attribute__((packed)) unsigned long b0to7; __attribute__((packed)) unsigned int b8to11; __attribute__((packed)) unsigned short b12to13; __attribute__((packed)) unsigned char b14; } y; }; struct x foo(union xy *xy) { return xy->x; } into: _foo: ## @foo movq (%rdi), %rax movabsq $1095216660480, %rcx ## imm = 0xFF00000000 andq %rax, %rcx movabsq $-72057594037927936, %rdx ## imm = 0xFF00000000000000 andq %rax, %rdx movzbl %al, %esi orq %rdx, %rsi movq %rax, %rdx andq $65280, %rdx ## imm = 0xFF00 orq %rsi, %rdx movq %rax, %rsi andq $16711680, %rsi ## imm = 0xFF0000 orq %rdx, %rsi movl %eax, %edx andl $-16777216, %edx ## imm = 0xFFFFFFFFFF000000 orq %rsi, %rdx orq %rcx, %rdx movabsq $280375465082880, %rcx ## imm = 0xFF0000000000 movq %rax, %rsi andq %rcx, %rsi orq %rdx, %rsi movabsq $71776119061217280, %r8 ## imm = 0xFF000000000000 andq %r8, %rax orq %rsi, %rax movzwl 12(%rdi), %edx movzbl 14(%rdi), %esi shlq $16, %rsi orl %edx, %esi movq %rsi, %r9 shlq $32, %r9 movl 8(%rdi), %edx orq %r9, %rdx andq %rdx, %rcx movzbl %sil, %esi shlq $32, %rsi orq %rcx, %rsi movl %edx, %ecx andl $-16777216, %ecx ## imm = 0xFFFFFFFFFF000000 orq %rsi, %rcx movq %rdx, %rsi andq $16711680, %rsi ## imm = 0xFF0000 orq %rcx, %rsi movq %rdx, %rcx andq $65280, %rcx ## imm = 0xFF00 orq %rsi, %rcx movzbl %dl, %esi orq %rcx, %rsi andq %r8, %rdx orq %rsi, %rdx ret We now compile this into: _foo: ## @foo ## BB#0: ## %entry movzwl 12(%rdi), %eax movzbl 14(%rdi), %ecx shlq $16, %rcx orl %eax, %ecx shlq $32, %rcx movl 8(%rdi), %edx orq %rcx, %rdx movq (%rdi), %rax ret A small improvement :-) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123520 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-15 06:32:33 +00:00
Duncan Sands	c087e20331	Turn X-(X-Y) into Y. According to my auto-simplifier this is the most common simplification present in fully optimized code (I think instcombine fails to transform some of these when "X-Y" has more than one use). Fires here and there all over the test-suite, for example it eliminates 8 subtractions in the final IR for 445.gobmk, 2 subs in 447.dealII, 2 in paq8p etc. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123442 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-14 15:26:10 +00:00
Duncan Sands	cf80bc1d4a	Factorize common code out of the InstructionSimplify shift logic. Add in threading of shifts over selects and phis while there. This fires here and there in the testsuite, to not much effect. For example when compiling spirit it fires 5 times, during early-cse, resulting in 6 more cse simplifications, and 3 more terminators being folded by jump threading, but the final bitcode doesn't change in any interesting way: other optimizations would have caught the opportunity anyway, only later. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123441 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-14 14:44:12 +00:00
Duncan Sands	62becca681	Rename this test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123440 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-14 14:16:33 +00:00
Chris Lattner	bdb6b7f9c7	relax testcase a bit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123433 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-14 07:46:33 +00:00
Duncan Sands	c43cee3fbb	Move some shift transforms out of instcombine and into InstructionSimplify. While there, I noticed that the transform "undef >>a X -> undef" was wrong. For example if X is 2 then the top two bits must be equal, so the result can not be anything. I fixed this in the constant folder as well. Also, I made the transform for "X << undef" stronger: it now folds to undef always, even though X might be zero. This is in accordance with the LangRef, but I must admit that it is fairly aggressive. Also, I added "i32 X << 32 -> undef" following the LangRef and the constant folder, likewise fairly aggressive. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123417 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-14 00:37:45 +00:00
Bob Wilson	704d1347c5	Extend SROA to handle arrays accessed as homogeneous structs and vice versa. This is a minor extension of SROA to handle a special case that is important for some ARM NEON operations. Some of the NEON intrinsics return multiple values, which are handled as struct types containing multiple elements of the same vector type. The corresponding return types declared in the arm_neon.h header have equivalent arrays. We need SROA to recognize that it can split up those arrays and structs into separate vectors, even though they are not always accessed with the same type. SROA already handles loads and stores of an entire alloca by using insertvalue/extractvalue to access the individual pieces, and that code works the same regardless of whether the type is a struct or an array. So, all that needs to be done is to check for compatible arrays and homogeneous structs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123381 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-13 17:45:11 +00:00
Bob Wilson	694a10e7d8	Make SROA more aggressive with allocas containing padding. SROA only split up structs and arrays one level at a time, so padding can only cause trouble if it is located in between the struct or array elements. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123380 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-13 17:45:08 +00:00
Duncan Sands	6dc91253ab	The most common simplification missed by instsimplify in unoptimized bitcode is "X != 0 -> X" when X is a boolean. This occurs a lot because of the way llvm-gcc converts gcc's conditional expressions. Add this, and a few other similar transforms for completeness. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123372 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-13 08:56:29 +00:00
Chris Lattner	d318fc2ceb	revert 123144, reenabling the rest of memset formation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123302 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-12 03:25:15 +00:00
Chris Lattner	d2e905027b	revert r123146 which disabled code that wasn't the root cause of the bootstrap miscompare issue. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123299 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-12 01:52:23 +00:00
Chris Lattner	86099ba2b5	merge tests into one crash.ll test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123220 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-11 07:50:07 +00:00
Chris Lattner	93767fdb61	remove a bogus assertion: the latch block of a loop is not neccesarily an uncond branch to the header. This fixes PR8955 (the assertion tripping). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123219 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-11 07:47:59 +00:00
Chandler Carruth	15ed90c859	Teach constant folding to perform conversions from constant floating point values to their integer representation through the SSE intrinsic calls. This is the last part of a README.txt entry for which I have real world examples. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123206 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-11 01:07:24 +00:00
Chandler Carruth	f7b0047f5f	FileCheck-ize a test, and move a no-longer calling test case to another file and make it actually test something... git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123205 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-11 01:07:20 +00:00
Owen Anderson	da1c122da5	Fix a random missed optimization by making InstCombine more aggressive when determining which bits are demanded by a comparison against a constant. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123203 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-11 00:36:45 +00:00
Chandler Carruth	9cc9f50abc	Teach instcombine about the rest of the SSE and SSE2 conversion intrinsics element dependencies. Reviewed by Nick. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123161 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-10 07:19:37 +00:00
Chandler Carruth	fdc8f2d260	Fold two related tests into the newly FileCheck-ized test, migrating them to FileCheck as well. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123154 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-10 02:53:58 +00:00
Chandler Carruth	548e581dcb	Clean up and FileCheck-ize a test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123153 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-10 02:53:54 +00:00
Chris Lattner	f86c75da4d	fix typo git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123148 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-10 02:33:34 +00:00
Chris Lattner	a806be66c1	another (more) aggressive attempt to bring llvm-gcc-i386-linux-selfhost back to life. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123146 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-10 00:47:34 +00:00
Chris Lattner	d8408270f3	temporarily disable memset formation from memsets in an effort to restore buildbot stability. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123144 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-09 23:52:48 +00:00
Tobias Grosser	aa2be84356	Instcombine: Fix pattern where the sext did not dominate the icmp using it git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123121 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-09 16:00:11 +00:00
Chris Lattner	d90a192279	Merge memsets followed by neighboring memsets and other stores into larger memsets. Among other things, this fixes rdar://8760394 and allows us to handle "Example 2" from http://blog.regehr.org/archives/320, compiling it into a single 4096-byte memset: _mad_synth_mute: ## @mad_synth_mute ## BB#0: ## %entry pushq %rax movl $4096, %esi ## imm = 0x1000 callq ___bzero popq %rax ret git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123089 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-08 21:19:19 +00:00
Chris Lattner	9fa11e94b5	fix an issue in IsPointerOffset that prevented us from recognizing that P and P+1 are relative to the same base pointer. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123087 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-08 21:07:56 +00:00
Chris Lattner	06511264f8	enhance memcpyopt to merge a store and a subsequent memset into a single larger memset. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123086 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-08 20:54:51 +00:00
Chris Lattner	355f5778aa	merge two tests and filecheckify git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123082 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-08 20:27:22 +00:00
Chris Lattner	5d37370a6f	When loop rotation happens, it is very common for the duplicated condbr to be foldable into an uncond branch. When this happens, we can make a much simpler CFG for the loop, which is important for nested loop cases where we want the outer loop to be aggressively optimized. Handle this case more aggressively. For example, previously on phi-duplicate.ll we would get this: define void @test(i32 %N, double* %G) nounwind ssp { entry: %cmp1 = icmp slt i64 1, 1000 br i1 %cmp1, label %bb.nph, label %for.end bb.nph: ; preds = %entry br label %for.body for.body: ; preds = %bb.nph, %for.cond %j.02 = phi i64 [ 1, %bb.nph ], [ %inc, %for.cond ] %arrayidx = getelementptr inbounds double* %G, i64 %j.02 %tmp3 = load double* %arrayidx %sub = sub i64 %j.02, 1 %arrayidx6 = getelementptr inbounds double* %G, i64 %sub %tmp7 = load double* %arrayidx6 %add = fadd double %tmp3, %tmp7 %arrayidx10 = getelementptr inbounds double* %G, i64 %j.02 store double %add, double* %arrayidx10 %inc = add nsw i64 %j.02, 1 br label %for.cond for.cond: ; preds = %for.body %cmp = icmp slt i64 %inc, 1000 br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge for.cond.for.end_crit_edge: ; preds = %for.cond br label %for.end for.end: ; preds = %for.cond.for.end_crit_edge, %entry ret void } Now we get the much nicer: define void @test(i32 %N, double* %G) nounwind ssp { entry: br label %for.body for.body: ; preds = %entry, %for.body %j.01 = phi i64 [ 1, %entry ], [ %inc, %for.body ] %arrayidx = getelementptr inbounds double* %G, i64 %j.01 %tmp3 = load double* %arrayidx %sub = sub i64 %j.01, 1 %arrayidx6 = getelementptr inbounds double* %G, i64 %sub %tmp7 = load double* %arrayidx6 %add = fadd double %tmp3, %tmp7 %arrayidx10 = getelementptr inbounds double* %G, i64 %j.01 store double %add, double* %arrayidx10 %inc = add nsw i64 %j.01, 1 %cmp = icmp slt i64 %inc, 1000 br i1 %cmp, label %for.body, label %for.end for.end: ; preds = %for.body ret void } With all of these recent changes, we are now able to compile: void foo(char X) { for (int i = 0; i != 100; ++i) for (int j = 0; j != 100; ++j) X[j+i100] = 0; } into a single memset of 10000 bytes. This series of changes should also be helpful for other nested loop scenarios as well. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123079 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-08 19:59:06 +00:00
Chris Lattner	0e4a1543ab	Three major changes: 1. Rip out LoopRotate's domfrontier updating code. It isn't needed now that LICM doesn't use DF and it is super complex and gross. 2. Make DomTree updating code a lot simpler and faster. The old loop over all the blocks was just to find a block?? 3. Change the code that inserts the new preheader to just use SplitCriticalEdge instead of doing an overcomplex reimplementation of it. No behavior change, except for the name of the inserted preheader. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123072 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-08 18:52:51 +00:00
Frits van Bommel	b686eb9186	Fix a bug in r123034 (trying to sext/zext non-integers) and clean up a little. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123061 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-08 10:51:36 +00:00
Chris Lattner	d9ec3572f3	Have loop-rotate simplify instructions (yay instsimplify!) as it clones them into the loop preheader, eliminating silly instructions like "icmp i32 0, 100" in fixed tripcount loops. This also better exposes the bigger problem with loop rotate that I'd like to fix: once this has been folded, the duplicated conditional branch often turns into an uncond branch. Not aggressively handling this is pessimizing later loop optimizations somethin' fierce by making "dominates all exit blocks" checks fail. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123060 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-08 08:24:46 +00:00
Tobias Grosser	46431d7a93	InstCombine: Match min/max hidden by sext/zext X = sext x; x >s c ? X : C+1 --> X = sext x; X <s C+1 ? C+1 : X X = sext x; x <s c ? X : C-1 --> X = sext x; X >s C-1 ? C-1 : X X = zext x; x >u c ? X : C+1 --> X = zext x; X <u C+1 ? C+1 : X X = zext x; x <u c ? X : C-1 --> X = zext x; X >u C-1 ? C-1 : X X = sext x; x >u c ? X : C+1 --> X = sext x; X <u C+1 ? C+1 : X X = sext x; x <u c ? X : C-1 --> X = sext x; X >u C-1 ? C-1 : X Instead of calculating this with mixed types promote all to the larger type. This enables scalar evolution to analyze this expression. PR8866 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123034 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-07 21:33:14 +00:00
Benjamin Kramer	eaff66a895	Revert 122959, it needs more thought. Add it back to README.txt with additional notes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123030 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-07 20:42:20 +00:00
Benjamin Kramer	8143a84c46	InstCombine: Turn _chk functions into the "unsafe" variant if length and max langth are equal. This happens when we take the (non-constant) length from a malloc. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122961 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-06 14:22:52 +00:00
Benjamin Kramer	240d42d185	InstCombine: If we call llvm.objectsize on a malloc call we can replace it with the size passed to malloc. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122959 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-06 13:11:05 +00:00
Benjamin Kramer	783a5c2b69	InstCombine: Teach llvm.objectsize folding to look through GEPs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122958 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-06 13:07:49 +00:00
Chris Lattner	8cd4efb6a5	implement constant folding support for an exotic constant expr: ret i64 ptrtoint (i8* getelementptr ([1000 x i8]* @X, i64 1, i64 sub (i64 0, i64 ptrtoint ([1000 x i8]* @X to i64))) to i64) to "ret i64 1000". This allows us to correctly compute the trip count on a loop in PR8883, which occurs with std::fill on a char array. This allows us to transform it into a memset with a constant size. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122950 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-06 06:19:46 +00:00
Chris Lattner	43b40a4620	fix an off-by-one bug that caused a crash analyzing ashr's with huge shift amounts, PR8896 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122814 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-04 18:19:15 +00:00
Chris Lattner	e41d3c015c	Teach loop-idiom to turn a loop containing a memset into a larger memset when safe. The testcase is basically this nested loop: void foo(char X) { for (int i = 0; i != 100; ++i) for (int j = 0; j != 100; ++j) X[j+i100] = 0; } which gets turned into a single memset now. clang -O3 doesn't optimize this yet though due to a phase ordering issue I haven't analyzed yet. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122806 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-04 07:46:33 +00:00
Chris Lattner	e508dd4c75	Duncan deftly points out that readnone functions aren't invalidated by stores, so they can be handled as 'simple' operations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122785 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-03 23:38:13 +00:00
Chris Lattner	75637154c3	earlycse can do trivial with-a-block dead store elimination as well. This deletes 60 stores in 176.gcc that largely come from bitfield code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122736 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-03 04:17:24 +00:00
Chris Lattner	ef87fc2e0a	now that loads are in their own table, we can implement store->load forwarding. This allows EarlyCSE to zap 600 more loads from 176.gcc. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122732 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-03 03:46:34 +00:00
Chris Lattner	03d49e955e	add a testcase for readonly call CSE git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122730 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-03 03:33:47 +00:00
Chris Lattner	8e7f0d70c7	Teach EarlyCSE to do trivial CSE of loads and read-only calls. On 176.gcc, this catches 13090 loads and calls, and increases the number of simple instructions CSE'd from 29658 to 36208. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122727 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-03 03:18:43 +00:00
Chris Lattner	91139ccd99	add DEBUG and -stats output to earlycse. Teach it to CSE the rest of the non-side-effecting instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122716 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-02 23:19:45 +00:00
Chris Lattner	cc9eab26b3	Enhance earlycse to do CSE of casts, instsimplify and die. Add a testcase. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122715 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-02 23:04:14 +00:00
Chris Lattner	63f9c3c49a	fix a miscompilation of tramp3d-v4: when forming a memcpy, we have to make sure that the loop we're promoting into a memcpy doesn't mutate the input of the memcpy. Before we were just checking that the dest of the memcpy wasn't mod/ref'd by the loop. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122712 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-02 21:14:18 +00:00
Chris Lattner	8e08e73f0e	If a loop iterates exactly once (has backedge count = 0) then don't mess with it. We'd rather peel/unroll it than convert all of its stores into memsets. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122711 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-02 20:24:21 +00:00
Chris Lattner	62c50fdf69	enhance loop idiom recognition to scan all unconditionally executed blocks in a loop, instead of just the header block. This makes it more aggressive, able to handle Duncan's Ada examples. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122704 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-02 19:01:03 +00:00
Duncan Sands	67fb341f8b	Fix PR8702 by not having LoopSimplify claim to preserve LCSSA form. As described in the PR, the pass could break LCSSA form when inserting preheaders. It probably would be easy enough to fix this, but since currently we always go into LCSSA form after running this pass, doing so is not urgent. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122695 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-02 13:38:21 +00:00
Chris Lattner	cf078f2b20	Allow loop-idiom to run on multiple BB loops, but still only scan the loop header for now for memset/memcpy opportunities. It turns out that loop-rotate is successfully rotating loops, but DOESN'T MERGE THE BLOCKS, turning "for loops" into 2 basic block loops that loop-idiom was ignoring. With this fix, we form many many more memcpy and memsets than before, including on the "history" loops in the viterbi benchmark, which look like this: for (j=0; j<MAX_history; ++j) { history_new[i][j+1] = history[2*i][j]; } Transforming these loops into memcpy's speeds up the viterbi benchmark from 11.98s to 3.55s on my machine. Woo. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122685 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-02 07:58:36 +00:00
Chris Lattner	e2c4392091	teach loop idiom recognition to form memcpy's from simple loops. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122678 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-02 03:37:56 +00:00
Chris Lattner	d91ed10b70	fix a globalopt crash on two Adobe-C++ testcases that the recent loop idiom pass exposed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122674 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-01 22:31:46 +00:00
Chris Lattner	bafa117e8f	add a validity check that was missed, fixing a crash on the new testcase. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122662 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-01 20:12:04 +00:00
Duncan Sands	124708d9b4	Revert commit 122654 at the request of Chris, who reckons that instsimplify is the wrong hammer for this nail, and is probably right. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122661 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-01 20:08:02 +00:00
Chris Lattner	a64cbf067d	improve validity check to handle constant-trip-count loops more aggressively. In practice, this doesn't help anything though, see the todo. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122660 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-01 19:54:22 +00:00
Chris Lattner	30980b6815	implement the "no aliasing accesses in loop" safety check. This pass should be correct now. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122659 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-01 19:39:01 +00:00
Duncan Sands	7cf85e74e3	Fix a README item by having InstructionSimplify do a mild form of value numbering, in which it considers (for example) "%a = add i32 %x, %y" and "%b = add i32 %x, %y" to be equal because the operands are equal and the result of the instructions only depends on the values of the operands. This has almost no effect (it removes 4 instructions from gcc-as-one-file), and perhaps slows down compilation: I measured a 0.4% slowdown on the large gcc-as-one-file testcase, but it wasn't statistically significant. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122654 91177308-0d34-0410-b5e6-96231b3b80d8	2011-01-01 16:12:09 +00:00
NAKAMURA Takumi	5c083b174e	test/Transforms/ConstProp/logicaltest.ll: FileCheck-ize. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122620 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-29 03:58:56 +00:00
Chris Lattner	a92ff91a96	implement enough of the memset inference algorithm to recognize and insert memsets. This is still missing one important validity check, but this is enough to compile stuff like this: void test0(std::vector<char> &X) { for (std::vector<char>::iterator I = X.begin(), E = X.end(); I != E; ++I) *I = 0; } void test1(std::vector<int> &X) { for (long i = 0, e = X.size(); i != e; ++i) X[i] = 0x01010101; } With: $ clang t.cpp -S -o - -O2 -emit-llvm \| opt -loop-idiom \| opt -O3 \| llc to: __Z5test0RSt6vectorIcSaIcEE: ## @_Z5test0RSt6vectorIcSaIcEE ## BB#0: ## %entry subq $8, %rsp movq (%rdi), %rax movq 8(%rdi), %rsi cmpq %rsi, %rax je LBB0_2 ## BB#1: ## %bb.nph subq %rax, %rsi movq %rax, %rdi callq ___bzero LBB0_2: ## %for.end addq $8, %rsp ret ... __Z5test1RSt6vectorIiSaIiEE: ## @_Z5test1RSt6vectorIiSaIiEE ## BB#0: ## %entry subq $8, %rsp movq (%rdi), %rax movq 8(%rdi), %rdx subq %rax, %rdx cmpq $4, %rdx jb LBB1_2 ## BB#1: ## %for.body.preheader andq $-4, %rdx movl $1, %esi movq %rax, %rdi callq _memset LBB1_2: ## %for.end addq $8, %rsp ret git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122573 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-26 23:42:51 +00:00
Chris Lattner	61db1f56d0	start using irbuilder to make mem intrinsics in a few passes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122572 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-26 22:57:41 +00:00
Benjamin Kramer	a112087e42	MemCpyOpt: Turn memcpys from a constant into a memset if possible. This allows us to compile "int cst[] = {-1, -1, -1};" into movl $-1, 16(%rsp) movq $-1, 8(%rsp) instead of movl _cst+8(%rip), %eax movl %eax, 16(%rsp) movq _cst(%rip), %rax movq %rax, 8(%rsp) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122548 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-24 21:17:12 +00:00
Owen Anderson	ec3953ff95	When determining if we can fold (x >> C1) << C2, the bits that we need to verify are zero are not the low bits of x, but the bits that WILL be the low bits after the operation completes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122529 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-23 23:56:24 +00:00
Benjamin Kramer	4ac19470dc	InstCombine: creating selects from -1 and 0 is fine, they combine into a sext from i1. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122453 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-22 23:12:15 +00:00
Duncan Sands	1cd05bb605	When determining whether the new instruction was already present in the original instruction, half the cases were missed (making it not wrong but suboptimal). Also correct a typo (A <-> B) in the second chunk. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122414 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-22 17:15:25 +00:00
Duncan Sands	b3898af89f	Make this test not depend on how the variable is named. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122413 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-22 17:08:04 +00:00
Duncan Sands	37bf92b523	Add a generic expansion transform: A op (B op' C) -> (A op B) op' (A op C) if both A op B and A op C simplify. This fires fairly often but doesn't make that much difference. On gcc-as-one-file it removes two "and"s and turns one branch into a select. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122399 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-22 13:36:08 +00:00
Owen Anderson	f056838956	Give GVN back the ability to perform simple conditional propagation on conditional branch values. I still think that LVI should be handling this, but that capability is some ways off in the future, and this matters for some significant benchmarks. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122378 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-21 23:54:34 +00:00
Duncan Sands	025c98bdbd	Add an additional InstructionSimplify factorization test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122333 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-21 15:12:22 +00:00
Duncan Sands	07f30fbd73	While I don't think any later transforms can fire, it seems cleaner to not assume this (for example in case more transforms get added below it). Suggested by Frits van Bommel. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122332 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-21 15:03:43 +00:00
Duncan Sands	9bd2c2e63c	Fix typo in comment, spotted by Deewiant. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122329 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-21 13:39:20 +00:00
Duncan Sands	3421d90853	Teach InstructionSimplify about distributive laws. These transforms fire quite often, but don't make much difference in practice presumably because instcombine also knows them and more. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122328 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-21 13:32:22 +00:00
Duncan Sands	566edb04b8	Add generic simplification of associative operations, generalizing a couple of existing transforms. This fires surprisingly often, for example when compiling gcc "(X+(-1))+1->X" fires quite a lot as well as various "and" simplifications (usually with a phi node operand). Most of the time this doesn't make a real difference since the same thing would have been done elsewhere anyway, eg: by instcombine, but there are a few places where this results in simplifications that we were not doing before. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122326 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-21 08:49:00 +00:00
Benjamin Kramer	5337f20c15	Teach InstCombine to merge (icmp ult (X + CA), C1) \| (icmp eq X, C2) into (icmp ult (X + CA), C1 + 1) if C2 + CA == C1. InstCombine creates these so now we compile x == 23 \|\| x == 24 \|\| x == 25 to %x.off = add i32 %x, -23 %1 = icmp ult i32 %x.off, 3 instead of %x.off = add i32 %x, -23 %1 = icmp ult i32 %x.off, 2 %cmp3 = icmp eq i32 %x, 25 %ret2 = or i1 %1, %cmp3 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122248 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-20 16:18:51 +00:00
Duncan Sands	ee9a2e322a	Have SimplifyBinOp dispatch Xor, Add and Sub to the corresponding methods (they had just been forgotten before). Adding Xor causes "main" in the existing testcase 2010-11-01-lshr-mask.ll to be hugely more simplified. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122245 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-20 14:47:04 +00:00
Chris Lattner	2b9375e44b	fix PR8807 by making transformConstExprCastCall aware of byval arguments. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122238 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-20 08:36:38 +00:00
Chris Lattner	0b66f63a26	when eliding a byval copy due to inlining a readonly function, we have to make sure that the reused alloca has sufficient alignment. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122236 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-20 08:10:40 +00:00
Chris Lattner	e7ae705c32	pull byval processing out to its own helper function. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122235 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-20 07:57:41 +00:00
Chris Lattner	018fb767b9	fix PR8769, a miscompilation by inliner when inlining a function with a byval argument. The generated alloca has to have at least the alignment of the byval, if not, the client may be making assumptions that the new alloca won't satisfy. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122234 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-20 07:45:28 +00:00
Chris Lattner	572335915f	merge two tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122233 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-20 07:39:57 +00:00
Chris Lattner	b0af8ce1d9	filecheckize git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122232 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-20 07:38:24 +00:00
Mon P Wang	b085181258	Test case for r122215 when InstCombine optimizes memset git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122216 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-20 01:06:23 +00:00
Chris Lattner	a34b3cf953	X86 supports i8/i16 overflow ops (except i8 multiplies), we should generate them. Now we compile: define zeroext i8 @X(i8 signext %a, i8 signext %b) nounwind ssp { entry: %0 = tail call %0 @llvm.sadd.with.overflow.i8(i8 %a, i8 %b) %cmp = extractvalue %0 %0, 1 br i1 %cmp, label %if.then, label %if.end into: _X: ## @X ## BB#0: ## %entry subl $12, %esp movb 16(%esp), %al addb 20(%esp), %al jo LBB0_2 Before we were generating: _X: ## @X ## BB#0: ## %entry pushl %ebp movl %esp, %ebp subl $8, %esp movb 12(%ebp), %al testb %al, %al setge %cl movb 8(%ebp), %dl testb %dl, %dl setge %ah cmpb %cl, %ah sete %cl addb %al, %dl testb %dl, %dl setge %al cmpb %al, %ah setne %al andb %cl, %al testb %al, %al jne LBB0_2 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122186 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-19 20:03:11 +00:00
Chris Lattner	e5cbdca3bd	recognize an unsigned add with overflow idiom into uadd. This resolves a README entry and technically resolves PR4916, but we still get poor code for the testcase in that PR because GVN isn't CSE'ing uadd with add, filed as PR8817. Previously we got: _test7: ## @test7 addq %rsi, %rdi cmpq %rdi, %rsi movl $42, %eax cmovaq %rsi, %rax ret Now we get: _test7: ## @test7 addq %rsi, %rdi movl $42, %eax cmovbq %rsi, %rax ret git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122182 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-19 19:37:52 +00:00
Chris Lattner	26b482d7a7	optimize uadd(x, cst) into a comparison when the normal result is dead. This is required for my next patch to not regress the testsuite. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122181 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-19 19:35:32 +00:00
Chris Lattner	0a62474830	generalize the sadd creation code to not require that the sadd formed is half the size of the original type. We can now compile this into a sadd.i8: unsigned char X(char a, char b) { int res = a+b; if ((unsigned )(res+128) > 255U) abort(); return res; } git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122178 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-19 18:35:09 +00:00
Chris Lattner	dd7e837374	fix another miscompile in the llvm.sadd formation logic: it wasn't checking to see if the high bits of the original add result were dead. Inserting a smaller add and zexting back to that size is not good enough. This is likely to be the fix for 8816. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122177 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-19 18:22:06 +00:00
Chris Lattner	368397bb7d	fix a bug (possibly 8816) in the sadd forming xform: it isn't profitable (or safe) to promote code when the add-with-constant has other uses. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122175 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-19 17:59:02 +00:00
Chris Lattner	1dec0d2704	Enhance LICM to promote alias sets whose pointers themselves are stored, which doesn't affect the memory address being promoted. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122172 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-19 05:57:25 +00:00
Chris Lattner	1c0af0ed25	fix PR8602, a bug in an assertion: a volatile store of a pointer does not make the alias set for that pointer volatile, just stores to the pointer. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122171 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-19 05:51:54 +00:00
Chris Lattner	a0d172f7fe	revert r122164, I'm going to go with a different approach. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122168 91177308-0d34-0410-b5e6-96231b3b80d8	2010-12-19 04:23:03 +00:00

1 2 3 4 5 ...

3349 Commits