Commit Graph

3399 Commits

Author SHA1 Message Date
Chris Lattner
7e9b427c87 if an alloca is only ever accessed as a unit, and is accessed with load/store instructions,
then don't try to decimate it into its individual pieces.  This will just make a mess of the
IR and is pointless if none of the elements are individually accessed.  This was generating
really terrible code for std::bitset (PR8980) because it happens to be lowered by clang
as an {[8 x i8]} structure instead of {i64}.

The testcase now is optimized to:

define i64 @test2(i64 %X) {
  br label %L2

L2:                                               ; preds = %0
  ret i64 %X
}

before we generated:

define i64 @test2(i64 %X) {
  %sroa.store.elt = lshr i64 %X, 56
  %1 = trunc i64 %sroa.store.elt to i8
  %sroa.store.elt8 = lshr i64 %X, 48
  %2 = trunc i64 %sroa.store.elt8 to i8
  %sroa.store.elt9 = lshr i64 %X, 40
  %3 = trunc i64 %sroa.store.elt9 to i8
  %sroa.store.elt10 = lshr i64 %X, 32
  %4 = trunc i64 %sroa.store.elt10 to i8
  %sroa.store.elt11 = lshr i64 %X, 24
  %5 = trunc i64 %sroa.store.elt11 to i8
  %sroa.store.elt12 = lshr i64 %X, 16
  %6 = trunc i64 %sroa.store.elt12 to i8
  %sroa.store.elt13 = lshr i64 %X, 8
  %7 = trunc i64 %sroa.store.elt13 to i8
  %8 = trunc i64 %X to i8
  br label %L2

L2:                                               ; preds = %0
  %9 = zext i8 %1 to i64
  %10 = shl i64 %9, 56
  %11 = zext i8 %2 to i64
  %12 = shl i64 %11, 48
  %13 = or i64 %12, %10
  %14 = zext i8 %3 to i64
  %15 = shl i64 %14, 40
  %16 = or i64 %15, %13
  %17 = zext i8 %4 to i64
  %18 = shl i64 %17, 32
  %19 = or i64 %18, %16
  %20 = zext i8 %5 to i64
  %21 = shl i64 %20, 24
  %22 = or i64 %21, %19
  %23 = zext i8 %6 to i64
  %24 = shl i64 %23, 16
  %25 = or i64 %24, %22
  %26 = zext i8 %7 to i64
  %27 = shl i64 %26, 8
  %28 = or i64 %27, %25
  %29 = zext i8 %8 to i64
  %30 = or i64 %29, %28
  ret i64 %30
}

In this case, instcombine was able to eliminate the nonsense, but in PR8980 enough
PHIs are in play that instcombine backs off.  It's better to not generate this stuff
in the first place.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123571 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-16 06:18:28 +00:00
Chris Lattner
192228edb1 enhance FoldOpIntoPhi in instcombine to try harder when a phi has
multiple uses.  In some cases, all the uses are the same operation,
so instcombine can go ahead and promote the phi.  In the testcase
this pushes an add out of the loop.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123568 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-16 05:28:59 +00:00
Owen Anderson
66f708f7e5 Improve the safety of my globalopt enhancement by ensuring that the bitcast
of the stored value to the new store type is always.  Also, add a testcase.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123563 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-16 04:33:33 +00:00
Chris Lattner
156eb0a569 fix PR8983, a broken assertion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123562 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-16 03:43:53 +00:00
Chris Lattner
0092b1142f remove the partial specialization pass. It is unmaintained and has bugs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123554 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-16 00:27:10 +00:00
Nick Lewycky
2820c25e84 Make constmerge a two-pass algorithm so that it won't miss merging
opporuntities. Fixes PR8978.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123541 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-15 18:14:21 +00:00
Chris Lattner
6ccb5ef1b5 temporarily revert r123526. While working on a follow-on patch I
realize that ConstantFoldTerminator doesn't preserve dominfo.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123527 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-15 07:51:19 +00:00
Chris Lattner
eeba3f5695 fix rdar://8785296 - -fcatch-undefined-behavior generates inefficient code
The basic issue is that isel (very reasonably!) expects conditional branches
to be folded, so CGP leaving around a bunch dead computation feeding
conditional branches isn't such a good idea.  Just fold branches on constants
into unconditional branches.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123526 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-15 07:36:13 +00:00
Chris Lattner
94e8e0cfbe Now that instruction optzns can update the iterator as they go, we can
have objectsize folding recursively simplify away their result when it
folds.  It is important to catch this here, because otherwise we won't
eliminate the cross-block values at isel and other times.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123524 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-15 07:25:29 +00:00
Chris Lattner
62fe406dc2 implement an instcombine xform that canonicalizes casts outside of and-with-constant operations.
This fixes rdar://8808586 which observed that we used to compile:


union xy {
        struct x { _Bool b[15]; } x;
        __attribute__((packed))
        struct y {
                __attribute__((packed)) unsigned long b0to7;
                __attribute__((packed)) unsigned int b8to11;
                __attribute__((packed)) unsigned short b12to13;
                __attribute__((packed)) unsigned char b14;
        } y;
};

struct x
foo(union xy *xy)
{
        return xy->x;
}

into:

_foo:                                   ## @foo
	movq	(%rdi), %rax
	movabsq	$1095216660480, %rcx    ## imm = 0xFF00000000
	andq	%rax, %rcx
	movabsq	$-72057594037927936, %rdx ## imm = 0xFF00000000000000
	andq	%rax, %rdx
	movzbl	%al, %esi
	orq	%rdx, %rsi
	movq	%rax, %rdx
	andq	$65280, %rdx            ## imm = 0xFF00
	orq	%rsi, %rdx
	movq	%rax, %rsi
	andq	$16711680, %rsi         ## imm = 0xFF0000
	orq	%rdx, %rsi
	movl	%eax, %edx
	andl	$-16777216, %edx        ## imm = 0xFFFFFFFFFF000000
	orq	%rsi, %rdx
	orq	%rcx, %rdx
	movabsq	$280375465082880, %rcx  ## imm = 0xFF0000000000
	movq	%rax, %rsi
	andq	%rcx, %rsi
	orq	%rdx, %rsi
	movabsq	$71776119061217280, %r8 ## imm = 0xFF000000000000
	andq	%r8, %rax
	orq	%rsi, %rax
	movzwl	12(%rdi), %edx
	movzbl	14(%rdi), %esi
	shlq	$16, %rsi
	orl	%edx, %esi
	movq	%rsi, %r9
	shlq	$32, %r9
	movl	8(%rdi), %edx
	orq	%r9, %rdx
	andq	%rdx, %rcx
	movzbl	%sil, %esi
	shlq	$32, %rsi
	orq	%rcx, %rsi
	movl	%edx, %ecx
	andl	$-16777216, %ecx        ## imm = 0xFFFFFFFFFF000000
	orq	%rsi, %rcx
	movq	%rdx, %rsi
	andq	$16711680, %rsi         ## imm = 0xFF0000
	orq	%rcx, %rsi
	movq	%rdx, %rcx
	andq	$65280, %rcx            ## imm = 0xFF00
	orq	%rsi, %rcx
	movzbl	%dl, %esi
	orq	%rcx, %rsi
	andq	%r8, %rdx
	orq	%rsi, %rdx
	ret

We now compile this into:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	movzwl	12(%rdi), %eax
	movzbl	14(%rdi), %ecx
	shlq	$16, %rcx
	orl	%eax, %ecx
	shlq	$32, %rcx
	movl	8(%rdi), %edx
	orq	%rcx, %rdx
	movq	(%rdi), %rax
	ret

A small improvement :-)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123520 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-15 06:32:33 +00:00
Duncan Sands
c087e20331 Turn X-(X-Y) into Y. According to my auto-simplifier this is the most common
simplification present in fully optimized code (I think instcombine fails to
transform some of these when "X-Y" has more than one use).  Fires here and
there all over the test-suite, for example it eliminates 8 subtractions in
the final IR for 445.gobmk, 2 subs in 447.dealII, 2 in paq8p etc.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123442 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-14 15:26:10 +00:00
Duncan Sands
cf80bc1d4a Factorize common code out of the InstructionSimplify shift logic. Add in
threading of shifts over selects and phis while there.  This fires here and
there in the testsuite, to not much effect.  For example when compiling spirit
it fires 5 times, during early-cse, resulting in 6 more cse simplifications,
and 3 more terminators being folded by jump threading, but the final bitcode
doesn't change in any interesting way: other optimizations would have caught
the opportunity anyway, only later.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123441 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-14 14:44:12 +00:00
Duncan Sands
62becca681 Rename this test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123440 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-14 14:16:33 +00:00
Chris Lattner
bdb6b7f9c7 relax testcase a bit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123433 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-14 07:46:33 +00:00
Duncan Sands
c43cee3fbb Move some shift transforms out of instcombine and into InstructionSimplify.
While there, I noticed that the transform "undef >>a X -> undef" was wrong.
For example if X is 2 then the top two bits must be equal, so the result can
not be anything.  I fixed this in the constant folder as well.  Also, I made
the transform for "X << undef" stronger: it now folds to undef always, even
though X might be zero.  This is in accordance with the LangRef, but I must
admit that it is fairly aggressive.  Also, I added "i32 X << 32 -> undef"
following the LangRef and the constant folder, likewise fairly aggressive.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123417 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-14 00:37:45 +00:00
Bob Wilson
704d1347c5 Extend SROA to handle arrays accessed as homogeneous structs and vice versa.
This is a minor extension of SROA to handle a special case that is
important for some ARM NEON operations.  Some of the NEON intrinsics
return multiple values, which are handled as struct types containing
multiple elements of the same vector type.  The corresponding return
types declared in the arm_neon.h header have equivalent arrays.  We
need SROA to recognize that it can split up those arrays and structs
into separate vectors, even though they are not always accessed with
the same type.  SROA already handles loads and stores of an entire
alloca by using insertvalue/extractvalue to access the individual
pieces, and that code works the same regardless of whether the type
is a struct or an array.  So, all that needs to be done is to check
for compatible arrays and homogeneous structs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123381 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-13 17:45:11 +00:00
Bob Wilson
694a10e7d8 Make SROA more aggressive with allocas containing padding.
SROA only split up structs and arrays one level at a time, so padding can
only cause trouble if it is located in between the struct or array elements.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123380 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-13 17:45:08 +00:00
Duncan Sands
6dc91253ab The most common simplification missed by instsimplify in unoptimized bitcode
is "X != 0 -> X" when X is a boolean.  This occurs a lot because of the way
llvm-gcc converts gcc's conditional expressions.  Add this, and a few other
similar transforms for completeness.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123372 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-13 08:56:29 +00:00
Chris Lattner
d318fc2ceb revert 123144, reenabling the rest of memset formation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123302 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-12 03:25:15 +00:00
Chris Lattner
d2e905027b revert r123146 which disabled code that wasn't the root cause
of the bootstrap miscompare issue.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123299 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-12 01:52:23 +00:00
Chris Lattner
86099ba2b5 merge tests into one crash.ll test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123220 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-11 07:50:07 +00:00
Chris Lattner
93767fdb61 remove a bogus assertion: the latch block of a loop is not
neccesarily an uncond branch to the header.  This fixes 
PR8955 (the assertion tripping).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123219 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-11 07:47:59 +00:00
Chandler Carruth
15ed90c859 Teach constant folding to perform conversions from constant floating
point values to their integer representation through the SSE intrinsic
calls. This is the last part of a README.txt entry for which I have real
world examples.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123206 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-11 01:07:24 +00:00
Chandler Carruth
f7b0047f5f FileCheck-ize a test, and move a no-longer calling test case to another
file and make it actually test something...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123205 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-11 01:07:20 +00:00
Owen Anderson
da1c122da5 Fix a random missed optimization by making InstCombine more aggressive when determining which bits are demanded by
a comparison against a constant.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123203 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-11 00:36:45 +00:00
Chandler Carruth
9cc9f50abc Teach instcombine about the rest of the SSE and SSE2 conversion
intrinsics element dependencies. Reviewed by Nick.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123161 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-10 07:19:37 +00:00
Chandler Carruth
fdc8f2d260 Fold two related tests into the newly FileCheck-ized test, migrating
them to FileCheck as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123154 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-10 02:53:58 +00:00
Chandler Carruth
548e581dcb Clean up and FileCheck-ize a test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123153 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-10 02:53:54 +00:00
Chris Lattner
f86c75da4d fix typo
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123148 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-10 02:33:34 +00:00
Chris Lattner
a806be66c1 another (more) aggressive attempt to bring llvm-gcc-i386-linux-selfhost
back to life.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123146 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-10 00:47:34 +00:00
Chris Lattner
d8408270f3 temporarily disable memset formation from memsets in an effort to restore buildbot stability.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123144 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-09 23:52:48 +00:00
Tobias Grosser
aa2be84356 Instcombine: Fix pattern where the sext did not dominate the icmp using it
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123121 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-09 16:00:11 +00:00
Chris Lattner
d90a192279 Merge memsets followed by neighboring memsets and other stores into
larger memsets.  Among other things, this fixes rdar://8760394 and
allows us to handle "Example 2" from http://blog.regehr.org/archives/320,
compiling it into a single 4096-byte memset:

_mad_synth_mute:                        ## @mad_synth_mute
## BB#0:                                ## %entry
	pushq	%rax
	movl	$4096, %esi             ## imm = 0x1000
	callq	___bzero
	popq	%rax
	ret



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123089 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-08 21:19:19 +00:00
Chris Lattner
9fa11e94b5 fix an issue in IsPointerOffset that prevented us from recognizing that
P and P+1 are relative to the same base pointer.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123087 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-08 21:07:56 +00:00
Chris Lattner
06511264f8 enhance memcpyopt to merge a store and a subsequent
memset into a single larger memset.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123086 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-08 20:54:51 +00:00
Chris Lattner
355f5778aa merge two tests and filecheckify
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123082 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-08 20:27:22 +00:00
Chris Lattner
5d37370a6f When loop rotation happens, it is *very* common for the duplicated condbr
to be foldable into an uncond branch.  When this happens, we can make a
much simpler CFG for the loop, which is important for nested loop cases
where we want the outer loop to be aggressively optimized.

Handle this case more aggressively.  For example, previously on
phi-duplicate.ll we would get this:


define void @test(i32 %N, double* %G) nounwind ssp {
entry:
  %cmp1 = icmp slt i64 1, 1000
  br i1 %cmp1, label %bb.nph, label %for.end

bb.nph:                                           ; preds = %entry
  br label %for.body

for.body:                                         ; preds = %bb.nph, %for.cond
  %j.02 = phi i64 [ 1, %bb.nph ], [ %inc, %for.cond ]
  %arrayidx = getelementptr inbounds double* %G, i64 %j.02
  %tmp3 = load double* %arrayidx
  %sub = sub i64 %j.02, 1
  %arrayidx6 = getelementptr inbounds double* %G, i64 %sub
  %tmp7 = load double* %arrayidx6
  %add = fadd double %tmp3, %tmp7
  %arrayidx10 = getelementptr inbounds double* %G, i64 %j.02
  store double %add, double* %arrayidx10
  %inc = add nsw i64 %j.02, 1
  br label %for.cond

for.cond:                                         ; preds = %for.body
  %cmp = icmp slt i64 %inc, 1000
  br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge

for.cond.for.end_crit_edge:                       ; preds = %for.cond
  br label %for.end

for.end:                                          ; preds = %for.cond.for.end_crit_edge, %entry
  ret void
}

Now we get the much nicer:

define void @test(i32 %N, double* %G) nounwind ssp {
entry:
  br label %for.body

for.body:                                         ; preds = %entry, %for.body
  %j.01 = phi i64 [ 1, %entry ], [ %inc, %for.body ]
  %arrayidx = getelementptr inbounds double* %G, i64 %j.01
  %tmp3 = load double* %arrayidx
  %sub = sub i64 %j.01, 1
  %arrayidx6 = getelementptr inbounds double* %G, i64 %sub
  %tmp7 = load double* %arrayidx6
  %add = fadd double %tmp3, %tmp7
  %arrayidx10 = getelementptr inbounds double* %G, i64 %j.01
  store double %add, double* %arrayidx10
  %inc = add nsw i64 %j.01, 1
  %cmp = icmp slt i64 %inc, 1000
  br i1 %cmp, label %for.body, label %for.end

for.end:                                          ; preds = %for.body
  ret void
}

With all of these recent changes, we are now able to compile:

void foo(char *X) {
 for (int i = 0; i != 100; ++i) 
   for (int j = 0; j != 100; ++j)
     X[j+i*100] = 0;
}

into a single memset of 10000 bytes.  This series of changes
should also be helpful for other nested loop scenarios as well.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123079 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-08 19:59:06 +00:00
Chris Lattner
0e4a1543ab Three major changes:
1. Rip out LoopRotate's domfrontier updating code.  It isn't
   needed now that LICM doesn't use DF and it is super complex
   and gross.
2. Make DomTree updating code a lot simpler and faster.  The 
   old loop over all the blocks was just to find a block??
3. Change the code that inserts the new preheader to just use
   SplitCriticalEdge instead of doing an overcomplex 
   reimplementation of it.

No behavior change, except for the name of the inserted preheader.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123072 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-08 18:52:51 +00:00
Frits van Bommel
b686eb9186 Fix a bug in r123034 (trying to sext/zext non-integers) and clean up a little.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123061 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-08 10:51:36 +00:00
Chris Lattner
d9ec3572f3 Have loop-rotate simplify instructions (yay instsimplify!) as it clones
them into the loop preheader, eliminating silly instructions like
"icmp i32 0, 100" in fixed tripcount loops.  This also better exposes the 
bigger problem with loop rotate that I'd like to fix: once this has been
folded, the duplicated conditional branch *often* turns into an uncond branch.

Not aggressively handling this is pessimizing later loop optimizations 
somethin' fierce by making "dominates all exit blocks" checks fail.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123060 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-08 08:24:46 +00:00
Tobias Grosser
46431d7a93 InstCombine: Match min/max hidden by sext/zext
X = sext x; x >s c ? X : C+1 --> X = sext x; X <s C+1 ? C+1 : X
X = sext x; x <s c ? X : C-1 --> X = sext x; X >s C-1 ? C-1 : X
X = zext x; x >u c ? X : C+1 --> X = zext x; X <u C+1 ? C+1 : X
X = zext x; x <u c ? X : C-1 --> X = zext x; X >u C-1 ? C-1 : X
X = sext x; x >u c ? X : C+1 --> X = sext x; X <u C+1 ? C+1 : X
X = sext x; x <u c ? X : C-1 --> X = sext x; X >u C-1 ? C-1 : X

Instead of calculating this with mixed types promote all to the
larger type. This enables scalar evolution to analyze this
expression. PR8866

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123034 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-07 21:33:14 +00:00
Benjamin Kramer
eaff66a895 Revert 122959, it needs more thought. Add it back to README.txt with additional notes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123030 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-07 20:42:20 +00:00
Benjamin Kramer
8143a84c46 InstCombine: Turn _chk functions into the "unsafe" variant if length and max langth are equal.
This happens when we take the (non-constant) length from a malloc.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122961 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-06 14:22:52 +00:00
Benjamin Kramer
240d42d185 InstCombine: If we call llvm.objectsize on a malloc call we can replace it with the size passed to malloc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122959 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-06 13:11:05 +00:00
Benjamin Kramer
783a5c2b69 InstCombine: Teach llvm.objectsize folding to look through GEPs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122958 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-06 13:07:49 +00:00
Chris Lattner
8cd4efb6a5 implement constant folding support for an exotic constant expr:
ret i64 ptrtoint (i8* getelementptr ([1000 x i8]* @X, i64 1, i64 sub (i64 0, i64 ptrtoint ([1000 x i8]* @X to i64))) to i64)

to "ret i64 1000".  This allows us to correctly compute the trip count
on a loop in PR8883, which occurs with std::fill on a char array.  This
allows us to transform it into a memset with a constant size.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122950 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-06 06:19:46 +00:00
Chris Lattner
43b40a4620 fix an off-by-one bug that caused a crash analyzing
ashr's with huge shift amounts, PR8896


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122814 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-04 18:19:15 +00:00
Chris Lattner
e41d3c015c Teach loop-idiom to turn a loop containing a memset into a larger memset
when safe.

The testcase is basically this nested loop:
void foo(char *X) {
  for (int i = 0; i != 100; ++i) 
    for (int j = 0; j != 100; ++j)
      X[j+i*100] = 0;
}

which gets turned into a single memset now.  clang -O3 doesn't optimize
this yet though due to a phase ordering issue I haven't analyzed yet.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122806 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-04 07:46:33 +00:00
Chris Lattner
e508dd4c75 Duncan deftly points out that readnone functions aren't
invalidated by stores, so they can be handled as 'simple'
operations.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122785 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-03 23:38:13 +00:00
Chris Lattner
75637154c3 earlycse can do trivial with-a-block dead store
elimination as well.  This deletes 60 stores in 176.gcc
that largely come from bitfield code.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122736 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-03 04:17:24 +00:00
Chris Lattner
ef87fc2e0a now that loads are in their own table, we can implement
store->load forwarding.  This allows EarlyCSE to zap 600 more
loads from 176.gcc.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122732 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-03 03:46:34 +00:00
Chris Lattner
03d49e955e add a testcase for readonly call CSE
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122730 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-03 03:33:47 +00:00
Chris Lattner
8e7f0d70c7 Teach EarlyCSE to do trivial CSE of loads and read-only calls.
On 176.gcc, this catches 13090 loads and calls, and increases the
number of simple instructions CSE'd from 29658 to 36208.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122727 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-03 03:18:43 +00:00
Chris Lattner
91139ccd99 add DEBUG and -stats output to earlycse.
Teach it to CSE the rest of the non-side-effecting instructions.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122716 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-02 23:19:45 +00:00
Chris Lattner
cc9eab26b3 Enhance earlycse to do CSE of casts, instsimplify and die.
Add a testcase.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122715 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-02 23:04:14 +00:00
Chris Lattner
63f9c3c49a fix a miscompilation of tramp3d-v4: when forming a memcpy, we have to make
sure that the loop we're promoting into a memcpy doesn't mutate the input
of the memcpy.  Before we were just checking that the dest of the memcpy
wasn't mod/ref'd by the loop.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122712 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-02 21:14:18 +00:00
Chris Lattner
8e08e73f0e If a loop iterates exactly once (has backedge count = 0) then don't
mess with it.  We'd rather peel/unroll it than convert all of its 
stores into memsets.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122711 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-02 20:24:21 +00:00
Chris Lattner
62c50fdf69 enhance loop idiom recognition to scan *all* unconditionally executed
blocks in a loop, instead of just the header block.  This makes it more
aggressive, able to handle Duncan's Ada examples.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122704 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-02 19:01:03 +00:00
Duncan Sands
67fb341f8b Fix PR8702 by not having LoopSimplify claim to preserve LCSSA form. As described
in the PR, the pass could break LCSSA form when inserting preheaders.  It probably
would be easy enough to fix this, but since currently we always go into LCSSA form
after running this pass, doing so is not urgent.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122695 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-02 13:38:21 +00:00
Chris Lattner
cf078f2b20 Allow loop-idiom to run on multiple BB loops, but still only scan the loop
header for now for memset/memcpy opportunities.  It turns out that loop-rotate
is successfully rotating loops, but *DOESN'T MERGE THE BLOCKS*, turning "for 
loops" into 2 basic block loops that loop-idiom was ignoring.

With this fix, we form many *many* more memcpy and memsets than before, including
on the "history" loops in the viterbi benchmark, which look like this:

        for (j=0; j<MAX_history; ++j) {
          history_new[i][j+1] = history[2*i][j];
        }

Transforming these loops into memcpy's speeds up the viterbi benchmark from
11.98s to 3.55s on my machine.  Woo.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122685 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-02 07:58:36 +00:00
Chris Lattner
e2c4392091 teach loop idiom recognition to form memcpy's from simple loops.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122678 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-02 03:37:56 +00:00
Chris Lattner
d91ed10b70 fix a globalopt crash on two Adobe-C++ testcases that the recent
loop idiom pass exposed.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122674 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-01 22:31:46 +00:00
Chris Lattner
bafa117e8f add a validity check that was missed, fixing a crash on the
new testcase.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122662 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-01 20:12:04 +00:00
Duncan Sands
124708d9b4 Revert commit 122654 at the request of Chris, who reckons that instsimplify
is the wrong hammer for this nail, and is probably right.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122661 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-01 20:08:02 +00:00
Chris Lattner
a64cbf067d improve validity check to handle constant-trip-count loops more
aggressively.  In practice, this doesn't help anything though,
see the todo.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122660 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-01 19:54:22 +00:00
Chris Lattner
30980b6815 implement the "no aliasing accesses in loop" safety check. This pass
should be correct now.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122659 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-01 19:39:01 +00:00
Duncan Sands
7cf85e74e3 Fix a README item by having InstructionSimplify do a mild form of value
numbering, in which it considers (for example) "%a = add i32 %x, %y" and
"%b = add i32 %x, %y" to be equal because the operands are equal and the
result of the instructions only depends on the values of the operands.
This has almost no effect (it removes 4 instructions from gcc-as-one-file),
and perhaps slows down compilation: I measured a 0.4% slowdown on the large
gcc-as-one-file testcase, but it wasn't statistically significant.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122654 91177308-0d34-0410-b5e6-96231b3b80d8
2011-01-01 16:12:09 +00:00
NAKAMURA Takumi
5c083b174e test/Transforms/ConstProp/logicaltest.ll: FileCheck-ize.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122620 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-29 03:58:56 +00:00
Chris Lattner
a92ff91a96 implement enough of the memset inference algorithm to recognize and insert
memsets.  This is still missing one important validity check, but this is enough
to compile stuff like this:

void test0(std::vector<char> &X) {
  for (std::vector<char>::iterator I = X.begin(), E = X.end(); I != E; ++I)
    *I = 0;
}

void test1(std::vector<int> &X) {
  for (long i = 0, e = X.size(); i != e; ++i)
    X[i] = 0x01010101;
}

With:
 $ clang t.cpp -S -o - -O2 -emit-llvm | opt -loop-idiom | opt -O3 | llc 

to:

__Z5test0RSt6vectorIcSaIcEE:            ## @_Z5test0RSt6vectorIcSaIcEE
## BB#0:                                ## %entry
	subq	$8, %rsp
	movq	(%rdi), %rax
	movq	8(%rdi), %rsi
	cmpq	%rsi, %rax
	je	LBB0_2
## BB#1:                                ## %bb.nph
	subq	%rax, %rsi
	movq	%rax, %rdi
	callq	___bzero
LBB0_2:                                 ## %for.end
	addq	$8, %rsp
	ret
...
__Z5test1RSt6vectorIiSaIiEE:            ## @_Z5test1RSt6vectorIiSaIiEE
## BB#0:                                ## %entry
	subq	$8, %rsp
	movq	(%rdi), %rax
	movq	8(%rdi), %rdx
	subq	%rax, %rdx
	cmpq	$4, %rdx
	jb	LBB1_2
## BB#1:                                ## %for.body.preheader
	andq	$-4, %rdx
	movl	$1, %esi
	movq	%rax, %rdi
	callq	_memset
LBB1_2:                                 ## %for.end
	addq	$8, %rsp
	ret



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122573 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-26 23:42:51 +00:00
Chris Lattner
61db1f56d0 start using irbuilder to make mem intrinsics in a few passes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122572 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-26 22:57:41 +00:00
Benjamin Kramer
a112087e42 MemCpyOpt: Turn memcpys from a constant into a memset if possible.
This allows us to compile "int cst[] = {-1, -1, -1};" into
  movl  $-1, 16(%rsp)
  movq  $-1, 8(%rsp)
instead of
  movl  _cst+8(%rip), %eax
  movl  %eax, 16(%rsp)
  movq  _cst(%rip), %rax
  movq  %rax, 8(%rsp)


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122548 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-24 21:17:12 +00:00
Owen Anderson
ec3953ff95 When determining if we can fold (x >> C1) << C2, the bits that we need to verify are zero
are not the low bits of x, but the bits that WILL be the low bits after the operation completes.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122529 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-23 23:56:24 +00:00
Benjamin Kramer
4ac19470dc InstCombine: creating selects from -1 and 0 is fine, they combine into a sext from i1.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122453 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-22 23:12:15 +00:00
Duncan Sands
1cd05bb605 When determining whether the new instruction was already present in
the original instruction, half the cases were missed (making it not
wrong but suboptimal).  Also correct a typo (A <-> B) in the second
chunk. 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122414 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-22 17:15:25 +00:00
Duncan Sands
b3898af89f Make this test not depend on how the variable is named.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122413 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-22 17:08:04 +00:00
Duncan Sands
37bf92b523 Add a generic expansion transform: A op (B op' C) -> (A op B) op' (A op C)
if both A op B and A op C simplify.  This fires fairly often but doesn't
make that much difference.  On gcc-as-one-file it removes two "and"s and
turns one branch into a select.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122399 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-22 13:36:08 +00:00
Owen Anderson
f056838956 Give GVN back the ability to perform simple conditional propagation on conditional branch values.
I still think that LVI should be handling this, but that capability is some ways off in the future,
and this matters for some significant benchmarks.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122378 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-21 23:54:34 +00:00
Duncan Sands
025c98bdbd Add an additional InstructionSimplify factorization test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122333 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-21 15:12:22 +00:00
Duncan Sands
07f30fbd73 While I don't think any later transforms can fire, it seems cleaner to
not assume this (for example in case more transforms get added below
it).  Suggested by Frits van Bommel.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122332 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-21 15:03:43 +00:00
Duncan Sands
9bd2c2e63c Fix typo in comment, spotted by Deewiant.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122329 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-21 13:39:20 +00:00
Duncan Sands
3421d90853 Teach InstructionSimplify about distributive laws. These transforms fire
quite often, but don't make much difference in practice presumably because
instcombine also knows them and more.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122328 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-21 13:32:22 +00:00
Duncan Sands
566edb04b8 Add generic simplification of associative operations, generalizing
a couple of existing transforms.  This fires surprisingly often, for
example when compiling gcc "(X+(-1))+1->X" fires quite a lot as well
as various "and" simplifications (usually with a phi node operand).
Most of the time this doesn't make a real difference since the same
thing would have been done elsewhere anyway, eg: by instcombine, but
there are a few places where this results in simplifications that we
were not doing before.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122326 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-21 08:49:00 +00:00
Benjamin Kramer
5337f20c15 Teach InstCombine to merge (icmp ult (X + CA), C1) | (icmp eq X, C2) into (icmp ult (X + CA), C1 + 1) if C2 + CA == C1.
InstCombine creates these so now we compile x == 23 || x == 24 || x == 25 to
  %x.off = add i32 %x, -23
  %1 = icmp ult i32 %x.off, 3
instead of
  %x.off = add i32 %x, -23
  %1 = icmp ult i32 %x.off, 2
  %cmp3 = icmp eq i32 %x, 25
  %ret2 = or i1 %1, %cmp3


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122248 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-20 16:18:51 +00:00
Duncan Sands
ee9a2e322a Have SimplifyBinOp dispatch Xor, Add and Sub to the corresponding methods
(they had just been forgotten before).  Adding Xor causes "main" in the
existing testcase 2010-11-01-lshr-mask.ll to be hugely more simplified.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122245 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-20 14:47:04 +00:00
Chris Lattner
2b9375e44b fix PR8807 by making transformConstExprCastCall aware of byval arguments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122238 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-20 08:36:38 +00:00
Chris Lattner
0b66f63a26 when eliding a byval copy due to inlining a readonly function, we have
to make sure that the reused alloca has sufficient alignment.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122236 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-20 08:10:40 +00:00
Chris Lattner
e7ae705c32 pull byval processing out to its own helper function.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122235 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-20 07:57:41 +00:00
Chris Lattner
018fb767b9 fix PR8769, a miscompilation by inliner when inlining a function with a byval
argument.  The generated alloca has to have at least the alignment of the
byval, if not, the client may be making assumptions that the new alloca won't
satisfy.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122234 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-20 07:45:28 +00:00
Chris Lattner
572335915f merge two tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122233 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-20 07:39:57 +00:00
Chris Lattner
b0af8ce1d9 filecheckize
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122232 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-20 07:38:24 +00:00
Mon P Wang
b085181258 Test case for r122215 when InstCombine optimizes memset
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122216 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-20 01:06:23 +00:00
Chris Lattner
a34b3cf953 X86 supports i8/i16 overflow ops (except i8 multiplies), we should
generate them.  

Now we compile:

define zeroext i8 @X(i8 signext %a, i8 signext %b) nounwind ssp {
entry:
  %0 = tail call %0 @llvm.sadd.with.overflow.i8(i8 %a, i8 %b)
  %cmp = extractvalue %0 %0, 1
  br i1 %cmp, label %if.then, label %if.end

into:

_X:                                     ## @X
## BB#0:                                ## %entry
	subl	$12, %esp
	movb	16(%esp), %al
	addb	20(%esp), %al
	jo	LBB0_2

Before we were generating:

_X:                                     ## @X
## BB#0:                                ## %entry
	pushl	%ebp
	movl	%esp, %ebp
	subl	$8, %esp
	movb	12(%ebp), %al
	testb	%al, %al
	setge	%cl
	movb	8(%ebp), %dl
	testb	%dl, %dl
	setge	%ah
	cmpb	%cl, %ah
	sete	%cl
	addb	%al, %dl
	testb	%dl, %dl
	setge	%al
	cmpb	%al, %ah
	setne	%al
	andb	%cl, %al
	testb	%al, %al
	jne	LBB0_2




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122186 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-19 20:03:11 +00:00
Chris Lattner
e5cbdca3bd recognize an unsigned add with overflow idiom into uadd.
This resolves a README entry and technically resolves PR4916,
but we still get poor code for the testcase in that PR because
GVN isn't CSE'ing uadd with add, filed as PR8817.

Previously we got:

_test7:                                 ## @test7
	addq	%rsi, %rdi
	cmpq	%rdi, %rsi
	movl	$42, %eax
	cmovaq	%rsi, %rax
	ret

Now we get:

_test7:                                 ## @test7
	addq	%rsi, %rdi
	movl	$42, %eax
	cmovbq	%rsi, %rax
	ret



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122182 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-19 19:37:52 +00:00
Chris Lattner
26b482d7a7 optimize uadd(x, cst) into a comparison when the normal
result is dead.  This is required for my next patch to not
regress the testsuite.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122181 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-19 19:35:32 +00:00
Chris Lattner
0a62474830 generalize the sadd creation code to not require that the
sadd formed is half the size of the original type. We can
now compile this into a sadd.i8:

unsigned char X(char a, char b) {
  int res = a+b;
  if ((unsigned )(res+128) > 255U)
    abort();
  return res;
}




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122178 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-19 18:35:09 +00:00
Chris Lattner
dd7e837374 fix another miscompile in the llvm.sadd formation logic: it wasn't
checking to see if the high bits of the original add result were dead.
Inserting a smaller add and zexting back to that size is not good enough.

This is likely to be the fix for 8816.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122177 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-19 18:22:06 +00:00
Chris Lattner
368397bb7d fix a bug (possibly 8816) in the sadd forming xform: it isn't
profitable (or safe) to promote code when the add-with-constant
has other uses.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122175 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-19 17:59:02 +00:00
Chris Lattner
1dec0d2704 Enhance LICM to promote alias sets whose pointers themselves are stored,
which doesn't affect the memory address being promoted.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122172 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-19 05:57:25 +00:00
Chris Lattner
1c0af0ed25 fix PR8602, a bug in an assertion: a volatile store *of* a pointer
does not make the alias set for that pointer volatile, just stores
*to* the pointer.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122171 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-19 05:51:54 +00:00
Chris Lattner
a0d172f7fe revert r122164, I'm going to go with a different approach.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122168 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-19 04:23:03 +00:00
Chris Lattner
140f4a315b first step to fixing PR8642: don't fold away empty basic blocks
which have trapping constant exprs in them due to PHI nodes.
Eliminating them can cause the constant expr to be evalutated
on new paths if the input edges are critical.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122164 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-19 03:02:34 +00:00
Chris Lattner
78d0094e4c move this test into the ARM test so that it is only run when the arm backend
is enabled.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122163 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-19 02:58:14 +00:00
Nate Begeman
9a3dc55202 Add vector versions of some existing scalar transforms to aid codegen in matching psign & pblend operations to the IR produced by clang/gcc for their C idioms.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122105 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-17 23:12:19 +00:00
Owen Anderson
e63dda51c2 Reapply r121905 (automatic synthesis of @llvm.sadd.with.overflow) with a fix for a bug that manifested itself
on the DragonEgg self-host bot.  Unfortunately, the testcase is pretty messy and doesn't reduce well due to
interactions with other parts of InstCombine.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122072 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-17 18:08:00 +00:00
Benjamin Kramer
14c0987bd9 SimplifyCFG: Ranges can be larger than 64 bits. Fixes Release-selfhost build.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122054 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-17 10:48:14 +00:00
Chris Lattner
e27db74a60 improve switch formation to handle small range
comparisons formed by comparisons.  For example,
this:

void foo(unsigned x) {
  if (x == 0 || x == 1 || x == 3 || x == 4 || x == 6) 
    bar();
}

compiles into:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	cmpl	$6, %edi
	ja	LBB0_2
## BB#1:                                ## %entry
	movl	%edi, %eax
	movl	$91, %ecx
	btq	%rax, %rcx
	jb	LBB0_3

instead of:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	cmpl	$2, %edi
	jb	LBB0_4
## BB#1:                                ## %switch.early.test
	cmpl	$6, %edi
	ja	LBB0_3
## BB#2:                                ## %switch.early.test
	movl	%edi, %eax
	movl	$88, %ecx
	btq	%rax, %rcx
	jb	LBB0_4

This catches a bunch of cases in GCC, which look like this:

 %804 = load i32* @which_alternative, align 4, !tbaa !0
 %805 = icmp ult i32 %804, 2
 %806 = icmp eq i32 %804, 3
 %or.cond121 = or i1 %805, %806
 %807 = icmp eq i32 %804, 4
 %or.cond124 = or i1 %or.cond121, %807
 br i1 %or.cond124, label %.thread, label %808

turning this into a range comparison.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122045 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-17 06:20:15 +00:00
Dan Gohman
c32046e6ea Revert r64460. strtol and friends cannot be marked readonly, even with
a null endptr argument, because they may write to errno.

This fixes a seflhost miscompile observed on Linux targets when TBAA
was enabled.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122014 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-17 01:09:43 +00:00
Duncan Sands
ebef48ea4b Speculatively revert commit 121905 since it looks like it might have broken the
dragonegg self-host buildbot.  Original commit message:

Add an InstCombine transform to recognize instances of manual overflow-safe addition
(performing the addition in a wider type and explicitly checking for overflow), and
fold them down to intrinsics.  This currently only supports signed-addition, but could
be generalized if someone works out the magic constant formulas for other operations.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121965 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-16 09:40:54 +00:00
Dan Gohman
f4177aa019 Preserve TBAA tags when doing load PRE.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121921 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-15 23:53:55 +00:00
Owen Anderson
12984de314 Add an InstCombine transform to recognize instances of manual overflow-safe addition
(performing the addition in a wider type and explicitly checking for overflow), and
fold them down to intrinsics.  This currently only supports signed-addition, but could
be generalized if someone works out the magic constant formulas for other operations.

Fixes <rdar://problem/8558713>.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121905 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-15 22:32:38 +00:00
Frits van Bommel
26e097ca4b Teach jump threading to "look through" a select when the branch direction of a terminator depends on it.
When it sees a promising select it now tries to figure out whether the condition of the select is known in any of the predecessors and if so it maps the operands appropriately.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121859 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-15 09:51:20 +00:00
Owen Anderson
86e8a700f5 Fix PR8790, another instance where unreachable code can cause instruction simplification to fail,
this case involve a select that simplifies to itself.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121817 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-15 00:55:35 +00:00
Chris Lattner
3aff13b82a - Insert new instructions before DomBlock's terminator,
which is simpler than finding a place to insert in BB.
 - Don't perform the 'if condition hoisting' xform on certain
   i1 PHIs, as it interferes with switch formation.

This re-fixes "example 7", without breaking the world hopefully.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121764 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-14 08:46:09 +00:00
Chris Lattner
60d410d7bb fix two significant issues with FoldTwoEntryPHINode:
first, it can kick in on blocks whose conditions have been
folded to a constant, even though one of the edges will be
trivially folded.

second, it doesn't clean up the "if diamond" that it just 
eliminated away.  This is a problem because other simplifycfg
xforms kick in depending on the order of block visitation,
causing pointless work.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121762 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-14 08:01:53 +00:00
Chris Lattner
8168efffa7 fix yet anohter broken line
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121750 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-14 06:09:07 +00:00
Chris Lattner
117f8cffc5 reapply my recent change that disables a piece of the switch formation
work, but fixes 400.perlbmk.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121749 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-14 05:57:30 +00:00
Owen Anderson
2d9220e8f5 Fix recent buildbot breakage by pulling SimplifyCFG back to its state as of r121694, the most recent state
where I'm confident there were no crashes or miscompilations.  XFAIL the test added since then for now.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121733 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-13 23:49:28 +00:00
Chris Lattner
f9a1b2a4cf temporarily disable part of my previous patch, which causes an iterator invalidation issue, causing a crash on some versions of perlbmk.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121728 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-13 23:02:19 +00:00
Benjamin Kramer
cf8b3257c0 Fix sort predicate. qsort(3)'s predicate semantics differ from std::sort's. Fixes PR 8780.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121705 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-13 18:20:38 +00:00
Chris Lattner
a9f6bbea62 reinstate my patch: the miscompile was caused by an inverted branch in the
'and' case.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121695 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-13 08:12:19 +00:00
Chris Lattner
92407e5895 Completely disable the optimization I added in r121680 until
I can track down a miscompile.  This should bring the buildbots
back to life


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121693 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-13 07:41:29 +00:00
Chris Lattner
daa02ab70c Make simplifycfg reprocess newly formed "br (cond1 | cond2)" conditions
when simplifying, allowing them to be eagerly turned into switches.  This
is the last step required to get "Example 7" from this blog post:
http://blog.regehr.org/archives/320

On X86, we now generate this machine code, which (to my eye) seems better
than the ICC generated code:

_crud:                                  ## @crud
## BB#0:                                ## %entry
	cmpb	$33, %dil
	jb	LBB0_4
## BB#1:                                ## %switch.early.test
	addb	$-34, %dil
	cmpb	$58, %dil
	ja	LBB0_3
## BB#2:                                ## %switch.early.test
	movzbl	%dil, %eax
	movabsq	$288230376537592865, %rcx ## imm = 0x400000017001421
	btq	%rax, %rcx
	jb	LBB0_4
LBB0_3:                                 ## %lor.rhs
	xorl	%eax, %eax
	ret
LBB0_4:                                 ## %lor.end
	movl	$1, %eax
	ret



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121690 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-13 07:00:06 +00:00
Chris Lattner
97bd89ece3 fix a bug in r121680 that upset the various buildbots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121687 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-13 05:34:18 +00:00
Chris Lattner
c232a2e559 make these tests a bit less fragile
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121682 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-13 05:10:30 +00:00
Chris Lattner
7312a22ed6 enhance the "change or icmp's into switch" xform to handle one value in an
'or sequence' that it doesn't understand.  This allows us to optimize
something insane like this:

int crud (unsigned char c, unsigned x)
 {
   if(((((((((( (int) c <= 32 ||
                    (int) c == 46) || (int) c == 44)
                  || (int) c == 58) || (int) c == 59) || (int) c == 60)
               || (int) c == 62) || (int) c == 34) || (int) c == 92)
            || (int) c == 39) != 0)
     foo();
 }

into:

define i32 @crud(i8 zeroext %c, i32 %x) nounwind ssp noredzone {
entry:
  %cmp = icmp ult i8 %c, 33
  br i1 %cmp, label %if.then, label %switch.early.test

switch.early.test:                                ; preds = %entry
  switch i8 %c, label %if.end [
    i8 39, label %if.then
    i8 44, label %if.then
    i8 58, label %if.then
    i8 59, label %if.then
    i8 60, label %if.then
    i8 62, label %if.then
    i8 46, label %if.then
    i8 92, label %if.then
    i8 34, label %if.then
  ]

by pulling the < comparison out ahead of the newly formed switch.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121680 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-13 04:50:38 +00:00
Chris Lattner
f5198e7fe3 merge two tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121679 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-13 04:45:56 +00:00
Chris Lattner
abf706703f Fix my previous patch to handle a degenerate case that the llvm-gcc
bootstrap buildbot tripped over.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121674 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-13 03:43:57 +00:00
Chris Lattner
61c77449c7 fix a fairly serious oversight with switch formation from
or'd conditions.  Previously we'd compile something like this:

int crud (unsigned char c) {
   return c == 62 || c == 34 || c == 92;
}

into:

  switch i8 %c, label %lor.rhs [
    i8 62, label %lor.end
    i8 34, label %lor.end
  ]

lor.rhs:                                          ; preds = %entry
  %cmp8 = icmp eq i8 %c, 92
  br label %lor.end

lor.end:                                          ; preds = %entry, %entry, %lor.rhs
  %0 = phi i1 [ true, %entry ], [ %cmp8, %lor.rhs ], [ true, %entry ]
  %lor.ext = zext i1 %0 to i32
  ret i32 %lor.ext

which failed to merge the compare-with-92 into the switch.  With this patch
we simplify this all the way to:

  switch i8 %c, label %lor.rhs [
    i8 62, label %lor.end
    i8 34, label %lor.end
    i8 92, label %lor.end
  ]

lor.rhs:                                          ; preds = %entry
  br label %lor.end

lor.end:                                          ; preds = %entry, %entry, %entry, %lor.rhs
  %0 = phi i1 [ true, %entry ], [ false, %lor.rhs ], [ true, %entry ], [ true, %entry ]
  %lor.ext = zext i1 %0 to i32
  ret i32 %lor.ext

which is much better for codegen's switch lowering stuff.  This kicks in 33 times
on 176.gcc (for example) cutting 103 instructions off the generated code.





git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121671 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-13 03:18:54 +00:00
Benjamin Kramer
2f7228b80c Generalize the and-icmp-select instcombine further by allowing selects of the form
(x & 2^n) ? 2^m+C : C

we can offset both arms by C to get the "(x & 2^n) ? 2^m : 0" form, optimize the
select to a shift and apply the offset afterwards.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121609 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-11 10:49:22 +00:00
Benjamin Kramer
20e3b4b380 Factor the (x & 2^n) ? 2^m : 0 instcombine into its own method and generalize it
to catch cases where n != m with a shift.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121608 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-11 09:42:59 +00:00
Chris Lattner
8fdca6a873 enhance memcpyopt to zap memcpy's that have the same src/dst.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121362 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-09 07:45:45 +00:00
Chris Lattner
f7f35467a9 fix PR8753, eliminating a case where we'd infinitely make a
substitution because it doesn't actually change the IR.  Patch by
Jakub Staszak!



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121361 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-09 07:39:50 +00:00
Dan Gohman
d8e0c0438a Really check that the bits that will become zero are actually already zero
before eliminating the operation that zeros them. This fixes rdar://8739316.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121353 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-09 02:52:17 +00:00
Chris Lattner
1945d58025 reapply r121100 with a tweak to constant fold ConstExprs with TargetData
(if available) as we go so that we get simple constantexprs not insane ones.
This fixes the failure of clang/test/CodeGenCXX/virtual-base-ctor.cpp
that the previous iteration of this patch had.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121111 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-07 04:33:29 +00:00
Eric Christopher
6a3e305326 Temporarily revert r121100 as it's causing clang to fail
CodeGenCXX/virtual-base-ctor.cpp.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121102 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-07 02:41:11 +00:00
Chris Lattner
fb431099c5 fix PR8710 - teach global opt that some constantexprs are too complex to
put in a global variable's initializer.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121100 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-07 01:59:32 +00:00
Frits van Bommel
6033b346e2 Implement jump threading of 'indirectbr' by keeping track of whether we're looking for ConstantInt*s or BlockAddress*s.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121066 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-06 23:36:56 +00:00
Chris Lattner
cc10244d77 Fix PR8728, a miscompilation I recently introduced. When optimizing
memcpy's like:
  memcpy(A, B)
  memcpy(A, C)

we cannot delete the first memcpy as dead if A and C might be aliases.
If so, we actually get:

  memcpy(A, B)
  memcpy(A, A)

which is not correct to transform into:

  memcpy(A, A)

This patch was heavily influenced by Jakub Staszak's patch in PR8728, thanks
Jakub!



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120974 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-06 01:48:06 +00:00
Frits van Bommel
7ac40c3ffa Teach SimplifyCFG to turn
(indirectbr (select cond, blockaddress(@fn, BlockA),
                            blockaddress(@fn, BlockB)))
into
  (br cond, BlockA, BlockB).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120943 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-05 18:29:03 +00:00
Chris Lattner
b5a3196f80 fix a bozo bug I introduced in r119930, causing a miscompile of
20040709-1.c from the gcc testsuite.  I was using the size of a
pointer instead of the pointee.  This fixes rdar://8713376


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120519 91177308-0d34-0410-b5e6-96231b3b80d8
2010-12-01 01:24:55 +00:00
Chris Lattner
3161ae1867 Enhance DSE to handle the variable index case in PR8657.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120498 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-30 23:43:23 +00:00
Chris Lattner
a04096580a teach DSE to use GetPointerBaseWithConstantOffset to analyze
may-aliasing stores that partially overlap with different base
pointers.  This implements PR6043 and the non-variable part of
PR8657


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120485 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-30 23:05:20 +00:00
Chris Lattner
55ee75d571 enhance isRemovable to refuse to delete volatile mem transfers
now that DSE hacks on them.  This fixes a regression I introduced,
by generalizing DSE to hack on transfers.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120445 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-30 19:12:10 +00:00
Chris Lattner
cf82dc376a Rewrite the main DSE loop to be written in terms of reasoning
about pairs of AA::Location's instead of looking for MemDep's
"Def" predicate.  This is more powerful and general, handling
memset/memcpy/store all uniformly, and implementing PR8701 and
probably obsoleting parts of memcpyoptimizer.

This also fixes an obscure bug with init.trampoline and i8
stores, but I'm not surprised it hasn't been hit yet.  Enhancing
init.trampoline to carry the size that it stores would allow
DSE to be much more aggressive about optimizing them.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120406 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-30 07:23:21 +00:00
Anders Carlsson
303023d9ff Add a puts optimization that converts puts() to putchar('\n').
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120398 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-30 06:19:18 +00:00
Anders Carlsson
1d06295583 Fix a typo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120394 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-30 06:03:55 +00:00
Anders Carlsson
2703399041 Rename this test to FPuts.ll since it actually tests fputs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120393 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-30 05:59:26 +00:00
Chris Lattner
d7b6a989a7 remove a use of llvm-dis
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120383 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-30 02:04:15 +00:00
Chris Lattner
feea8fb6b4 merge one more away
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120375 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-30 01:06:43 +00:00
Chris Lattner
8a2dc0233f I already merged partial-overwrite.ll -> PartialStore.ll
Merge context-sensitive.ll -> simple.ll and upgrade it.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@120374 91177308-0d34-0410-b5e6-96231b3b80d8
2010-11-30 01:05:07 +00:00