header for now for memset/memcpy opportunities. It turns out that loop-rotate
is successfully rotating loops, but *DOESN'T MERGE THE BLOCKS*, turning "for
loops" into 2 basic block loops that loop-idiom was ignoring.
With this fix, we form many *many* more memcpy and memsets than before, including
on the "history" loops in the viterbi benchmark, which look like this:
for (j=0; j<MAX_history; ++j) {
history_new[i][j+1] = history[2*i][j];
}
Transforming these loops into memcpy's speeds up the viterbi benchmark from
11.98s to 3.55s on my machine. Woo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122685 91177308-0d34-0410-b5e6-96231b3b80d8
numbering, in which it considers (for example) "%a = add i32 %x, %y" and
"%b = add i32 %x, %y" to be equal because the operands are equal and the
result of the instructions only depends on the values of the operands.
This has almost no effect (it removes 4 instructions from gcc-as-one-file),
and perhaps slows down compilation: I measured a 0.4% slowdown on the large
gcc-as-one-file testcase, but it wasn't statistically significant.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122654 91177308-0d34-0410-b5e6-96231b3b80d8
This resolves a README entry and technically resolves PR4916,
but we still get poor code for the testcase in that PR because
GVN isn't CSE'ing uadd with add, filed as PR8817.
Previously we got:
_test7: ## @test7
addq %rsi, %rdi
cmpq %rdi, %rsi
movl $42, %eax
cmovaq %rsi, %rax
ret
Now we get:
_test7: ## @test7
addq %rsi, %rdi
movl $42, %eax
cmovbq %rsi, %rax
ret
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122182 91177308-0d34-0410-b5e6-96231b3b80d8
(x & 2^n) ? 2^m+C : C
we can offset both arms by C to get the "(x & 2^n) ? 2^m : 0" form, optimize the
select to a shift and apply the offset afterwards.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@121609 91177308-0d34-0410-b5e6-96231b3b80d8
void a(int x) { if (((1<<x)&8)==0) b(); }
into "x != 3", which occurs over 100 times in 403.gcc but in no
other program in llvm-test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119922 91177308-0d34-0410-b5e6-96231b3b80d8
of a select instruction, see if doing the compare with the
true and false values of the select gives the same result.
If so, that can be used as the value of the comparison.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118378 91177308-0d34-0410-b5e6-96231b3b80d8
(X >s -1) ? C1 : C2 and (X <s 0) ? C2 : C1
into ((X >>s 31) & (C2 - C1)) + C1, avoiding the conditional.
This optimization could be extended to take non-const C1 and C2 but we better
stay conservative to avoid code size bloat for now.
for
int sel(int n) {
return n >= 0 ? 60 : 100;
}
we now generate
sarl $31, %edi
andl $40, %edi
leal 60(%rdi), %eax
instead of
testl %edi, %edi
movl $60, %ecx
movl $100, %eax
cmovnsl %ecx, %eax
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@107866 91177308-0d34-0410-b5e6-96231b3b80d8
a load/or/and/store sequence into a narrower store when it is
safe. Daniel tells me that clang will start producing this sort
of thing with bitfields, and this does trigger a few dozen times
on 176.gcc produced by llvm-gcc even now.
This compiles code like CodeGen/X86/2009-05-28-DAGCombineCrash.ll
into:
movl %eax, 36(%rdi)
instead of:
movl $4294967295, %eax ## imm = 0xFFFFFFFF
andq 32(%rdi), %rax
shlq $32, %rcx
addq %rax, %rcx
movq %rcx, 32(%rdi)
and each of the testcases into a single store. Each of them used
to compile into craziness like this:
_test4:
movl $65535, %eax ## imm = 0xFFFF
andl (%rdi), %eax
shll $16, %esi
addl %eax, %esi
movl %esi, (%rdi)
ret
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@101343 91177308-0d34-0410-b5e6-96231b3b80d8
phi nodes when deciding which pointers point to local memory.
I actually checked long ago how useful this is, and it isn't
very: it hardly ever fires in the testsuite, but since Chris
wants it here it is!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92836 91177308-0d34-0410-b5e6-96231b3b80d8