turning (fptrunc (sqrt (fpext x))) -> (sqrtf x) is great, but we have
to delete the original sqrt as well. Not doing so causes us to do
two sqrt's when building with -fmath-errno (the default on linux).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@113260 91177308-0d34-0410-b5e6-96231b3b80d8
framework, which is good at ripping through bitfield
operations. This generalize a bunch of the existing
xforms that instcombine does, such as
(x << c) >> c -> and
to handle intermediate logical nodes. This is useful for
ripping up the "promote to large integer" code produced by
SRoA.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112304 91177308-0d34-0410-b5e6-96231b3b80d8
computation can be truncated if it is fed by a sext/zext that doesn't
have to be exactly equal to the truncation result type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112285 91177308-0d34-0410-b5e6-96231b3b80d8
by the SRoA "promote to large integer" code, eliminating
some type conversions like this:
%94 = zext i16 %93 to i32 ; <i32> [#uses=2]
%96 = lshr i32 %94, 8 ; <i32> [#uses=1]
%101 = trunc i32 %96 to i8 ; <i8> [#uses=1]
This also unblocks other xforms from happening, now clang is able to compile:
struct S { float A, B, C, D; };
float foo(struct S A) { return A.A + A.B+A.C+A.D; }
into:
_foo: ## @foo
## BB#0: ## %entry
pshufd $1, %xmm0, %xmm2
addss %xmm0, %xmm2
movdqa %xmm1, %xmm3
addss %xmm2, %xmm3
pshufd $1, %xmm1, %xmm0
addss %xmm3, %xmm0
ret
on x86-64, instead of:
_foo: ## @foo
## BB#0: ## %entry
movd %xmm0, %rax
shrq $32, %rax
movd %eax, %xmm2
addss %xmm0, %xmm2
movapd %xmm1, %xmm3
addss %xmm2, %xmm3
movd %xmm1, %rax
shrq $32, %rax
movd %eax, %xmm0
addss %xmm3, %xmm0
ret
This seems pretty close to optimal to me, at least without
using horizontal adds. This also triggers in lots of other
code, including SPEC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@112278 91177308-0d34-0410-b5e6-96231b3b80d8
with a vector input and output into a shuffle vector. This sort of
sequence happens when the input code stores with one type and reloads
with another type and then SROA promotes to i96 integers, which make
everyone sad.
This fixes rdar://7896024
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@103354 91177308-0d34-0410-b5e6-96231b3b80d8
and T->isPointerTy(). Convert most instances of the first form to the second form.
Requested by Chris.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@96344 91177308-0d34-0410-b5e6-96231b3b80d8
what it does. Enhance it to return false to optimizing vector
sign extensions from vector comparisions, which is the idiom used
to get a splatted vector for a vector comparison.
Doing this breaks vector-casts.ll, add some compensating
transformations to handle the important case they cover without
depending on this canonicalization.
This fixes rdar://7434900 a serious pessimization of vector compares.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@95855 91177308-0d34-0410-b5e6-96231b3b80d8
"sext cond" instead of a select. This simplifies some instcombine
code, matches the policy for zext (cond ? 1 : 0 -> zext), and allows
us to generate better code for a testcase on ppc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@94339 91177308-0d34-0410-b5e6-96231b3b80d8
aggressive changed the canonical form from sext(trunc(x)) to ashr(lshr(x)),
make sure to transform a couple more things into that canonical form,
and catch a case where we missed turning zext/shl/ashr into a single sext.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93787 91177308-0d34-0410-b5e6-96231b3b80d8
codegen has no apparent problem with the trunc version of this, because it turns
into a simple subreg idiom
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93202 91177308-0d34-0410-b5e6-96231b3b80d8
trunc has multiple uses. Codegen is not able to coalesce the subreg case
correctly and so this leads to higher register pressure and spilling (see PR5997).
This speeds up 256.bzip2 from 8.60 -> 8.04s on my machine, ~7%.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93200 91177308-0d34-0410-b5e6-96231b3b80d8
BitsToClear case. This allows it to promote expressions which have an
and/or/xor after the lshr, promoting cases like test2 (from PR4216)
and test3 (random extample extracted from a spec benchmark).
clang now compiles the code in PR4216 into:
_test_bitfield: ## @test_bitfield
movl %edi, %eax
orl $194, %eax
movl $4294902010, %ecx
andq %rax, %rcx
orl $32768, %edi
andq $39936, %rdi
movq %rdi, %rax
orq %rcx, %rax
ret
instead of:
_test_bitfield: ## @test_bitfield
movl %edi, %eax
orl $194, %eax
movl $4294902010, %ecx
andq %rax, %rcx
shrl $8, %edi
orl $128, %edi
shlq $8, %rdi
andq $39936, %rdi
movq %rdi, %rax
orq %rcx, %rax
ret
which is still not great, but is progress.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93145 91177308-0d34-0410-b5e6-96231b3b80d8
new BitsToClear result which allows us to start promoting
expressions that end with a lshr-by-constant. This is
conservatively correct and better than what we had before
(see testcases) but still needs to be extended further.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93144 91177308-0d34-0410-b5e6-96231b3b80d8
the zext dest type. This allows us to handle test52/53 in cast.ll,
and allows llvm-gcc to generate much better code for PR4216 in -m64
mode:
_test_bitfield: ## @test_bitfield
orl $32962, %edi
movl %edi, %eax
andl $-25350, %eax
ret
This also fixes a bug handling vector extends, ensuring that the
mask produced is a vector constant, not an integer constant.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93127 91177308-0d34-0410-b5e6-96231b3b80d8
elimination of a sign extend to be a win, which simplifies
the client of CanEvaluateSExtd, and allows us to eliminate
more casts (examples taken from real code).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93109 91177308-0d34-0410-b5e6-96231b3b80d8
lshr+ashr instead of trunc+sext. We want to avoid type
conversions whenever possible, it is easier to codegen expressions
without truncates and extensions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93107 91177308-0d34-0410-b5e6-96231b3b80d8
bits known clear in the result and don't care about the # casts
eliminated. TD is also dead but keeping it for now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93098 91177308-0d34-0410-b5e6-96231b3b80d8
1) don't try to optimize a sext or zext that is only used by a trunc, let
the trunc get optimized first. This avoids some pointless effort in
some common cases since instcombine scans down a block in the first pass.
2) Change the cost model for zext elimination to consider an 'and' cheaper
than a zext. This allows us to do it more aggressively, and for the next
patch to simplify the code quite a bit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@93097 91177308-0d34-0410-b5e6-96231b3b80d8