addrec in a different loop to check the value being added to
the accumulated Start value, not the Start value before it has
the new value added to it. This prevents LSR from going crazy
on the included testcase. Dale, please review.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64440 91177308-0d34-0410-b5e6-96231b3b80d8
after sorting by stride value. This prevents it from missing
IV reuse opportunities in a host-sensitive manner.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64415 91177308-0d34-0410-b5e6-96231b3b80d8
loop induction on LP64 targets. When the induction variable is
used in addressing, IndVars now is usually able to inserst a
64-bit induction variable and eliminates the sign-extending cast.
This is also useful for code using C "short" types for
induction variables on targets with 32-bit addressing.
Inserting a wider induction variable is easy; the tricky part is
determining when trunc(sext(i)) expressions are no-ops. This
requires range analysis of the loop trip count. A common case is
when the original loop iteration starts at 0 and exits when the
induction variable is signed-less-than a fixed value; this case
is now handled.
This replaces IndVarSimplify's OptimizeCanonicalIVType. It was
doing the same optimization, but it was limited to loops with
constant trip counts, because it was running after the loop
rewrite, and the information about the original induction
variable is lost by that point.
Rename ScalarEvolution's executesAtLeastOnce to
isLoopGuardedByCond, generalize it to be able to test for
ICMP_NE conditions, and move it to be a public function so that
IndVars can use it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@64407 91177308-0d34-0410-b5e6-96231b3b80d8
accessed at least once as a vector. This prevents it from
compiling the example in not-a-vector into:
define double @test(double %A, double %B) {
%tmp4 = insertelement <7 x double> undef, double %A, i32 0
%tmp = insertelement <7 x double> %tmp4, double %B, i32 4
%tmp2 = extractelement <7 x double> %tmp, i32 4
ret double %tmp2
}
instead, producing the integer code. Producing vectors when they
aren't otherwise in the program is dangerous because a lot of other
code treats them carefully and doesn't want to break them down.
OTOH, many things want to break down tasty i448's.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63638 91177308-0d34-0410-b5e6-96231b3b80d8
With the new world order, it can handle cases where the first
store into the alloca is an element of the vector, instead of
requiring the first analyzed store to have the vector type
itself. This allows us to un-xfail
test/CodeGen/X86/vec_ins_extract.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63590 91177308-0d34-0410-b5e6-96231b3b80d8
turn icmp eq a+x, b+x into icmp eq a, b if a+x or b+x has other uses. This
may have been increasing register pressure leading to the bzip2 slowdown.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63487 91177308-0d34-0410-b5e6-96231b3b80d8
improvements to the EvaluateInDifferentType code. This code works
by just inserted a bunch of new code and then seeing if it is
useful. Instcombine is not allowed to do this: it can only insert
new code if it is useful, and only when it is converging to a more
canonical fixed point. Now that we iterate when DCE makes progress,
this causes an infinite loop when the code ends up not being used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63483 91177308-0d34-0410-b5e6-96231b3b80d8
simplifydemandedbits to simplify instructions with *multiple
uses* in contexts where it can get away with it. This allows
it to simplify the code in multi-use-or.ll into a single 'add
double'.
This change is particularly interesting because it will cover
up for some common codegen bugs with large integers created due
to the recent SROA patch. When working on fixing those bugs,
this should be disabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63481 91177308-0d34-0410-b5e6-96231b3b80d8
Now, if it detects that "V" is the same as some other value,
SimplifyDemandedBits returns the new value instead of RAUW'ing it immediately.
This has two benefits:
1) simpler code in the recursive SimplifyDemandedBits routine.
2) it allows future fun stuff in instcombine where an operation has multiple
uses and can be simplified in one context, but not all.
#2 isn't implemented yet, this patch should have no functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@63479 91177308-0d34-0410-b5e6-96231b3b80d8