Andrew Trick 616011418e Fix ScalarEvolution::ComputeExitLimitFromCond for 'or' conditions.
Fixes PR16130 - clang produces incorrect code with loop/expression at -O2.

This is a 2+ year old bug that's now holding up the release. It's a
case where we knowingly made aggressive assumptions about undefined
behavior. These assumptions are wrong when SCEV is computing a
subexpression that does not directly control the branch. With this
fix, we avoid making assumptions in those cases but still optimize the
common case. SCEV's trip count computation for exits controlled by
'or' expressions is now analagous to the trip count computation for
loops with multiple exits. I had already fixed the multiple exit case
to be conservative.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182989 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-31 06:43:25 +00:00
..

Analysis Opportunities:

//===---------------------------------------------------------------------===//

In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the
ScalarEvolution expression for %r is this:

  {1,+,3,+,2}<loop>

Outside the loop, this could be evaluated simply as (%n * %n), however
ScalarEvolution currently evaluates it as

  (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n))

In addition to being much more complicated, it involves i65 arithmetic,
which is very inefficient when expanded into code.

//===---------------------------------------------------------------------===//

In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll,

ScalarEvolution is forming this expression:

((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32)))

This could be folded to

(-1 * (trunc i64 undef to i32))

//===---------------------------------------------------------------------===//