mirror of
https://github.com/c64scene-ar/llvm-6502.git
synced 2025-03-17 05:31:32 +00:00
Summary: Often filter-like loops will do memory accesses that are separated by constant offsets. In these cases it is common that we will exceed the threshold for the allowable number of checks. However, it should be possible to merge such checks, sice a check of any interval againt two other intervals separated by a constant offset (a,b), (a+c, b+c) will be equivalent with a check againt (a, b+c), as long as (a,b) and (a+c, b+c) overlap. Assuming the loop will be executed for a sufficient number of iterations, this will be true. If not true, checking against (a, b+c) is still safe (although not equivalent). As long as there are no dependencies between two accesses, we can merge their checks into a single one. We use this technique to construct groups of accesses, and then check the intervals associated with the groups instead of checking the accesses directly. Reviewers: anemet Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10386 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241673 91177308-0d34-0410-b5e6-96231b3b80d8
Analysis Opportunities: //===---------------------------------------------------------------------===// In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the ScalarEvolution expression for %r is this: {1,+,3,+,2}<loop> Outside the loop, this could be evaluated simply as (%n * %n), however ScalarEvolution currently evaluates it as (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n)) In addition to being much more complicated, it involves i65 arithmetic, which is very inefficient when expanded into code. //===---------------------------------------------------------------------===// In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll, ScalarEvolution is forming this expression: ((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32))) This could be folded to (-1 * (trunc i64 undef to i32)) //===---------------------------------------------------------------------===//