mirror of
https://github.com/c64scene-ar/llvm-6502.git
synced 2025-01-18 13:34:04 +00:00
3cda2d3885
Replace std::map with SmallVector to memorize the cached result since SCEV usually belongs to little Loop/BB Linear scan on SmallVector is faster than std::map. Code reviewer : Andrew Trick. Test result : Pass Unit Test & LLVM Test Suite 401.bzip2 0.425721 0.419981 101.37% 403.gcc 24.53855 24.2667 101.12% 429.mcf 0.060847 0.059944 101.51% 433.milc 0.646009 0.636119 101.55% 444.namd 1.383928 1.370614 100.97% 445.gobmk 5.836575 5.800225 100.63% 450.soplex 1.911257 1.895963 100.81% 456.hmmer 1.039565 1.032534 100.68% 458.sjeng 0.897401 0.885567 101.34% 464.h264ref 3.645908 3.577991 101.90% 470.lbm 0.049456 0.048398 102.19% 471.omnetpp 5.638575 5.60435 100.61% bitmnp01 0.045738 0.045291 100.99% cjpegv2data 0.304359 0.302833 100.50% idctrn01 0.046433 0.045763 101.46% quake2 4.534416 4.4952 100.87% quake 2.688566 2.659208 101.10% xcsoar 12.42545 12.30385 100.99% linpack 0.038739 0.03803 101.86% matrix01 0.053564 0.0528 101.45% nbench 0.402867 0.395803 101.78% tblook01 0.021265 0.021015 101.19% ttsprk01 0.066384 0.065566 101.25% git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194459 91177308-0d34-0410-b5e6-96231b3b80d8
Analysis Opportunities: //===---------------------------------------------------------------------===// In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the ScalarEvolution expression for %r is this: {1,+,3,+,2}<loop> Outside the loop, this could be evaluated simply as (%n * %n), however ScalarEvolution currently evaluates it as (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n)) In addition to being much more complicated, it involves i65 arithmetic, which is very inefficient when expanded into code. //===---------------------------------------------------------------------===// In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll, ScalarEvolution is forming this expression: ((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32))) This could be folded to (-1 * (trunc i64 undef to i32)) //===---------------------------------------------------------------------===//