llvm-6502/test/Transforms/LoopVectorize/X86
Hal Finkel 6bbb01bbf8 Move partial/runtime unrolling late in the pipeline
The generic (concatenation) loop unroller is currently placed early in the
standard optimization pipeline. This is a good place to perform full unrolling,
but not the right place to perform partial/runtime unrolling. However, most
targets don't enable partial/runtime unrolling, so this never mattered.

However, even some x86 cores benefit from partial/runtime unrolling of very
small loops, and follow-up commits will enable this. First, we need to move
partial/runtime unrolling late in the optimization pipeline (importantly, this
is after SLP and loop vectorization, as vectorization can drastically change
the size of a loop), while keeping the full unrolling where it is now. This
change does just that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205264 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-31 23:23:51 +00:00
..
already-vectorized.ll Move partial/runtime unrolling late in the pipeline 2014-03-31 23:23:51 +00:00
avx1.ll
constant-vector-operand.ll
conversion-cost.ll
cost-model.ll
fp32_to_uint32-cost-model.ll [X86] Adjust cost of FP_TO_UINT v8f32->v8i32 2014-03-30 18:07:13 +00:00
fp64_to_uint32-cost-model.ll [X86] Adjust cost of FP_TO_UINT v4f64->v4i32 as well 2014-03-31 21:54:48 +00:00
fp_to_sint8-cost-model.ll
gather-cost.ll
gcc-examples.ll
illegal-parallel-loop-uniform-write.ll
lit.local.cfg
metadata-enable.ll
min-trip-count-switch.ll
no-vector.ll
parallel-loops-after-reg2mem.ll
parallel-loops.ll
rauw-bug.ll
reduction-crash.ll
small-size.ll
struct-store.ll
tripcount.ll
uint64_to_fp64-cost-model.ll [X86][Vectorizer Cost Model] Correct vectorization cost model for v2i64->v2f64 2014-03-27 00:52:16 +00:00
unroll_selection.ll
unroll-pm.ll
unroll-small-loops.ll
vector_ptr_load_store.ll
vector-scalar-select-cost.ll
x86_fp80-vector-store.ll