llvm-6502/test/CodeGen/NVPTX
Jingyue Wu 75d77cd179 [MachineSink] Use the real post dominator tree
Summary:
Fixes a FIXME in MachineSinking. Instead of using the simple heuristics in
isPostDominatedBy, use the real MachinePostDominatorTree and MachineLoopInfo.
The old heuristics caused instructions to sink unnecessarily, and might create
register pressure.

This is the second try of the fix. The first one (D4814) caused a performance
regression due to failing to sink instructions out of loops (PR21115). This
patch fixes PR21115 by sinking an instruction from a deeper loop to a shallower
one regardless of whether the target block post-dominates the source.

Thanks Alexey Volkov for reporting PR21115! 

Test Plan:
Added a NVPTX codegen test to verify that our change prevents the backend from
over-sinking. It also shows the unnecessary register pressure caused by
over-sinking.

Added an X86 test to verify we can sink instructions out of loops regardless of
the dominance relationship. This test is reduced from Alexey's test in PR21115.

Updated an affected test in X86.

Also ran SPEC CINT2006 and llvm-test-suite for compilation time and runtime
performance. Results are attached separately in the review thread.

Reviewers: Jiangning, resistor, hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, bruno, volkalexey, llvm-commits, meheff, eliben, jholewinski

Differential Revision: http://reviews.llvm.org/D5633

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219773 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 03:27:43 +00:00
..
access-non-generic.ll Canonicalize addrspacecast ConstExpr between different pointer types 2014-06-15 21:40:57 +00:00
add-128bit.ll [NVPTX] 64-bit ADDC/ADDE are not legal 2013-07-01 12:59:04 +00:00
addrspacecast-gvar.ll [NVPTX] Add support for addrspacecast in global variable initializers, including emitting generic() when casting to address space 0. 2014-04-09 15:39:11 +00:00
addrspacecast.ll Optimize away unnecessary address casts. 2014-04-03 21:18:25 +00:00
aggr-param.ll [NVPTX] Fix emitting aggregate parameters 2014-01-28 18:35:29 +00:00
annotations.ll
arg-lowering.ll [NVPTX] Clean up argument lowering code and properly handle alignment for structs and vectors 2014-06-27 18:35:44 +00:00
arithmetic-fp-sm20.ll [NVPTX] Improve handling of FP fusion 2014-07-17 18:10:09 +00:00
arithmetic-int.ll [NVPTX] Rename registers %fl -> %fd and %rl -> %rd 2014-07-16 16:26:58 +00:00
atomics.ll Add some tests for NVPTX lowering of cmpxchg 2014-07-21 22:54:44 +00:00
bfe.ll [NVPTX] Add isel patterns for bit-field extract (bfe) 2014-06-27 18:35:27 +00:00
bug17709.ll Remove the linker_private and linker_private_weak linkages. 2014-03-13 23:18:37 +00:00
call-with-alloca-buffer.ll [NVPTX] Rename registers %fl -> %fd and %rl -> %rd 2014-07-16 16:26:58 +00:00
callchain.ll [NVPTX] Fix handling of indirect calls 2013-11-15 12:30:04 +00:00
calling-conv.ll
compare-int.ll [NVPTX] Rename registers %fl -> %fd and %rl -> %rd 2014-07-16 16:26:58 +00:00
constant-vectors.ll [NVPTX] Make constant vector test case endian-independent 2013-09-19 13:14:44 +00:00
convert-fp.ll [NVPTX] Rename registers %fl -> %fd and %rl -> %rd 2014-07-16 16:26:58 +00:00
convert-int-sm20.ll [NVPTX] Rename registers %fl -> %fd and %rl -> %rd 2014-07-16 16:26:58 +00:00
ctlz.ll [NVPTX] Add support for cttz/ctlz/ctpop 2013-06-28 17:58:07 +00:00
ctpop.ll Convert CodeGen/*/*.ll tests to use the new CHECK-LABEL for easier debugging. No functionality change and all tests pass after conversion. 2013-07-13 20:38:47 +00:00
cttz.ll [NVPTX] Add support for cttz/ctlz/ctpop 2013-06-28 17:58:07 +00:00
div-ri.ll [NVPTX] Add missing patterns for div.approx with immediate denominator 2014-01-21 14:40:05 +00:00
envreg.ll [NVPTX] Add support for envreg reads 2014-06-27 18:35:21 +00:00
fast-math.ll [NVPTX] Use approximate FP ops when unsafe-fp-math is used, and append 2013-07-22 12:18:04 +00:00
fma-disable.ll
fma.ll [NVPTX] Improve handling of FP fusion 2014-07-17 18:10:09 +00:00
fp16.ll NVPTX: support direct f16 <-> f64 conversions via intrinsics. 2014-07-18 08:30:10 +00:00
fp-contract.ll [NVPTX] Improve handling of FP fusion 2014-07-17 18:10:09 +00:00
fp-literals.ll [NVPTX] Improve handling of FP fusion 2014-07-17 18:10:09 +00:00
generic-to-nvvm.ll [NVPTX] Add support for selecting CUDA vs OCL mode based on triple 2013-06-21 18:51:49 +00:00
global-ordering.ll
gvar-init.ll [NVPTX] Error out if initializer is given for variable in an address space that does not support initialization 2014-06-27 18:36:01 +00:00
half.ll NVPTX: support fpext/fptrunc to and from f16. 2014-07-18 13:01:43 +00:00
i1-global.ll [NVPTX] Add support for selecting CUDA vs OCL mode based on triple 2013-06-21 18:51:49 +00:00
i1-int-to-fp.ll [NVPTX] Add missing patterns for i1 [s,u]int_to_fp 2013-08-06 14:13:34 +00:00
i1-param.ll [NVPTX] Add support for selecting CUDA vs OCL mode based on triple 2013-06-21 18:51:49 +00:00
i8-param.ll Convert CodeGen/*/*.ll tests to use the new CHECK-LABEL for easier debugging. No functionality change and all tests pass after conversion. 2013-07-13 20:38:47 +00:00
imad.ll [NVPTX] Implement fma and imad contraction as target DAGCombiner patterns 2014-06-27 18:35:37 +00:00
implicit-def.ll [NVPTX] Improve handling of FP fusion 2014-07-17 18:10:09 +00:00
inline-asm.ll [NVPTX] Add 'b' asm constraint 2014-06-27 18:36:06 +00:00
intrin-nocapture.ll
intrinsic-old.ll [NVPTX] Rename registers %fl -> %fd and %rl -> %rd 2014-07-16 16:26:58 +00:00
intrinsics.ll [NVPTX] Rename registers %fl -> %fd and %rl -> %rd 2014-07-16 16:26:58 +00:00
isspacep.ll [NVPTX] Add support for isspacep instruction 2014-06-27 18:35:24 +00:00
ld-addrspace.ll [NVPTX] Rename registers %fl -> %fd and %rl -> %rd 2014-07-16 16:26:58 +00:00
ld-generic.ll [NVPTX] Rename registers %fl -> %fd and %rl -> %rd 2014-07-16 16:26:58 +00:00
ldparam-v4.ll [NVPTX] Fix off-by-one error when creating the VT list for an SDNode 2013-12-05 12:58:00 +00:00
ldu-i8.ll [NVPTX] Make the alignment an explicit argument to ldu/ldg 2014-08-29 15:30:20 +00:00
ldu-ldg.ll [NVPTX] Make the alignment an explicit argument to ldu/ldg 2014-08-29 15:30:20 +00:00
ldu-reg-plus-offset.ll [NVPTX] Make the alignment an explicit argument to ldu/ldg 2014-08-29 15:30:20 +00:00
lit.local.cfg Reduce verbiage of lit.local.cfg files 2014-06-09 22:42:55 +00:00
load-sext-i1.ll [NVPTX] Add support for selecting CUDA vs OCL mode based on triple 2013-06-21 18:51:49 +00:00
local-stack-frame.ll [NVPTX] Rename registers %fl -> %fd and %rl -> %rd 2014-07-16 16:26:58 +00:00
machine-sink.ll [MachineSink] Use the real post dominator tree 2014-10-15 03:27:43 +00:00
managed.ll [NVPTX] Add support for .managed variables for UVM 2014-06-27 18:35:58 +00:00
misaligned-vector-ldst.ll [NVPTX] Honor alignment on vector loads/stores 2014-07-16 19:45:35 +00:00
module-inline-asm.ll [NVPTX] Add support for module-scope inline asm 2013-07-01 13:00:14 +00:00
mulwide.ll [NVPTX] Add some extra tests for mul.wide to test non-power-of-two source types 2014-07-23 20:23:49 +00:00
noduplicate-syncthreads.ll Expose "noduplicate" attribute as a property for intrinsics. 2014-03-18 23:51:07 +00:00
nvvm-reflect.ll [NVPTX] Add reflect intrinsic (better than matching by function name) 2014-06-27 18:36:11 +00:00
param-align.ll
pr13291-i1-store.ll [NVPTX] Rename registers %fl -> %fd and %rl -> %rd 2014-07-16 16:26:58 +00:00
pr16278.ll [NVPTX] Remove old CONST_NOT_GEN address space that is not being used anymore and causes constants to be emitted in the global address space 2013-06-10 13:29:47 +00:00
pr17529.ll [NVPTX] Switch from StrongPHIElimination to PHIElimination in NVPTXTargetMachine, and add some missing optimization passes to addOptimizedRegAlloc 2013-10-11 12:39:39 +00:00
ptx-version-30.ll
ptx-version-31.ll
refl1.ll [NVPTX] Add support for selecting CUDA vs OCL mode based on triple 2013-06-21 18:51:49 +00:00
rotate.ll [NVPTX] Add support for efficient rotate instructions on SM 3.2+ 2014-06-27 18:35:33 +00:00
rsqrt.ll Convert CodeGen/*/*.ll tests to use the new CHECK-LABEL for easier debugging. No functionality change and all tests pass after conversion. 2013-07-13 20:38:47 +00:00
sched1.ll
sched2.ll
sext-in-reg.ll Convert CodeGen/*/*.ll tests to use the new CHECK-LABEL for easier debugging. No functionality change and all tests pass after conversion. 2013-07-13 20:38:47 +00:00
sext-params.ll [NVPTX] Handle signext/zeroext attributes properly 2013-07-01 12:58:58 +00:00
shift-parts.ll [NVPTX] Add support for [SHL,SRA,SRL]_PARTS 2014-06-27 18:35:40 +00:00
simple-call.ll
sm-version-20.ll
sm-version-21.ll
sm-version-30.ll
sm-version-35.ll
st-addrspace.ll [NVPTX] Rename registers %fl -> %fd and %rl -> %rd 2014-07-16 16:26:58 +00:00
st-generic.ll [NVPTX] Rename registers %fl -> %fd and %rl -> %rd 2014-07-16 16:26:58 +00:00
surf-read-cuda.ll [NVPTX] Add more surface/texture intrinsics, including CUDA unified texture fetch 2014-07-17 11:59:04 +00:00
surf-read.ll [NVPTX] Add preliminary intrinsics and codegen support for textures/surfaces 2014-04-09 15:39:15 +00:00
surf-write-cuda.ll [NVPTX] Add more surface/texture intrinsics, including CUDA unified texture fetch 2014-07-17 11:59:04 +00:00
surf-write.ll [NVPTX] Add preliminary intrinsics and codegen support for textures/surfaces 2014-04-09 15:39:15 +00:00
symbol-naming.ll Fix for PR19099 - NVPTX produces invalid symbol names. 2014-03-31 15:56:26 +00:00
tex-read-cuda.ll [NVPTX] Add more surface/texture intrinsics, including CUDA unified texture fetch 2014-07-17 11:59:04 +00:00
tex-read.ll [NVPTX] Add more surface/texture intrinsics, including CUDA unified texture fetch 2014-07-17 11:59:04 +00:00
texsurf-queries.ll [NVPTX] Flag surface/texture query instructions with IsTexSurfQuery 2014-07-17 14:51:33 +00:00
tuple-literal.ll
vec8.ll [NVPTX] Fix logic error in loading vector parameters of more than 4 components 2013-11-11 19:28:16 +00:00
vec-param-load.ll Fix non-deterministic SDNodeOrder-dependent codegen 2014-01-12 14:09:17 +00:00
vector-args.ll [NVPTX] Add support for vectorized function return values 2013-06-28 17:57:55 +00:00
vector-call.ll [NVPTX] Add missing .v4 qualifier on vector store instruction 2014-07-17 16:58:56 +00:00
vector-compare.ll
vector-loads.ll
vector-select.ll
vector-stores.ll Add a target legalize hook for SplitVectorOperand (again) 2013-07-26 13:28:29 +00:00
weak-global.ll [NVPTX] Emit .weak linkage for link_once, weak, available_externally, and common linkage 2014-06-27 18:35:56 +00:00
weak-linkage.ll [NVPTX] Emit .weak when linkage is not external, internal, or private 2014-06-27 18:35:10 +00:00