llvm-6502/lib
Bruno Cardoso Lopes ba059464c3 [x86] Add vector @llvm.ctpop intrinsic custom lowering
Currently, when ctpop is supported for scalar types, the expansion of
@llvm.ctpop.vXiY uses vector element extractions, insertions and individual
calls to @llvm.ctpop.iY. When not, expansion with bit-math operations is used
for the scalar calls.

Local haswell measurements show that we can improve vector @llvm.ctpop.vXiY
expansion in some cases by using a using a vector parallel bit twiddling
approach, based on:

v = v - ((v >> 1) & 0x55555555);
v = (v & 0x33333333) + ((v >> 2) & 0x33333333);
v = ((v + (v >> 4) & 0xF0F0F0F)
v = v + (v >> 8)
v = v + (v >> 16)
v = v & 0x0000003F
(from http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel)

When scalar ctpop isn't supported, the approach above performs better for
v2i64, v4i32, v4i64 and v8i32 (see numbers below). And even when scalar ctpop
is supported, this approach performs ~2x better for v8i32.

Here, x86_64 implies -march=corei7-avx without ctpop and x86_64h includes ctpop
support with -march=core-avx2.

== [x86_64h - new]
v8i32: 0.661685
v4i32: 0.514678
v4i64: 0.652009
v2i64: 0.324289
== [x86_64h - old]
v8i32: 1.29578
v4i32: 0.528807
v4i64: 0.65981
v2i64: 0.330707

== [x86_64 - new]
v8i32: 1.003
v4i32: 0.656273
v4i64: 1.11711
v2i64: 0.754064
== [x86_64 - old]
v8i32: 2.34886
v4i32: 1.72053
v4i64: 1.41086
v2i64: 1.0244

More work for other vector types will come next.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224725 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-22 19:45:43 +00:00
..
Analysis InstSimplify: Don't bother if getScalarSizeInBits returns zero 2014-12-20 04:45:33 +00:00
AsmParser IR: Make metadata typeless in assembly 2014-12-15 19:07:53 +00:00
Bitcode Fix Visual C++ error "'llvm::make_unique' : ambiguous call to overloaded function". 2014-12-18 10:03:35 +00:00
CodeGen [CodeGenPrepare] Handle properly the promotion of operands when this does not 2014-12-22 18:11:52 +00:00
DebugInfo [DebugInfo] Move all DWARF headers to the public include directory. 2014-12-19 18:26:33 +00:00
ExecutionEngine [C API] Expose LLVMGetGlobalValueAddress and LLVMGetFunctionAddress. 2014-12-22 18:53:11 +00:00
IR The leak detector is dead, long live asan and valgrind. 2014-12-22 13:00:36 +00:00
IRReader
LineEditor
Linker Rename MapValue(Metadata*) to MapMetadata() 2014-12-19 06:06:18 +00:00
LTO LTO: Lazy-load LTOModule in local contexts 2014-12-17 22:05:42 +00:00
MC Remove unused header. NFC. 2014-12-22 19:09:15 +00:00
Object Add printing the LC_ROUTINES load commands with llvm-objdump’s -private-headers. 2014-12-19 22:25:22 +00:00
Option
ProfileData
Support Add missing implementation of 'sys::path::is_other' to the support library. 2014-12-18 18:19:47 +00:00
TableGen Clean up static analyzer warnings. 2014-12-12 21:48:03 +00:00
Target [x86] Add vector @llvm.ctpop intrinsic custom lowering 2014-12-22 19:45:43 +00:00
Transforms InstCombine: Squash an icmp+select into bitwise arithmetic 2014-12-20 04:45:35 +00:00
CMakeLists.txt
LLVMBuild.txt
Makefile