llvm-6502/test
Sanjay Patel e7c966f067 Use rcpss/rcpps (X86) to speed up reciprocal calcs (PR21385).
This is a first step for generating SSE rcp instructions for reciprocal
calcs when fast-math allows it. This is very similar to the rsqrt optimization
enabled in D5658 ( http://reviews.llvm.org/rL220570 ).

For now, be conservative and only enable this for AMD btver2 where performance
improves significantly both in terms of latency and throughput.

We may never enable this codegen for Intel Core* chips because the divider circuits
are just too fast. On SandyBridge, divss can be as fast as 10 cycles versus the 21
cycle critical path for the rcp + mul + sub + mul + add estimate.

Follow-on patches may allow configuration of the number of Newton-Raphson refinement
steps, add AVX512 support, and enable the optimization for more chips.

More background here: http://llvm.org/bugs/show_bug.cgi?id=21385

Differential Revision: http://reviews.llvm.org/D6175



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221706 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-11 20:51:00 +00:00
..
Analysis [X86] Custom lower UINT_TO_FP from v4f32 to v4i32, and for v8f32 to v8i32 if 2014-11-11 02:23:47 +00:00
Assembler Use FileCheck in a few tests. 2014-11-06 15:05:51 +00:00
Bindings [OCaml] Run tests twice, with ocamlc and ocamlopt (if available) 2014-11-03 09:50:53 +00:00
Bitcode Revert "Revert "DI: Fold constant arguments into a single MDString"" 2014-10-03 20:01:09 +00:00
BugPoint Revert "Revert "DI: Fold constant arguments into a single MDString"" 2014-10-03 20:01:09 +00:00
CodeGen Use rcpss/rcpps (X86) to speed up reciprocal calcs (PR21385). 2014-11-11 20:51:00 +00:00
DebugInfo Provide gmlt-like inline scope information in the skeleton CU to facilitate symbolication without needing the .dwo files 2014-11-04 22:12:25 +00:00
ExecutionEngine [MCJIT] Defer application of AArch64 MachO GOT relocations until resolve time. 2014-10-21 23:41:15 +00:00
Feature Delete -std-compile-opts. 2014-10-16 20:00:02 +00:00
FileCheck
Instrumentation Base check on the section name, not the variable name. 2014-11-06 20:01:34 +00:00
Integer
JitListener Revert "Revert "DI: Fold constant arguments into a single MDString"" 2014-10-03 20:01:09 +00:00
Linker Copy externally_initialized in GlobalVariable::copyAttributesFrom. 2014-11-10 18:41:59 +00:00
LTO LTO: Add missing target triple from r218784 2014-10-01 18:49:58 +00:00
MC [mips] Add hardware register name "hwr_ulr" ($29) 2014-11-11 11:22:39 +00:00
Object [yaml2obj] Support AArch64 relocations. 2014-11-10 23:02:03 +00:00
Other
SymbolRewriter Transform: add SymbolRewriter pass 2014-11-07 21:32:08 +00:00
TableGen [AVX512] Added intrinsics for VPCMPEQB and VPCMPEQW. 2014-09-30 11:32:22 +00:00
tools llvm-objdump: Skip empty sections when dumping contents 2014-11-11 09:58:25 +00:00
Transforms Addition to r216371 (SLP and Loop Vectorization) and r218607 where 2014-11-11 07:39:27 +00:00
Unit
Verifier Extend the verifier to validate range metadata on calls and invokes. 2014-10-20 23:52:07 +00:00
YAMLParser
.clang-format
CMakeLists.txt Make llvm-go test dependency optional. 2014-10-23 19:51:40 +00:00
lit.cfg Only run the gold plugin tests if gold supports the targets we test with. 2014-11-11 05:27:12 +00:00
lit.site.cfg.in [OCaml] [autoconf] Migrate to ocamlfind. 2014-10-30 08:29:45 +00:00
Makefile [OCaml] [autoconf] Migrate to ocamlfind. 2014-10-30 08:29:45 +00:00
Makefile.tests
TestRunner.sh