mirror of
https://github.com/c64scene-ar/llvm-6502.git
synced 2025-02-23 20:29:30 +00:00
This is a first step for generating SSE rcp instructions for reciprocal calcs when fast-math allows it. This is very similar to the rsqrt optimization enabled in D5658 ( http://reviews.llvm.org/rL220570 ). For now, be conservative and only enable this for AMD btver2 where performance improves significantly both in terms of latency and throughput. We may never enable this codegen for Intel Core* chips because the divider circuits are just too fast. On SandyBridge, divss can be as fast as 10 cycles versus the 21 cycle critical path for the rcp + mul + sub + mul + add estimate. Follow-on patches may allow configuration of the number of Newton-Raphson refinement steps, add AVX512 support, and enable the optimization for more chips. More background here: http://llvm.org/bugs/show_bug.cgi?id=21385 Differential Revision: http://reviews.llvm.org/D6175 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221706 91177308-0d34-0410-b5e6-96231b3b80d8
Low Level Virtual Machine (LLVM) ================================ This directory and its subdirectories contain source code for the Low Level Virtual Machine, a toolkit for the construction of highly optimized compilers, optimizers, and runtime environments. LLVM is open source software. You may freely distribute it under the terms of the license agreement found in LICENSE.txt. Please see the documentation provided in docs/ for further assistance with LLVM, and in particular docs/GettingStarted.rst for getting started with LLVM and docs/README.txt for an overview of LLVM's documentation setup. If you're writing a package for LLVM, see docs/Packaging.rst for our suggestions.
Description
Languages
C++
48.7%
LLVM
38.5%
Assembly
10.2%
C
0.9%
Python
0.4%
Other
1.2%