llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-11-06 21:05:51 +00:00

History

Chandler Carruth e78a87b633 Teach instcombine to canonicalize "element extraction" from a load of an integer and "element insertion" into a store of an integer into actual element extraction, element insertion, and vector loads and stores. Previously various parts of LLVM (including instcombine itself) would introduce integer loads and stores into the code as a way of opaquely loading and storing "bits". In some cases (such as a memcpy of std::complex<float> object) we will eventually end up using those bits in non-integer types. In order for SROA to effectively promote the allocas involved, it splits these "store a bag of bits" integer loads and stores up into the constituent parts. However, for non-alloca loads and tsores which remain, it uses integer math to recombine the values into a large integer to load or store. All of this would be "fine", except that it forces LLVM to go through integer math to combine and split up values. While this makes perfect sense for integers (and in fact is critical for bitfields to end up lowering efficiently) it is terrible for non-integer types, especially floating point types. We have a much more canonical way of representing the act of concatenating the bits of two SSA values in LLVM: a vector and insertelement. This patch teaching InstCombine to use this representation. With this patch applied, LLVM will no longer introduce integer math into the critical path of every loop over std::complex<float> operations such as those that make up the hot path of ... oh, most HPC code, Eigen, and any other heavy linear algebra library. For the record, I looked extensively at fixing this in other parts of the compiler, but it just doesn't work: - We really do want to canonicalize memcpy and other bit-motion to integer loads and stores. SSA values are tremendously more powerful than "copy" intrinsics. Not doing this regresses massive amounts of LLVM's scalar optimizer. - We really do need to split up integer loads and stores of this form in SROA or every memcpy of a trivially copyable struct will prevent SSA formation of the members of that struct. It essentially turns off SROA. - The closest alternative is to actually split the loads and stores when partitioning with SROA, but this has all of the downsides historically discussed of splitting up loads and stores -- the wide-store information is fundamentally lost. We would also see performance regressions for bitfield-heavy code and other places where the integers aren't really intended to be split without seemingly arbitrary logic to treat integers totally differently. - We can effectively fix this in instcombine, so it isn't that hard of a choice to make IMO. Differential Revision: http://reviews.llvm.org/D6548 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223764 91177308-0d34-0410-b5e6-96231b3b80d8		2014-12-09 08:55:32 +00:00
..
Analysis	InstSimplify: Try to bring back the rest of r223583	2014-12-08 18:30:43 +00:00
AsmParser	Reland r223754	2014-12-09 05:56:09 +00:00
Bitcode	IR: Disallow function-local metadata attachments	2014-12-06 02:29:44 +00:00
CodeGen	Fix a few instances found in SelectionDAG where we were not handling F16 at parity with F32 and F64.	2014-12-09 06:50:39 +00:00
DebugInfo	Make DWARFAcceleratorTable::dump() const.	2014-11-20 16:21:11 +00:00
ExecutionEngine	[MCJIT] Unique-ptrify the RTDyldMemoryManager member of MCJIT. NFC.	2014-12-03 00:51:19 +00:00
IR	ConstantFold: Zero-sized globals might land on top of another global	2014-12-08 19:35:31 +00:00
IRReader
LineEditor
Linker	Skip declarations in the case of functions.	2014-12-09 08:20:06 +00:00
LTO
MC	clang-formatted ranged loops and assignment, NFC.	2014-12-04 08:30:39 +00:00
Object	Add mach-o LC_RPATH support to llvm-objdump	2014-12-04 07:37:02 +00:00
Option
ProfileData
Support	Silence warning: variable 'buffer' set but not used.	2014-12-04 21:36:38 +00:00
TableGen	Revert r222957 "Replace std::map<K, V*> with std::map<K, V> to handle ownership and deletion of the values."	2014-11-30 01:20:17 +00:00
Target	AVX-512: Added some comments to ERI scalar intrinsics.	2014-12-09 07:06:32 +00:00
Transforms	Teach instcombine to canonicalize "element extraction" from a load of an	2014-12-09 08:55:32 +00:00
CMakeLists.txt
LLVMBuild.txt
Makefile