llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-12-15 20:29:48 +00:00

History

Sanjay Patel 959b276771 transform fadd chains to increase parallelism This is a compromise: with this simple patch, we should always handle a chain of exactly 3 operations optimally, but we're not generating the optimal balanced binary tree for a longer sequence. In general, this transform will reduce the dependency chain for a sequence of instructions using N operands from a worst case N-1 dependent operations to N/2 dependent operations. The optimal balanced binary tree would reduce the chain to log2(N). The trade-off for not dealing with longer sequences is: (1) we have less complexity in the compiler, (2) we avoid unknown compile-time blowup calculating a balanced tree, and (3) we don't need to worry about the increased register pressure required to parallelize longer sequences. It also seems unlikely that we would ever encounter really long strings of dependent ops like that in the wild, but I'm not sure how to verify that speculation. FWIW, I see no perf difference for test-suite running on btver2 (x86-64) with -ffast-math and this patch. We can extend this patch to cover other associative operations such as fmul, fmax, fmin, integer add, integer mul. This is a partial fix for: https://llvm.org/bugs/show_bug.cgi?id=17305 and if extended: https://llvm.org/bugs/show_bug.cgi?id=21768 https://llvm.org/bugs/show_bug.cgi?id=23116 The issue also came up in: http://reviews.llvm.org/D8941 Differential Revision: http://reviews.llvm.org/D9232 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236031 91177308-0d34-0410-b5e6-96231b3b80d8		2015-04-28 21:03:22 +00:00
..
AArch64	[AArch64] Also combine vector selects fed by non-i1 SETCCs.	2015-04-27 21:43:12 +00:00
ARM	Switch lowering: Take branch weight into account when ordering for fall-through	2015-04-27 23:35:22 +00:00
BPF	[opaque pointer type] Add textual IR support for explicit type parameter to the call instruction	2015-04-16 23:24:18 +00:00
CPP	[opaque pointer type] Add textual IR support for explicit type parameter to the call instruction	2015-04-16 23:24:18 +00:00
Generic	Switch lowering: Take branch weight into account when ordering for fall-through	2015-04-27 23:35:22 +00:00
Hexagon	[Hexagon] Use constant extenders to fix up hardware loops	2015-04-27 14:16:43 +00:00
Inputs	DebugInfo: Fix bad debug info for compile units and types	2015-03-27 20:46:33 +00:00
Mips	Reapply "[mips][FastISel] Implement shift ops for Mips fast-isel.""	2015-04-27 13:28:05 +00:00
MSP430	[opaque pointer type] Add textual IR support for explicit type parameter to gep operator	2015-03-13 18:20:45 +00:00
NVPTX	[NVPTX] Handle addrspacecast constant expressions in aggregate initializers	2015-04-28 17:18:30 +00:00
PowerPC	[PPC64LE] Remove unnecessary swaps from lane-insensitive vector computations	2015-04-27 19:57:34 +00:00
R600	R600: Fix up for AsmPrinter's OutStreamer being a unique_ptr	2015-04-28 17:37:03 +00:00
SPARC	[opaque pointer type] Add textual IR support for explicit type parameter to the call instruction	2015-04-16 23:24:18 +00:00
SystemZ	Allow memory intrinsics to be tail calls	2015-04-13 17:16:45 +00:00
Thumb	[opaque pointer type] Add textual IR support for explicit type parameter to the call instruction	2015-04-16 23:24:18 +00:00
Thumb2	Thumb2: When applying branch optimizations, visit branches in reverse order.	2015-04-23 20:31:35 +00:00
WinEH	[SEH] Implement GetExceptionCode in __except blocks	2015-04-24 20:25:05 +00:00
X86	transform fadd chains to increase parallelism	2015-04-28 21:03:22 +00:00
XCore	[opaque pointer type] Add textual IR support for explicit type parameter to the call instruction	2015-04-16 23:24:18 +00:00