llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-12-15 20:29:48 +00:00

History

Sanjay Patel 8765e82c83 [X86, AVX] adjust tablegen patterns to generate better code for scalar insertion into zero vector (PR23073) For code like this: define <8 x i32> @load_v8i32() { ret <8 x i32> <i32 7, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0> } We produce this AVX code: _load_v8i32: ## @load_v8i32 movl $7, %eax vmovd %eax, %xmm0 vxorps %ymm1, %ymm1, %ymm1 vblendps $1, %ymm0, %ymm1, %ymm0 ## ymm0 = ymm0[0],ymm1[1,2,3,4,5,6,7] retq There are at least 2 bugs in play here: We're generating a blend when a move scalar does the same job using 2 less instruction bytes (see FIXMEs). We're not matching an existing pattern that would eliminate the xor and blend entirely. The zero bytes are free with vmovd. The 2nd fix involves an adjustment of "AddedComplexity" [1] and mostly masks the 1st problem. [1] AddedComplexity has close to no documentation in the source. The best we have is this comment: "roughly corresponds to the number of nodes that are covered". It appears that x86 has bastardized this definition by inflating its values for some other undocumented reason. For example, we have a pattern with "AddedComplexity = 400" (!). I searched my way to this page: https://groups.google.com/forum/#!topic/llvm-dev/5UX-Og9M0xQ Differential Revision: http://reviews.llvm.org/D8794 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233931 91177308-0d34-0410-b5e6-96231b3b80d8		2015-04-02 17:56:17 +00:00
..
AArch64	Fix PR23065. Avoid optimizing bitcast of build_vector with constant input to scalar_to_vector.	2015-04-01 01:52:38 +00:00
ARM	[SDAG] Move TRUNCATE splitting logic into a helper, and use	2015-03-31 10:20:58 +00:00
BPF	[bpf] mark mov instructions as ReMaterializable	2015-03-31 02:49:58 +00:00
CPP	[opaque pointer type] Add textual IR support for explicit type parameter to load instruction	2015-02-27 21:17:42 +00:00
Generic	LLParser: Require non-null scope for MDLocation and MDLocalVariable	2015-03-27 17:56:39 +00:00
Hexagon	Expand MUX instructions early on Hexagon	2015-03-31 13:35:12 +00:00
Inputs	DebugInfo: Fix bad debug info for compile units and types	2015-03-27 20:46:33 +00:00
Mips	[mips] Make sure that we don't adjust the stack pointer by zero amount.	2015-04-02 10:14:54 +00:00
MSP430	[opaque pointer type] Add textual IR support for explicit type parameter to gep operator	2015-03-13 18:20:45 +00:00
NVPTX	[NVPTX] Associate a minimum PTX version for each SM architecture	2015-03-30 19:30:55 +00:00
PowerPC	[PowerPC] FastISel can't handle i1 return values when using CR bits	2015-04-01 00:40:48 +00:00
R600	[R600/SI] Fix testcase check line.	2015-03-27 20:41:42 +00:00
SPARC	[opaque pointer type] Add textual IR support for explicit type parameter to gep operator	2015-03-13 18:20:45 +00:00
SystemZ	[SystemZ] Support transactional execution on zEC12	2015-04-01 12:51:43 +00:00
Thumb	DebugInfo: Fix bad debug info for compile units and types	2015-03-27 20:46:33 +00:00
Thumb2	Fix a nasty bug in DAGCombine of STORE nodes.	2015-03-19 22:48:57 +00:00
WinEH	Fix WinEHPrepare bug with multiple catch handlers	2015-04-01 17:21:25 +00:00
X86	[X86, AVX] adjust tablegen patterns to generate better code for scalar insertion into zero vector (PR23073)	2015-04-02 17:56:17 +00:00
XCore	DebugInfo: Fix testcases that fail -verify-debug-info=true	2015-03-16 21:10:12 +00:00