llvm-6502/test/CodeGen
Sanjay Patel ab4ad4f98e Optimize merging of scalar loads for 32-byte vectors [X86, AVX]
Fix the poor codegen seen in PR21710 ( http://llvm.org/bugs/show_bug.cgi?id=21710 ).
Before we crack 32-byte build vectors into smaller chunks (and then subsequently
glue them back together), we should look for the easy case where we can just load
all elements in a single op.

An example of the codegen change is:

From:

vmovss  16(%rdi), %xmm1
vmovups (%rdi), %xmm0
vinsertps       $16, 20(%rdi), %xmm1, %xmm1
vinsertps       $32, 24(%rdi), %xmm1, %xmm1
vinsertps       $48, 28(%rdi), %xmm1, %xmm1
vinsertf128     $1, %xmm1, %ymm0, %ymm0
retq

To:

vmovups (%rdi), %ymm0
retq

Differential Revision: http://reviews.llvm.org/D6536



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223518 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-05 21:28:14 +00:00
..
AArch64
ARM Add missing FP build attribute tests. 2014-12-05 08:22:47 +00:00
CPP
Generic
Hexagon
Inputs
Mips
MSP430
NVPTX
PowerPC [PowerPC]Update Power VSX test cases to also test fast-isel 2014-12-05 20:32:05 +00:00
R600
SPARC
SystemZ
Thumb Re-add support to llvm-objdump for Mach-O universal files and archives with -macho 2014-12-04 23:56:27 +00:00
Thumb2
X86 Optimize merging of scalar loads for 32-byte vectors [X86, AVX] 2014-12-05 21:28:14 +00:00
XCore