llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2025-03-06 05:33:28 +00:00

History

Sanjay Patel ab4ad4f98e Optimize merging of scalar loads for 32-byte vectors [X86, AVX]

Fix the poor codegen seen in PR21710 ( http://llvm.org/bugs/show_bug.cgi?id=21710 ).
Before we crack 32-byte build vectors into smaller chunks (and then subsequently
glue them back together), we should look for the easy case where we can just load
all elements in a single op.

An example of the codegen change is:

From:

vmovss  16(%rdi), %xmm1
vmovups (%rdi), %xmm0
vinsertps       $16, 20(%rdi), %xmm1, %xmm1
vinsertps       $32, 24(%rdi), %xmm1, %xmm1
vinsertps       $48, 28(%rdi), %xmm1, %xmm1
vinsertf128     $1, %xmm1, %ymm0, %ymm0
retq

To:

vmovups (%rdi), %ymm0
retq

Differential Revision: http://reviews.llvm.org/D6536



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223518 91177308-0d34-0410-b5e6-96231b3b80d8

2014-12-05 21:28:14 +00:00

AArch64

[AArch64] Combining Load and IntToFp should check for neon availability

2014-12-04 20:25:50 +00:00

ARM

Add missing FP build attribute tests.

2014-12-05 08:22:47 +00:00

CPP

…

Generic

…

Hexagon