llvm-6502/test/CodeGen
Michael J. Spencer d4b4f2d340 [IR] Make {extract,insert}element accept an index of any integer type.
Given the following C code llvm currently generates suboptimal code for
x86-64:

__m128 bss4( const __m128 *ptr, size_t i, size_t j )
{
    float f = ptr[i][j];
    return (__m128) { f, f, f, f };
}

=================================================

define <4 x float> @_Z4bss4PKDv4_fmm(<4 x float>* nocapture readonly %ptr, i64 %i, i64 %j) #0 {
  %a1 = getelementptr inbounds <4 x float>* %ptr, i64 %i
  %a2 = load <4 x float>* %a1, align 16, !tbaa !1
  %a3 = trunc i64 %j to i32
  %a4 = extractelement <4 x float> %a2, i32 %a3
  %a5 = insertelement <4 x float> undef, float %a4, i32 0
  %a6 = insertelement <4 x float> %a5, float %a4, i32 1
  %a7 = insertelement <4 x float> %a6, float %a4, i32 2
  %a8 = insertelement <4 x float> %a7, float %a4, i32 3
  ret <4 x float> %a8
}

=================================================

        shlq    $4, %rsi
        addq    %rdi, %rsi
        movslq  %edx, %rax
        vbroadcastss    (%rsi,%rax,4), %xmm0
        retq

=================================================

The movslq is uneeded, but is present because of the trunc to i32 and then
sext back to i64 that the backend adds for vbroadcastss.

We can't remove it because it changes the meaning. The IR that clang
generates is already suboptimal. What clang really should emit is:

  %a4 = extractelement <4 x float> %a2, i64 %j

This patch makes that legal. A separate patch will teach clang to do it.

Differential Revision: http://reviews.llvm.org/D3519

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207801 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-01 22:12:39 +00:00
..
AArch64 AArch64/ARM64: print BFM instructions as BFI or BFXIL 2014-05-01 12:29:38 +00:00
ARM ARM: support stack probe emission for Windows on ARM 2014-04-30 07:05:07 +00:00
ARM64 [ARM64] Prefer generation of bzero on Darwin only 2014-05-01 13:11:59 +00:00
CPP
Generic MC: move test from Generic to COFF 2014-04-23 21:41:07 +00:00
Hexagon
Inputs
Mips Add basic functionality for assignment of ints. 2014-05-01 20:39:21 +00:00
MSP430
NVPTX Fix the test: DCE optimized away everything. 2014-04-21 17:23:12 +00:00
PowerPC
R600 R600/SI: Fix verifier error with pseudo store instructions. 2014-05-01 16:37:52 +00:00
SPARC Revert "blockfreq: Temporarily turn on -debug-only=block-freq" 2014-04-19 22:45:44 +00:00
SystemZ
Thumb Make this test not match its own filename, when being run from a path that includes the string 'add'. 2014-04-15 22:29:32 +00:00
Thumb2
X86 [IR] Make {extract,insert}element accept an index of any integer type. 2014-05-01 22:12:39 +00:00
XCore Reapply "blockfreq: Rewrite BlockFrequencyInfoImpl" 2014-04-21 17:57:07 +00:00