llvm-6502/lib
Evan Cheng ba05f728b5 Revamp build_vector lowering to take advantage of movss and movd instructions.
movd always clear the top 96 bits and movss does so when it's loading the
value from memory.
The net result is codegen for 4-wide shuffles is much improved. It is near
optimal if one or more elements is a zero. e.g.

__m128i test(int a, int b) {
  return _mm_set_epi32(0, 0, b, a);
}

compiles to

_test:
	movd 8(%esp), %xmm1
	movd 4(%esp), %xmm0
	punpckldq %xmm1, %xmm0
	ret

compare to gcc:

_test:
	subl	$12, %esp
	movd	20(%esp), %xmm0
	movd	16(%esp), %xmm1
	punpckldq	%xmm0, %xmm1
	movq	%xmm1, %xmm0
	movhps	LC0, %xmm0
	addl	$12, %esp
	ret

or icc:

_test:
        movd      4(%esp), %xmm0                                #5.10
        movd      8(%esp), %xmm3                                #5.10
        xorl      %eax, %eax                                    #5.10
        movd      %eax, %xmm1                                   #5.10
        punpckldq %xmm1, %xmm0                                  #5.10
        movd      %eax, %xmm2                                   #5.10
        punpckldq %xmm2, %xmm3                                  #5.10
        punpckldq %xmm3, %xmm0                                  #5.10
        ret                                                     #5.10

There are still room for improvement, for example the FP variant of the above example:

__m128 test(float a, float b) {
  return _mm_set_ps(0.0, 0.0, b, a);
}

_test:
	movss 8(%esp), %xmm1
	movss 4(%esp), %xmm0
	unpcklps %xmm1, %xmm0
	xorps %xmm1, %xmm1
	movlhps %xmm1, %xmm0
	ret

The xorps and movlhps are unnecessary. This will require post legalizer optimization to handle.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@27939 91177308-0d34-0410-b5e6-96231b3b80d8
2006-04-21 23:03:30 +00:00
..
Analysis Another simple case type merge case to try 2006-04-19 15:34:34 +00:00
Archive
AsmParser Make sure CVS versions of yacc and lex files get distributed. 2006-04-12 20:57:05 +00:00
Bytecode use isValidOperands instead of duplicating checks 2006-04-08 04:09:19 +00:00
CodeGen The BFS scheduler is apparently nondeterminstic (causes many llvmgcc bootstrap 2006-04-21 17:16:16 +00:00
Debugger Add the README files to the distribution. 2006-04-13 06:39:24 +00:00
ExecutionEngine Get JIT/Interpreter working on Windows again. 2006-03-24 02:53:49 +00:00
Linker Add shufflevector support 2006-04-08 01:19:47 +00:00
Support
System Add checks for __OpenBSD__. 2006-04-17 17:55:41 +00:00
Target Revamp build_vector lowering to take advantage of movss and movd instructions. 2006-04-21 23:03:30 +00:00
Transforms Fix Transforms/ScalarRepl/2006-04-20-PromoteCrash.ll 2006-04-20 20:48:50 +00:00
VMCore Remove a hack required by V9. 2006-04-21 15:33:35 +00:00
Makefile