llvm-6502/lib
Ahmed Bougacha 4a3cd42601 [AArch64] Avoid going through GPRs for across-vector instructions.
This adds new node types for each intrinsic.
For instance, for addv, we have AArch64ISD::UADDV, such that:
  (v4i32 (uaddv ...))
is the same as
  (v4i32 (scalar_to_vector (i32 (int_aarch64_neon_uaddv ...))))
that is,
  (v4i32 (INSERT_SUBREG (v4i32 (IMPLICIT_DEF)),
           (i32 (int_aarch64_neon_uaddv ...)), ssub)

In a combine, we transform all such across-vector-lanes intrinsics to:

  (i32 (extract_vector_elt (uaddv ...), 0))

This has one big advantage: by making the extract_element explicit, we
enable the existing patterns for lane-aware instructions to fire.
This lets us avoid needlessly going through the GPRs.  Consider:

    uint32x4_t test_mul(uint32x4_t a, uint32x4_t b) {
        return vmulq_n_u32(a, vaddvq_u32(b));
    }

We now generate:
    addv.4s  s1, v1
    mul.4s   v0, v0, v1[0]
instead of the previous:
    addv.4s  s1, v1
    fmov     w8, s1
    dup.4s   v1, w8
    mul.4s   v0, v1, v0

rdar://20044838


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231840 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-10 20:45:38 +00:00
..
Analysis LoopAccessAnalysis: Silence -Wreturn-type diagnostic from GCC 2015-03-10 20:23:29 +00:00
AsmParser Fix a stack overflow in the assembler when checking that GEPs must be over sized types. 2015-03-10 06:34:57 +00:00
Bitcode
CodeGen Don't evaluate rend() on every iteration of the loop. 2015-03-10 20:29:59 +00:00
DebugInfo
ExecutionEngine Temporarily revert r231726 and r231724 as they're breaking the build.: 2015-03-10 00:33:27 +00:00
Fuzzer
IR [X86, AVX] replace vinsertf128 intrinsics with generic shuffles 2015-03-10 16:08:36 +00:00
IRReader
LineEditor
Linker DataLayout is mandatory, update the API to reflect it with references. 2015-03-10 02:37:25 +00:00
LTO
MC
Object Add support for Nuxi CloudABI. 2015-03-09 18:40:45 +00:00
Option
Passes
ProfileData InstrProf: Allow hexadecimal function hashes in proftext format 2015-03-09 18:54:49 +00:00
Support Teach raw_ostream to accept SmallString. 2015-03-10 07:33:23 +00:00
TableGen
Target [AArch64] Avoid going through GPRs for across-vector instructions. 2015-03-10 20:45:38 +00:00
Transforms remove function names from comments; NFC 2015-03-10 19:42:57 +00:00
CMakeLists.txt
LLVMBuild.txt
Makefile