llvm-6502/lib
Quentin Colombet 8201185d61 [X86] Custom lower UINT_TO_FP from v4f32 to v4i32, and for v8f32 to v8i32 if
AVX2 is available.
According to IACA, the new lowering has a throughput of 8 cycles instead of 13
with the previous one.

Althought this lowering kicks in some SPECs benchmarks, the performance
improvement was within the noise.

Correctness testing has been done for the whole range of uint32_t with the
following program:
    uint4 v = (uint4) {0,1,2,3};
    uint32_t i;
    
    //Check correctness over entire range for uint4 -> float4 conversion
    for( i = 0; i < 1U << (32-2); i++ )
    {
        float4 t = test(v);
        float4 c = correct(v);
        
        if( 0xf != _mm_movemask_ps( t == c ))
        {
            printf( "Error @ %vx: %vf vs. %vf\n", v, c, t);
            return -1;
        }
        
        v += 4;
    }
Where "correct" is the old lowering and "test" the new one.

The patch adds a test case for the two custom lowering instruction.
It also modifies the vector cost model, which is why cast.ll and uitofp.ll are
modified.
2009-02-26-MachineLICMBug.ll is also modified because we now hoist 7
instructions instead of 4 (3 more constant loads).

rdar://problem/18153096>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221657 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-11 02:23:47 +00:00
..
Analysis Indentation fixes 2014-11-06 19:05:57 +00:00
AsmParser X86: Implement the vectorcall calling convention 2014-10-28 01:29:26 +00:00
Bitcode Factor out call to push_back. NFC. 2014-11-06 22:39:16 +00:00
CodeGen Transforms: address some late comments 2014-11-08 00:00:50 +00:00
DebugInfo [dwarfdump] Dump DW_AT_ranges values inline in the debug_info dump. 2014-10-23 04:08:34 +00:00
ExecutionEngine [JIT] Fix more missing endian conversions (opcodes for AArch64, ARM, and Mips stub functions, and ARM target in general) 2014-11-06 09:53:05 +00:00
IR Copy externally_initialized in GlobalVariable::copyAttributesFrom. 2014-11-10 18:41:59 +00:00
IRReader Remove unused variable. NFC. 2014-11-06 23:16:57 +00:00
LineEditor
Linker IR: MDNode => Value: NamedMDNode::getOperator() 2014-11-05 18:16:03 +00:00
LTO Add an option to the LTO code generator to disable vectorization during LTO 2014-10-26 21:50:58 +00:00
MC speling. 2014-11-11 01:13:42 +00:00
Object [yaml2obj] Support AArch64 relocations. 2014-11-10 23:02:03 +00:00
Option Add an overload of getLastArgNoClaim taking two OptSpecifiers. 2014-09-12 19:42:53 +00:00
ProfileData Use ErrorOr for the ::create factory on instrumented and sample profilers. 2014-11-03 00:51:45 +00:00
Support Fix style. 2014-11-07 21:30:36 +00:00
TableGen Eliminate some deep std::vector copies. NFC. 2014-10-03 18:33:16 +00:00
Target [X86] Custom lower UINT_TO_FP from v4f32 to v4i32, and for v8f32 to v8i32 if 2014-11-11 02:23:47 +00:00
Transforms [SwitchLowering] Fix the "fixPhis" function. 2014-11-10 21:05:27 +00:00
CMakeLists.txt
LLVMBuild.txt
Makefile