Teach constant folding to perform conversions from constant floating

point values to their integer representation through the SSE intrinsic calls. This is the last part of a README.txt entry for which I have real world examples. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123206 91177308-0d34-0410-b5e6-96231b3b80d8
2025-07-24 22:24:54 +00:00 · 2011-01-11 01:07:24 +00:00
parent f7b0047f5f
commit 15ed90c859
3 changed files with 89 additions and 55 deletions
--- a/lib/Target/README.txt
+++ b/lib/Target/README.txt
@@ -2259,58 +2259,3 @@ Since we know that x+2.0 doesn't care about the sign of any zeros in X, we can
 transform the fmul to 0.0, and then the fadd to 2.0.

 //===---------------------------------------------------------------------===//
-
-clang -O3 currently compiles this code:
-
-#include <emmintrin.h>
-int f(double x) { return _mm_cvtsd_si32(_mm_set_sd(x)); }
-int g(double x) { return _mm_cvttsd_si32(_mm_set_sd(x)); }
-
-into
-
-define i32 @_Z1fd(double %x) nounwind readnone {
-entry:
-  %vecinit.i = insertelement <2 x double> undef, double %x, i32 0
-  %vecinit1.i = insertelement <2 x double> %vecinit.i, double 0.000000e+00,i32 1
-  %0 = tail call i32 @llvm.x86.sse2.cvtsd2si(<2 x double> %vecinit1.i) nounwind
-  ret i32 %0
-}
-
-define i32 @_Z1gd(double %x) nounwind readnone {
-entry:
-  %conv.i = fptosi double %x to i32
-  ret i32 %conv.i
-}
-
-This difference carries over to the assmebly produced, resulting in:
-
-_Z1fd:                                  # @_Z1fd
-# BB#0:                                 # %entry
-        pushq   %rbp
-        movq    %rsp, %rbp
-        xorps   %xmm1, %xmm1
-        movsd   %xmm0, %xmm1
-        cvtsd2sil       %xmm1, %eax
-        popq    %rbp
-        ret
-
-_Z1gd:                                  # @_Z1gd
-# BB#0:                                 # %entry
-        pushq   %rbp
-        movq    %rsp, %rbp
-        cvttsd2si       %xmm0, %eax
-        popq    %rbp
-        ret
-
-The problem is that we can't see through the intrinsic call used for cvtsd2si,
-and fold away the unnecessary manipulation of the function parameter. When
-these functions are inlined, it forms a barrier preventing many further
-optimizations. LLVM IR doesn't have a good way to model the logic of
-'cvtsd2si', its only FP -> int conversion path forces truncation. We should add
-a rounding flag onto fptosi so that it can represent this type of rounding
-naturally in the IR rather than using intrinsics. We might need to use a
-'system_rounding_mode' flag to encode that the semantics of the rounding mode
-can be changed by the program, but ideally we could just say that isn't
-supported, and hard code the rounding.
-
-//===---------------------------------------------------------------------===//