R600: Fix inconsistency in rsq instructions.

R600 was using a clamped version of rsq, but SI was not. Add a
new rsq_clamped intrinsic and use them consistently.

It's unclear to me from the documentation what behavior
the R600 instructions have, so I assume they have the legacy behavior
described by the SI documents. For R600, use RECIPSQRT_IEEE
for both llvm.AMDGPU.rsq.legacy and llvm.AMDGPU.rsq. R600 also
has RECIPSQRT_FF, which I'm not sure how it fits in here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211637 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is contained in:
Matt Arsenault
2014-06-24 22:13:39 +00:00
parent 0029534141
commit 95eb45c5d9
13 changed files with 109 additions and 14 deletions

View File

@@ -66,4 +66,7 @@ def int_AMDGPU_rcp : GCCBuiltin<"__builtin_amdgpu_rcp">,
def int_AMDGPU_rsq : GCCBuiltin<"__builtin_amdgpu_rsq">,
Intrinsic<[llvm_anyfloat_ty], [LLVMMatchType<0>], [IntrNoMem]>;
def int_AMDGPU_rsq_clamped : GCCBuiltin<"__builtin_amdgpu_rsq_clamped">,
Intrinsic<[llvm_anyfloat_ty], [LLVMMatchType<0>], [IntrNoMem]>;
} // End TargetPrefix = "AMDGPU"