Remove obsolete README_SSE note.

We are generating movaps for all XMM register copies, including scalar
floating point values. This is known to be at least as good as movss and movsd
for all known architectures up to and including Nehalem because it avoids a
partial register stall.

The SSEDomainFix pass will switch movaps to movdqa when appropriate (i.e., when
operands come from the integer unit). We don't now that switching movaps to
movapd has any benefit.

The same applies to andps -> pand.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@108096 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is contained in:
Jakob Stoklund Olesen 2010-07-11 17:13:42 +00:00
parent 744b3a5acd
commit aef48d7b36

View File

@ -89,16 +89,6 @@ Perhaps use pxor / xorp* to clear a XMM register first?
//===---------------------------------------------------------------------===//
X86RegisterInfo::copyRegToReg() returns X86::MOVAPSrr for VR128. Is it possible
to choose between movaps, movapd, and movdqa based on types of source and
destination?
How about andps, andpd, and pand? Do we really care about the type of the packed
elements? If not, why not always use the "ps" variants which are likely to be
shorter.
//===---------------------------------------------------------------------===//
External test Nurbs exposed some problems. Look for
__ZN15Nurbs_SSE_Cubic17TessellateSurfaceE, bb cond_next140. This is what icc
emits: