364 SSE and SSE2 Optimizations Appendix E
25112 Rev. 3.06 September 2005
Software Optimization Guide for AMD64 Processors
E.8 Data Conversion
Optimization
Use care when selecting instructions to convert values from one type to another.
Application
This optimization applies to:
• 32-bit software
• 64-bit software
Rationale
For example, the CVTDQ2PS instruction converts four packed 32-bit signed integer values in an
XMM register or a 128-bit memory location to four packed single-precision floating-point values and
writes the converted values to another XMM register. In some cases, an additional instruction is
recommended to ensure that both halves of register operands are of the same type (as recommended
in “Zeroing Out an XMM Register” on page 357).
Table 22 shows the recommendations for register-to-register conversion of scalar values. Table 23 on
page 365 shows the recommendations for register-to-register conversion of vector operands. When
converting values directly from memory, use the preferred instructions provided in Table 24 on
page 365.
Table 22. Converting Scalar Values
Source
format
Destination format Preferred instructions Notes
FPS INT XMM cvtps2dq xmm1, xmm2
FPS INT GPR cvtss2si reg32/64, xmm1
FPS FPD cvtss2sd xmm1, xmm2
FPD INT XMM unpcklpd xmm2, xmm2
cvtpd2dq xmm1, xmm2
UNPCKLPD ensures that the high
half of XMM2 is also in FPD
format.
FPD INT GPR cvtsd2si reg32/64, xmm1
FPD FPS xorps xmm1, xmm1
cvtsd2ss xmm1, xmm2
XORPS ensures that the high half
of XMM1 is in FPS format in case
a MOVAPS instruction is used
later.
INT XMM FPS cvtdq2ps xmm1, xmm2
INT XMM FPD cvtdq2pd xmm1, xmm2