Support User Manuals

AMD 250 Computer Hardware User Manual

Open as PDF

of 384

364 SSE and SSE2 Optimizations Appendix E

25112 Rev. 3.06 September 2005

Software Optimization Guide for AMD64 Processors

E.8 Data Conversion

Optimization

Use care when selecting instructions to convert values from one type to another.

Application

This optimization applies to:

• 32-bit software

• 64-bit software

Rationale

For example, the CVTDQ2PS instruction converts four packed 32-bit signed integer values in an

XMM register or a 128-bit memory location to four packed single-precision floating-point values and

writes the converted values to another XMM register. In some cases, an additional instruction is

recommended to ensure that both halves of register operands are of the same type (as recommended

in “Zeroing Out an XMM Register” on page 357).

Table 22 shows the recommendations for register-to-register conversion of scalar values. Table 23 on

page 365 shows the recommendations for register-to-register conversion of vector operands. When

converting values directly from memory, use the preferred instructions provided in Table 24 on

page 365.

Table 22. Converting Scalar Values

Source

format

Destination format Preferred instructions Notes

FPS INT XMM cvtps2dq xmm1, xmm2

FPS INT GPR cvtss2si reg32/64, xmm1

FPS FPD cvtss2sd xmm1, xmm2

FPD INT XMM unpcklpd xmm2, xmm2

cvtpd2dq xmm1, xmm2

UNPCKLPD ensures that the high

half of XMM2 is also in FPD

format.

FPD INT GPR cvtsd2si reg32/64, xmm1

FPD FPS xorps xmm1, xmm1

cvtsd2ss xmm1, xmm2

XORPS ensures that the high half

of XMM1 is in FPS format in case

a MOVAPS instruction is used

later.

INT XMM FPS cvtdq2ps xmm1, xmm2

INT XMM FPD cvtdq2pd xmm1, xmm2

previous next