356 SSE and SSE2 Optimizations Appendix E
25112 Rev. 3.06 September 2005
Software Optimization Guide for AMD64 Processors
E.1 Half-Register Operations
Optimization
❖ Take care when mixing data types of operands within the same register.
Application
This optimization applies to:
• 32-bit software
• 64-bit software
Rationale
Mixing data types in a single register is harmless if only scalar operations are used. However, this
practice can cause performance problems if the register is used as a sourcce for a vector operation.
Example 1
Avoid code like this:
addps xmm1, xmm2 ; Add four packed single-precision (FPD) values in XMM1
; to their corresponding values in XMM2.
cvtss2sd xmm1, xmm2 ; Convert the low-order single-precision value in XMM2
; to 64-bit double precision FP format and store in
; lower 64-bits of XMM1.
In this example, the second instruction leaves the upper half of XMM1 in FPS format and the lower
half in FPD format.
Example 2
Avoid code like this:
addps xmm1,xmm2 ; Add four packed single-precision (FPD) values in XMM1
; to their corresponding values in XMM2.
movlpd xmm1,mem64 ; Move the double-precision value in mem64 to the lower
; half of XMM1.
In this example, The MOVLPD instruction sets the low half of XMM1 to FPD format but leaves the
high half unchanged (in FPS format).