Appendix E SSE and SSE2 Optimizations 363
Software Optimization Guide for AMD64 Processors
25112 Rev. 3.06 September 2005
E.7 Explicit Load Instructions
Optimization
Use movlpd xmm1, mem64 when loading a scalar FPD value from memory.
Application
This optimization applies to:
• 32-bit software
• 64-bit software
Rationale
The movlpd xmm1, mem64 instruction is more efficient than movsd xmm1, mem64. Use MOVSD only
if you need to ensure that the upper half of XMM1 is also set to FPD format, perhaps because a vector
operation is planned on the register.
When loading a scalar FPS value from memory, use MOVSS.