Chapter 9 Optimizing with SIMD Instructions 207
Software Optimization Guide for AMD64 Processors
25112 Rev. 3.06 September 2005
9.7 Use MMX™ Instructions to Construct Fast Block-
Copy Routines in 32-Bit Mode
Optimization
Use MMX instructions when moving integer data in a block-copy routine.
Application
This optimization applies to:
• 32-bit software
Rationale
MMX instructions relieve the high register pressure typical of x86 code because of the small register
file.
In addition, MMX instructions increase the available parallelism on AMD Athlon 64 and
AMD Opteron processors because they use both sides (integer and floating-point) of the execution
pipeline. For an example of how to move a large quadword-aligned block of data using the MMX
MOVQ instruction, see "Optimizing Main Memory Performance for Large Arrays" in the
AMD Athlon™ Processor x86 Code Optimization Guide (order # 22007).
If a block-copy routine is not used, do not move integer data through MMX registers.