Revision History xv
Software Optimization Guide for AMD64 Processors
25112 Rev. 3.06 September 2005
Revision History
Date Rev. Description
August 2005 3.06 Updated latency tables in Appendix C. Added section 8.9 on optimizing integer
division. Clarified the use of non-temporal PREFETCHNTA instruction in section
5.6. Added explanatory information to section 5.3 on ccNUMA. Added section 4.5
on AMD64 complx addressing modes. Added new section 5.13 on memory copies.
October 2004 3.05 Updated information on write-combining optimizations in Appendix B,
Implementation of Write-Combining; Added latency information for SSE3
instructions.
March 2004 3.04 Incorporated a section on ccNUMA in Chapter 5. Added sections on moving
unaligned versus unaligned data. Added to PREFETCHNTA information in Chapter
5. Fixed many minor typos.
September 2003 3.03 Made several minor typographical and formatting corrections.
July 2003 3.02 Added index references. Corrected information pertaining to L1 and L2 data and
instruction caches. Corrected information on alignment in Chapter 5, “Cache and
Memory Optimizations”. Amended latency information in Appendix C.
April 2003 3.01 Clarified section 2.22 'Array Indices'. Corrected factual errors and removed
misleading examples from Cache and Memory chapter..
April 2003 3.00 Initial public release.