Chapter 5 Cache and Memory Optimizations 103
Software Optimization Guide for AMD64 Processors
25112 Rev. 3.06 September 2005
4-Kbyte pages away from the load address (address bits 47–12 do not match). Avoid the type of code
shown in the following example:
mov eax, 10h
mov [eax], bx ; Word store to address 10
mov cx, [eax+2] ; Word load to address 12
; Load detects a false dependency
; on store because it is in the
; same doubleword of memory.
mov cx, [eax+4] ; Word load to address 14
; Load does not detect a false
; dependency because it is to a
; different doubleword of memory.
Here is another example of the type of code to avoid:
mov eax, 10h
mov [eax], bl ; First store to DWORD at address 10h
mov [eax+1], cl ; Second store to DWORD at address 10h
mov dl, [eax] ; Load detects a false
; dependency on the second store
; because it is the most recent
; store to the same doubleword of
; memory as the load.
Summary of Store-to-Load-Forwarding Pitfalls to Avoid
To avoid store-to-load-forwarding pitfalls, follow these guidelines:
• Maintain consistent use of operand size across all loads and stores. Preferably use doubleword or
quadword operand sizes.
• Avoid misaligned data references.
• Avoid narrow-to-wide and wide-to-narrow forwarding cases.
• When using word or byte stores, avoid loading data from anywhere in the same doubleword of
memory other than the identical start addresses of the stores.
Application
This optimization applies to:
• 32-bit software
• 64-bit software