Chapter 5 Cache and Memory Optimizations 101
Software Optimization Guide for AMD64 Processors
25112 Rev. 3.06 September 2005
Avoid
mov eax, 10h
mov WORD PTR [eax], bx ; Word store
...
mov ecx, DWORD PTR [eax] ; Doubleword load--cannot forward upper byte
; from store buffer
Avoid
mov eax, 10h
mov BYTE PTR [eax+3], bl ; Byte store
...
mov ecx, DWORD PTR [eax] ; Doubleword load--cannot forward upper byte
; from store buffer
Wide-to-Narrow Store-Buffer Data-Forwarding Restriction
If the following conditions are present, there is a wide-to-narrow store-buffer data-forwarding
restriction:
• The operand size of the store data is greater than the operand size of the load data.
• The start address of the store data does not match the start address of the load data.
Avoid
mov eax, 10h
add DWORD PTR [eax], ebx ; Doubleword store
mov cx, WORD PTR [eax+2] ; Word load--cannot forward high word
; from store buffer
Avoid
movq [foo], mm1 ; Store upper and lower half.
...
add eax, [foo] ; Fine
add edx, [foo+4] ; Not good!
Preferred
movd [foo], mm1 ; Store lower half.
punpckhdq mm1, mm1 ; Copy upper half into lower half.
movd [foo+4], mm1 ; Store lower half.
...
add eax, [foo] ; Fine
add edx, [foo+4] ; Fine
Misaligned Store-Buffer Data-Forwarding Restriction
If the following condition is present, there is a misaligned store-buffer data-forwarding restriction:
• The store or load address is misaligned. For example, a quadword store is not aligned to a
quadword boundary.