86 Instruction-Decoding Optimizations Chapter 4
25112 Rev. 3.06 September 2005
Software Optimization Guide for AMD64 Processors
Example 3
Replace this instruction:
shld
reg1
,
reg2
, 3
with this code sequence:
shr
reg2
, 29
lea
reg1
, [
reg1
*8+
reg2
]