122 Cache and Memory Optimizations Chapter 5
25112 Rev. 3.06 September 2005
Software Optimization Guide for AMD64 Processors
5.14 Stack Considerations
Make sure the stack is suitably aligned for the local variable with the largest base type. Then, using
the technique described in “Sorting and Padding C and C++ Structures” on page 117, all variables can
be properly aligned with no padding.
Application
This optimization applies to:
• 32-bit software
Extend Arguments to 32 Bits Before Pushing onto Stack
Function arguments smaller than 32 bits should be extended to 32 bits before being pushed onto the
stack, which ensures that the stack is always doubleword aligned on entry to a function.
If a function has no local variables with a base type larger than a doubleword, no further work is
necessary. If the function does have local variables whose base type is larger than a doubleword,
insert additional code to ensure proper alignment of the stack. For example, the following code
achieves quadword alignment:
prologue:
push ebp
mov ebp, esp
sub esp, SIZE_OF_LOCALS ; Size of local variables
and esp, –8
... ; Push registers that need to be preserved.
epilogue: ; Pop register that needed to be preserved.
leave
ret
With this technique, function arguments can be accessed through EBP, and local variables can be
accessed through ESP. Save and restore EBP between the prologue and the epilogue to keep it free for
general use.
Optimized Stack Usage
It is sometimes possible to improve performance in frequently executed routines by altering the way
variables and parameters are passed and accessed on the stack. Replacing PUSH and POP instructions
with MOV instructions can reduce stack pointer dependencies and uses fewer execution resources.
This optimization is usually most effective in smaller routines. Excessive use of this optimization can
result in increased code size as MOV instructions are considerably larger than PUSH and POP
instructions.