Chapter 7 Scheduling Optimizations 145
Software Optimization Guide for AMD64 Processors
25112 Rev. 3.06 September 2005
7.2 Loop Unrolling
Optimization
Use loop unrolling where appropriate to increase instruction-level parallelism:
Application
This optimization applies to:
• 32-bit software
• 64-bit software
Loop Unrolling
Loop unrolling is a technique that duplicates the body of a loop one or more times in order to increase
the number of instructions relative to the branch and allow operations from different loop iterations to
execute in parallel.
There are two types of loop unrolling:
• Complete loop unrolling
• Partial loop unrolling
If all of these conditions are true Then use
• The loop is in a frequently executed piece of code.
• The number of loop iterations is known at compile time.
• The loop body includes fewer than 10 instructions.
Complete loop unrolling
• Spare registers are available (for example, when operating in 64-bit mode,
where additional registers are available).
• The loop body is small, so that loop overhead is significant.
• The number of loop iterations is likely greater than 10.
Partial loop unrolling