2-1
2
General Optimization
Guidelines
This chapter discusses general optimization techniques that can improve
the performance of applications running on the Intel Pentium 4, Intel
Xeon, Pentium M processors, as well as on dual-core processors. These
techniques take advantage of the microarchitectural features of the
generation of IA-32 processor family described in Chapter 1.
Optimization guidelines for 64-bit mode applications are discussed in
Chapter 8. Additional optimization guidelines applicable to dual-core
processors and Hyper-Threading Technology are discussed in Chapter 7.
This chapter explains the optimization techniques both for those who
use the Intel
®
C++ or Fortran Compiler and for those who use other
compilers. The Intel
®
compiler, which generates code specifically tuned
for IA-32 processor family, provides the most of the optimization. For
those not using the Intel C++ or Fortran Compiler, the assembly code
tuning optimizations may be useful. The explanations are supported by
coding examples.
Tuning to Achieve Optimum Performance
The most important factors in achieving optimum processor
performance are:
• good branch prediction
• avoiding memory access stalls
• good floating-point performance
• instruction selection, including use of SIMD instructions
• instruction scheduling (to maximize trace cache bandwidth)
• vectorization