IA-32 Intel® Architecture Optimization
2-6
• Minimize use of global variables and pointers.
• Use the const modifier; use the static modifier for global
variables.
• Use new cacheability instructions and memory-ordering behavior.
Optimize Floating-point Performance
• Avoid exceeding representable ranges during computation, since
handling these cases can have a performance impact. Do not use a
larger precision format (double-extended floating point) unless
required, since this increases memory size and bandwidth
utilization.
• Use FISTTP to avoid changing rounding mode when possible or use
optimized
fldcw; avoid changing floating-point control/status
registers (rounding modes) between more than two values.
• Use efficient conversions, such as those that implicitly include a
rounding mode, in order to avoid changing control/status registers.
• Take advantage of the SIMD capabilities of Streaming SIMD
Extensions (SSE) and of Streaming SIMD Extensions 2 (SSE2)
instructions. Enable flush-to-zero mode and DAZ mode when using
SSE and SSE2 instructions.
• Avoid denormalized input values, denormalized output values, and
explicit constants that could cause denormal exceptions.
• Avoid excessive use of the fxch instruction.
Optimize Instruction Selection
• Focus instruction selection at the granularity of path length for a
sequence of instructions versus individual instruction selections;
minimize the number of uops, data/register dependency in
aggregates of the path length, and maximize retirement throughput.