Support User Manuals

Intel IA-32 Computer Accessories User Manual

Open as PDF

of 568

IA-32 Intel® Architecture Optimization

C-2

Overview

The current generation of IA-32 family of processors use out-of-order

execution with dynamic scheduling and buffering to tolerate poor

instruction selection and scheduling that may occur in legacy code. It

can reorder μops to cover latency delays and to avoid resource conflicts.

In some cases, the microarchitecture’s ability to avoid such delays can

be enhanced by arranging IA-32 instructions. While reordering IA-32

instructions may help, the execution core determines the final schedule

of μops.

This appendix provides information to assembly language programmers

and compiler writers, to aid in selecting the sequence of instructions

which minimizes dependency chain latency, and to arrange instructions

in an order which assists the hardware in processing instructions

efficiently while avoiding resource conflicts. The performance impact

of applying the information presented in this appendix has been shown

to be on the order of several percent, for applications which are not

completely dominated by other performance factors, such as:

• cache miss latencies

• bus bandwidth

• I/O bandwidth

Instruction selection and scheduling matters when the compiler or

assembly programmer has already addressed the performance issues

discussed in Chapter 2:

• observe store forwarding restrictions

• avoid cache line and memory order buffer splits

• do not inhibit branch prediction

• minimize the use of xchg instructions on memory locations

previous next