Intel IA-32 Computer Accessories User Manual


 
Multi-Core and Hyper-Threading Technology 7
7-25
User/Source Coding Rule 21. (M impact, H generality) Insert the PAUSE
instruction in fast spin loops and keep the number of loop repetitions to a
minimum to improve overall system performance.
On IA-32 processors that use the Intel NetBurst microarchitecture core,
the penalty of exiting from a spin-wait loop can be avoided by inserting
a
PAUSE instruction in the loop. In spite of the name, the PAUSE
instruction improves performance by introducing a slight delay in the
loop and effectively causing the memory read requests to be issued at a
rate that allows immediate detection of any store to the synchronization
variable. This prevents the occurrence of a long delay due to memory
order violation.
One example of inserting the
PAUSE instruction in a simplified spin-wait
loop is shown in Example 7-4(b). The
PAUSE instruction is compatible
with all IA-32 processors. On IA-32 processors prior to Intel NetBurst
microarchitecture, the
PAUSE instruction is essentially a NOP instruction.
Additional examples of optimizing spin-wait loops using the
PAUSE
instruction are available in Application Note AP-949 “Using
Spin-Loops on Intel Pentium 4 Processor and Intel Xeon Processor.”
Inserting the
PAUSE instruction has the added benefit of significantly
reducing the power consumed during the spin-wait because fewer
system resources are used.
Optimization with Spin-Locks
Spin-locks are typically used when several threads needs to modify a
synchronization variable and the synchronization variable must be
protected by a lock to prevent un-intentional overwrites. When the lock
is released, however, several threads may compete to acquire it at once.
Such thread contention significantly reduces performance scaling with
respect to frequency, number of discrete processors, and
Hyper-Threading Technology.