IA-32 Intel® Architecture Optimization
7-16
Optimization Guidelines
This section summarizes optimization guidelines for tuning
multithreaded applications. Five areas are listed (in order of
importance):
• thread synchronization
• bus utilization
• memory optimization
• front end optimization
• execution resource optimization
Practices associated with each area are listed in this section. Guidelines
for each area are discussed in greater depth in sections that follow.
Most of the coding recommendations improve performance scaling with
processor cores; and scaling due to Hyper-Threading Technology.
Techniques that apply to only one environment are noted.
Key Practices of Thread Synchronization
Key practices for minimizing the cost of thread synchronization are
summarized below:
• Insert the PAUSE instruction in fast spin loops and keep the number
of loop repetitions to a minimum to improve overall system
performance.
• Replace a spin lock that may be acquired by multiple threads with
pipelined locks such that no more than two threads have write
accesses to one lock. If only one thread needs to write to a variable
shared by two threads, there is no need to acquire a lock.
• Use a thread-blocking API in a long idle loop to free up the
processor.
• Prevent “false-sharing” of per-thread-data between two threads.