Intel IA-32 Computer Accessories User Manual


 
IA-32 Intel® Architecture Optimization
7-26
To reduce the performance penalty, one approach is to reduce the
likelihood of many threads competing to acquire the same lock. Apply a
software pipelining technique to handle data that must be shared
between multiple threads.
Instead of allowing multiple threads to compete for a given lock, no
more than two threads should have write access to a given lock. If an
application must use spin-locks, include the
PAUSE instruction in the
wait loop. Example 7-4 (c) shows an example of the “test, test-and-set”
technique for determining the availability of the lock in a spin-wait
loop.
User/Source Coding Rule 22. (M impact, L generality) Replace a spin lock
that may be acquired by multiple threads with pipelined locks such that no
more than two threads have write accesses to one lock. If only one thread needs
to write to a variable shared by two threads, there is no need to use a lock.
Synchronization for Longer Periods
When using a spin-wait loop not expected to be released quickly, an
application should follow these guidelines:
Keep the duration of the spin-wait loop to a minimum number of
repetitions.
Applications should use an OS service to block the waiting thread;
this can release the processor so that other runnable threads can
make use of the processor or available execution resources.
On processors supporting Hyper-Threading Technology, operating
systems should use the HLT instruction if one logical processor is active
and the other is not. HLT will allow an idle logical processor to
transition to a halted state; this allows the active logical processor to use
all the hardware resources in the physical package. An operating system
that does not use this technique must still execute instructions on the
idle logical processor that repeatedly check for work. This “idle loop”
consumes execution resources that could otherwise be used to make
progress on the other active logical processor.