IA-32 Intel® Architecture Optimization
7-28
Avoid Coding Pitfalls in Thread Synchronization
Synchronization between multiple threads must be designed and
implemented with care to achieve good performance scaling with
respect to the number of discrete processors and the number of logical
processor per physical processor. No single technique is a universal
solution for every synchronization situation.
The pseudo-code example in Example 7-5 (a) illustrates a polling loop
implementation of a control thread. If there is only one runnable worker
thread, an attempt to call a timing service API, such as Sleep(0), may be
ineffective in minimizing the cost of thread synchronization. Because
the control thread still behaves like a fast spinning loop, the only
runnable worker thread must share execution resources with the
spin-wait loop if both are running on the same physical processor that
supports Hyper-Threading Technology. If there are more than one
runnable worker threads, then calling a thread blocking API, such as
Sleep(0), could still release the processor running the spin-wait loop,
allowing the processor to be used by another worker thread instead of
the spinning loop.
A control thread waiting for the completion of worker threads can
usually implement thread synchronization using a thread-blocking API
or a timing service, if the worker threads require significant time to
complete. Example 7-5 (b) shows an example that reduces the overhead
of the control thread in its thread synchronization.