8-68 Vol. 3
MULTIPLE-PROCESSOR MANAGEMENT
the two parameters should default to be the same (the size of the monitor triggering
area is the same as the system coherence line size).
Based on the monitor line sizes returned by the CPUID, the OS should dynamically
allocate structures with appropriate padding. If static data structures must be used
by an OS, attempt to adapt the data structure and use a dynamically allocated data
buffer for thread synchronization. When the latter technique is not possible, consider
not using MONITOR/MWAIT when using static data structures.
To set up the data structure correctly for MONITOR/MWAIT on multi-clustered
systems: interaction between processors, chipsets, and the BIOS is required (system
coherence line size may depend on the chipset used in the system; the size could be
different from the processor’s monitor triggering area). The BIOS is responsible to
set the correct value for system coherence line size using the
IA32_MONITOR_FILTER_LINE_SIZE MSR. Depending on the relative magnitude of
the size of the monitor triggering area versus the value written into the
IA32_MONITOR_FILTER_LINE_SIZE MSR, the smaller of the parameters will be
reported as the Smallest Monitor Line Size. The larger of the parameters will be
reported as the Largest Monitor Line Size.
8.10.6 Required Operating System Support
This section describes changes that must be made to an operating system to run on
processors supporting Intel Hyper-Threading Technology. It also describes optimiza-
tions that can help an operating system make more efficient use of the logical
processors sharing execution resources. The required changes and suggested opti-
mizations are representative of the types of modifications that appear in Windows*
XP and Linux* kernel 2.4.0 operating systems for Intel processors supporting Intel
Hyper-Threading Technology. Additional optimizations for processors supporting
Intel Hyper-Threading Technology are described in the
Intel® 64 and IA-32 Architec-
tures Optimization Reference Manual.
8.10.6.1 Use the PAUSE Instruction in Spin-Wait Loops
Intel recommends that a PAUSE instruction be placed in all spin-wait loops that run
on Intel processors supporting Intel Hyper-Threading Technology and multi-core
processors.
Software routines that use spin-wait loops include multiprocessor synchronization
primitives (spin-locks, semaphores, and mutex variables) and idle loops. Such
routines keep the processor core busy executing a load-compare-branch loop while a
thread waits for a resource to become available. Including a PAUSE instruction in such
a loop greatly improves efficiency (see
Section 8.10.2, “PAUSE Instruction”). The
following routine gives an example of a spin-wait loop that uses a PAUSE instruction:
Spin_Lock:
CMP lockvar, 0 ;Check if lock is free