2-32 Vol. 3
SYSTEM ARCHITECTURE OVERVIEW
introduced with the Pentium Pro processor). If any non-wake events are pending
during shutdown, they will be handled after the wake event from shutdown is
processed (for example, A20M# interrupts).
The LOCK prefix invokes a locked (atomic) read-modify-write operation when modi-
fying a memory operand. This mechanism is used to allow reliable communications
between processors in multiprocessor systems, as described below:
• In the Pentium processor and earlier IA-32 processors, the LOCK prefix causes
the processor to assert the LOCK# signal during the instruction. This always
causes an explicit bus lock to occur.
• In the Pentium 4, Intel Xeon, and P6 family processors, the locking operation is
handled with either a cache lock or bus lock. If a memory access is cacheable and
affects only a single cache line, a cache lock is invoked and the system bus and
the actual memory location in system memory are not locked during the
operation. Here, other Pentium 4, Intel Xeon, or P6 family processors on the bus
write-back any modified data and invalidate their caches as necessary to
maintain system memory coherency. If the memory access is not cacheable
and/or it crosses a cache line boundary, the processor’s LOCK# signal is asserted
and the processor does not respond to requests for bus control during the locked
operation.
The RSM (return from SMM) instruction restores the processor (from a context
dump) to the state it was in prior to an system management mode (SMM) interrupt.
2.7.6 Reading Performance-Monitoring and Time-Stamp Counters
The RDPMC (read performance-monitoring counter) and RDTSC (read time-stamp
counter) instructions allow application programs to read the processor’s perfor-
mance-monitoring and time-stamp counters, respectively. Processors based on Intel
NetBurst microarchitecture have eighteen 40-bit performance-monitoring counters;
P6 family processors have two 40-bit counters. Intel Atom processors and most of
the processors based on the Intel Core microarchitecture support two types of
performance monitoring counters: two programmable performance counters similar
to those available in the P6 family, and three fixed-function performance monitoring
counters.
The programmable performance counters can support counting either the occurrence
or duration of events. Events that can be monitored on programmable counters
generally are model specific (except for architectural performance events enumer
-
ated by CPUID leaf 0AH); they may include the number of instructions decoded,
interrupts received, or the number of cache loads. Individual counters can be set up
to monitor different events. Use the system instruction WRMSR to set up values in
IA32_PERFEVTSEL0/1 (for Intel Atom, Intel Core 2, Intel Core Duo, and Intel
Pentium M processors), in one of the 45 ESCRs and one of the 18 CCCR MSRs (for
Pentium 4 and Intel Xeon processors); or in the PerfEvtSel0 or the PerfEvtSel1 MSR
(for the P6 family processors). The RDPMC instruction loads the current count from
the selected counter into the EDX:EAX registers.