Vol. 3 8-25
MULTIPLE-PROCESSOR MANAGEMENT
• Non-privileged serializing instructions — CPUID, IRET, and RSM.
When the processor serializes instruction execution, it ensures that all pending
memory transactions are completed (including writes stored in its store buffer)
before it executes the next
instruction. Nothing can pass a serializing instruction and
a serializing instruction cannot pass any other instruction (read, write, instruction
fetch, or I/O). For example, CPUID can be executed at any privilege level to serialize
instruction execution with no effect on program flow, except that the EAX, EBX, ECX,
and EDX registers are modified.
The following instructions are memory-ordering instructions, not serializing instruc-
tions. These drain the data memory subsystem. They do not effect the instruction
execution stream:
• Non-privileged memory-ordering instructions — SFENCE, LFENCE, and
MFENCE.
The SFENCE, LFENCE, and MFENCE instructions provide more granularity in control-
ling the serialization of memory loads and stores (see Section 8.2.5, “Strengthening
or Weakening the Memory-Ordering Model”).
The following additional information is worth noting regarding serializing instruc-
tions:
• The processor does not writeback the contents of modified data in its data cache
to external memory when it serializes instruction execution. Software can force
modified data to be written back by executing the WBINVD instruction, which is a
serializing instruction. The amount of time or cycles for WBINVD to complete will
vary due to the size of different cache hierarchies and other factors. As a conse
-
quence, the use of the WBINVD instruction can have an impact on
interrupt/event response time.
• When an instruction is executed that enables or disables paging (that is, changes
the PG flag in control register CR0), the instruction should be followed by a jump
instruction. The target instruction of the jump instruction is fetched with the new
setting of the PG flag (that is, paging is enabled or disabled), but the jump
instruction itself is fetched with the previous setting. The Pentium 4, Intel Xeon,
and P6 family processors do not require the jump operation following the move to
register CR0 (because any use of the MOV instruction in a Pentium 4, Intel Xeon,
or P6 family processor to write to CR0 is completely serializing). However, to
maintain backwards and forward compatibility with code written to run on other
IA-32 processors, it is recommended that the jump operation be performed.
• Whenever an instruction is executed to change the contents of CR3 while paging
is enabled, the next instruction is fetched using the translation tables that
correspond to the new value of CR3. Therefore the next instruction and the
sequentially following instructions should have a mapping based upon the new
value of CR3. (Global entries in the TLBs are not invalidated, see
Section 4.10.3,
“Invalidation of TLBs and Paging-Structure Caches.”)
• The Pentium processor and more recent processor families use branch-prediction
techniques to improve performance by prefetching the destination of a branch
instruction before the branch instruction is executed. Consequently, instruction