8-8 Vol. 3
MULTIPLE-PROCESSOR MANAGEMENT
have cached the same area of memory from simultaneously modifying data in that
area.
8.2 MEMORY ORDERING
The term memory ordering refers to the order in which the processor issues reads
(loads) and writes (stores) through the system bus to system memory. The Intel 64
and IA-32 architectures support several memory-ordering models depending on the
implementation of the architecture. For example, the Intel386 processor enforces
program ordering (generally referred to as strong ordering), where reads and
writes are issued on the system bus in the order they occur in the instruction stream
under all circumstances.
To allow performance optimization of instruction execution, the IA-32 architecture
allows departures from strong-ordering model called processor ordering in
Pentium 4, Intel Xeon, and P6 family processors. These processor-ordering varia-
tions (called here the memory-ordering model) allow performance enhancing
operations such as allowing reads to go ahead of buffered writes. The goal of any of
these variations is to increase instruction execution speeds, while maintaining
memory coherency, even in multiple-processor systems.
Section 8.2.1 and Section 8.2.2 describe the memory-ordering implemented by
Intel486, Pentium, Intel Core 2 Duo, Intel Atom, Intel Core Duo, Pentium 4, Intel
Xeon, and P6 family processors. Section 8.2.3 gives examples illustrating the
behavior of the memory-ordering model on IA-32 and Intel-64 processors. Section
8.2.4 considers the special treatment of stores for string operations and Section
8.2.5 discusses how memory-ordering behavior may be modified through the use of
specific instructions.
8.2.1 Memory Ordering in the Intel
®
Pentium
®
and Intel486
™
Processors
The Pentium and Intel486 processors follow the processor-ordered memory model;
however, they operate as strongly-ordered processors under most circumstances.
Reads and writes always appear in programmed order at the system bus—except for
the following situation where processor ordering is exhibited. Read misses are
permitted to go ahead of buffered writes on the system bus when all the buffered
writes are cache hits and, therefore, are not directed to the same address being
accessed by the read miss.
In the case of I/O operations, both reads and writes always appear in programmed
order.
Software intended to operate correctly in processor-ordered processors (such as the
Pentium 4, Intel Xeon, and P6 family processors) should not depend on the relatively
strong ordering of the Pentium or Intel486 processors. Instead, it should insure that
accesses to shared variables that are intended to control concurrent execution