Vol. 3 19-41
ARCHITECTURE COMPATIBILITY
with the exception of “fast string” store operations (see Section 8.2.4, “Out-of-Order
Stores For String Operations”).
The Pentium processor has two store buffers, one corresponding to each of the pipe-
lines. Writes in these buffers are always written to memory in the order they were
generated by the processor core.
It should be noted that only memory writes are buffered and I/O writes are not. The
Pentium 4, Intel Xeon, P6 family, Pentium, and Intel486 processors do not synchro-
nize the completion of memory writes on the bus and instruction execution after a
write. An I/O, locked, or serializing instruction needs to be executed to synchronize
writes with the next instruction (see Section 8.3, “Serializing Instructions”).
The Pentium 4, Intel Xeon, and P6 family processors use processor ordering to main-
tain consistency in the order that data is read (loaded) and written (stored) in a
program and the order the processor actually carries out the reads and writes. With
this type of ordering, reads can be carried out speculatively and in any order, reads
can pass buffered writes, and writes to memory are always carried out in program
order. (See
Section 8.2, “Memory Ordering,” for more information about processor
ordering.) The Pentium III processor introduced a new instruction to serialize writes
and make them globally visible. Memory ordering issues can arise between a
producer and a consumer of data. The SFENCE instruction provides a performance-
efficient way of ensuring ordering between routines that produce weakly-ordered
results and routines that consume this data.
No re-ordering of reads occurs on the Pentium processor, except under the condition
noted in
Section 8.2.1, “Memory Ordering in the Intel
®
Pentium
®
and Intel486
™
Processors,” and in the following paragraph describing the Intel486 processor.
Specifically, the store buffers are flushed before the IN instruction is executed. No
reads (as a result of cache miss) are reordered around previously generated writes
sitting in the store buffers. The implication of this is that the store buffers will be
flushed or emptied before a subsequent bus cycle is run on the external bus.
On both the Intel486 and Pentium processors, under certain conditions, a memory
read will go onto the external bus before the pending memory writes in the buffer
even though the writes occurred earlier in the program execution. A memory read
will only be reordered in front of all writes pending in the buffers if all writes pending
in the buffers are cache hits and the read is a cache miss. Under these conditions, the
Intel486 and Pentium processors will not read from an external memory location that
needs to be updated by one of the pending writes.
During a locked bus cycle, the Intel486 processor will always access external
memory, it will never look for the location in the on-chip cache. All data pending in
the Intel486 processor's store buffers will be written to memory before a locked cycle
is allowed to proceed to the external bus. Thus, the locked bus cycle can be used for
eliminating the possibility of reordering read cycles on the Intel486 processor. The
Pentium processor does check its cache on a read-modify-write access and, if the
cache line has been modified, writes the contents back to memory before locking the
bus. The P6 family processors write to their cache on a read-modify-write operation
(if the access does not split across a cache line) and does not write back to system