Compaq ECQD2KCTE Laptop User Manual


 
System Architecture and Programming Implications 5–29
The MB on the first processor guarantees that the write to CSR_A precedes the write to flag in
memory, as perceived on other processors. (The MB does not guarantee that the write to
CSR_A has completed. See Section 5.6.4.7 for a discussion of how a processor can guarantee
that a write to an I/O device has completed at that device.) The MB on the second processor
guarantees that the write to CSR_B will reach the I/O device after the write to CSR_A.
5.6.5 Implications for Hardware
The coherency point for physical address x is the place in the memory subsystem at which
accesses to x are ordered. It may be at a main memory board, or at a cache containing x exclu-
sively, or at the point of winning a common bus arbitration.
The coherency point for x may move with time, as exclusive access to x migrates between
main memory and various caches.
MB and CALL_PAL IMB force all preceding writes to at least reach their respective coher-
ency points. This does not mean that main-memory writes have been done, just that the order
of the eventual writes is committed. For example, on the XMI with retry, this means getting the
writes acknowledged as received with good parity at the inputs to memory board queues; the
actual RAM write happens later.
MB and CALL_PAL IMB also force all queued cache invalidates to be delivered to the local
caches before starting any subsequent reads (that may otherwise cache hit on stale data) or
writes (that may otherwise write the cache, only to have the write effectively overwritten by a
late-delivered invalidate).
WMB ensures that the final order of writes to memory-like regions is committed and that the
final order of writes to non-memory-like regions is committed. This does not imply that the
final order of writes to memory-like regions relative to writes to non-memory-like regions is
committed. It also prevents writes that precede the WMB from merging with writes that fol-
low the WMB. For example, an implementation with a write buffer might implement WMB by
closing all valid write buffer entries from further merging and then drain the write buffer
entries in order.
Implementations may allow reads of x to hit (by physical address) on pending writes in a write
buffer, even before the writes to x reach the coherency point for x. If this is done, it is still true
that no earlier value of x may subsequently be delivered to the processor that took the hit on the
write buffer value.
Virtual data caches are allowed to deliver data before doing address translation, but only if
there cannot be a pending write under a synonym virtual address. Lack of a write-buffer match
on untranslated address bits is sufficient to guarantee this.
Virtual data caches must invalidate or otherwise become coherent with the new value when-
ever a PALcode routine is executed that affects the validity, fault behavior, protection
behavior, or virtual-to-physical mapping specified for one or more pages. Becoming coherent
can be delayed until the next subsequent MB instruction or TB fill (using the new mapping) if
the implementation of the PALcode routine always forces a subsequent TB fill.