Compaq ECQD2KCTE Laptop User Manual


 
System Architecture and Programming Implications 5–25
MB [1] ensures that the writes done to save the state of the current process happen before
the ownership is passed.
MB [2] ensures that the reads done to load the state of the new process happen after the
ownership is picked up and hence are reliably the values written by the processor saving
the old state. Leaving this MB out makes the code fail if an old value of the context
remains in the second processor’s cache and invalidates from the writes done on the first
processor are not delivered soon enough.
The TB on the second processor must be made coherent with any write to the page tables
that may have occurred on the first processor just before the save of the process state. This
must be done with a series of TB invalidate instructions to remove any nonglobal page
mapping for this process, or by assigning an ASN that is unused on the second processor to
the process. One of these actions must occur sometime before starting execution of the
code for the new process that accesses memory (instruction or data) that is not common to
all processes. A common method is to assign a new ASN after gaining ownership of the
new process and before loading its context, which includes its ASN.
The D-cache on the second processor must be made coherent with any write to the
D-stream that may have occurred on the first processor just before the save of process
state. This is ensured by MB [2] and does not require any additional instructions.
The I-cache on the second processor must be made coherent with any write to the I-stream
that may have occurred on the first processor just before the save of process state. This can
be done with a CALL_PAL IMB sometime before the execution of any code that is not
common to all processes, More commonly, this can be done by forcing a TB miss (via the
new ASN or via TB invalidate instructions) and using the TB-fill rule (see Section 5.6.4.3).
This latter approach does not require any additional instruction.
Combining all these considerations gives the following, where, on a single processor, there is
no need for the barriers: