4–14 Cache and External Interfaces
21264/EV68A Hardware Reference Manual
Lock Mechanism
1. When the Mbox requests a Dcache fill, the Cbox uses the CTAG array entry to find
if the Dcache already contains the requested physical address in another virtually-
indexed Dcache line. If it does, the Cbox invalidates that cache line after first writ-
ing the data back to the Bcache if it was in the modified state. The Cbox also checks
to see if the Dcache contains an address different from the requested address, but
maps to the same Bcache line. If it does, the Dcache line is evicted in order to keep
the Dcache a subset of the Bcache.
2. When the Ibox requests an Icache fill, the Cbox uses the CTAG array entries to find
if the Dcache contains the requested physical address in the modified state. If it
does, the Cbox forces the line to be written back to the Bcache before servicing the
Icache fill request. The Cbox also checks to see if the Dcache contains an address
different from the requested address but which maps to the same Bcache line. In
this case the Istream request will miss the Bcache, and the Cbox will
service the request by launching a noncached Fetch command to the system port
and will not put the Istream block into the Bcache. This mechanism allows the
21264/EV68A to use a cache resident lock flag for LDx_L/STx_C instructions.
3. The Cbox uses the CTAG array entries to find whether probe addresses are held in
the Dcache without interrupting load/store instruction processing in the processor
core.
4.6 Lock Mechanism
The 21264/EV68A does not contain a dedicated lock register, nor are system compo-
nents required to do so.
When a load-lock (LDx_L) instruction executes, data is accessed from the Dcache or
Bcache. If there is a cache miss, data is accessed from memory with a RdBlk command.
Its associated cache line is filled into the Dcache in the clean state, if it is not already
there.
When the store-conditional (STx_C) instruction executes, it is allowed to succeed if its
associated cache line is still present in the Dcache and can be made writable; otherwise,
it fails.
This algorithm is successful because another agent in the system writing to the cache
line between the load-lock and the store-conditional cache line would make the cache
line invalid. This mechanism’s coherence is based on the following four items:
1. LDx_L instructions are processed in-order in relation to the associated STx_C.
2. Once a block is locked by way of an LDx_L instruction, no internal agent can evict
the block from the Dcache as a side-effect of its processing.
3. Any external agent that intends to update the contents of the stored block must use
an invalidating probe command to inform the 21264/EV68A.
4. The system is the only agent with sufficient information to manage the tasks of fair-
ness and liveness. However, to enable these tasks, the 21264/EV68A only generates
external commands for nonspeculative STx_C instructions, and once given a suc-
cess indication from the system, must faithfully update the Dcache with the STx_C
value.
The system is entirely responsible for item number three. The 21264/EV68A plays an
active role in items one, two, and four.