Compaq 21264 Network Card User Manual


 
21264/EV68A Hardware Reference Manual
Cache and External Interfaces 4–15
Lock Mechanism
4.6.1 In-Order Processing of LDx_L/STx_C Instructions
The 21264/EV68A uses the stWait logic in the IQ to ensure that LDx_L/STx_C pairs
are issued in order. The stWait logic treats an Ldx_L instruction like Stx instructions.
STx_C instructions are always loaded into the IQ with their associate stWait bit set.
Thus, a STx_C instruction is not issued until the older LDx_L is out of the IQ.
4.6.2 Internal Eviction of LDx_L Blocks
The 21264/EV68A prevents the eviction of cache blocks in the Dcache due to either of
the following references:
Istream references with a Bcache index that matches the Dcache block and a
Bcache tag that mismatches the Dcache block.
To avoid evictions of LDx_L blocks, Istream references that match the index of a
block in the Dcache are converted to noncached references.
Ldx or Stx references with a Dcache index that matches the block.
In the Alpha architecture, Dstream references between a LDx_L/STx_C pair force
the value of the STx_C success flag to be UNPREDICTABLE. The 21264/EV68A
forces all STx_C instructions that interrupt an LDx_L/STx_C pair to fail in pro-
gram order.
There should be no Dstream references between LDx_L/STx_C pairs; however, the
out-of-order nature of the 21264/EV68A can introduce Dstream references between
LDx_L/STx_C pairs. To prevent load or store instructions older than the LDx_L
from evicting the LDx_L cache block, the Mbox invokes a replay trap on the
incoming load or store instruction, which also aborts the LDx_L. These instructions
are issued in program order in the next iteration of the trap retry down the pipeline.
To prevent newer load or store instructions from evicting the locked cache line, the
Ibox ensures that a STx_C is issued before any newer load or store instruction by
placing the STx_C into the IQ and stalling all subsequent instructions in the map
stage of the pipe until the IQ is empty.
Branch instructions between the LDx_L/STx_C pair may be mispredicted, intro-
ducing load and store instructions that evict the locked cache block. To prevent that
from happening, there is a bit in the instruction fetcher that is set for a LDx_L refer-
ence and cleared on any other memory reference. When this bit is set, the branch
predictor predicts all branches to fall through.
4.6.3 Liveness and Fairness
To prevent a livelock condition, the 21264/EV68A processes the STx_C as follows:
1. If a STx_C misses the Dcache, then no system port transaction is started and the
STx_C fails.
2. If a STx_C hits a block that is not dirty, then a ChangeToDirty (Shared or Clean) is
launched after the STx_C retires and all older store queue entries are in the writable
state. This ensures that once the ChangeToDirty command is launched on behalf of
the STx_C, the STx_C will be executed to completion if the ChangeToDirty com-
mand succeeds.