Compaq 21264 Network Card User Manual


 
D–22 PALcode Restrictions and Guidelines
21264/EV68A Hardware Reference Manual
Restriction 46: Avoiding Livelocks in Speculative Load CRD Handlers
D.42 Restriction 46: Avoiding Livelocks in Speculative Load CRD
Handlers
Speculative load CRD handlers that release from the interrupt without scrubbing a
cache block could suffer from the following livelock condition:
1. An initial error on a speculative load forces a CRD interrupt.
2. The CRD releases without scrubbing the block. A speculative load in the shadow of
the hw_ret (or hw_ret_stall) touches a Dcache location that has the single-bit error,
forcing a CRD.
3. The CRD handler is entered again immediately.
4. Go to (2).
This problem can be avoided if all jumps in the CRD handler path for speculative loads
use the following sequence:
mb ; make sure hw_ret goes
ALIGN_FETCH_BLOCK <^x47FF041F>
mulq p6, #1, p6 ; Hold up loads
mulq p6, #1, p6 ; Hold up loads
hw_mtpr p6, <EV6__MM_STAT ! ^x44> ; Hold up loads
PVC_VIOLATE<43> ; Ignore restriction 43
hw_ret_stall (p23) ; Return
This sequence prevents speculative loads from issuing in the shadow of the
hw_ret_stall. Note that it is a violation of restriction 4 to have in the same fetch block a
MTPR that specifies scoreboard bit 2 (an explicit writer in the memory operation
group) and a HW_RET (an implicit reader in the memory operation group). Under nor-
mal circumstances, the intention would be for a HW_RET to wait until the MTPR
issues, and that can only be enforced by putting the two instructions in different fetch
blocks. In this case, the intention is for the HW_RET to issue before the MTPR. The
hardware does not enforce the scoreboarding when the two instructions are in the same
fetch block, and thus the HW_RET can issue and mispredict before any speculative
loads (which are held up by the MTPR) can issue.
D.43 Restriction 47: Cache Eviction for Single-Bit Cache Errors
A live lock can occur if issuing instructions out-of-order causes a floating-point store
instruction (with sberr) to replay trap.
A hardware mechanism exists that keeps track of replayed floating-point store instruc-
tions, and cancels the dirty register check. See Section D.5 for more details.
If the floating-point store instruction has an sberr and the CRD_HANDLER is entered/
exited before the instruction is replayed, the mechanism will lose track of the instruc-
tion. When the instruction is replayed, the dirty register check is not canceled, and a
replay trap occurs, causing the floating-point store instruction to continually replay the
trap until the sberr is evicted from cache. The sberr will not evict, because the floating-
point store instruction is killed by the replay trap. Killed instructions are not scrubed by
the Error Recovery Machine, and CBOX_ERR[C_ADDR] may not contain the address
of the sberr. Because CBOX_ERR[C_ADDR] is not guaranteed, the CRD_HANDLER
might not evict the sberr.