IBM PPC440X5 Computer Hardware User Manual


 
User’s Manual
PPC440x5 CPU Core Preliminary
Page 156 of 589
mmu.fm.
September 12, 2002
2. MSR[ME] = 1, so the CPU vectors to the machine check handler (i.e takes the machine check interrupt)
and resets the MSR[ME] bit. Note that even though the parity error causes an asynchronous interrupt,
that interrupt is guaranteed to be taken before the tlbre instruction completes if the CCR0[PRE] (Parity
Recoverability Enable) is set, and so the target register (RT) of the tlbre will not be updated.
3. The Machine Check handler code includes a series of tlbre instructions to query the state of the TLB and
find the erroneous entry. When a tlbre encounters an erroneous entry and MSR[ME] = 0, the parity
exception still happens, setting the MCSR[MCS] and MCSR[TLBE] bits. Additionally, since MSR[ME] = 0,
MCSR[IMCE] is set, indicating that an imprecise machine check was detected. Finally, the instruction
completes, (since no interrupt is taken because MSR[ME} = 0), updating the target register with data from
the TLB, including the parity information.
Note that the tlbre causes an exception when it detects a parity error, but the icread and dcread instruc-
tions do not. This inconsistency is explained because OS code commonly uses a sequence of tlbsx and
tlbre instructions to update the “changed” bit in the page table entries. (See section 5.10, “Page Reference
and Change Status Management.”) Forcing the software to check the parity manually for each tlbre would be
a performance limitation. No such functional use exists for the icread and dcread instructions; they are used
only in debugging contexts with no significant performance requirements.
As is the case for any machine check interrupt, after vectoring to the machine check handler, the MCSRR0
contains the value of the oldest “uncommitted” instruction in the pipeline at the time of the exception and
MCSRR1 contains the old (MSR) context. The interrupt handler is able to query Machine Check Status
Register (MCSR) to find out that it was called due to a TLB parity exception, and then use tlbre instructions
to find the error in the TLB and restore it from a known good copy in main memory.
Note: A parity error on the TLB entry which maps the machine check exception handler code
prevents recovery. In effect, one of the 64 TLB entries is unprotected, in that the machine
cannot recover from an error in that entry. It is possible to add logic to get around this problem,
but the reduction in SER achieved by protecting 63 out of 64 TLB entries is sufficient. Further,
the software technique of simply dedicating a TLB entry to the page that contains the machine
check handler and periodically refreshing that entry from a known good copy can reduce the
probability that the entry will be used with a parity error to near zero.
As mentioned above, any tlbre or tlbsx instruction that causes a machine check interrupt will be flushed
from the pipeline before it completes. Further, any instruction that causes a DTLB or ITLB refill which causes
a TLB parity error will be flushed before it completes.
5.11.2 Simulating TLB Parity Errors for Software Testing
Because parity errors occur in the TLB infrequently and unpredictably, it is desirable to provide users with a
way to simulate the effect of a TLB parity error so that interrupt handling software may be exercised. This is
exactly the purpose of the 4-bit CCR1[MMUPEI] field.
Usually, parity is calculated as the even parity for each set of bits to be protected, which the checking hard-
ware expects. This calculation is done as the TLB data is stored with a tlbwe instruction. However, if any of
the the CCR1[MMUPEI] bits are set, the calculated parity for the corresponding bits of the data being stored
are inverted and stored as odd parity. Then, when the data stored with odd parity is subsequently used to
refill the DTLB or ITLB, or by a tlbsx or tlbre instruction, it will cause a Parity exception type Machine Check
interrupt and exercise the interrupt handling software. The following pseudo-code is an example of how to
use the CCR1[MMUPEI] field to simulate a parity error on a TLB entry:
mtspr CCR1, Rx ; Set some CCR1[MMUPEI] bits
isync ; wait for the CCR1 context to update
tlbwe Rs,Ra,0 ; write some data to the TLB with bad parity
tlbwe Rs,Ra,1 ; write some data to the TLB with bad parity