IBM PPC440X5 Manual

Open as PDF

next previous

User’s Manual

PPC440x5 CPU Core Preliminary

Page 156 of 589

mmu.fm.

September 12, 2002

2. MSR[ME] = 1, so the CPU vectors to the machine check handler (i.e takes the machine check interrupt)

and resets the MSR[ME] bit. Note that even though the parity error causes an asynchronous interrupt,

that interrupt is guaranteed to be taken before the tlbre instruction completes if the CCR0[PRE] (Parity

Recoverability Enable) is set, and so the target register (RT) of the tlbre will not be updated.

3. The Machine Check handler code includes a series of tlbre instructions to query the state of the TLB and

ﬁnd the erroneous entry. When a tlbre encounters an erroneous entry and MSR[ME] = 0, the parity

exception still happens, setting the MCSR[MCS] and MCSR[TLBE] bits. Additionally, since MSR[ME] = 0,

MCSR[IMCE] is set, indicating that an imprecise machine check was detected. Finally, the instruction

completes, (since no interrupt is taken because MSR[ME} = 0), updating the target register with data from

the TLB, including the parity information.

Note that the tlbre causes an exception when it detects a parity error, but the icread and dcread instruc-

tions do not. This inconsistency is explained because OS code commonly uses a sequence of tlbsx and

tlbre instructions to update the “changed” bit in the page table entries. (See section 5.10, “Page Reference

and Change Status Management.”) Forcing the software to check the parity manually for each tlbre would be

a performance limitation. No such functional use exists for the icread and dcread instructions; they are used

only in debugging contexts with no significant performance requirements.

As is the case for any machine check interrupt, after vectoring to the machine check handler, the MCSRR0

contains the value of the oldest “uncommitted” instruction in the pipeline at the time of the exception and

MCSRR1 contains the old (MSR) context. The interrupt handler is able to query Machine Check Status

to find the error in the TLB and restore it from a known good copy in main memory.

Note: A parity error on the TLB entry which maps the machine check exception handler code

prevents recovery. In effect, one of the 64 TLB entries is unprotected, in that the machine

cannot recover from an error in that entry. It is possible to add logic to get around this problem,

but the reduction in SER achieved by protecting 63 out of 64 TLB entries is sufﬁcient. Further,

the software technique of simply dedicating a TLB entry to the page that contains the machine

check handler and periodically refreshing that entry from a known good copy can reduce the

probability that the entry will be used with a parity error to near zero.

As mentioned above, any tlbre or tlbsx instruction that causes a machine check interrupt will be flushed

from the pipeline before it completes. Further, any instruction that causes a DTLB or ITLB refill which causes

a TLB parity error will be flushed before it completes.

5.11.2 Simulating TLB Parity Errors for Software Testing

Because parity errors occur in the TLB infrequently and unpredictably, it is desirable to provide users with a

way to simulate the effect of a TLB parity error so that interrupt handling software may be exercised. This is

exactly the purpose of the 4-bit CCR1[MMUPEI] field.

Usually, parity is calculated as the even parity for each set of bits to be protected, which the checking hard-

ware expects. This calculation is done as the TLB data is stored with a tlbwe instruction. However, if any of

the the CCR1[MMUPEI] bits are set, the calculated parity for the corresponding bits of the data being stored

are inverted and stored as odd parity. Then, when the data stored with odd parity is subsequently used to

refill the DTLB or ITLB, or by a tlbsx or tlbre instruction, it will cause a Parity exception type Machine Check

interrupt and exercise the interrupt handling software. The following pseudo-code is an example of how to

use the CCR1[MMUPEI] field to simulate a parity error on a TLB entry:

mtspr CCR1, Rx ; Set some CCR1[MMUPEI] bits

isync ; wait for the CCR1 context to update

tlbwe Rs,Ra,0 ; write some data to the TLB with bad parity

tlbwe Rs,Ra,1 ; write some data to the TLB with bad parity