15-40 Vol. 3
MACHINE-CHECK ARCHITECTURE
Guidelines for writing a machine-check exception handler or a machine-
error logging utility are given in the following sections.
15.10.1 Machine-Check Exception Handler
The machine-check exception (#MC) corresponds to vector 18. To service
machine-check exceptions, a trap gate must be added to the IDT. The
pointer in the trap gate must point to a machine-check exception handler.
Two approaches can be taken to designing the exception handler:
1. The handler can merely log all the machine status and error information, then call
a debugger or shut down the system.
2. The handler can analyze the reported error information and, in some cases,
attempt to correct the error and restart the processor.
For Pentium 4, Intel Xeon, P6 family, and Pentium processors; virtually all
machine-check conditions cannot be corrected (they result in abort-type
exceptions). The logging of status and error information is therefore a base
-
line implementation requirement.
When recovery from a machine-check error may be possible, consider the
following when writing a machine-check exception handler:
• To determine the nature of the error, the handler must read each of the error-
reporting register banks. The count field in the IA32_MCG_CAP register gives
number of register banks. The first register of register bank 0 is at address 400H.
• The VAL (valid) flag in each IA32_MCi_STATUS register indicates whether the
error information in the register is valid. If this flag is clear, the registers in that
bank do not contain valid error information and do not need to be checked.
• To write a portable exception handler, only the MCA error code field in the
IA32_MCi_STATUS register should be checked. See
Section 15.9, “Interpreting
the MCA Error Codes,” for information that can be used to write an algorithm to
interpret this field.
• The RIPV, PCC, and OVER flags in each IA32_MCi_STATUS register indicate
whether recovery from the error is possible. If PCC or OVER are set, recovery is
not possible. If RIPV is not set, program execution can not be restarted reliably.
When recovery is not possible, the handler typically records the error information
and signals an abort to the operating system.
• Correctable errors are corrected automatically by the processor. The UC flag in
each IA32_MCi_STATUS register indicates whether the processor automatically
corrected an error.
• The RIPV flag in the IA32_MCG_STATUS register indicates whether the program
can be restarted at the instruction indicated by the instruction pointer (the
address of the instruction pushed on the stack when the exception was