15-38 Vol. 3
MACHINE-CHECK ARCHITECTURE
For Instruction Fetch recoverable error, the affected logical processor should
find that the RIPV flag and the EIPV Flag in the IA32_MCG_STATUS register
are cleared, indicating that the error is detected at the instruction pointer
saved on the stack may not be associated with this error and restarting the
execution with the interrupted context is not possible.
The logical processors that observed but not affected by an SRAR error
should find that the RIPV flag in the IA32_MCG_STATUS register is set and
the EIPV flag in the IA32_MCG_STATUS register is cleared, indicating that it
is safe to restart the execution at the instruction saved on the stack for the
machine check exception on these processors after the recovery action is
successfully taken by system software.
For the Data-Load and the Instruction-Fetch recoverable errors, system
software may take the following recovery actions for the affected logical
processor:
• The current executing thread cannot be continued. You must terminate the
interrupted stream of execution and provide a new stream of execution on return
from the machine check handler for the affected logical processor
In addition to taking the recovery action described above, system software
may also need to disable the use of the affected page from the program. This
recovery action by system software may prevent the occurrence of future
consumption errors from that affected page.
15.9.4 Multiple MCA Errors
When multiple MCA errors are detected within a certain detection window,
the processor may aggregate the reporting of these errors together as a
single event, i.e. a single machine exception condition. If this occurs,
system software may find multiple MCA errors logged in different MC banks
on one logical processor or find multiple MCA errors logged across different
processors for a single machine check broadcast event. In order to handle
multiple UCR errors reported from a single machine check event and
possibly recover from multiple errors, system software may consider the
following:
• Whether it can recover from multiple errors is determined by the most severe
error reported on the system. If the most severe error is found to be an unrecov
-
erable error (VAL=1, UC=1, PCC=1 and EN=1) after system software examines
the MC banks of all processors to which the MCA signal is broadcast, recovery
from the multiple errors is not possible and system software needs to reset the
system.