Intel 460GX Computer Hardware User Manual


 
Intel® 460GX Chipset Software Developers Manual 6-9
Data Integrity and Error Handling
Other errors capture the address associated with the failure. This is also for debug and diagnostic
purposes, but also has the potential for use in system recovery. For instance, if there is an
uncorrectable error on a data read, and the access can be isolated, then instead of re-starting the
whole system, it might be possible to kill only the failing process and allow other users to continue
running.
6.6.1 SAC Address on an Error
The SAC has several registers used to access the address for a failure. After FERR_SAC is read to
determine the precise error that occurred, the address can then be determined for certain errors. The
method is somewhat indirect. The SAC is the only chip that tracks the original address, so is used
to get the address even when another chip may have detected the error.
The GX keeps track of all transactions using an ITID (Internal Transaction ID). All the chips use
this tag to track a specific transaction through the system. When the transaction is compete the
ITID is retired and a later transaction may re-use the same value. On an error the ITID for that
failing transaction is captured and not retired. There are 3 registers used by the SAC to capture the
failing ITID:
SECTID - captures the ITID of the first Single-bit ECC error from memory.
DEDTID - captures the ITID of the first Double-bit ECC error from memory.
FSETID - captures the ITID of the first system bus or PDB data error seen by the SDC.
Each register is set as shown in Table 6-1.
SECTID, DEDTID, and FSETID all have Valid bits in the register. The ITID is captured until the
error is logged and the system is ready to clear all the error indication. Single, Double and system
bus errors are recoverable, and therefore the system can clear those errors and continue running. To
do so software must write a one to the Val id bit of the register. This will cause the system to retire
the ITID and that transaction is now complete.
Note that these 3 registers are sticky through reset, so that the information is preserved. After
BINIT# or reset (but not power-on), the SECTID, DEDTID and FSETID registers are valid and the
failing address can be retrieved from the RAM.
Multiple of these registers may be active. After the status register is read then the log register can
be read to determine the ITID. The SDC signals the first of each type of error. If the first error the
SDC sees is a Double-bit ECC then the FERR_SAC and SDC_FERR bits are set for that along
with DEDTID in the SAC. If the SDC sees a single-bit error next, before software has cleared out
the error logging, then the SDC_NERR and NERR_SAC registers are set, as well as SECTID.
Anytime the SDC sends an indication of a Single, Double or system bus error, the appropriate
SECTID, DEDTID, FSETID register is set. The SDC is responsible for not sending a 2nd
indication of one of these errors, until its FERR and NERR registers are cleared.
When there is an error in the SDC and software has finished processing it, it should follow the
following procedure in the order given:
Write to the Valid bit of SECTID, DEDTID or FSETID to release the ITID and unlock the
register.
Write 1 to clear those bits that software has read as asserted in FERR_SAC and the
NERR_SAC registers. Software should not just write 1s to the entire register.
Since any of these 3 types of errors are reported through interrupt and not BERR# or BINIT#,
the interrupt must be cleared so that the next error can be visible. Write an EOI to the PID.
Write 1 to clear the SDC_FERR and SDC_NERR registers. Writing a 1 to either register in
the SDC will clear both the SDC_FERR and SDC_NERR at the same time.