Chapter 1
Overview
Server Errors
58
1. Detection is the hardware checks that realize an error has occurred.
2. Transaction handling modifies how the hardware treats the tmansaction with the detected error.
3. Logging is storing the error indication in the primary error mode register, which sets the error state for the
block.
4. State behavior is any special actions taken in the various error states.
It is preferred that most errors not result in any special transaction handling by the hardware but rather
handled by state behavior. For instance, it is preferable to take a link down because a block is in fatal error
mode rather than because a packet arrived with a particular error. Using error state behavior is preferred
because it eliminates many corner case, and makes verification somewhat easier. It is also possible to test
error state behavior by inserting errors in the primary error mode register using software setting bits. Testing
transaction handling requires actually creating the error.
The error strategy provides a way to mask logging all errors (the error enable mask register) and so it
provides a mechanism to avoid error states and the subsequent state behavior.
For instance, if a link goes down when the block is in fatal error mode, and a multibit error puts a block in
fatal error mode, just clearing the enable bit for the error will avoid the need to take the link down.
Unfortunately, some errors require transactional error handling. The sx2000 chipset approach provides
separate CSR configuration bits to mask the transactional handling for these errors independent of the error
enable mask register when it seems appropriate.
Although the content of each interface's error logs and status registers are different, the programming model
for each is the same.
1. Firmware initializes the error enable mask register in each interface at boot time. The default
configuration in hardware is to mask all errors. Firmware may also choose to configure the error upgrade
registers.
2. Hardware detects an error and sets a symptom bit in the interface's primary error mode register. The
corresponding error log is updated with the new error. No other errors of that type will be logged until the
first is cleared. Subsequent errors of the same type will force bits to be set in the secondary error mode
register.
3. Firmware checks the primary error mode register and sees a bit set.
4. Firmware reads the appropriate error log and does some error handling code. More information may exist
in the secondary error mode register and the error order status register.
5. If fatal error mode is being cleared, set the error enable mask register to mask the errors, "Received packet
with FE bit set" and "FE wire set" in all interfaces.
6. Firmware clears the symptom bits in the primary and secondary error mode registers. Firmware should
read the secondary register and save its value, and then read the primary register. Firmware should handle
the errors indicated in the saved values, but can read the associated logging registers any time. To clear the
error modes, firmware writes the saved secondary register value to the “clear” address, and then writes the
saved primary register value to its “clear” address. This ensures only errors that have been seen by firmware
are cleared. Clearing the primary error mode register will stop the hardware from setting the FE bits in
outgoing packets. Firmware checks to make sure that both registers have all bits of the particular error type
“cleared”. If they are not cleared, then additional errors have occurred and the data in the associated log
registers may be invalid.
7. Plunge all transactions to clear any queues with FE bit set.
8. Unmask errors in the error enable mask register.