Intel SE8500HW4 Server User Manual


 
System BIOS Intel® Server Board Set SE8500HW4
Revision 1.0
Intel order number D22893-001
76
not being able to set the desired configuration during POST, the BIOS reports this error and
continues booting with the maximum performance configuration.
During a hot insertion operation, if the bad DIMM(s) from the memory test results in the system
not being able to set the desired memory mode during runtime, the BIOS rejects the new
Memory Board addition request and powers down the newly inserted board.
After the BIOS successfully executes the hardware memory test, it zeros out the contents of
memory. The BIOS also sends BMC memory RAS commands to update the system memory
state.
10.3.2.1 Disabling Failed Memory
The BIOS and chipset disable memory when one of the following occurs:
The initialization locates a bad DIMM and disables the DIMM bank.
An uncorrectable ECC error has occurred on a DIMM during runtime. The BIOS disables
the DIMM bank for subsequent boots.
A DIMM rank surpasses an error threshold for switching to spare during runtime.
Hardware disables the DIMM rank after its contents are copied to a spare rank. The
BIOS disables the DIMM bank for subsequent boots.
A failed Memory Board is disabled by the system hardware and the BIOS.
On subsequent boots, the disabled memory is not initialized the DIMM error LED will relight
after system initialization. If all memory in a system has been disabled, the BIOS generates
beep codes to indicate that the system has no usable memory.
Disabled memory may be re-enabled and retested by enabling the setup option for “Retest All
System Memory” or “Retest Board Memory”. “Retest All System Memory” re-enables
initialization and test of all Memory Boards and slots whereas “Retest Board Memory” re-
enables and retests only the slots on the desired board.
The BIOS records the disabled memory to the SEL.
10.3.2.2 Handling ECC Errors and XMB Fail During Runtime
The BIOS handles ECC errors based on whether the error is correctable or uncorrectable and if
the current memory mode is redundant. A RAID configuration with all good Memory Boards
operates in redundant mode. A redundant group in a mirror configuration is redundant if each of
its boards operates in redundant mode. The maximum performance and maximum compatibility
modes operate in a non-redundant state. RAID configurations and mirror board pairs with failed
or missing boards also operate in a non-redundant state.
If the system is operating in a non-redundant state during runtime and an uncorrectable ECC
error occurs during runtime, the BIOS reports the error to the SEL, sets the Memory Board LED
to indicate a bad DIMM and disables the DIMM(s) for subsequent boots. The BIOS triggers a
non-maskable interrupt to halt the system.
If the system is operating in a redundant state during runtime and an uncorrectable ECC error
occurs, hardware marks the bad memory location and the system continues to function by
reading from the redundant copy of memory. The BIOS ECC error handler increments the