Managing Faults 19
FIGURE: ILOM Fault Management
The service processor can detect when a fault is no longer present and clears the fault
in several ways:
■ Fault recovery – The system automatically detects that the fault condition is no
longer present. The service processor extinguishes the Service Required LED and
updates the FRU’s PROM, indicating that the fault is no longer present.
■ Fault repair – The fault has been repaired by human intervention. In most cases,
the service processor detects the repair and extinguishes the Service Required
LED. If the service processor does not perform these actions, you must perform
these tasks manually by setting the ILOM component_state or fault_state of the
faulted component.
The service processor can detect the removal of a FRU, in many cases even if the FRU
is removed while the service processor is powered off (for example, if the system
power cables are unplugged during service procedures). This function enables ILOM
to know that a fault, diagnosed to a specific FRU, has been repaired.
Note – ILOM does not automatically detect hard drive replacement.
Many environmental faults can automatically recover. A temperature that is
exceeding a threshold might return to normal limits. An unplugged power supply
can be plugged in, and so on. Recovery of environmental faults is automatically
detected.
Note – No ILOM command is needed to manually repair an environmental fault.
The Predictive Self-Healing technology does not monitor the hard drive for faults. As
a result, the service processor does not recognize hard drive faults, and will not light
the fault LEDs on either the chassis or the hard drive itself. Use the Oracle Solaris
message files to view hard drive faults.