IBM 750 Server User Manual


 
IBM United States Hardware Announcement
110-009
IBM is a registered trademark of International Business Machines Corporation
20
Remove/replace procedures
Service labels that contain remove/replace procedures are often found on a cover
of the system or in other spots accessible to the servicer. These labels provide
systematic procedures, including diagrams, detailing how to remove/replace certain
serviceable hardware components.
Arrows
Numbered arrows are used to indicate the order of operation and serviceability
direction of components. Some serviceable parts such as latches, levers, and touch
points need to be pulled or pushed in a certain direction and certain order for the
mechanical mechanisms to engage or disengage. Arrows generally improve the ease
of serviceability.
Packaging for service
The following service enhancements are included in the physical packaging of the
systems to facilitate service:
Color coding (touch points): Terracotta colored touch points indicate that a
component (FRU/CRU) can be concurrently maintained. Blue colored touch points
delineate components that are not concurrently maintained -- those that require
the system to be turned off for removal or repair.
Tool-less design: Selected IBM systems support tool-less or simple tool designs.
These designs require no tools or simple tools such as flathead screwdrivers to
service the hardware components.
Positive retention: Positive retention mechanisms help to assure proper
connections between hardware components such as cables to connectors, and
between two cards that attach to each other. Without positive retention, hardware
components run the risk of becoming loose during shipping or installation,
preventing a good electrical connection. Positive retention mechanisms like
latches, levers, thumb-screws, pop Nylatches (U-clips), and cables are included
to help prevent loose connections and aid in installing (seating) parts correctly.
These positive retention items do not require tools.
Error Handling and Reporting
In the unlikely event of system hardware or environmentally induced failure, the
system runtime error capture capability systematically analyzes the hardware error
signature to determine the cause of failure. The analysis result will be stored in
system NVRAM. When the system can be successfully restarted either manually or
automatically, the error will be reported to the operating system. Error Log Analysis
(ELA) can be used to display the failure cause and the physical location of the failing
hardware.
With the integrated Service Processor, the system has the ability to automatically
send out an alert via phone line to a pager or call for service in the event of a critical
system failure. A hardware fault will also turn on the amber system fault LED located
on the system unit to alert the user of an internal hardware problem. The indicator
may also be set to blink by the operator as a tool to allow system identification.
For identification, the blue locate LED on the enclosure and at the system level will
turn on solid. The amber system fault LED will be on solid when an error condition
occurs.
On POWER7 processor-based servers, hardware and software failures are recorded
in the system log. When an HMC is attached, an ELA routine analyzes the error,
forwards the event to the Service Focal Point (SFP) application running on the
HMC, and notifies the system administrator that it has isolated a likely cause of
the system problem. The Service Processor event log also records unrecoverable
checkstop conditions, forwards them to the SFP application, and notifies the system
administrator. Once the information is logged in the SFP application, if the system is
properly configured, a call home service request will be initiated and the pertinent
failure data with service parts information and part locations will be sent to an IBM
Service organization. Customer contact information and specific system-related