4 Sun Fire T1000 Server Service Manual • January 2006
The ALOM-CMT software is preinstalled as firmware, and therefore, ALOM
initializes as soon as you apply power to the system. You can customize ALOM to
work with your particular installation.
ALOM enables you to monitor and control your server over a network, or by using
a dedicated serial port for connection to a terminal or terminal server. ALOM
provides a command-line interface that you can use to remotely administer
geographically distributed or physically inaccessible machines. In addition, ALOM
enables you to run diagnostics (such as POST) remotely that would otherwise
require physical proximity to the server’s serial port.
You can configure ALOM to send email alerts of hardware failures, hardware
warnings, and other events related to the server or to ALOM. The ALOM circuitry
runs independently of the server, using the server’s standby power. Therefore,
ALOM firmware and software continue to function when the server operating
system goes offline or when the server is powered off. ALOM monitors the
following Sun Fire T1000 server components:
■ Hard disk drive status
■ Enclosure thermal conditions
■ Power supply status
■ Voltage levels
■ Faults detected by POST (Power-On Self-Test)
■ Solaris OS Predictive Self Healing (PSH) diagnostic facilities
For information about configuring and using the ALOM system controller, refer to
the Sun Fire T1000 Server Advanced Lights Out Manager (ALOM) Guide.
System Reliability, Availability, and Serviceability
Reliability, availability, and serviceability (RAS) are aspects of a system’s design that
affect its ability to operate continuously and to minimize the time necessary to
service the system. Reliability refers to a system’s ability to operate continuously
without failures and to maintain data integrity. System availability refers to the
ability of a system to recover to an operational state after a failure, with minimal
impact. Serviceability relates to the time it takes to restore a system to service
following a system failure. Together, reliability, availability, and serviceability
features provide for near continuous system operation.
To deliver high levels of reliability, availability, and serviceability, the Sun Fire T1000
server offers the following features:
■ Environmental monitoring
■ Error detection and correction for improved data integrity
■ Easy access for most component replacements
■ Extensive POST tests that automatically delete faulty components from the
configuration.