Compaq 3000 Computer Accessories User Manual


 
Chapter 2. RAID Array Controller
EK–SMCPQ–UG. C01 2–17
When one controller fails, the survivor will process all I/O requests until the
failed controller is repaired and powered on. The subsystem will then return to its
previous state (i.e., ACTIVE / ACTIVE or ACTIVE / PASSIVE).
2.8.1 Initialization
During initialization, the firmware in the RAID 3000 verifies that both control-
lers have consistent configurations including identical memory cache and system
parameters. If the controller setups are incompatible, the set is not bound and
each controller operates in stand-alone mode.
2.8.2 Message Passing
Information is shared between the two controllers by a collection of messages
passed through the backplane connectors. The messages provide configuration
data as well as a heartbeat which is transmitted by each controller every 500 ms.
If a controller does not receive a heartbeat within one second, it assumes the peer
controller has become inoperable and begins failing over.
If the controllers cannot exchange messages due to communication problems
over the backplane, they will break the connection and each controller will
switch to a stand-alone mode.
2.8.3 Failover
Failover describes the process of transferring data from a failed controller to a
survivor and completing any active tasks. When one controller begins the fail-
over process, it sends a reset to the other controller, which prevents the failing
unit from processing any more information and enables any host ports that are
passive. It then downloads the failed controller’s cache to its unused portion of
cache and begins acting upon that data.
While downloading the data, the controller responds to I/O by disconnecting (if
allowed) and waiting approximately three seconds before reconnecting and pre-
senting a BUSY status. The delay is to prevent host operating systems from
seeing too many errors and fencing off the controller.