IBM SG24-5131-00 Laptop User Manual


 
Cluster Testing 137
Verify that all sharedvg file systems and paging spaces are accessible (
df
-k and lsps -a).
6.2.2 Node Failure / Reintegration
The following sections deal with issues of node failure and reintegration.
6.2.2.1 AIX Crash
Perform the following steps in the event of an AIX crash:
Check, by way of the verification commands, that all the Nodes in the
cluster are up and running.
Optional: Prune the error log on NodeF (
errclear 0).
If NodeF is an SMP, you may want to set the fast reboot switch (
mpcfg -cf
11 1
).
Monitor cluster logfiles on NodeT.
Crash NodeF by entering
cat /etc/hosts > /dev/kmem. (The LED on NodeF
will display 888.)
The OS failure on NodeF will cause a node failover to NodeT.
Verify that failover has occurred (
netstat -i and ping for networks, lsvg -o
and
vi of a test file for volume groups, and ps -U <appuid> for application
processes).
Power cycle NodeF. If HACMP is not configured to start from /etc/inittab,
(on restart) start HACMP on NodeF (
smit clstart). NodeF will take back
its cascading Resource Groups.
Verify that re-integration has occurred (
netstat -i and ping for networks,
lsvg -o and vi of a test file for volume groups, and ps -U <appuid> for
application processes).
6.2.2.2 CPU Failure
Perform the following steps in the event of CPU failure:
Check, by way of the verification commands, that all the Nodes in the
cluster are up and running.
Optional: Prune the error log on NodeF (
errclear 0).
If NodeF is an SMP, you may want to set the fast reboot switch (
mpcfg -cf
11 1).
Monitor cluster logfiles on NodeT.
Power off NodeF. This will cause a node failover to NodeT.