IBM SG24-5131-00 Laptop User Manual


 
138 IBM Certification Study Guide AIX HACMP
Verify that failover has occurred (netstat -i and ping for networks, lsvg -o
and
vi of a test file for volume groups, and ps -U <appuid> for application
processes).
Power cycle NodeF. If HACMP is not configured to start from /etc/inittab
(on restart), start HACMP on NodeF (
smit clstart). NodeF will take back
its cascading Resource Groups.
Verify that re-integration has occurred (
netstat -i and ping for networks,
lsvg -o and vi of a test file for volume groups, and ps -U <appuid> for
application processes).
6.2.2.3 TCP/IP Subsystem Failure
Check, by way of the verification commands, that all the Nodes in the
cluster are up and running.
Optional: Prune the error log on NodeF (errclear 0).
Monitor the cluster log files on NodeT.
On NodeF, stop the TCP/IP subsystem (
sh /etc/tcp.clean) or crash the
subsystem by increasing the size of the sb_max and thewall parameters to
large values (
no -o sb_max=10000; no -o thewall=10000) and ping NodeT.
Note that you should record the values for sb_max and thewall prior to
modifying them, and, as an extra check, you may want to add the original
values to the end of /etc/rc.net.
The TCP/IP subsystem failure on NodeF will cause a network failure of all
the TCP/IP networks on NodeF. Unless there has been some
customization done to promote this type of failure to a node failure, only
the network failure will occur. The presence of a non-TCP/IP network
(RS232, target mode SCSI or target mode SSA) should prevent the cluster
from triggering a node down in this situation.
Verify that the network_down event has been run by checking the
/tmp/hacmp.out file on either node. By default, the network_down script
does nothing, but it can be customized to do whatever is appropriate for
that situation in your environment.
On NodeF, issue the command
startsrc -g tcpip. This should restart the
TCP/IP daemons, and should cause a network_up event to be triggered in
the cluster for each of your TCP/IP networks.
6.2.3 Network Failure
Check, by way of the verification commands, that all the Nodes in the
cluster are up and running.
Optional: Prune the error log on NodeF (
errclear 0).