MPCMM0001 Chassis Management Module Software Technical Product Specification 51
Process Monitoring and Integrity
6.7.8 Excessive Restarts, Successful Escalate Failover/Reboot
In this scenario PMS detects a process fault. The configured recovery action is: restart the process.
However, the PMS also detects that the process has exceeded the threshold for excessive process
restarts. Therefore, the PMS will execute the escalation action. The configured escalation recovery
action is: failover to the standby CMM and upon successfully executing the failover, reboot the
now standby CMM. The escalated recovery action is successful.
PMS detects that the process has
been restarted excessively.
Recovery failure due to excessive
restarts
# N/A Configure
PMS attempts to execute the
escalated recovery action. Since the
recovery action is "no action", PMS
disables monitoring of the process.
Take no action specified for
escalated recovery
# N/A Configure
No attempt will be made to recover
the process. The PMS will stop
monitoring the process.
See Section 6.7.11, “Process
Administrative Action” on page 53, for
information about how to re-enable
monitoring and de-assert the event.
Process existence fault;
monitoring disabled or
Thread watchdog fault; monitoring
disabled or
Process integrity fault; monitoring
disabled
# Assert Configure
Table 12. Existence Fault, Excessive Restarts, Escalate No Action (Sheet 2 of 2)
Description Event String UID Assert Severity
Table 13. Excessive Restarts, Successful Escalate Failover/Reboot
Description Event String UID Assert Severity
PMS detects a faulty process. The
mechanism (existence, thread
watchdog, or integrity) used to detect
the fault will determine which of the
event type strings will be used.
Process existence fault;
attempting recovery or
Thread watchdog fault; attempting
recovery or
Process integrity fault; attempting
recovery
# Assert Configure
The recovery action specified is
"restart process"
Attempting process restart
recovery action
# N/A Configure
PMS detects that the process has
been restarted excessively.
Recovery failure due to excessive
restarts
# N/A Configure
The escalated recovery action
specified is "failover and reboot"
Attempting failover & reboot
escalated recovery action
# N/A Configure
PMS executes a failover.
Note this step is skipped when
running on the standby CMM.
The existing code generates the
events for failover. They are
separate from process monitoring
events and are not described
here.
-N/A N/A
PMS is running on the standby CMM
(failover was successful or already
running on the standby), PMS
recovers the CMM by rebooting.
Upon initialization of PMS after the
reboot. The monitor will de-assert the
event.
Monitoring initialized # De-assert OK