CHAPTER 4 PCI Card Hot Maintenance in Red Hat Enterprise Linux 6
4.2 Hot add of SB
62
C122-E175-01EN
4.2.4 How to deal with when timeout occurs while OS is processing SB
hot add
If OS does not finish the process of SB hot add within predetermined time, timeout message “DR sequence
timeout: SB hot-add OS failure” is shown on MMB CLI.
It means that DR completion message from OS does not arrive at MMB. In such case, some collaboration
programs may hang though DR process is still running on OS. Rebooting the partition is recommended
because it is difficult to estimate when the process will be completed.
The process of SB hot add by OS can be mainly divided into three parts. Check /var/log/message, analyzing
which process takes a lot of time.
- Pre-process of collaboration program
- Activating added resources
- Post-process of collaboration program
1. Checking pre-process of collaboration program
Process of below messages in /var/log/messages is pre-process of the collaboration program.
Dec 17 00:15:33 xxx dp-util[4457]: INFO : 800 : Detected SB
hot-add
Dec 17 00:15:33 xxx dp-util[4457]: INFO : 801 : Added SB3,
Node6,7
Dec 17 00:15:33 xxx dp-util[4457]: INFO : 807 : Execute 1
user programs at ADD_PRE timing
...
Dec 17 00:15:34 xxx dp-util[4457]: 10-FJSVdp-util-kdump-
restart : INFO : start
...
Dec 17 00:15:34 xxx dp-util[4457]: 10-FJSVdp-util-kdump-
restart : INFO : result: 0
...
Dec 17 00:15:34 xxx dp-util[4457]: INFO : 808 : Executed
user programs at ADD_PRE timing
If “INFO : 808 : Executed user programs at ADD_PRE timing” is not output, pre-process of the
collaboration program is delayed. Check which collaboration program takes a lot of time by seeing
/var/log/messages and ‘collaboration program name.log’ made in /opt/FJSVdp-util/var/log directory, if any.
The developer of the collaboration program can be confirmed by below rpm command. Ask the developer
about the cause of its delay.
(Example) Checking the developer of the collaboration program “10-FJSVdp-util-kdump-restart”
$ rpm -qif /opt/FJSVdp-util/user_command/10-FJSVdp-util-
kdump-restart
...
Rebooting the partition is recommended because SB hot add process has been imperfect state.
2. Checking the time for activating added resources
Process of below messages in /var/log/messages is the process of activating added resources.
Dec 17 00:15:34 xxx dp-util[4457]: INFO : 802 : Add CPU30-59
(total 30)
Dec 17 00:15:34 xxx dp-util[4457]: INFO : 804 : Add
MEM98304-98559,114688-114943 (total 67108864 kiB)
...
Dec 17 00:15:47 xxx dp-util[4457]: INFO : 809 : Added SB3
If “INFO : 809 : Added SBX” is not output, process of activating added resources is delayed. Check that
the process of adding CPU or memory is performed by executing below command at several seconds.
- Checking the number of CPU
$ grep -c processor /proc/cpuinfo
30
- Checking the size of memory
$ cat /proc/meminfo |grep MemTotal