C – Troubleshooting
InfiniPath MPI Troubleshooting
IB6054601-00 D C-25
Q
These messages appear in the mpirun output. Most are followed by an abort, and
possibly a backtrace. Each is preceded by the name of the function in which the
exception occurred.
Error sending packet: description
Error receiving packet: description
A fatal protocol error occurred while trying to send an InfiniPath packet.
On Node n, process p seems to have forked.
The new process id is q. Forking is illegal under
InfiniPath. Exiting.
An MPI process has forked and its child process has attempted to make MPI calls.
This is not allowed.
processlabel Fatal Error in filename line_no: error_string
This is always followed by an abort. The processlabel usually takes the form of
host name followed by process rank.
At time of writing, the possible
error_strings are:
Illegal label format character.
Recv Error.
Memory allocation failed.
Error creating shared memory object.
Error setting size of shared memory object.
Error mapping shared memory.
Error opening shared memory object.
Error attaching to shared memory.
invalid remaining buffers !!
Node table has inconsistent length!
Timeout waiting for nodetab!
The following indicates an unknown host:
$ mpirun -np 2 -m ~/tmp/q mpi_latency 100 100
MPIRUN: Cannot obtain IP address of <nodename>: Unknown host
<nodename> 15:35_~.1019
There is no route to a valid host:
$ mpirun -np 2 -m ~/tmp/q mpi_latency 100 100
ssh: connect to host <nodename> port 22: No route to host
MPIRUN: Some node programs ended prematurely without connecting to
mpirun.
MPIRUN: No connection received from 1 node process on node
<nodename>