3 – Using InfiniPath MPI
Debugging MPI Programs
3-20 IB6054601-00 D
Q
may be desirable to run multiple MPI processes and multiple OpenMP threads per
node.
The number of OpenMP threads is typically controlled by the
OMP_NUM_THREADS environment variable in the .
mpirunrc file. This may be
used to adjust the split between MPI processes and OpenMP threads. Usually the
number of MPI processes (per node) times the number of OpenMP threads will be
set to match the number of CPUs per node. An example case would be a node with
4 CPUs, running 1 MPI process and 4 OpenMP threads. In this case,
OMP_NUM_THREADS is set to 4. OMP_NUM_THREADS is on a per-node basis.
See the section 3.5.8 for information on setting environment variables.
The MPI_THREAD_SERIALIZED and MPI_THREAD_MULTIPLE models are not
yet supported.
NOTE: If there are more threads than CPUs, then both MPI and OpenMP
performance can be significantly degraded due to over-subscription of
the CPUs.
3.11
Debugging MPI Programs
Debugging parallel programs is substantially more difficult than debugging serial
programs. Thoroughly debugging the serial parts of your code before parallelizing
is good programming practice.
3.11.1
MPI Errors
Almost all MPI routines (except MPI_Wtime and MPI_Wtick) return an error code;
as the function return value in C functions or as the last argument in a Fortran
subroutine call. Before the value is returned, the current MPI error handler is called.
By default, this error handler aborts the MPI job. Therefore you can get information
about MPI exceptions in your code by providing your own handler for
MPI_ERRORS_RETURN. See the man page for MPI_Errhandler_set for details.
NOTE: MPI does not guarantee that an MPI program can continue past an error.
See the standard MPI documentation referenced in appendix D for details on the
MPI error codes.
3.11.2
Using Debuggers
The InfiniPath software supports the use of multiple debuggers, including pathdb,
gdb, and the system call tracing utility strace. These debuggers let you set
breakpoints in a running program, and examine and set its variables.