IB6054601-00 D B-1
Appendix B
Integration with a Batch Queuing System
Most cluster systems use some kind of batch queuing system as an orderly way to
provide users with access to the resources they need to meet their job’s performance
requirements. One of the tasks of the cluster administrator is to provide means for
users to submit MPI jobs through such batch queuing systems. This can take the
form of a script, which your users can invoke much as they would invoke
mpirun
to submit their MPI jobs. A sample script is presented in this section.
B.1
A Batch Queuing Script
We give an example of the some of the functions that such a script might perform,
in the context of the Simple Linux Utility Resource Manager (SLURM) developed
at Lawrence Livermore National Laboratory. These functions assume the use of the
bash shell. We will call this script batch_mpirun. It is provided here:
#! /bin/sh
# Very simple example batch script for InfiniPath MPI, using slurm
# (http://www.llnl.gov/linux/slurm/)
# Invoked as:
# batch_mpirun #cpus mpi_program_name mpi_program_args ...
#
np=$1 mpi_prog="$2" # assume arguments to script are correct
shift 2 # program args are now $@
eval ‘srun --allocate --ntasks=$np --no-shell‘
mpihosts_file=‘mktemp -p /tmp mpihosts_file.XXXXXX‘
srun --jobid=${SLURM_JOBID} hostname -s | sort | uniq -c \
| awk ’{printf "%s:%s\n", $2, $1}’ > $mpihosts_file
mpirun -np $np -m $mpihosts_file "$mpi_prog" $@
exit_code=$?
scancel ${SLURM_JOBID}
rm -f $mpihosts_file
exit $exit_code
In the following sections, setup and the various functions of the script are discussed
in further detail.
B.1.1
Allocating Resources
When the mpirun command starts, it requires specification of the number of node
programs it must spawn (via the
-np option) and specification of an mpihosts file
listing the nodes on which the node programs may be run. (See section 3.5.8 for
more information.) Normally, since performance is usually important, a user might