Loading and Running Programs
The resource request is sent to the Partition Manager, pmanager (described in
Section 4.4). The Partition Manager performs access checks (described in
Chapter 6 (Access Control, Usage Limits and Accounting)) and then allocates CPUs
according to the policies established for the partition (see Chapter 7 (RMS Scheduling)).
RMS makes a distinction between allocating resources and starting jobs on them. Before
the Partition Manager schedules a parallel program, it will ensure that the required
CPUs and memory are allocated. Note that this may cause requests to block for longer
than you might expect – especially when the job has not specified how much memory it
requires. Once CPUs have been allocated, jobs can be started on them immediately.
3.3 Loading and Running Programs
A simple parallel program is shown in Figure 3.2. It has eight application processes,
distributed over four nodes, two processes per node.
Figure 3.2: Loading and Running a Parallel Program
0
4
1
5
2
6
3
7
rmsloader
rmsd
prun
stdio
FourNodesinaParallelPartition
PartitionManager
RMSNode
Once the CPUs have been allocated, prun asks the pmanager to start the application
processes on the allocated CPUs. The pmanager does this by instructing the daemons
running on each of the allocated nodes to start the loader process rmsloader on the
user’s behalf.
The rmsloader process starts the application processes executing, forwarding their
stdout and stderr streams to prun (unless otherwise directed). Meanwhile, prun
supplies information on the application processes as requested by rmsloader and
forwards stdout and stderr to the controlling terminal or output files.
prun forwards stdin and certain signals (QUIT, USR1, USR2, WINCH) to the application
processes. If prun is killed, RMS cleans up the parallel program, killing the application
Parallel Programs Under RMS 3-3