The Role of the RMS
Scheduling deciding when and where to run parallel jobs
Audit maintaining an audit trail of system state changes
From the user’s point of view, RMS provides tools for:
Information querying the resources of the system
Execution loading and running parallel programs on a given set of resources
Monitoring monitoring the execution of parallel programs
2.3.1 The Structure of the RMS
RMS is implemented as a set of UNIX commands and daemons, programmed in C and
C++, using sockets for communications. All of the details of the system (its
configuration, its current state, usage statistics) are maintained in a SQL database, as
shown in
Figure 2.3. See Section 2.3.4 for an overview and
Chapter 10 (The RMS Database) for details of the database.
2.3.2 The RMS Daemons
A set of daemons provide the services required for managing the resources of the system.
To do this, the daemons both query and update the database (see Section 2.3.4).
• The Database Manager, msqld, provides SQL database services.
• The Machine Manager, mmanager, monitors the status of nodes in an RMS system.
• The Partition Manager, pmanager, controls the allocation of resources to users and
the scheduling of parallel programs.
• The Switch Network Manager, swmgr, supervises the operation of the Compaq
AlphaServer SC Interconnect, monitoring it for errors and collecting performance
data.
• The Event Manager, eventmgr, runs handlers in response to system incidents and
notifies clients who have registered an interest in them.
• The Transaction Log Manager, tlogmgr, instigates database transactions that have
been requested in the Transaction Log. All client transactions are made through this
mechanism. This ensures that changes to the database are serialized and an audit
trail is kept.
• The Process Manager, rmsmhd, runs on each node in the system. It starts the other
RMS daemons.
2-4 Overview of RMS