Chapter 2 HPSS Planning
80 September 2002 HPSS Installation Guide
Release 4.5, Revision 2
requests to the DMAP Gateway. Migration processes (hpss_hdm_mig) migrate data to HPSS, and
purge processes (hdm_hdm_pur) purge migrated data from DFS and XFS. A set of processes
(hpss_hdm_tcp) accept requests from the DMAP Gateway, and perform the requested operation in
DFS. A destroy process (hpss_hdm_dst) takes care of deleting files. Finally, XFS HDMs have a
process that watches for staleevents (hpss_hdm_stl) and keeps the HDM from getting bogged own
by them.
There are three types of event handlers based on the type of activity that generates the events:
administrative, name space, and data. Administrative activities include mounting and
dismounting aggregates. Name space activities include creating, deleting, or renaming objects, and
changing an object's attributes. Data activities include reading and writing file data. The number of
processes allocated to handle events generated by these activities should be large enough to allow
a reasonable mix of these activities to run in parallel.
When the HDM fetches an event from DFS or XFS, it is put on a queue and assigned to an
appropriate event handler when one becomes free. The total number of entries allowed in the
queue is determined by a configuration parameter. If this value is not large enough to handle a
reasonable number of requests, some of the event handlers may be starved. For example, if the
queue fills up with data events, the name space handlers will be starved. Section 7.6.3.3.1: config.dat
File on page 449 discusses the criteria for selecting the size of an event queue.
HDM logs outstanding name space events. If the HDM is interrupted, the log is replayed when the
HDM restarts to ensure that the events have been processed to completion and the DFS/XFS and
HPSS name spaces are synchronized. The size of the log is determined by a configuration
parameter, as discussed in Section 7.6.3.3.1: config.dat File on page 449.
HDM has two other logs, each containing a list of files that are candidates for being destroyed. One
of the logs, called the zap log, keeps track of files on archived aggregates and file systems, while the
other, called the destroy log, keeps track of files on mirrored aggregates. Because of restrictions
imposed by the DFS SMR, the HDM cannot take the time to destroy files immediately, so the logs
serve as a record of files that need to be destroyed by the destroy process. The size of the zap log is
bounded only by the file system where the log is kept, but the size of the destroy log is determined
by a configuration parameter. If the destroy log is too small, the HDM will be forced to wait until
space becomes available.
Since the HDM may be running on a machine where it cannot write error messages to the HPSS
message log, it uses its own log. This HDM log consists of a configurable number of files (usually
2) that are written in round-robin fashion. The sizes of these files are determined by a configuration
parameter.
HDM logging policy allows the system administrator to determine the type of messages written to
the log file: alarm, event, debug, and/or trace messages. Typically, only alarms should be enabled,
although event messages can be useful, and do not add significant overhead. If a problem occurs,
activating debug and trace messages may provide additional information to help identify the
problem. However, these messages add overhead, and the system will perform best if messages are
kept to a minimum. The type of messages logged is controlled by a parameter in the configuration
file and can be dynamically changed using the hdm_admin utility.
HDM migrates and purges files based on policies defined in the HDM policy configuration file. The
administrator can establish different policies for each aggregate in the system. Migration policy
parameters include the length of time to wait between migration cycles and the amount of time that
must elapse since a file was last accessed before it becomes eligible for migration. Purge policy
parameters include the length of time to wait between purge cycles, the amount of time that must
elapse since a file was last accessed, an upper bound specifying the percentage of DFS space that