Chapter 2 HPSS Planning
44 September 2002 HPSS Installation Guide
Release 4.5, Revision 2
2.2.2 Required Throughputs
Determine the required or expectedthroughput forthe various types of data transfers that theusers
will perform. Some users want quick access to small amounts of data. Other users have huge
amounts of data they want to transfer quickly, but are willing to wait for tape mounts, etc. In all
cases, plan for peak loads that can occur during certain time periods. These findings must be used
to determine the type of storage devices and network to be used with HPSS to provide the needed
throughput.
2.2.3 Load Characterization
Understanding the kind of load users are putting on an existing file system provides input that can
be used to configure and schedule the HPSS system. What is the distribution of file sizes? How
many files and how much data is moved in each category? How does the load vary with time (e.g.,
over a day, week, month)? Are any of the data transfer paths saturated?
Having this storage system load information helps to configure HPSS so that it can meet the peak
demands. Also based on this information, maintenance activities such as migration, repack, and
reclaim can be scheduled during times when HPSS is less busy.
2.2.4 Usage Trends
To configure the system properly the growth rates of the various categories of storage, as well as
the growth rate of the number of files accessed and data moved in the various categories must be
known. Extra storage and data transfer hardware must be available if the amount of data storage
and use are growing rapidly.
2.2.5 Duplicate File Policy
The policy on duplicating critical files that a site uses impacts the amount of data stored and the
amount of data moved. If all user files are mirrored, the system will require twice as many tape
devices and twice as much tape storage. If a site lets the users control their own duplication of files,
the system may have a smaller amount of data duplicated depending on user needs. Users can be
given control over duplication of their files by allowing them a choice between hierarchies which
provide duplication and hierarchies which do not. Note that only files on disk can be duplicated to
tapes and only if their associated hierarchies are configured to support multiple copies.
2.2.6 Charging Policy
HPSS does not do the actual charging of users for the use of storage system resources. Instead, it
collects information that a site can use to implement a charging policy for HPSS use. The amount
charged for storage system use also will impact the amount of needed storage. If there is no charge
for storage, users will have no incentive to remove files that are outdated and more data storage
will be needed.