Chapter 1 HPSS Basics
20 September 2002 HPSS Installation Guide
Release 4.5, Revision 2
provide scalability and parallelism. The basis for this architecture is the IEEE Mass Storage System
Reference Model, Version 5.
1.2.2 High Data Transfer Rate
HPSS achieves high data transfer rates by eliminating overhead normally associated with data
transfer operations. In general, HPSS servers establish transfer sessions but are not involved in
actual transfer of data.
1.2.3 Parallel Operation Built In
The HPSS Application Program Interface (API) supports parallel or sequential access to storage
devices by clients executing parallel or sequential applications. HPSS also provides a Parallel File
Transfer Protocol. HPSS can even manage data transfers in a situation where the number of data
sources and destination are different. Parallel data transfer is vital in situations that demand fast
access to very large files.
1.2.4 A Design Based on Standard Components
HPSS runs on UNIX with no kernel modifications and is written in ANSI C and Java. It uses the
OSF Distributed Computing Environment (DCE) and Encina from Transarc Corporation as the
basis for its portable, distributed, transaction-based architecture. These components are offered on
many vendors’ platforms. Source code is available to vendors and users for porting HPSS to new
platforms. HPSS Movers and the Client API have been ported to non-DCE platforms. HPSS has
been implemented on the IBM AIX and Sun Solaris platforms. In addition, selected components
have been ported to other vendor platforms. The non-DCE Client API and Mover have been ported
to SGI IRIX, while the Non-DCE Client API has also been ported to Linux. Parallel FTP client
software has been ported to a number of vendor platforms and is also supported on Linux. Refer
to Section 1.4: HPSS Hardware Platforms on page 37 and Section 2.3: Prerequisite Software
Considerations on page 46 for additional information.
1.2.5 Data Integrity Through Transaction Management
Transactional metadata management and Kerberos security enable a reliable design that protects
user data both from unauthorized use and from corruption or loss. A transaction is an atomic
grouping of metadata management functions that either take place together, or none of them take
place. Journaling makes it possible to back out any partially complete transactions if a failure
occurs. Transaction technology is common in relational data management systems but not in
storage systems. HPSS implements transactions through Transarc’s Encina product. Transaction
management is the key to maintaining reliability and security while scaling upward into a large
distributed storage environment.
1.2.6 Multiple Hierarchies and Classes of Services
Most other storage management systems support simple storage hierarchies consisting of one kind
of disk and one kind of tape. HPSS provides multiple hierarchies, which are particularly useful
when inserting new storage technologies over time. As new disks, tapes, or optical media are