IBM Hub/Switch Switch User Manual


 
Chapter 2 HPSS Planning
HPSS Installation Guide September 2002 61
Release 4.5, Revision 2
that an implementation is thread-safe provided only one thread makes MPI calls. With HPSS MPI-
IO, multiple threads will make MPI calls. HPSS MPI-IO attempts to impose thread-safety on these
hosts by utilizing a global lock that must be acquired in order to make an MPI call. However, there
are known problems with this approach, and the bottom line is that until these hosts provide true
thread-safety, the potential for deadlock within an MPI application will exist when using HPSS
MPI-IO in conjunction with other MPI operations. See the HPSS Programmers Reference Guide,
Volume 1, Release 4.5 for more details.
Files read and written through the HPSS MPI-IO can also be accessed through the HPSS Client API,
FTP, Parallel FTP, or NFS interfaces. So even though the MPI-IO subsystem does not offer all the
migration, purging, and caching operations that are available in HPSS, parallel applications can
still do these tasks through the HPSS Client API or other HPSS interfaces.
The details of the MPI-IO API are described in the HPSS Programmer’s Reference Guide, Volume 1.
2.5.7 DFS
DFS is offered by the Open Software Foundation (now the Open Group) as part of DCE. DFS is a
distributed file system that allows users to access files using normal Unix utilities and system calls,
regardless of the file’s location. This transparency is one of the major attractions of DFS. The
advantage of DFS over NFS is that it provides greater security and allows files to be shared globally
between many sites using a common name space.
HPSS provides two options for controlling how DFS files are managed by HPSS: archived and
mirrored. The archived option gives users the impression of having an infinitely large DFS file
system that performs at near-native DFS speeds. This option is well suited to sites with large
numbers of small files. However,when using thisoption, the filescan only beaccessed through DFS
interfaces and cannot be accessed with HPSS utilities, such as parallel FTP. Therefore, the
performance for data transfers is limited to DFS speeds.
The mirrored option gives users the impression of having a single, common (mirrored) name space
where objects have the same path names in DFSand HPSS. With this option, large files can be stored
quickly on HPSS, then analyzed at a more leisurely pace from DFS. On the other hand, some
operations, such as file creates, perform slower when this option is used, as compared to when the
archived option is used.
HPSS and DFS define disk partitions differently from one another. In HPSS, the option for how files
are mirrored or archived is associated with a fileset. Recall that in DFS, multiple filesets may reside
on a single aggregate. However, the XDSM implementation provided in DFS generates events on a
per-aggregate basis. Therefore, in DFS this option applies to all filesets on a given aggregate.
To use the DFS/HPSS interface on an aggregate, the aggregate must be on a processor that has
Transarc’s DFS SMT kernel extensions installed. These extensions are available for Sun Solaris and
IBM AIX platforms. Once an aggregate has been set up, end users can access filesets on the
aggregate from any machine that supports DFS client software, including PCs. The wait/retry logic
in DFS client software was modified to account for potential delays caused by staging data from
HPSS. Using a DFS client without this change may result in long delays for some IO requests.
HPSS servers and DFS both use Encina as part of their infrastructure. Since the DFS and HPSS
release cycles to support the latest version of Encina may differ significantly, running the DFS
server on a different machine from the HPSS servers is recommended.