Support User Manuals

IBM Hub/Switch Switch User Manual

Open as PDF

of 590

Chapter 2 HPSS Planning

HPSS Installation Guide September 2002 61

Release 4.5, Revision 2

that an implementation is thread-safe provided only one thread makes MPI calls. With HPSS MPI-

IO, multiple threads will make MPI calls. HPSS MPI-IO attempts to impose thread-safety on these

hosts by utilizing a global lock that must be acquired in order to make an MPI call. However, there

are known problems with this approach, and the bottom line is that until these hosts provide true

thread-safety, the potential for deadlock within an MPI application will exist when using HPSS

MPI-IO in conjunction with other MPI operations. See the HPSS Programmers Reference Guide,

Volume 1, Release 4.5 for more details.

Files read and written through the HPSS MPI-IO can also be accessed through the HPSS Client API,

FTP, Parallel FTP, or NFS interfaces. So even though the MPI-IO subsystem does not offer all the

migration, purging, and caching operations that are available in HPSS, parallel applications can

still do these tasks through the HPSS Client API or other HPSS interfaces.

The details of the MPI-IO API are described in the HPSS Programmer’s Reference Guide, Volume 1.

2.5.7 DFS

DFS is offered by the Open Software Foundation (now the Open Group) as part of DCE. DFS is a

distributed ﬁle system that allows users to access ﬁles using normal Unix utilities and system calls,

regardless of the ﬁle’s location. This transparency is one of the major attractions of DFS. The

advantage of DFS over NFS is that it provides greater security and allows ﬁles to be shared globally

between many sites using a common name space.

HPSS provides two options for controlling how DFS ﬁles are managed by HPSS: archived and

mirrored. The archived option gives users the impression of having an inﬁnitely large DFS ﬁle

system that performs at near-native DFS speeds. This option is well suited to sites with large

numbers of small ﬁles. However,when using thisoption, the ﬁlescan only beaccessed through DFS

interfaces and cannot be accessed with HPSS utilities, such as parallel FTP. Therefore, the

performance for data transfers is limited to DFS speeds.

The mirrored option gives users the impression of having a single, common (mirrored) name space

where objects have the same path names in DFSand HPSS. With this option, large ﬁles can be stored

quickly on HPSS, then analyzed at a more leisurely pace from DFS. On the other hand, some

operations, such as ﬁle creates, perform slower when this option is used, as compared to when the

archived option is used.

HPSS and DFS deﬁne disk partitions differently from one another. In HPSS, the option for how ﬁles

are mirrored or archived is associated with a ﬁleset. Recall that in DFS, multiple ﬁlesets may reside

on a single aggregate. However, the XDSM implementation provided in DFS generates events on a

per-aggregate basis. Therefore, in DFS this option applies to all ﬁlesets on a given aggregate.

To use the DFS/HPSS interface on an aggregate, the aggregate must be on a processor that has

Transarc’s DFS SMT kernel extensions installed. These extensions are available for Sun Solaris and

IBM AIX platforms. Once an aggregate has been set up, end users can access ﬁlesets on the

aggregate from any machine that supports DFS client software, including PCs. The wait/retry logic

in DFS client software was modiﬁed to account for potential delays caused by staging data from

HPSS. Using a DFS client without this change may result in long delays for some IO requests.

HPSS servers and DFS both use Encina as part of their infrastructure. Since the DFS and HPSS

release cycles to support the latest version of Encina may differ signiﬁcantly, running the DFS

server on a different machine from the HPSS servers is recommended.

previous next