Integrating Portable and Distributed Storage
Niraj Tolia
†‡
, Jan Harkes
†
, Michael Kozuch
‡
, M. Satyanarayanan
†‡
†
Carnegie Mellon University,
‡
Intel Research Pittsburgh
Abstract
We describe a technique called lookaside caching that combines the
strengths of distributed file systems and portable storage devices,
while negating their weaknesses. In spite of its simplicity, this tech-
nique proves to be powerful and versatile. By unifying distributed
storage and portable storage into a single abstraction, lookaside
caching allows users to treat devices they carry as merely perfor-
mance and availability assists for distant file servers. Careless use
of portable storage has no catastrophic consequences.
1 Introduction
Floppy disks were the sole means of sharing data
across users and computers in the early days of per-
sonal computing. Although they were trivial to use,
considerable discipline and foresight was required of
users to ensure data consistency and availability, and to
avoid data loss — if you did not have the right floppy
at the right place and time, you were in trouble! These
limitations were overcome by the emergence of dis-
tributed file systems such as NFS [17], Netware [8],
LanManager [24], and AFS [7]. In such a system, re-
sponsibility for data management is delegated to the
distributed file system and its operational staff.
Personal storage has come full circle in the recent
past. There has been explosive growth in the avail-
ability of USB- and Firewire-connected storage de-
vices such as flash memory keychains and portable disk
drives. Although very different from floppy disks in
capacity, data transfer rate, form factor, and longevity,
their usage model is no different. In other words, they
are just glorified floppy disks and suffer from the same
limitations mentioned above. Why then are portable
storage devices in such demand today? Is there a way
to use them that avoids the messy mistakes of the past,
where a user was often awash in floppy disks trying to
figure out which one had the latest version of a specific
file? If loss, theft or destruction of a portable storage
device occurs, how can one prevent catastrophic data
loss? Since human attention grows ever more scarce,
can we reduce the data management demands on atten-
tion and discipline in the use of portable devices?
We focus on these and related questions in this pa-
per. We describe a technique called lookaside caching
that combines the strengths of distributed file sys-
tems and portable storage devices, while negating their
weaknesses. In spite of its simplicity, this technique
proves to be powerful and versatile. By unifying “stor-
age in the cloud” (distributed storage) and “storage in
the hand” (portable storage) into a single abstraction,
lookaside caching allows users to treat devices they
carry as merely performance and availability assists for
distant file servers. Careless use of portable storage has
no catastrophic consequences.
Lookaside caching has very different goals and de-
sign philosophy from a PersonalRAID system [18], the
only previous research that we are aware of on us-
age models for portable storage devices. Our starting
point is the well-entrenched base of distributed file sys-
tems in existence today. We assume that these are suc-
cessful because they offer genuine value to their users.
Hence, our goal is to integrate portable storage devices
into such a system in a manner that is minimally dis-
ruptive of its existing usage model. In addition, we
make no changes to the native file system format of
a portable storage device; all we require is that the de-
vice be mountable as a local file system at any client of
the distributed file system. In contrast, PersonalRAID
takes a much richer view of the role of portable storage
devices. It views them as first-class citizens rather than
as adjuncts to a distributed file system. It also uses a
customized storage layout on the devices. Our design
and implementation are much simpler, but also more
limited in functionality.
We begin in Section 2 by examining the strengths
and weaknesses of portable storage and distributed file
systems. In Sections 3 and 4, we describe the design
and implementation of lookaside caching. We quantify
the performance benefit of lookaside caching in Sec-
tion 5, using three different benchmarks. We explore
broader use of lookaside caching in Section 6, and con-
clude in Section 7 with a summary.
1