No With
Lookaside Lookaside Win
100 Mb/s 173 (9) 103 (3.9) 40.1%
10 Mb/s 370 (14) 163 (2.9) 55.9%
1 Mb/s 2688 (39) 899 (26.4) 66.6%
100 Kb/s 30531 (1490) 8567 (463.9) 71.9%
This table gives the total operation latency (in seconds) for the
CDA benchmark of Section 5.2 at different bandwidths, with
and without lookaside to a LAN-attached CAS provider. The
CAS provider contains the same state as the DVD used for
the results of Figure 8. Each data point is the mean of three
trials, with standard deviation in parentheses.
Figure 11. Off-machine Lookaside
can be used for lookaside. Distributed hash tables
(DHTs) are one such source. There is growing interest
in DHTs such as Pastry [16], Chord [20], Tapestry [23]
and CAN [15]. There is also growing interest in
planetary-scale services such as PlanetLab [14] and lo-
gistical storage such as the Internet Backplane Proto-
col [2]. Finally, hash-addressable storage hardware
is now available [5]. Together, these trends suggest
that Content-Addressable Storage (CAS) will become
a widely-supported service in the future.
Lookaside caching enables a conventional dis-
tributed file system based on the client-server model
to take advantage of the geographical distribution and
replication of data offered by CAS providers. As with
portable storage, there is no compromise of the consis-
tency model. Lookaside to a CAS provider improves
performance without any negative consequences.
We have recently extended the prototype imple-
mentation described in Section 4 to support off-
machine CAS providers. Experiments with this ex-
tended prototype confirm its performance benefits. For
the ISR benchmark described in Section 5.2, Fig-
ure 11 shows the performance benefit of using a LAN-
attached CAS provider with same contents as the DVD
of Figure 8. Since the CAS provider is on a faster ma-
chine than the file server, Figure 11 shows a substantial
benefit even at 100 Mb/s.
Another potential application of lookaside caching
is in implementing a form of cooperative caching [1,
4]. A collection of distributed file system clients
with mutual trust (typically at one location) can ex-
port each other’s file caches as CAS providers. No
protocol is needed to maintain mutual cache consis-
tency; divergent caches may, at worst, reduce looka-
side performance improvement. This form of cooper-
ative caching can be especially valuable in situations
where the clients have LAN connectivity to each other,
but poor connectivity to a distant file server. The heavy
price of a cache miss on a large file is then borne only
by the first client to access the file. Misses elsewhere
are serviced at LAN speeds, provided the file has not
been replaced in the first client’s cache.
7 Conclusion
“Sneakernet,” the informal term for manual trans-
port of data, is alive and well today in spite of advances
in networking and distributed file systems. Early in this
paper, we examined why this is the case. Carrying your
data on a portable storage device gives you full confi-
dence that you will be to access that data anywhere,
regardless of network quality, network or server out-
ages, and machine configuration. Unfortunately, this
confidence comes at a high price. Remembering to
carry the right device, ensuring that data on it is cur-
rent, tracking updates by collaborators, and guarding
against loss, theft and damage are all burdens borne by
the user. Most harried mobile users would gladly del-
egate these chores if only they could be confident that
they would have good access to their critical data at all
times and places.
Lookaside caching suggests a way of achieving this
goal. Let the true home of your data be in a distributed
file system. Make a copy of your critical data on a
portable storage device. If you find yourself needing
to access the data in a desperate situation, just use the
device directly — you are no worse off than if you
were relying on sneakernet. In all other situations,
use the device for lookaside caching. On a slow net-
work or with a heavily loaded server, you will benefit
from improved performance. With network or server
outages, you will benefit from improved availability if
your distributed file system supports disconnected op-
eration and if you have hoarded all your meta-data.
Notice that you make the decision to use the de-
vice directly or via lookaside caching at the point of
use, not a priori. This preserves maximum flexibility
up front, when there may be uncertainty about the ex-
act future locations where you will need to access the
data. Lookaside caching thus integrates portable stor-
age devices and distributed file systems in a manner
that combines their strengths. It preserves the intrinsic
advantages of performance, availability and ubiquity
10