[LSAM replication logo] About LSAM Request Routing

-----

Introduction and Related Work

The recent efforts in internet caching started with the Harvest Internet Cache, and the analysis (by the same group of researchers) of internet FTP traffic logs that indicated that a relatively small number of caches placed at strategic places in the network topology could eliminate a significant amount of network traffic.

The Squid internet cache and the Internet Cache Protocol ( RFC2186, RFC2187) efforts applied internet caching to world wide web traffic, and continued the publicly available development of internet caching. This included mechanisms within ICP to query a neighboring (non-parent) cache for the presence of an item, which complement the traditional mechanisms to forward requests within a hierarchical system of proxy caches.

Systems with hierarchical cache organization and forwarding work to create "aggregate request flows", which theoretically may have higher hit rates due to files common to multiple client requests (possibly forwarded via other proxies). Neighbor cache queries that produce a cache miss are generally not forwarded, hence this mechanism does not work to "build" a common cache. Instead, neighbor caches work together to form a large aggregate cache, which theoretically will produce more hits than a single smaller cache.

Initial deployment of web proxy systems primarily relied on request forwarding within a hierarchy. Analysis of the performance of these web proxy systems revealed that the benefits of internet web caching did not meet the expectations of the original internet FTP analysis. Level one proxy caches (one removed from client browsers) generally exhibit only a 30-35% cache hit rate, while proxies further up in the hierarchy were even lower (10-15%).

The low hit rate of hierarchical caching is addressed by "second generation" cache systems, which focus on mechanisms to organize and exploit neighbor caches to produce larger virtual aggregate caches which achieve higher hit rates. Various mechanisms have been proposed to organize these virtual caches. The CacheMesh project basically uses a hash function to assign requests to neighbor caches, hence requests are uniquely sent to caches that may have seen it previously. CRISP uses a central coordinator that tracks cached items in a group of caches, hence a single query can find a nearby cached object. The Adaptive Web Cache uses multicast queries to discover cache hits in any nearby neighbor.

Multicast has been proposed as a mechanism in a number of systems. In addition to its use for cache queries (i.e. resource discovery), it also is naturally used for efficient data distribution. The Adaptive Web Cache, as well as other systems, use multicast for distribution of web date.

The LSAM proxy is based on multicast distribution, combined with a mechanism that provides for pre-sending. As discussed in the LSAM overview, origin servers identify groups of related web pages that are in high demand and create a multicast channel for the distribution of these items. LSAM proxies track request patterns, and subscribe to distribution channels that match client web requests. All subscribers are then refreshed by requests to the server that match the multicast group.

The basic LSAM hierarchical request routing is designed to create aggregate request flows. Thus, those requests that do not have sufficient usage at a particular proxy to merit a channel subscription might be combined with similar requests at the common parent where a shared subscription might provide sufficient benefit. Theoretically, the LSAM approach of pre-sending related files could reduce the number of caches misses throughout the system, and achieve better performance than conventional hierarchical caching systems. Finally, LSAM also provides a mechanism to exploit non-hierarchical neighbor caches, to increase the probability of a nearby hit.

Next Page

  • Auto-Configuring Hierarchy

    -----

    LSAM home ISI home Page maintainer: Steve Hotz Last modified: Mon Jan 6 10:48:20 1997 Copyright © 1996 by USC/ISI