About LSAM
What is LSAM
LSAM is developing middleware infrastructure to support scalable
distributed information services. This is embodied in the LSAM Proxy Cache, an extension of the Apache
proxy cache that uses multicast push of related web pages,
based on automatically-selected interest groups,
to load caches at natural network aggregation points.
The proxy is designed to reduce server and network load,
and increase client performance.
Current implementations of such services rely on
individual components, such as Web servers, WAIS servers, and host clients.
Some resource sharing is provided via distributed indexing systems, e.g.,
Harvest, or manually configured proxy caches. The goal of LSAM is to develop
middleware to support large-scale deployment of distributed information
services with effective economies of scale.
LSAM achieves these goals by using automatic multicast channel management
and server-based information push.
The LSAM system is based on a few key assumptions about web systems:
-
Existing caching techniques are relatively effective.
-
Hot-spots generate significant traffic, important to users.
-
Hot-spots defeat existing caching techniques.
The LSAM Proxy
The LSAM proxy coalesces the LSAM technology into a single software bundle,
that can be deployed throughout an infrastructure to provide intelligent
caching services.
For caching to be effective, the proxy must be central to the sites
with similar interests. Because these interests are dynamic, no single
proxy can provide effective resource sharing.
LSAM replaces the concept of a single, central proxy with that of a
virtual proxy. Multiple local proxies act together over multicast
channels to provide the benefits of a single, central proxy, even where
there is no central location from which one proxy could suffice.
The LSAM self-organizing web cache system provides efficient and intelligent
cache service for distributed information access. The system is implemented
in a proxy, which coordinates with other instances of itself to create
a self-organized system. This system adjusts to support both conventional
object access and meta-data such as aggregated query responses and their
prefetched components. The proxy uses multicast networking to self-configure,
and to reduce its aggregate bandwidth requirements while providing enhanced
performance. It also uses multicast to leverage access affinity patterns
to provide enhanced performance for groups of clients, supporting growth
and scalability of distributed information service, and facilitating their
use in unpredictable environments.
The LSAM proxy reduces the response time for access, for both objects and
query systems. This same mechanism enhances content extraction and the
ability to correlate and manipulate distributed information, by providing
group-enhanced information discovery, where group actions give signals
to individuals about prevalent access patterns. It also enhances the effective
collection capacity, by extending the service effort beyond the original
server, to a tree of multicast-fed caches. The LSAM proxy enables federated
service, by providing a self-organized system of caches to allow cache
services to provide a greater service as a whole than as the sum-of-parts.
An example:
The LSAM proxy's automatic management aids rapid deployment and accommodates
mobile environments. For example, in a rapid-deployment scenario, critical
incoming information would be cached for high availability at the site.
High-interest outbound information would be distributed throughout the
system to reduce the burden on critical networking resources. The same
mechanism used to deploy data for enhanced performance also provides a
key component of information discovery. The multicast channel indicates
group behavior, and can be monitored to hint at what information other
group members are finding useful, suggesting additional directions for
the individual.
Specific IM projects enhanced or enabled:
There are a variety of projects in the IM Program which are enabled or
enhanced by the LSAM proxy.
CNRI's Digital
Object Infrastructure includes Distributed Indexing, which can use
multicast channels where each index is a channel, enabling fast access
to indexed information grouped by affinity.
UTennesee's
Evolving Software Repositories can use multicast to update and aggregate
its software libraries based on groupings such as host architecture or
application use.
Both MIT/LCS's
Intelligent Query Access and UMD's
Dynamic Query Management summarize responses via sets, the components
of which are prefetched, requiring prioritized client request ordering
and server processing, provided by the LSAM proxy's object scheduling mechanism.
Further, the proxy channels can be mapped to their query refinement groups,
where observation of channel activity indicates group interest in related
information, signaling the individual client to additional relationships,
enhancing information discovery.
The Open
Group's Distributed WWW Clients , in DARPA's
IC&V Program, can use channels for object working sets, using explicit
group information to govern channel creation and content, a multicast proactive
version of their unicast client-side system, but with lower upstream request
bandwidth and even further reduced downstream bandwidth.
Earlier LSAM research
Earlier LSAM research focused on preliminary support for the virtual proxy
that is currently being implemented.
That earlier work focused on four major tasks:
-
Intelligent Bandwidth (IB) Organizes
the use of distributed caches based on network parameters and usage information.
This work led to the multicast channel architecture of the current LSAM
proxy.
-
Replication Supports copy management
and copy selection to increase performance and reliability. This is used
to support the management of copies of data, such as are created by the
multicast data distribution of the current proxy.
-
Security Integrates emerging authentication
and privacy mechanisms, and augment those services to accommodate LSAM's
new middleware. This is used to support the distributed management of data,
such as used in the current proxy.
-
Performance Examines protocol extensions
to improve the performance of distributed object access. These protocol
modifications affect the access latency of both cached and non-cached objects.
The overall goal of the current LSAM proxy is to reduce access latency
and network load for accessing hot-spot objects. These protocol modifications
increase the performance of the underlying object access system, which
both highlights the need for the LSAM proxy's enhancements, as well as
to further increase the performance of the proxy.
Page maintainer: the LSAM
project
Last modified: Thu Feb 18 12:48:18 PST 1999
Copyright © 1996 by USC/ISI