About LSAM

What is LSAM

LSAM is developing middleware infrastructure to support scalable distributed information services. This is embodied in the LSAM Proxy Cache, an extension of the Apache proxy cache that uses multicast push of related web pages, based on automatically-selected interest groups, to load caches at natural network aggregation points. The proxy is designed to reduce server and network load, and increase client performance.

Current implementations of such services rely on individual components, such as Web servers, WAIS servers, and host clients. Some resource sharing is provided via distributed indexing systems, e.g., Harvest, or manually configured proxy caches. The goal of LSAM is to develop middleware to support large-scale deployment of distributed information services with effective economies of scale.

LSAM achieves these goals by using automatic multicast channel management and server-based information push.

The LSAM system is based on a few key assumptions about web systems:

The LSAM Proxy

The LSAM proxy coalesces the LSAM technology into a single software bundle, that can be deployed throughout an infrastructure to provide intelligent caching services.

For caching to be effective, the proxy must be central to the sites with similar interests. Because these interests are dynamic, no single proxy can provide effective resource sharing.

LSAM replaces the concept of a single, central proxy with that of a virtual proxy. Multiple local proxies act together over multicast channels to provide the benefits of a single, central proxy, even where there is no central location from which one proxy could suffice.

The LSAM self-organizing web cache system provides efficient and intelligent cache service for distributed information access. The system is implemented in a proxy, which coordinates with other instances of itself to create a self-organized system. This system adjusts to support both conventional object access and meta-data such as aggregated query responses and their prefetched components. The proxy uses multicast networking to self-configure, and to reduce its aggregate bandwidth requirements while providing enhanced performance. It also uses multicast to leverage access affinity patterns to provide enhanced performance for groups of clients, supporting growth and scalability of distributed information service, and facilitating their use in unpredictable environments.

Relationship to the DARPA Information Management Program

The LSAM proxy reduces the response time for access, for both objects and query systems. This same mechanism enhances content extraction and the ability to correlate and manipulate distributed information, by providing group-enhanced information discovery, where group actions give signals to individuals about prevalent access patterns. It also enhances the effective collection capacity, by extending the service effort beyond the original server, to a tree of multicast-fed caches. The LSAM proxy enables federated service, by providing a self-organized system of caches to allow cache services to provide a greater service as a whole than as the sum-of-parts.

An example:

The LSAM proxy's automatic management aids rapid deployment and accommodates mobile environments. For example, in a rapid-deployment scenario, critical incoming information would be cached for high availability at the site. High-interest outbound information would be distributed throughout the system to reduce the burden on critical networking resources. The same mechanism used to deploy data for enhanced performance also provides a key component of information discovery. The multicast channel indicates group behavior, and can be monitored to hint at what information other group members are finding useful, suggesting additional directions for the individual.

Specific IM projects enhanced or enabled:

There are a variety of projects in the IM Program which are enabled or enhanced by the LSAM proxy.

CNRI's Digital Object Infrastructure includes Distributed Indexing, which can use multicast channels where each index is a channel, enabling fast access to indexed information grouped by affinity.

UTennesee's Evolving Software Repositories can use multicast to update and aggregate its software libraries based on groupings such as host architecture or application use.

Both MIT/LCS's Intelligent Query Access and UMD's Dynamic Query Management summarize responses via sets, the components of which are prefetched, requiring prioritized client request ordering and server processing, provided by the LSAM proxy's object scheduling mechanism. Further, the proxy channels can be mapped to their query refinement groups, where observation of channel activity indicates group interest in related information, signaling the individual client to additional relationships, enhancing information discovery.

The Open Group's Distributed WWW Clients , in DARPA's IC&V Program, can use channels for object working sets, using explicit group information to govern channel creation and content, a multicast proactive version of their unicast client-side system, but with lower upstream request bandwidth and even further reduced downstream bandwidth.

Earlier LSAM research

Earlier LSAM research focused on preliminary support for the virtual proxy that is currently being implemented.

That earlier work focused on four major tasks:

LSAM home ISI home
Page maintainer: the LSAM project 

Last modified: Thu Feb 18 12:48:18 PST 1999

Copyright © 1996 by USC/ISI