
We can compute the bandwidth required to provide an object in a given
latency. The latency can be described by a 'budget', which, for
interactive access, is approximately 100 ms between the request and
response. The object need only be approximately one 'screen-full', and
the size can be estimated based on the quality of its data. I.e.,
text-only data is about 3-5 Kilobytes (KB), HTML text with small icons
are about 20 KB, etc.
The quality vs. latency trade-off may be most useful as a
user-configurable parameter. The "conventional wisdom" on this is
split - I (J. Touch) believe quality will be sacrificed for speed,
whereas others at the DLI meeting (notably T. Smith, UCSB) believe
quality is of primary importance. An experiment is clearly indicated.
Transaction-TCP (T/TCP) is a reliable, transaction-oriented
protocol. T/TCP provides the reliability of the stream-oriented TCP
protocol, and the packet-boundaries of the unreliable UDP protocol. It
caches connection state across multiple transactions, including
congestion-control information. Further information on TrTCP is
provided in the Internet RFCs on:
The question of "dominant traffic" needs qualification. In what
respect is the traffic dominant:
Conventional wisdom is that mail dominates connections, because some
legacy mail systems open a connection-per-message. Packets may be
dominated by network control messages in some systems, but are usually
related to the dominant data. The dominant data in the Internet until
recently was FTP traffic (file transfer), but recently the Web
surpassed FTP. Router processing is dominated by exceptions to common
traffic, i.e., multicast traffic load, rather than necessarily by
packet processing. When the mbone was configured using IP source
routes rather than tunnels to interconnect multicast-islands, source
route processing became a dominant load. Server processing load is
dominated by context switching costs, among multiple connections.
There are three applications that are expected to dominated the Internet:
The current web is characterized as a system that is:
NOs can be classified as either:
This page written and maintained by Joe Touch touch@isi.edu

What are the bandwidth requirements for DLs?
The bandwidth requirements for digital libraries (DLs) are a function
of the size of the objects, the bandwidth of the network connection,
and the latency between a request and the presentation of the response.

What latency trade-offs are involved?
DLs can be considered one of a class of "interactive distributed
information access" (I-DIA) systems (this acronym could easily be
InDIA). I-DIA requirements are based on the assumption that responses
must occur within 100 ms of a request to be considered interactive to
humans. There are two components to the trade-off - quality vs.
latency, and response-time vs. bandwidth, the latter
described above.

What existing protocols support DLs?
There are a few protocols that are relevant to the DL community, notably:
Multicasting provides a mechanism for distributing data to a set of
recipients without addiional server load. Packets are replicated
inside the network, rather than at the source. There are several
Internet RFCs (Request for Comments) concerning multicast, notably the
description of the
multicast-IP protocol.
Multicast IP is the basis of the
MBone, or multicast backbone.

What prior-work in caching applies?
There is a wealth of prior art in caching and replicated file systems that
applies to DLs. A very few are listed here. We also keep of list of general
web-accelleration techniques, of which file caching is one.

What are the expected network applications?
There are several expected applications that may dominate the
network traffic. The question of dominant applications affects
the design of DLs, and the design of networks to support DLs,
because DL traffic will interoperate with this traffic, or
in fact become the dominant traffic as a result of these apps.
We believe that Web traffic will be dominant, and that
explicitly-transferred messages (e-mail) will be less common as
information organization becomes more standard. Consider the telephone
directory services - the better organized the phone book is, the less
likely dialed services are used. We also believe the Web is a member of
a general class of I-DIA applciations, given sufficient augmentation.
If we relax some of these constraints, we may see the broader class of
I-DIA applications:

What objects might be used on a network?
The object model impacts the organization of data in a DL. The use of
networks can affect the view of the objects, as well. Current network
objects (NOs) aren't simple static objects. NOs are:
Here we present a rudimentary object taxonomy from a networker's perspective.
(this is a personal view - J. Touch).
central, perm, transact, obj first = book
" " " req first = pending book request
" " stream, obj first = movie
" " " req first = pending movie request
" ephm, transact, obj first = CV's, *today's* paper, web
" " " req first = subscription
" " stream, obj first = bcast by item
" " " req first = scheduled bcasts (by time)

What is Dartnet?
See the presentation from the DLI '95 meeting.
