At ISI we are currently examining HTTP protocol performance across various transport protocols [8, 20]. As a part of this work we have examined the performance of HTTP and persistent-HTTP (P-HTTP) in detail. We have developed a model for HTTP performance based on a function of server and network characteristics [8].
To validate our HTTP performance model we compared predicted performance to measured performance in an actual web server. Our early experiments suggested that P-HTTP performance was ten times slower than the corresponding HTTP transactions in a simple page-retrieval benchmark. This result is surprising since P-HTTP is intended to improve performance by amortizing costs of connection creation across multiple requests [16, 13].
We found several interactions between P-HTTP and TCP which explain the exceedingly poor P-HTTP performance. These performance problems are not caused by specific errors in our server (Apache, beta version 1.1b4) or in our TCP implementation (SunOS 4.1.3), but they instead result from interactions between application-level P-HTTP behavior and existing TCP algorithms. We resolved these interactions through application-level implementation changes, providing an HTTP implementation where P-HTTP is 40% faster than simple HTTP over an Ethernet. With these implementation changes, most P-HTTP overhead is accurately accounted for by our analytic model [8].
Although the problems that we found are due to our particular implementations of P-HTTP and TCP, we believe that there are several reasons broader understanding of these issues is needed in the web community. First, P-HTTP is a relatively new protocol and is only now becoming standardized in HTTP/1.1 [6]. Although P-HTTP is derived from HTTP, P-HTTP exhibits very different network dynamics. To a first approximation, simple HTTP is identical to the data channel of FTP: a new connection is opened for each data object. FTP behavior has been studied for many years. P-HTTP involves multiple exchanges over a single TCP connection, thus it behaves much more like SMTP or NNTP (the Internet's standard e-mail and news transfer protocols) than FTP. SMTP is a batch protocol and interactive use of NNTP is usually on a LAN, so it is not surprising that TCP is not tuned for wide-area P-HTTP-style traffic.
Second, we have observed these problems in widely deployed implementations of HTTP and TCP. We have also made an early draft of this work available to others and been told that similar problems exist in at least one other HTTP server [7]. Together, these observations suggest that the web development community is not widely familiar with these problems.
Finally, HTTP is becoming very widely deployed outside its original domain of hypertext exchange. HTTP server implementations have been deployed for weather sensor arrays, networked disk drives, network routers and gateways, and implementations exist for nearly all types of general-purpose computers. Although many of these platforms will implement only a subset of HTTP (and possibly not P-HTTP), the many potential P-HTTP implementations suggest that a broader understanding of its behavior is important.
This document summarizes two observed performance problems and a third anticipated problem. In each case we describe the problem and demonstrate it with packet traces (where possible). We have implemented solutions to the first two problems we describe and show that, with these solutions, P-HTTP performs better than HTTP. We outline a solution to the third problem.
Problems similar to the first two problems we describe have been encountered in other contexts [14, 5]. We compare this work to ours in Section 3.