Abstract
I demonstrate how one can utilize the World Wide Web to solve problems that may otherwise be hard to solve. The demonstration consists of a distributed computing application run on a collection of Web servers using the HTTP protocol for communication. A problem that seems ideally suited for this demonstration is the directed Hamiltonian path problem (a restricted form of the traveling salesperson problem), one of the class of combinatorially hard problems. A Hamiltonian path of a connected graph is a path that includes every node exactly once. It is possible to encode a graph onto a small network of Web servers. Messages passed between them via HTTP requests will accumulate a list of visited nodes along the path. Each server then analyzes the path contained in received messages, and when one finds a completed Hamiltonian path, it will send a message to the destination server.
Distributed computing
Distributed computing has been studied extensively over the last two decades as a practical means to solve many kinds of computationally intensive problems. If a problem is split across several processors, either on the same computer, or on different machines distributed over a network, which communicate either via message passing or through a shared memory mechanism, it can be solved in much less time than by performing the computation on a single processor. One particularly hot area of research in distributed computing today is networks of workstations (NOWs), which harness the computational power of many workstations connected together on a fast Local Area Network (LAN)[1]. Individual workstations in NOWs communicate by passing messages using Remote Procedure Calls (RPCs) and through distributed shared memory. RPC is a programming interface supported on many platforms, and is therefore a good communication protocol in a heterogeneous computing environment.
More recently the Object Management Group (OMG), a consortium of software and hardware vendors, has proposed Object Management Architecture (OMA) "with a view to drive the industry towards interoperable, reusable, portable software components based on open, standard object-oriented interfaces"[2]. The Common Object Request Broker Architecture (CORBA), a standard communications layer also defined by the OMG that establishes a client-server relationships between objects in a heterogeneous computing environment, has seen keen interest from the commercial sector, especially for use in client/server applications. Another distributed object model gaining popularity for client/server applications is the Distributed Component Object Model (DCOM) advocated by Microsoft. Like CORBA, DCOM allows objects to be serialized and transmitted over the network to a server computer.
WWW is a distributed computing network
The World Wide Web is already a distributed computing network. It is composed of millions of machines, each supporting one of dozens of software and hardware configurations, all communicating via a platform independent standard --- the HyperText Transfer Protocol (HTTP). Moreover, Web servers routinely include the Common Gateway Interface (CGI) that allow clients, such as Web browsers, to execute commands on these servers. The ability to do local computation and fully asynchronous inter-platform independent communication enables the WWW to serve as the hardware for performing extremely massively parallel computations. Due to the slow nature of HTTP, it is most well suited to performing cpu bound processing, such as prime number searching or factoring, and communication bound processing in which message passing defines the problem, such as in neural network models. Load balanced problems, in which computation and communication are more equally weighted, are better suited to run on massively parallel supercomputers with dedicated communication networks.
Combinatorial problems
A Hamiltonian path exists in a graph only if there is a connected path from a designated starting node to a designated output node that visits each node of the graph exactly once. The directed Hamiltonian path problem has been proven to belong to the class of NP-complete problems, which includes various network optimization, routing and circuit design problems. There are algorithms that can find a Hamiltonian path in an arbitrary graph, but their time complexity grows exponentially with the size of the graph. There are (at this time) no known polynomial time algorithms to solve NP-complete problems. This is also the problem that has been solved by Leonard Adleman using a DNA computer[3].
Because a network such as the Web can be represented by a graph, it seems natural to attempt to solve a directed graph problem by using the Web. Several Web servers were chosen to represent nodes of the graph. Edges of the graph were set programmatically, by providing each server with a list of names of other servers to which it should send messages. HTTP requests were used to pass messages, that encode the path, from one server to the next. The computation was initiated at the start node (server) by sending it an empty path, and the last node in the path emailed the solution to the problem.
The following are the key steps of the algorithm:
Hamiltonian path computation experiment
Seven Web servers were chosen for the distributed Hamiltonian path computation experiment. Of the seven, five were in various parts of California and two were in Illinois. They were linked together as shown in the graph[5]. The red and green lines are two Hamiltonian paths through the graph starting at node 0 and ending at node 6.

The following servers were used as nodes in the graph:
A perl script was installed in the /cgi-bin directory on each server. If the server supports Common Gateway Interface (CGI) standard, the script is executed each time the server receives an HTTP GET request for the URL containing script name. Arguments were passed to the script via the HTTP GET by appending a question mark and a string containing the arguments to the URL. Each server used telnet protocol to send messages to other servers through the HTTP GET requests. All the scripts were identical except for the telnet functionality. If perl distribution 5.003 or above was installed on the server, IO::Socket module was used for telnet functions. For earlier versions of perl, chat2.pl library worked on some platforms, on others, downloading a small Net::Telnet module enabled the script to work. The scripts logged every incoming message.
The experiment was initiated by sending an empty path in an HTTP request to server 0. Server 0 automatically appended its name to the path and sent the new path in a message to servers 1, 2, 3 and 6. Each server that received the message, appended its name to the path, and checked for duplicate names in the path. If a duplicate was found, the script stopped. If no duplicates were found, and the number of servers in the path equaled number of nodes in the graph, a mail message containing the path was sent to the author. Otherwise, the path was sent to server's neighbors, as indicated in the graph above. Less than a minute after the computation was initiated on the first server, the author received two email messages containing the two Hamiltonian paths through the graph (red and green lines in the graph). The second solution was not anticipated by the author before the start of the experiment.
For more details of the experiment, please see results.
Future directions
The experiment described above demonstrates that it is possible to harness the power of distributed servers on the Web to solve real problems. Another experiment of interest to the author is to build a neural net out of several servers and train it on a simple task, such as addition of two numbers. It is intriguing to consider, though highly impractical, the possibility of linking the millions of Web servers into a single powerful neural net. The author would like to see the development of more intelligent, more dynamic Web servers, which could readily respond to environmental stimuli, such as network traffic fluctuations. In addition to computation, collective use of these servers by Web surfers could lead to interesting applications, such as collaboratively enhanced knowledge discovery, for instance.
Acknowledgements
This experiment would have been impossible without the help of several people who graciously agreed to install the script on their servers. I would like to acknowledge James Gunderson, Nicholas Mloukieh, David Saltzberg, Mingming Wu, friends at W3-Design, and especially Luke Tymowski. I am also grateful to Richard Ross for challenging questions and helpful suggestions.
References
1. Survey on Distributed Computing Networks - Networks of Workstations (http://www.cis.ohio-state.edu/~jain/cis788/dist_comp/index.html).
2. Developments in Distributed Computing (http://www-cad.eecs.berkeley.edu/~naji/litsearch/dist_sys.html).
3. Leonard Adleman, Molecular Computation of Solutions to Combinatorial Problems, Science 266, November 11, 1994.
4. J. N. Tsitsiklis and G. D. Stamoulis, On the Average Communication Complexity of Asynchronous Distributed Algorithms, JACM 42(2) March 1995.
5. The graph is the one used by Adleman in Ref. 3, except for the extra edge between nodes 0 and 2. The spurious edge arose from a transcription error when the paper was converted to HTML format. Adleman quotes a unique solution - 0->1->2->3->4->5->6 - he found by using DNA computers. The extra edge introduces a second solution, 0->2->1->3->4->5->6.