The Network

The network is like a sequence of wires which transport packets among a small group of machines. Some machines sit on more than one of these wires, these machines are called gateways. To send a message to a machine, you need to decide how to get to that machine. It would be difficult for each machine to know how to get to any other machine it might want to talk to, since the network is constantly changing due to new networks being added, old networks breaking and becoming unavailable for transport for a while, and networks being replaced with different ones to meet changing needs.

It would be impractical for every machine on the network to keep up to date versions of this information about every network that it could possibly reach. Instead what is done is each machine only needs to know how to make the first step towards the destination. Usually this means that each machine knows how to get to the gateway which will put the packet on the right track towards the destination. The gateway knows how to get to the next gateway, and so on until the packet reaches the final gateway, and this machine sends it directly to the destination.

In Nachos, we will have a much simpler system. There will only be a small number of machines, and each machine has a direct link to every other machine, so there is no need for this kind of routing. To send a packet to a machine, specify which machine it is to go to and the kernel takes care of sending it to that machine.

This leaves us with the problem of identifying who on that machine is to receive the packet. This is a naming problem: how do we name something on a destination machine. While machine objects several unique identifiers (for instance, in Unix a process has a unique process identification), these names change over time. We want a more global name, so that we can send the message without inspecting the foreign name space.

The model we use is similar to the postal system. Lets say you want to send a letter to your friend Joe in New Jersey. Well, you can't just say Joe in New Jersey on the letter, since the post office doesn't know how to deliver that. Even if there were only one Joe in all of New Jersey, he might move around some (drive to work, go to a movie, play in some toxic waste) so the post office can't deliver the message directly to him. Rather what you do is specify some well known drop point, say a post office box. To send the letter to Joe you say Post Office Box 10202, Joisey City, NJ, and the post office moves the message to that box. Some time later, when convenient, Joe comes to the post office building and looks in his box for new letters.

Keep in mind that the only reason you know that sending a letter to a given PO Box (or address) will get to a specific individual is that you know that the person will look for messages in that place. There is no inherent connection between an address or a PO Box and the person who gets mail at that location in the postal system. If you were to give the correct name but the wrong PO box number, the post office would stick the message in the wrong box and Joe would never get the message.

Interprocess communication in Nachos works in a similar fashion. Each machine provides a number of message holders (called boxes in Nachos and sockets in Unix), which can be the destination of a message. These slots are identified by a number. When a process wants to get messages from other processes, it publishes which box number it will receive messages in, and waits for the messages to come there.

This is very much like how Unix works. For Unix there is a set of standard services, which do things like get the mail, report who is on the machine, do file transfer and remote login stuff, and so forth. Each of these is a process which accepts messages from other machines and does some kind of work. The problem is, how do you know where to send the message to get in touch with the service you are interested in. In Unix, there is a text file called /etc/services which is a list of the programs and the location at which they can be reached. This is how they publish their mail box number (kind of like a phone book).

Keep in mind that this just specifies an initial contact point. If a couple of processes want to initiate an extended communication session, then they will probably want to get their own pair of mail boxes to communicate through, so someone who wants to initiate a conversation won't be sending messages into the same box as they are using to carry out their conversation.

Remote login could work something like this. There is a remote login server which sits on, say, mail box 513. When someone tries to rlogin to the machine, this person sends a message to this box saying which box he is listening to and what he wants. The rlogin process forks itself and the child goes off to service this request, and the parent continues listening to the published box for more requests.

The child looks at the intro packet and determines what to do. The intro packet contains the box number on the remote machine, and the remote machine name, to talk to. The child doesn't want to continue using the published box for communication with the client since other people will be sending messages to that box to start other rlogin sessions. Instead it gets a new box that is unused and tells the client to send further communication to the new box. This is kind of like a business starting a new department. They will want to give the new department its own phone number and mail box, rather than having all communication come to one address where they will have to sort it according to whether it is for the old or the new business.

In this situation we have three boxes in use. One is the box the client is using to get messages from the rlogin server. One is the published communications end point that the rlogin server listens to for new connections. The third is a private box used on a per communications bases by the client and the server on the server machine, which is the destination of all of the messages from the client to the server except the first.