Computer Networking is becoming a bigger and bigger issue every day. It's a versitile and inexpensive way to share resources and trade data. This section addresses the basic OS issues involved in communicating between computers.

That's the same diagram from our discussion of the I/O system, only relabelled to represent a computer network. Some of the issues are remarkably similar. The system still has to address:
There are some significant differences between a network and a hardware system, though:
Because there are so many more elements connected sparsely, issues of naming, addressing, and routing become paramount. It's important to grasp the distinction between a name, an address, and a route.
Defining and allocating these entities is one of the most difficult parts of networking. That the Internet provides a global (in the purest sense of that word) naming, addressing, and routing system is nothing short of phenomenal.1 This is only possible because the system was designed to scale to global sizes. Even at that cracks are showing - the address space may not be large enough. Routing information taxes the ability of hardware to store and search it. And the one that we had working, naming, is under assault from lawyers.
Another basic concept that underlies networking is the protocol. A protocol is a set of rules that communicating entities follow in order to communicate meaningfully. For example, exchanging electronic mail requires a sequence of exchanges between the mailing machine and the receiving (or forwarding) machine. The mailer identifies itself, the reciever acknowledges it, the mailer tells who the mail is from, the receiver accepts or rejects the address, the mailer tells who the mail it to, and the reciever again accepts or rejects, and finally the message is exchanged and acknowledged. That set of rules is a protocol.
Protocols give rise to standards. A standard is a formal presentation of a protocol that has been sanctioned by some official body. For example the electronic mail protocol above has been sanctioned by the Internet Engineering Task Force (IETF). If your system claims to exchange RFC822-compliant mail2, it must follow those rules - this is called conforming to the standard. Of course, if your mailer doesn't conform to the standard but sends mail without losing any, the only thing that happens is that you can't put an RFC822-compliant sticker on it. However, because they represent an agreement between major practitioners of the field, conforming to standards provide a loose guarantee that systems interoperate.
another word that networkers use a lot is packet. A packet is like a disk block - it's an element of data exchanged between 2 hosts. Depending on the underlying hardware and its associated protocols they may be fixed or vairable length, and there are different maxima and minima for the various packet parameters. It's best to think of them as atomic elemets of networking, although as we'll see, that can be an illusion.
The final basic distinction to draw is between connection-oriented an dconnectionless communications. This is exactly the difference between the post office and the phone system. In the mail, two units of transmission (2 letters) have no relation to each other. If you send them to the same place on the same day, you can't usually tell what order they were sent, or even if they bear any relation to each other - even if sent between the same 2 people. There's no state that ties them together. On a phone call, the notion that the various transmissions (words, or different family members talking) have some relation that is encapsulated by the idea of a call. The words that go in one end of the phone are not arbitrarily reordered, for example.
There are networks that support both these paradigms. In fact each can be supported by the other electronic mail is connectionless in the sense that each piece of mail has no ordering relative to others, yet the mail is transferred using TCP, a connection-oriented protocol.
The seven layer model is the OSI (an international standards body) model for designing networking. As a tool for understanding the verious issues in networking, it's not bad. As a model for implementation, it's a recipe for a slow network. We'll use it to talk about protocol design, but think more seriously about what you're doing before you implement something this way.
The each level of the stack takes provides services to layers above it using building blocks provided by layers below it. Conceptually, this is very nice, but we'll see that some servcies are replicated and some don't fit neatly into a layer.
Conceptually, each layer adds a header to outgoing packets, and strips them off incoming packets before passing the packet up or down the stack as the case may be. Numbering layers from the bottom up, an outgoing packet would have headers:
| Layer 4 header | Layer 3 header | Layer 2 header | Layer 1 header |
| First bit transmitted -> | |||
The layers are (bottom to top): Physical, Link, Network, Transport, Session, Presentation, and Application.
The physical layer specifies the format of bits on the wire and what kinds of wire you can use. This is very nuts and bolts electrical (or optical!) engineering stuff, and I won't discuss it in any great detail.
Each type of hardware has it's own standard: there's an ethernet standard, a FDDI standard, an X.25 standard, and a bunch more. They tell you what kinds of hardware to buy, how far apart nodes have to be (or must be), and what you'd see if you hooked up an oscilliscope (or spectrum analyzer) to the medium.
The link layer describes the protocols used by communicating nodes connected by the same physical hardware. The scope of names, addresses and routes is therefore constrained.
In a shared medium network3, the link layer is responsible for medium access. Medium access is the process of determining which host has the right to send information on the shared medium. There are many ways to do this. Ethernet uses CSMA/CD (Carrier Sense Medium Access/Collision Detection), which means that each host listens to the shared line and doesn't send until th eline is silent. That's the CSMA; the CD is that even listening beforehand, its possible for 2 hosts far enough apart to haer the line clear, begin transmitting, and have their signals collide. If that happens, they both stop transmitting and remain silent for a random time period befor trying again. The time they remain silent gets geometrically larger.
Other medium access methods involve passing a token from host to host. Like the shell in Lord of the Flies, the token allows the holder the right to send uninterrupted. Tokens generally have a fixed lifetime so that a host can only transmit for a given time period before it is forced to relinquish the token and passs it to the next host. The protocols guarantee that evrey host gets the token eventually. FDDI (Fiber Distributed Digitial Interface) and token rings use tokens.
Link layers are also the first layer that detects (and potentially recovers from) transmission errors. This is usually accomplished by including a checksum in each packet. A checksum is a mathematical function that depends on the full contents of the packet, like the one-way functions used for authentication. Upon receiving the packet, a host will recompute the function (assuming the checksum field to be 0) and unless it gets the same answer as the packet contained, reject the packet.
Choosing a good checksum is a difficult tradeoff. The more effective the checksum is at detecting errors, the slower it is to calculate. Because the checksum must be calculated for every packet, the speed of calculating it can determine the network speed. The science of constructing efficient strong checksums is interesting in its own right.
Some link layers correct errors, either by labelling the packets and acknowledging each packet receipt and retransmitting packets if there is no acknowlegement in a reasonable time. Another approach is to send redundant data and reconstruct damaged packets. This idea is also used in disk drive arrays, like RAID. You'll hear more about it in CS 555.
The network layer connects hosts on different physical networks. It extends the ideas of addressing, naming, and routing to their global extreme. The headers added to the network layer are independent of the network hardware.
The network layer solves some difficult distributed problems, e.g., how to store routes from every host to every host efficiently. Actually, it just makes sure that certain routers in the network know enough to sent the packet the right general direction, with each router knowing more about its local area.
I don't have time to really address these problems in this class, but I strongly advise you to check out one of the networking classes to find out for yourself.
The transport layer provides link layer style guarantees at across the network layer. For example transport resends lost packets and prevents reordering. Similar techniques to the link layer are used for these.
Transport also provides demultiplexing within the computer systems. The network layer can name, address and route to a given computer. Within the computer, the transport layer provides a way to name, address and route to given processes.
Transport is also the layer that addresses global performance of the network, for example congestion control and resource allocations.
The session layer provides further multiplexing, control over which endpoint is sending data, and some checkpointing behavior. It's not often used. It's something of an open question if this functionality is important.
This layer is responsible for refoematting data between machines, and providing data-based semantics. Converting floating point formats between hosts or only returning packets that contain a given type field are things that fall under Presentation's umbrella.
Like the Session layer, Presentation isn't often used.
These are protocols designed to carry out some useful, conrete service. SMTP - the email protocol - is an application layer protocol. So is HTTP (although it's extending its tenticles into Session and Presentation as well).
These are also standardized; there are lengthy documents on what a valid HTTP request looks like or on what behavior an FTP server has to support. They're dry reading, but important to interoperability.
I probably won't have time to talk about these in detail, but other netwoking issues include: