Reducing Context Switches

The Problem
NFS, and many other RPC based network systems, use multiple processes
to provide concurrency for independent RPC requests. At high transfer
bandwidths context switching between those processes is a noticeable
component of the overhead.
The Solution
We reimplemented the RPC system used by NFS as a single process by
using a system similar to continuations in MACH. The system keeps
state for each outstanding request indicating the destination of data
that will be returned. One process manages retransmissions and other
aspects of data transfer.
Results
| |
| Window Size | Biod | Single Thread |
| Throughput | Context | Throughput | Context |
| (Mb/sec) | Switches | (Mb/sec) | Switches |
| |
| 2 | 37.8 | 8327 | 46.2 | 8227 |
| 3 | 41.0 | 9302 | 47.5 | 4951 |
| 4 | 42.5 | 8587 | 44.5 | 2529 |
| 5 | 44.1 | 9107 | 44.2 | 2226 |
| 6 | 45.1 | 8586 | 44.6 | 2242 |
| 7 | 45.3 | 9450 | 44.9 | 2227 |
| 8 | 44.0 | 9469 | 44.7 | 2220 |
| |
Averages of 25 writes of a 32 MB file using various window sizes (or
numbers of biods). The prototype hardware was
used with disk writes disabled. "Single thread" is the single threaded implementation, and
"NFS" is SunOS NFS. Both implementations have disk accessed
disabled at the server. Standard deviations are tenths of
megabits/sec or tens of context switches. Highlighted values show the
40% reduction in context switches (red) and 4%
throughput improvement (blue).
- The single threaded system shows a 4% improvement compared to the best
biod throughput, and a 40% improvement in context
switching when there is any improvement at all. The values compared
are circled in the table. The fact that there
is no improvement in context switching for a window size less than 2
in the single threaded system is because the pipeline between server
and client is almost always empty, so that the receiving process is
asleep between packets. As the window gets larger, the receiving
process can process more packets per context switch.
-
Converting to a single-threaded model enables further enhancements.
For example, the number of outstanding requests is now a variable in a
single process rather than a function of how many biod processes are
running. This potentially allows the ATOMIC file server to
dynamically react to changing network state more simply than a
multiprocess NFS implementation.

Go back to the ATOMIC-2 home page. / Go back to the ISI home page.
This page written and maintained by the
ATOMIC-2 group.
Please mail us any problems with or
comments about this page.
Last modified July 9, 1997