TCP's RTO & RTT-Spikes [was Re: large RTT variation caused by bandwidth oscillation]

From: Reiner Ludwig (Reiner.Ludwig@ericsson.com)
Date: Mon Jan 21 2002 - 16:33:33 EST


Hi Kacheong,

I'm slowly working through my e-mail backlog ...

At 04:16 09.01.2002, Kacheong Poon wrote:
>There is one comment specifically saying that Solaris does not
>comply with Section 5, rule 5.3. It is correct, Solaris does
>not implement the part of restarting timer in rule 5.3. The timer
>is restarted when a fast retransmit happens or during the fast
>recovery phase when missing segments are retransmitted.
>
>Rule 5.3 simply says that the RTO calculation is not good enough so
>that implementations should add a little fudge factor to it. For
>example, if 2 segments are sent and the receiver does not delay
>ack'ing. The second segment is dropped. After the ACK for the
>first segment arrives, rule 5.3 suggests that the timer should
>be restarted. This means that the actual timeout value for the
>second segment is RTO+RTT. The arrival of first data segment
>correlates weakly, if there is any correlation, to the fate of the
>next segment. This point has been mentioned by a lot of other people.

I thought the same for quite a while but our recent research showed that
rule 5.3 in RFC2988 is often an important (although fairly imprecise)
safeguard.

Imagine the case of a bandwidth dominated link where, e.g., the packet
transmission delay across the bottleneck link dominates the RTT. This is
typical for paths across wide-area wireless access. When the queue starts
to build up at the bottleneck link, the RTT will increase significantly.
Thus, when the RTT for packet N has increased significantly compared to
SRTT, chances are that the RTT for packet N+1 will also be increased. Rule
5.3 implicitly factors this in. We found that this often saves you from a
spurious timeout, because the SRTT and RTTVAR are often not sensitive
enough to account for such RTT spikes (due to the gains 1/4 and 1/8).

[Side note: This example also argues in favor of timing every segment
across bandwidth dominated paths. This is because an inreasing RTT is
sooner reflected in an increased RTO. Problem: How does the sender know
that it is running across a bandwidth dominated path?]

See also section 3.3 in our paper in CCR Vol. 30 No. 3, July 2000. At that
time, we thought it would be better to change rule 5.3 as follows:

(1) REXMT = RTO - 'age of oldest outstanding segment';

When timing every packet, we also changed the gains (1/4 for SRTT, and 1/8
for RTTVAR) to the single factor 1/(4 x flightsize). This is to avoid that
the estimators (SRTT and RTTVAR) lose their memory too quick. With that
change in place, we made the estimators even less sensitive to RTT spikes.
The effect was that when an RTT spike occured, 'age of oldest outstanding
segment' became very large, and REXMT became very low (often too low) to
due rule (1).

To solve that our current RTO looks like this:

(2) RTO = MAX(SRTT, RTT) + MAX((K x RTTVAR), (2 x G)); with RTTVAR defined
as given in the mentioned CCR paper.

When timing every packet, we chose K = 2, and find pretty good timer
performance.

>[...] The current TCP RTO algorithm is not designed to
>handle this kind of wireless environment. And we all know that the
>current assumption of the RTO calculation is that the round trip route
>does not change much. So it seems to me that we may need a better way
>to deal with this in the RTO calculation.

I doubt that we will find an RTO that will work fine under all conditions.
Instead, I agree with Mark Allman that an RTO should be "reset" to be more
conservative "for a while" after a spurious timeout has been detected. This
is the link to the Eifel algorithm (which at this point is a pure detection
scheme) and a corresponding response algorithm (revert CC state & adapt RTO).

>I have another minor comment about what actually happens after a timeout.
>I believe all modern TCP implementations will not have the false fast
>retransmit after timeout problem mentioned in this thread.

I'm not quite sure what you mean. Please explain this.

///Reiner



This archive was generated by hypermail 2b29 : Mon Jan 28 2002 - 09:12:29 EST