RE: LL ARQ on LD links [was: PILC: prioritization]

From: Reiner Ludwig (rludwig@huginn.CS.Berkeley.EDU)
Date: Wed Jan 27 1999 - 16:20:27 EST


Sanjoy,

At 11:59 AM 1/27/99 , Sanjoy Sen wrote:
>There is perhaps no need for the link layer ARQ mechanism to tell TCP to
>retransmit, since, as Phil has rightly pointed out, it can do the
>retransmission itself. BUT, there is always a need for the link layer to
>tell TCP NOT TO RETRANSMIT when it itself is doing retransmissions. One way
>to achieve the same goal is to stop the link layer retransmissions and let
>TCP recover. Question is when?

It is not possible to know that at the link layer unless the TCP sender
announces the RTT (or SRTT) e.g. through an IP header option (to stay
compatible with IPsec).

Also, as I wrote in an earlier mail, VJ header compression fails if you start
to toss packets at the link layer. As a result an entire flight of packets is
lost. The bad news is that VJ header compression does not work anyway when
IPsec is run over that link.

> The link layer retransmissions should be continued to the maximum possible
>amount of time in order to get the maximum benefit out of it, making sure
>that TCP's RTT doesn't timeout. But since this RTO estimation can be a
>highly variable quantity (thanks to some previous retransmissions over a
>GSM/CDMA radio link with ~ 100 ms delay and Jacobsen's algorithm), there
>seems to be a need of "link-layer awareness for TCP " or "TCP-layer
>awareness for the link layer", particularly for high FER conditions.

I want to strictly argue against "link-layer awareness for TCP " and also
against "TCP-layer awareness for the link layer". All known solutions that
follow either approach e.g. break with IPsec. Despite conventional wisdom I
strongly believe that a well engineered, _fully_ reliable wireless link
does in
general not interfere with TCP’s end-to-end error recovery. Moreover, we are
working on a new algorithm that can optionally be implemented at the TCP
sender
to further improve performance. The proposed mechanism eliminates the
retransmission ambiguity problem (e.g. by using timestamps or simply 2 bits in
the TCP header each way) and thereby allows the TCP sender to detect spurious
timeouts. In fact the algorithm uses spurious timeouts as an implicit
cross-layer signal to prevent excessive spurious retransmissions. Thus,
usually
only the first unacked packet will have be retransmitted unnecessarily
which is
the same overhead as any explicit signal.

I have described the basic idea of that algorithm in a recent mail to this
list. However, as I can't find that mail in the archive
(http://pilc.lerc.nasa.gov/pilc/list/archive/) I have attached the relevant
part
below.

///Reiner

>X-Sender: rludwig@pop.cs.berkeley.edu
>X-Mailer: QUALCOMM Windows Eudora Pro Version 4.0.2
>Date: Thu, 21 Jan 1999 13:00:03 -0800
>To: Phil Karn <karn@qualcomm.com>
>From: Reiner Ludwig <rludwig@cs.berkeley.edu>
>Subject: RE: PILC: prioritization
>Cc: border@hns.com, mallman@lerc.nasa.gov, pilc@lerc.nasa.gov,
vern@ee.lbl.gov,
> adfalk@mail.hac.com, sdawkins@nortelnetworks.com,
> rludwig@cs.berkeley.edu
>
> [Parts deleted]
>
>Spurious timeouts don't have to be that disastrous as they are. For that we
>are currently implementing a "spurious timeout detection mechanism" . The
>problem with spurious timeouts caused by excessive delays is that the
>retransmission ambiguity problem fools the sender into believing that an
>entire window got lost and he will retransmit it again. Above that these
>DUPPACKs will generate DUPACKs which will then trigger a spurious fast
>retransmit. The basic idea of our mechanism is to mark the packets to allow
>the sender to discriminate between an original ACK and an ACK for a
>retransmission (e.g. using timestamps or better the 2 bits discussed
>above). This will not prevent the first spurious retransmission, however,
>once the sender gets the _original_ ACK, i.e. the "spurious timeout"
>signal, it will know that it did the wrong thing. The sender can then use
>that "signal" and the timing derived from that late ACK to A. restore the
>congestion window to the vaule it had before the timeout occured and B. to
>update the RTO given the new measurement in order to hopefully prevent
>further spurious retransmissions. BTW, the same approach could also be used
>to detect spurious FastRetransmits after packet re-orderings > 3 packets.
>The latter is not new and has e.g. also been proposed by Sally Floyd in a
>private discussion.
>
> [Rest of the mail deleted]



This archive was generated by hypermail 2b29 : Mon Jan 28 2002 - 09:12:20 EST