Re: Some More Concerns

From: Eric Travis ([email protected])
Date: Thu Apr 02 1998 - 14:51:38 EST


Tom,

I'll address the specifics of your issues shortly, but first:

> Much of the recent traffic in the TCPSAT WG has been
>concerned with the error rates of satellite links. There seem to be a
>school of thought that satellite links can be made to perform as well
>as fiber links by the use of FEC schemes and proper design of the
>RF components. Others have commented that due to uncontrolled
>natural occurrences that there will be occasional degradation of the
>I think the discussion on these issues is missing one big point.
>TCP is an end to end protocol, one that is affected by all possible
>degradation in the complete path. To focus in on just one link in the
>chain might be shortsighted. Reducing Loss on a satellite link will help
>reduce the overall loss in the path, but it does not necessarily
eliminate
>it.

(I'm responding mostly to your statement about "missing one big point")

I think you are misinterpreting or deprecating the significance of the BER
discussion. There's lots of previous context behind the amazing traffic
(content, not volume) of the past couple of weeks. The impact of delays
is a required element within the scope of constructing possible solution
spaces - the difficulty has been getting people to realize that mixed losses
need to hold a similar status.

The traffic of the past couple of weeks is significant, people need to
read what industry is saying; it's a reality check that should not be
ignored or deprecated.

If we address the mixed loss environment for satellites, we are also
doing good things for the wireless and mobile environments. Assuming away
the importance of mixed-losses certainly simplifies dealing with the
remaining issues, but it will yield academic rather than real-world
solutions to problems. Real world perspective is an important thing.

You are correct in pointing out that "focusing on only one link in the
chain is shortsighted", but ignoring one significant link is equally
myopic. Even one false congestion event in a long-fat path will have
disastrous effects on a connections performance.

Considering the impacts of long-delay paths is important. Attempting to
address the long-delay path without considering the impacts of
mixed-losses is short-sighted.

Now that I've gotten that out of my system:

> If we're still assuming that satellite and fiber are comparable in
>terms of error rates, then the key distinguishing feature of satellites
>(GEO in particular) is the longer latency of the link. We need to fully
>explore how long latency will effect performance in a shared network
>environment where we must assume some losses occur.

If you change the "the key distinguishing feature" to read "a
distinguishing feature" then you are absolutely correct. Sorry if I come
off as being pedantic, but the satellite environment is not a monolithic
entity.

Satellite environments can exhibit significant propagation delays,
but they also can exhibit intermittent connectivity (due to orbital
characteristics, FEC effects or terrain effects), varying propagation
delays, asymmetric paths and large bandwidth-delay products w/o large
delays (big LEOs).

The point is that the satellite environment can exhibit the same
characteristics as fiber, cable-modem, mobile, cellular and wireless
environments. Generalization of the problem space needs to take these
factors into consideration.

Specific comments:

    : TCP assumes all loss is due to congestion. How true is this
    : assumption? How would the use of ECN improve the way that
    : TCP responds to loss and congestion?

    Validity of the TCP's default assumption can only be addressed
    on a case-by-case basis. My *guess* would be that the assumption
    is:
       o good while traversing the Internet backbone
       o equally good for wired stub-networks
       o poor for portions of a path that encounter anything RF
         and/or mobile.

    Losses in RF paths can be due to burst-errors or lack of "visibility".
    Mixing paths that have fundamentally different loss characteristics
    might turn out to be a bad thing - but if we can't find a workable
    solution, I'm concerned that we will be forced to abandon end-to-end
    functionality. I'm not willing to sign off on that one just yet.

    I'm not sure that you can expect ECN to *improve* the way
    that TCP responds to loss, but it should help in reducing
    congestion based loss. Tossing segments simply to signal congestion
    wastes the work it took to get the packet to the bottleneck in the
    first place. This could represent a significant amount of effort or
    use of scarce resources. ECN allows you to signal congestion before
    it becomes bad enough that tossing packets is necessary - this is a
    HUGE win especially for my environments.

    An ECN should be treated just like a packet loss - reduce your
    transmission rate. The expected benefit is that it allows signaling
    before packet drops are necessary at the bottleneck. If this works
    out, you get shorter queues at the bottleneck and less incidence of
    congestion-based loss.

    The faster you can signal impending congestion, the more
    effective avoidance will be - so are we talking ECN via
    marking, or via something like Source Quench? Marking will
    take close to an RTT to be propagate to the source - if
    the marked segment is not dropped further downstream to
    the destination. A Source Quench should propagate quicker
    (but it is also unreliable), but it is more expensive to
    generate than a packet marking.

    For long-delay environments with large bandwidth-delay products,
    Source Quench seems like it might be worthwhile. Problems with
    Source Quench acknowledged (cost of generating quenches by an
    already stressed router) might be reduced if the router employs
    RED and generates the quench as it's notification.

    In environments where one can detect periods of corruption, it
    should be possible to provide explicit notification of these
    corruption events. This information can be exploited in a
    manner similar to ECN.

    : Is loss that IS caused by congestion distributed fairly? What
    : about non-TCP streams, are they reducing their throughput when
    : congestion is indicated? Some of the research on this issue
    : seem to indicate the answer is NO in both cases. What can be
    : done to correct this? Is RED the answer?

    How do non-TCP streams detect or signal congestion? I'd guess that
    the reason that many (not all, i.e., RTP) non-TCP streams are not
    using TCP is to avoid reacting to congestion-based losses. The
    distinction here is streams versus discrete transactions. Transactions
    are difficult to be made to obey congestion avoidance. There probably
    needs to be a stick to punish anti-social flows. RED seems like a
    good part of an answer, but charging for actual bandwidth usage
    is probably the only effective means of forcing cooperative behavior.
    I'd like to be mistaken about this though.

    : TCP recovers from congestion at a rate that is related to the RTT.

    : This would seem to give the advantage to low latency paths when
    : shared resources are in use. I have seen some suggestion for
    : changing the congestion recovery scheme of TCP so that it is no
    : longer linked to RTT but would increase by some constant rate. I
    : haven't seen any research on what the negative implications of
    : this might be.

    Lower latency paths *do* have a distinct advantage when contending
    for shared resources. Like paths that have very different loss
    characteristics, maybe sharing resources is a bad thing... same
    concerns apply about losing end-to-end viability as above.

    Can you provide a pointer here to this suggestion to which you are
    referring? It seems to me that in order to increase the congestion
    window at some constant rate would require some knowledge by the
    sender of what an acceptable (stable?) operating point should be...
    The nice thing about basing growth on RTT is that you have some
    clue as to cause and effect of cwnd growth. I'm just curious on
    the specifics of the proposal.

   : The only way to improve throughput without dealing with the
   : losses is to lower the latency of the link. This is not possible
   : on a GEO satellite without seriously breaking a few laws of physics.
   : What can be done is to give the APPEARANCE of lower latency
   : by spoofing acks. This approach has its limits though, it only
   : works well when the traffic is asymmetrical, it does nothing for
   : interactive applications, and may have a serious impact on
   : security. These need to be well documented so that those who
   : choose the spoofing route are fully aware of the risks as well as
   : the benefits.

   I agree with the notion that spoofs need to be documented for full
   disclosure and the requisite "Mr. Yuck!" sticker with regard to their
   risks - this is precisely the motivation behind the discussion at
   Tuesday's TCPSAT session in LA.

   In your particular example, there is an alternative to spoofing -
   splitting the end-to-end path length through use of a proxy. This
   really does reduce the latency in at least on of your feedback loops;
   the one whose path you know you can't control - the Internet.

> Let me finish by saying that the purpose of this message is not to
>criticize, but to generate some discussion on some issues that might
>need more attention. If any of my assumptions are way off base
>please let me know, but please don't send me mail telling me to
>check this RFC and that Research paper. I have probably already
>seen them and I am trying to generalize the issues, not detail the
>specifics.

Understood. Please don't take my response the wrong way either; I like
your issues and think their important. They each surface periodically
on this list and then die down. I'm attempting to capture these ideas
in a document hoping that will help focus the discussion.

In summary, generalizing the issues is a good thing - but remember, they
are applicable to more than just the satellite environment. At the same
time, I want to make sure that in the process of generalization there are
not any important factors aren't filtered out of the discussion and
considerations.

The problem is *not* one dimensional, treating it as such won't provide as
robust a solution as possible. We need to address TCP performance over
*all* the emerging environments.

Regards,

Eric



This archive was generated by hypermail 2b29 : Mon Feb 14 2000 - 16:14:38 EST