Re: SCPS-TP, TCP and Rate-Based Protocol Evaluation

From: Eric Travis ([email protected])
Date: Fri Jun 21 2002 - 17:47:00 EDT

  • Next message: Daniel Shell: "Re: The (perceived) requirement to be an IETF Standard"

    William Ivancic wrote:

    >>2 Rate-based protocols are advisable for *any* environment
    >>where bandwidth reservation is practical and available.
    >
    >Will include with as: Rate-based protocols may be applicable for *any*
    >environment where bandwidth reservation is practical and available

    I know that I'm being a nudge but - within a *space environment*
    rate-based protocols really (honestly) are always advisable from a
    performance perspective:

       The biggest performance hit one takes from a well tuned
       TCP operating in a benign environment (no congestion,
       no corruption) is the cost of the initial bandwidth estimation.

       There may indeed be stable/safe alternatives to slow-start
       (an excellent research area), but they can never be cheaper
       to any flow than a priori knowledge (read: not having to probe
       at all).
     
       Once a source of loss is introduced, the benefits of a
       rate-based approach become even more pronounced.

    Hence, when one has bandwidth reservation available to them
    and timeliness of delivery is critical (pretty much a given
    if one is being as wasteful as providing dedicated bandwidth),
    then rate-based mechanisms will *always* preferable.

    >>3 The performance problems experienced with the rate-based
    >> protocols was most likely due to accidental misconfiguration
    >> of the protocols, causing them to congest the network.
    >
    >I wish this were the case, but we configured and tuned these to
    >the best of our capabilities and consulted with the developers
    >(Mitre, NRL, etc...). The problem appears to be (at least for MDP)
    >that the receiving system choked. It can not process fast enough
    >due to the algorithm and implementation of the protocol.

    OK, but the proper tuning is only as good as the knowledge of the
    effective end-to-end capacity of the path.

    When examined together, the performance graphs (slides 12, 14, 15,
    18 and 19) still strongly suggest that the intermediate buffering
    in the path can only support flows of 35-40Mbps when subjected to
    a 500ms rtt.

    In fact, the measurements from the in-kernel TCP (slide 12,
    in particular the 100MB file results) are what are causing my
    stubbornness in believing that the capacity was actually somewhat
    lower than the assumed 100Mbps.

    If you've gone through footnote one, items 1 & 2, verified the
    end-to-end path was capable of supporting flows greater than
    35-40 Mbps with an rtt of 500ms and the tcpdump traces indeed
    verify that all the packets were actually making it to the
    receiving host's side of the Cisco 7100 than that's just the way
    it is...

    But - out of curiosity, if you were hook the two test hosts
    together with just a cross-over cable (bypass the routers and SX/14),
    do they still choke at 35 - 40 Mbps?

    I confess that the reason I find the data troubling is because I
    routinely run userspace protocols that can achieve about 94Mbps of
    throughput in now modest hardware (333MHz Pentium II) using the
    same algorithms (and even some of the same code) as in the SCPS-TP
    implementation you tested, and so I'm slightly reluctant to believe
    that the bottleneck was ~35Mbps on hosts capable of sinking an
    in-kernel 65Mbps TCP connection (though those hosts should have
    been able to achieve ~ 94Mbps performance over 100baseT).

    The cost of crossing the kernel/userspace boundary is really not
    that high any more. Userspace protcols can run at very high speeds,
    so the fact that the MDP and SCPS implementations couldn't handle
    the load is surprising.
     
    >>4 With proper configuration, all the rate-based protocols
    >> could have performed better in the errored environments.
    >>
    >> - An increase in error frequency has the same effect as
    >> increasing the rtt, and hence changes the bandwidth/
    >> delay product
    >>
    >> - The measured curves illustrate the effects of window
    >> limiting rather than the expected error performance
    >> curve decay
    >>
    >> - The buffer sizes used on the rate-based TCPs and MDP
    >> should have been re-tuned to reflect the effective
    >> bandwidth-delay product
    >
    >
    >Same comment as for item 4. We tuned these for best performance
    >at Zero BER. It is impractical to expect one to constantly, manually
    >tune a protocol for a given time-varying unknown link condition.
    >We believe our tuning was valid for today's protocols. That being
    >said, I believe work needs to be done for autotuning as I believe it
    >is impractical and unreasonable to expect the general users of systems
    >to possess the knowledge to tune each protocol for a particular
    >environment. Obviously a research area with lots of work being done
    >here.

    I agree, the tuning was indeed valid for the Zero BER case...

    However, if one of the goals of the effort was to measure where the
    protocols break, then it really makes sense to try and not artificially
    limit their performance - or at least note the reason(s) for the
    performance degradation...

    Practically speaking, if one can expect the end-users to have *some*
    knowledge of the end-to-end path (say, tuning for a 500ms rtt) then at
    least in *space environments* it is probably reasonable for them to
    also know the expected error characteristics of the path...

    The mission folks have this data readily available.

    Now, I absolutely agree that it is indeed unreasonable to expect a
    general user to be able to - or be willing to optimally tune their
    protocols (and frankly, it is part of our business model);

    To stay on point, as the space environment is more managed
    (and likely to stay that way for some time) than a general user
    environment, the expected error characteristics of the space-link
    will be known well in advance and the hosts *can* be tuned
    appropriately.

    Auto-tuning is enticing (we've experimented a bit with it and our
    'terrestrial TCP over satellite paths' customers want it so we still
    have notions of adding it to our code-base someday), but I have
    lingering doubts as to whether any auto-tuning mechanism can tune
    fast-enough to still be relevant for any particular flow or valid
    across multiple time-disjoint flows.

    In the real world I always advise folks to tune their rate-based
    buffers to conform to an expected *worst case* expected error
    conditions (at least in the case where the rate-based protocols
    are using a flow control window to prevent overwhelming the receiver),
    it is perfectly safe to over provision those buffers.

    >>6 Even with equal performance, the deployment of a
    >> rate-based TCP is more desirable than the deployment
    >> of MDP or other non-TCP protocols when unicast data
    >> delivery is the goal.
    >>
    >> - The rate-based TCP is a sending-side only modification
    >> - All "standard" TCP applications can be deployed
    >> without modification
    >
    >Well, I don't consider a pure rate-base, not congestion control
    >protocol TCP for reasons I previously stated - even though the
    >protocol header is the same as TCP. However, an in-kernel reliable
    >rate-based protocol may be more desirable for unicast data than an
    >application level rate-based protocol. There is definitely room
    >for debate here. This is a philosophical debate. I see merit on
    >both sides and I don't really care to carry on such a debate here.
    >IMHO, neither answer is right or wrong and both are right and wrong
    >depending on the situation.
    >
    >I'll consider this, but I definitely cannot accept it as worded.

    Not to be a persistent pain, RFC 793 is still the root definition
    of TCP and it is not a "congestion control protocol". I'm *not*
    saying that is a good thing, but just that it is.

    A rate-paced TCP interoperates perfectly with any other implementation
    of TCP (paced or not). The protocol definition covers the protocol
    header format, the interaction of the pairwise state machines and
    the end-to-end semantics; The behavior of the bits over the ether
    in response to external stimulus is something else.

    Perhaps it would be difficult to accept a purely rate-based TCP
    as a "TCP friendly" or "congestion friendly" protocol, but it
    is still TCP.

    Remember, a purely rate-paced system *does* have congestion control
    but it is only suitable for use over bandwidth reserved paths:

       doing slow-start just makes no sense in these environments,
       and if you experience loss over your reserved paths, you
       probably don't want to trigger a congestion response.

    However, the important point is that it is better in general
    to use a protocol with TCP's end-to-end semantics rather than
    having to write new applications for new protocols. Having
    a single application that works over a reserved bandwidth
    deployment or a shared resource environment is preferable
    to having two different applications - and needing to know
    which one to use depending on who you are interacting with...

    The obvious advantage for a rate-paced TCP is that the receiving
    host need not even be cognizant that it's peer is rate-paced.

    As for user-space versus kernel-space implementation, the real
    requirement is for multiple entities to share the same token-bucket(s).
    The cause of your multi-streams testing grief was that the SCPS
    Reference implementation really wasn't built to support that - it
    was originally for a spacecraft were we were providing the OS...
    It wasn't expected to be used that way - even for experimentation.

    Generally, if you are using the SCPS Reference Implementation
    and want/need to support multiple application, the best way is
    to configure it as a transparent gateway.

    Eric



    This archive was generated by hypermail 2b29 : Fri Jun 21 2002 - 17:53:39 EDT