idnits 2.17.1 

draft-ietf-ippm-treno-btc-03.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Looks like you're using RFC 2026 boilerplate.  This must be updated to
     follow RFC 3978/3979, as updated by RFC 4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard

  == The page length should not exceed 58 lines per page, but there was 1
     longer page, the longest (page 1) being 344 lines


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an Abstract section.

  ** The document seems to lack a Security Considerations section.

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** There is 1 instance of too long lines in the document, the longest one
     being 1 character in excess of 72.

  ** There is 1 instance of lines with control characters in the document.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == Line 136 has weird spacing: '... of the  bound...'

  -- The document date (Feb 1999) is 9196 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'Mathis96' is mentioned on line 164, but not defined

  == Unused Reference: 'Jacobson88' is defined on line 308, but no explicit
     reference was found in the text

  == Unused Reference: 'Mathis97b' is defined on line 322, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC2001' is defined on line 326, but no explicit
     reference was found in the text

  -- Possible downref: Non-RFC (?) normative reference: ref. 'Jacobson88'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'Mathis97a'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'Mathis97b'

  ** Obsolete normative reference: RFC 2001 (Obsoleted by RFC 2581)


     Summary: 9 errors (**), 0 flaws (~~), 7 warnings (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	  INTERNET-DRAFT           Expires Aug 1999             INTERNET-DRAFT

3	  Network Working Group                                    Matt Mathis
4	  INTERNET-DRAFT                      Pittsburgh Supercomputing Center
5	  Expiration Date:  Aug 1999                                  Feb 1999

7	                     TReno Bulk Transfer Capacity

9	                 < draft-ietf-ippm-treno-btc-03.txt >

11	  Status of this Document

13	     This document is an Internet-Draft and is in full conformance with
14	     all provisions of Section 10 of RFC2026.

16	     Internet-Drafts are working documents of the Internet Engineering
17	     Task Force (IETF), its areas, and its working groups.  Note that
18	     other groups may also distribute working documents as
19	     Internet-Drafts.

21	     Internet-Drafts are draft documents valid for a maximum of six
22	     months and may be updated, replaced, or obsoleted by other
23	     documents at any time.  It is inappropriate to use Internet-
24	     Drafts as reference material or to cite them other than as
25	     "work in progress."

27	     The list of current Internet-Drafts can be accessed at
28	     http://www.ietf.org/ietf/1id-abstracts.txt

30	     The list of Internet-Draft Shadow Directories can be accessed at
31	     http://www.ietf.org/shadow.html.

33	  Abstract:

35	    TReno is a tools to measure Bulk Transport Capacity (BTC) as defined
36	    in [ippm-btc-framework].  This document specifies specific details
37	    of the TReno algorithm as require by the BTC framework document.

39	  2. Introduction:

41	    This memo defines a Bulk Transport Capacity (BTC) based on the TReno
42	    (``tree-no'') diagnostic [Mathis97a].  It builds on notions
43	    introduced in the BTC framework document [ippm-btc-framework] and
44	    the IPPM Framework document, RFC 2330 [@@]; the reader is assumed to
45	    be familiar with both documents.

47	    The BTC framework document defines pure Congestion Avoidance
48	    Capacity (CAC) as the data rate (bits per second) of the Congestion
49	    Avoidance algorithm, subject to the restriction that the
50	    Retransmission Timeout and Slow-Start algorithms are not invoked.
51	    In principle a CAC metric would be an ideal BTC metric, but there
52	    are rather substantial difficulty with using it as such.  The
53	    Self-Clocking of the Congestion Avoidance algorithm can be very
54	    fragile, depending on the specific details of the Fast Retransmit,
55	    Fast Recovery or other advanced recovery algorithms.  When TCP
56	    looses Self-Clock it is reestablished through a retransmission
57	    timeout and Slow-Start.  These algorithms nearly always take more
58	    time than Congestion Avoidance would have taken.

60	    The TReno program implements BTC, CAC and ancillary metrics.  The
61	    ancillary metrics are designed to instrument all network events that
62	    might cause discrepancies between an ideal CAC metric and the TReno
63	    BTC, other BTC metrics or real TCP implementations.

65	    We use this multiple metrics approach because the CAC metric is more
66	    suitable for analytic modeling while the BTC metrics is more suited
67	    to applied measurement.  We believe that future research will lead
68	    to a strong analytic framework (A-frame) [ippm-btc-framework] that
69	    will result in understanding the relationship between CAC metrics
70	    and other metrics, including simple metrics (delay, loss) as well as
71	    the various different BTC metrics and TCP implementations.

73	  3. The TReno BTC Definition

75	  3.1. Metric Name:

77	    TReno-Type-P-Bulk-Transfer-Capacity

79	  3.2. Metric Parameters:

81	   +  Src, the IP address of a host

83	   +  Dst, the IP address of a host

85	   +  Initial Maximum Segment size

87	   +  a test duration

89	   +  T, a time

91	  3.3. Metric Units:

93	    Bits per second

95	  3.4. Definition:

97	    The average data rate attained by the TReno program over the path
98	    under test.

100	  3.5 Congestion Control Algorithms

102	    The BTC framework document [ippm-btc-framework] makes the
103	    observation that the standard specifying congestion control
104	    algorithms [RFC2001.bis] allows more latitude in their
105	    implementation than is appropriate for a metric.  Some of the
106	    details of the congestion control algorithms that are left to the
107	    discretion of the implementor must be fully specified in a metric.

109	  3.5.1 Congestion Avoidance details

111	    TReno computes the window size in bytes.  Each acknowledgment opens
112	    the congestion window (cwnd) by MSS*MSS/cwnd bytes.  The actual
113	    number of outstanding bytes in the network is always an integral
114	    number of segments such that the total size is less than or equal to
115	    cwnd.

117	    @@@ the framework needs to require that delayed Acks emulation be
118	    specified.

120	    When a loss is detected the window is reduced using a algorithm that
121	    sends one segment per two acknowledgments for exactly one round trip
122	    (as determied by sequence numbers).  This reduces the window to
123	    exactly half of the data that was actually held by the network at
124	    the time the first loss was detected.  This algorithm, called
125	    Rate-Halving, is described in detail in a separate technical note
126	    [facknote].   The new cwnd will be (old_cwnd - loss)/2.

128	    The technical not also describes an additional group of algoritms,
129	    collectivly called bounding parameters, that assure that rate
130	    halving always arrives at a reasonable congestioin window, even
131	    under pathological conditions.  The bounding parameter algorithms
132	    have no effect on TReno under normal conditons.  If the bounding
133	    parameters are invoked, they are instrumented and an exceptional
134	    network event.

136	    The one of the  bounding parameters is to set ssthresh to 1/4 of
137	    the pre-recovery cwnd.  Thus recovery normally ends with cwnd larger
138	    than ssthresh, so TReno does not do a one segment slow-start as
139	    permitted by RFC2001.  However, if more than half a window of data
140	    was lost, rate having can arrive at a new cwnd which is smaller than
141	    ssthresh, resulting in a slow-start up to ssthresh (which would be
142	    1/4 the prior value of cwnd).

144	  3.5.2 Retransmission Timeouts

146	    The current version of TReno does not include an accurate model for
147	    the TCP retransmission timer.  Under nearly all normal conditions
148	    the timers in TReno are much more conservative than real TCP
149	    implementations.  TReno takes the view that timeouts indicate a
150	    failure to attain a CAC measurement, which an abnormality in the
151	    network that should be diagnosed.    TReno doem not experience
152	    timeouts unless an entire window of data is lost.

154	  3.5.3 Slow-Start

156	    TReno invokes Slow-start if cwnd is equal to or less than ssthresh.
157	    Unlike most TCP implementations this condition is not normally true
158	    at the end of recovery.

160	  3.5.4 Advanced Recovery Algorithms

162	    The algorithm used by TReno to emulate the TCP reassembly queue
163	    naturally emulates SACK [RFC2018] with the Forward Acknowledgment
164	    Algorithm [Mathis96] as updated by [facknote].

166	  3.5.5 Segment Size

168	    TReno can dynamicly discover the correct Maximum Segment Size through
169	    path MTU discovery.   A smaller MTU can be explicitly selected.

171	  3.6 Ancillary results:

173		@@@ expand

175	        - Statistics over the entire test
176	          (data transferred, duration and average rate)
177	        - Statistics over the Congestion Avoidance portion of the test
178	          (data transferred, duration and average rate)
179	        - Path property statistics (MTU, minimum RTT, maximum congestion
180	          window during Congestion Avoidance and during Slow-start)
181	        - Direct measures of the analytic model parameters (Number
182	          of congestion signals, average RTT)
183	        - Indications of which TCP algorithms must be present to
184	          attain the same performance.
185	        - The estimated load/BW/buffering used on the return path
186	        - Warnings about data transmission abnormalities.
187	          (e.g. packets out-of-order, events that cause timeouts)
188	        - Warnings about conditions which may affect metric
189	          accuracy. (e.g. insufficient tester buffering)
190	        - Alarms about serious data transmission abnormalities.
191	          (e.g. data duplicated in the network)
192	        - Alarms about internal inconsistencies of the tester and
193	          events which might invalidate the results.
194	        - IP address/name of the responding target.
195	        - TReno version.

197	  3.7 Manual calibration checks:

199	    The following discussion assumes that the TReno diagnostic is
200	    implemented as a user mode program running under a standard
201	    operating system.  Other implementations, such as those in dedicated
202	    measurement instruments, can have stronger built-in calibration
203	    checks.

205	  3.7.1 Tester performance

207	    Verify that the tester and target have sufficient data rates to
208	    sustain the test.

210	    The raw performance (data rate) limitations of both the tester and
211	    target should be measured by running TReno in a controlled
212	    environment (e.g. a bench test).  Ideally the observed performance
213	    limits should be validated by determining the nature of the
214	    bottleneck and verifying that it agrees with other benchmarks of the
215	    tester and target (e.g. That TReno performance agrees with direct
216	    measures of backplane or memory bandwidth or other bottleneck as
217	    appropriate).  Currently no routers are reliable targets, although
218	    under some conditions they can be used for meaningful measurements.
219	    When testing between a pair of modern computer systems at a few
220	    megabits per second or less, the tester and target are unlikely to
221	    be the bottleneck.

223	    TReno may be less accurate at average rates above half of the known
224	    tester or target limits.  This is because during the initial
225	    Slow-start TReno needs to send bursts which are twice the average
226	    data rate.

228	    Likewise, if the link to the first hop is not more than twice as
229	    fast as the entire path, some of the path properties such as max
230	    congestion window during Slow-start may reflect the testers link
231	    interface, and not the path itself.

233	  3.7.2 Tester Buffering

235	    Verify that the tester and target have sufficient buffering to
236	    support the window needed by the test.

238	    If they do not have sufficient buffer space, then losses at their
239	    own queues may contribute to the apparent losses along the path.
240	    There are several difficulties in verifying the tester and target
241	    buffer capacity.  First, there are no good tests of the targets
242	    buffer capacity at all.  Second, all validation of the testers
243	    buffering depends in some way on the accuracy of reports by the
244	    tester's own operating system.  Third, there is the confusing result
245	    that under many circumstances (particularly when there is much more
246	    than sufficient average tester performance) insufficient buffering
247	    in the tester does not adversely impact measured performance.

249	    TReno reports (as calibration alarms) any events in which transmit
250	    packets were refused due to insufficient buffer space.  It reports a
251	    warning if the maximum measured congestion window is larger than the
252	    reported buffer space.  Although these checks are likely to be
253	    sufficient in most cases they are probably not sufficient in all
254	    cases, and will be the subject of future research.

256	    Note that on a timesharing or multi-tasking system, other activity
257	    on the tester introduces burstiness due to operating system
258	    scheduler latency.  Since some queuing disciplines discriminate
259	    against bursty sources, it is important that there be no other
260	    system activity during a test.  This should be confirmed with other
261	    operating system specific tools.

263	  3.7.3 Return Path performance

265	    Verify that the return path is not a bottleneck at the load needed
266	    to sustain the test.

268	    In ICMP mode TReno measures the net effect of both the forward and
269	    return paths on a single data stream.  Bottlenecks and packet losses
270	    in the forward and return paths are treated equally.

272	    In traceroute mode, TReno computes and reports the load it
273	    contributes to the return path.  Unlike real TCP, TReno can not
274	    distinguish between losses on the forward and return paths, so
275	    ideally we want the return path to introduce as little loss as
276	    possible.  A good way to test to see if the return path has a large
277	    effect on a measurement is to reduce the forward path messages down
278	    to ACK size (40 bytes), and verify that the measured packet rate is
279	    improved by at least factor of two.  [More research is needed.]

281	 3.8 Discussion:

283	    There are many possible reasons why a TReno measurement might not
284	    agree with the performance obtained by a TCP-based application.
285	    Some key ones include: older TCPs missing key algorithms such as MTU
286	    discovery, support for large windows or SACK, or miss-tuning of
287	    either the data source or sink.  Network conditions which require
288	    the newer TCP algorithms are detected by TReno and reported in the
289	    ancillary results.  Other documents will cover methods to diagnose
290	    the difference between TReno and TCP performance.

292	    People using the TReno metric as part of procurement documents
293	    should be aware that in many circumstances MTU has an intrinsic
294	    and large impact on overall path performance.  Under some
295	    conditions the difficulty in meeting a given performance
296	    specifications is inversely proportional to the square of the
297	    path MTU.  (e.g. Halving the specified MTU makes meeting the
298	    bandwidth specification 4 times harder.)

300	    When used as an end-to-end metric TReno presents exactly the same
301	    load to the network as a properly tuned state-of-the-art bulk TCP
302	    stream between the same pair of hosts.  Although the connection
303	    is not transferring useful data, it is no more wasteful than
304	    fetching an unwanted web page with the same transfer time.

306	  References

308	    [Jacobson88] Jacobson, V., "Congestion Avoidance and Control",
309	    Proceedings of SIGCOMM '88, Stanford, CA., August 1988.

311	    [mathis96] Mathis, M. and Mahdavi, J. "Forward acknowledgment:
312	    Refining TCP congestion control",  Proceedings of ACM SIGCOMM '96,
313	    Stanford, CA., August 1996.

315	    [RFC2018] Mathis, M., Mahdavi, J. Floyd, S., Romanow, A., "TCP
316	    Selective Acknowledgment Options", 1996 Obtain via:
317	    ftp://ds.internic.net/rfc/rfc2018.txt

319	    [Mathis97a] Mathis, M., TReno source distribution, Obtain via:
320	    ftp://ftp.psc.edu/pub/networking/tools/treno.shar

322	    [Mathis97b] Mathis, M., Semke, J., Mahdavi, J., Ott, T.,
323	    "The Macroscopic Behavior of the TCP Congestion Avoidance
324	    Algorithm", Computer Communications Review, 27(3), July 1997.

326	    [RFC2001] Stevens, W., "TCP Slow Start, Congestion Avoidance,
327	    Fast Retransmit, and Fast Recovery Algorithms",
328	    ftp://ds.internic.net/rfc/rfc2001.txt

330	    [facknote] Mathis, M., Mahdavi, M., TCP Rate-Halving with Bounding
331	    Parameters http://www.psc.edu/networking/papers/FACKnotes/current/

333	  Author's Address

335	    Matt Mathis
336	    email: mathis@psc.edu
337	    Pittsburgh Supercomputing Center
338	    4400 Fifth Ave.
339	    Pittsburgh PA 15213