idnits 2.17.1 

draft-ietf-dccp-rfc3448bis-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 19.

  -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on
     line 1578.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1589.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1596.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1602.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust Copyright Line does not match the
     current year

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (4 March 2007) is 6263 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Obsolete informational reference (is this intentional?): RFC 2140
     (Obsoleted by RFC 9040)

  -- Obsolete informational reference (is this intentional?): RFC 2988
     (Obsoleted by RFC 6298)

  -- Obsolete informational reference (is this intentional?): RFC 3448
     (Obsoleted by RFC 5348)


     Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 10 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Internet Engineering Task Force                               M. Handley
2	INTERNET-DRAFT                                 University College London
3	Intended status: Proposed Standard                              S. Floyd
4	Expires: September 2007                                             ICIR
5	                                                               J. Padhye
6	                                                               Microsoft
7	                                                               J. Widmer
8	                                                  University of Mannheim
9	                                                            4 March 2007

11	        TCP Friendly Rate Control (TFRC): Protocol Specification
12	                   draft-ietf-dccp-rfc3448bis-01.txt

14	Status of this Memo

16	    By submitting this Internet-Draft, each author represents that any
17	    applicable patent or other IPR claims of which he or she is aware
18	    have been or will be disclosed, and any of which he or she becomes
19	    aware will be disclosed, in accordance with Section 6 of BCP 79.

21	    Internet-Drafts are working documents of the Internet Engineering
22	    Task Force (IETF), its areas, and its working groups.  Note that
23	    other groups may also distribute working documents as Internet-
24	    Drafts.

26	    Internet-Drafts are draft documents valid for a maximum of six
27	    months and may be updated, replaced, or obsoleted by other documents
28	    at any time.  It is inappropriate to use Internet-Drafts as
29	    reference material or to cite them other than as "work in progress."

31	    The list of current Internet-Drafts can be accessed at
32	    http://www.ietf.org/ietf/1id-abstracts.txt.

34	    The list of Internet-Draft Shadow Directories can be accessed at
35	    http://www.ietf.org/shadow.html.

37	    This Internet-Draft will expire on September 2007.

39	Copyright Notice

41	    Copyright (C) The IETF Trust (2007).

43	Abstract

45	    This document specifies TCP-Friendly Rate Control (TFRC).  TFRC is a
46	    congestion control mechanism for unicast flows operating in a best-
47	    effort Internet environment.  It is reasonably fair when competing
48	    for bandwidth with TCP flows, but has a much lower variation of
49	    throughput over time compared with TCP, making it more suitable for
50	    applications such as streaming media where a relatively smooth
51	    sending rate is of importance.

53	Table of Contents

55	    1. Introduction ...................................................8
56	    2. Conventions ....................................................9
57	    3. Protocol Mechanism .............................................9
58	       3.1. TCP Throughput Equation ..................................10
59	       3.2. Packet Contents ..........................................11
60	            3.2.1. Data Packets ......................................12
61	            3.2.2. Feedback Packets ..................................12
62	    4. Data Sender Protocol ..........................................13
63	       4.1. Measuring the Segment Size ...............................13
64	       4.2. Sender Initialization ....................................14
65	       4.3. Sender behavior when a feedback packet is received .......15
66	       4.4. Expiration of nofeedback timer ...........................16
67	       4.5. Sending a packet after an idle or data-limited period ....17
68	       4.6. Preventing Oscillations ..................................17
69	       4.7. Scheduling of Packet Transmissions .......................18
70	    5. Calculation of the Loss Event Rate (p) ........................19
71	       5.1. Detection of Lost or Marked Packets ......................19
72	       5.2. Translation from Loss History to Loss Events .............20
73	       5.3. Inter-loss Event Interval ................................21
74	       5.4. Average Loss Interval ....................................22
75	       5.5. History Discounting ......................................23
76	    6. Data Receiver Protocol ........................................25
77	       6.1. Receiver behavior when a data packet is received .........26
78	       6.2. Expiration of feedback timer .............................26
79	       6.3. Receiver initialization ..................................27
80	            6.3.1. Initializing the Loss History after the First Loss
81	            Event ....................................................28
82	    7. Sender-based Variants .........................................29
83	    8. Implementation Issues .........................................30
84	    9. Changes from RFC 3448 .........................................31
85	    10. Security Considerations ......................................32
86	    11. IANA Considerations ..........................................32
87	    12. Acknowledgments ..............................................33
88	    13. Terminology ..................................................33
89	    14. Normative References .........................................35
90	    15. Informational References .....................................35
91	    16. Authors' Addresses ...........................................36
92	    Full Copyright Statement .........................................37
93	    Intellectual Property ............................................38
94	    NOTE TO RFC EDITOR: PLEASE DELETE THIS NOTE UPON PUBLICATION.

96	     Changes from draft-ietf-dccp-rfc3448bis-00.txt:

98	     * When initializing the loss history after the first
99	       data packet sent is lost or ECN-marked, TFRC uses
100	       a minimum receive rate of 0.5 packets per second.

102	     * For initializing the estimated packet drop rate
103	       for the first loss interval when coming out of slow-start,
104	       it is ok to use the maximum receive rate so far, not just
105	       the receive rate in the last round-trip time.
106	       Feedback from Ladan Gharai.

108	     * General feedback from Gorry Fairhurst:
109	       - Added a reference for TFRC-SP.
110	       - Clarified that R_m is sender's estimate of RTT, as reported
111	         in Section 3.2.1.
112	       - Added a definition of terms.
113	       - Added a discussion of why the initial value of the nofeedback
114	         timer is two seconds, instead of three seconds for the
115	         recommended initial value for TCP's retransmit timer.

117	     * General feedback from Arjuna Sathiaseelan:
118	       - Added more details about sending multiple feedback
119	          packets per RTT.
120	       - Added change to Section 4.3 to use the first feedback
121	          packet, or the first feedback packet after a
122	          nofeedback timer during slow-start, *if min_rate > X*.

124	     * General feedback from Gerrit Renker:
125	       - Changed "delta" to "t_delta".
126	       - Changed X_calc to X_Bps, clarified X.
127	       - Clarified send times in Section 4.7.
128	       - Changed so that tld can be initialized to either 0 or -1.
129	       - Fixed Section 5.5 to say that the most recent lost
130	         interval has weight 1/(0.75*n) *when there have been
131	         at least eight loss intervals*.
132	       - Clarified introduction about fixed-size and variable-size
133	         packets.

135	     * Added more about sender-based variants.
136	       Feedback from Guillaume Jourjon.

138	     * Corrected that the loss interval I_0 includes all transmitted
139	       packets, including lost and marked packets (as defined in Section
140	       5.3 in the general definition.)  Email from Eddie Kohler and
141	       Gerrit Renker.

143	     * Open issue:  Feedback from Ian about problems being limited by
144	       X_recv after a loss event.  There might not be an easy answer.

146	     * Related open issue:  Add Faster Restart to RFC3448bis?  Or not?
147	       From Ian McDonald.

149	     * Open issue: Adopt something like DCCP's Receive Rate Length,
150	       instead of ignoring one feedback packet?  From Eddie Kohler.

152	     * Open issue: Add possible mechanisms for limited the maximum
153	       burst size?  Using a token bucket size based on the
154	       current rate?  Or not?  Email from Eddie Kohler and Gerrit
155	       Renker.

157	     * Related open issue: To deal with idle periods and the like,
158	       in Section 4.7 say that t_i := max(t_i, t_now - RTT/2), to
159	       limit bursts to RTT/2 packets?  Has anyone implemented this?
160	       Email from Eddie Kohler and Ian McDonald.

162	     * Not done:  I didn't add a minimum value for the nofeedback
163	       timer.  (Why would a nofeedback timer need to be bigger
164	       than max(4*R, 2*s/X)?  Email discussing pros and cons from
165	       Arjuna.

167	     * Not addressed yet: Email thread on "RFC 3448, 4.4:  Modifying
168	       X_recv if p = 0 at the time of last feedback".

170	     * Todo: Update Section 9 on "Changes from RFC 3448" with
171	       changes since draft-floyd-rfc3448bis-00.txt.

173	     Changes from draft-floyd-rfc3448bis-00.txt:

175	     * Name change to draft-ietf-dccp-rfc3448bis-00.txt.

177	     * Specified the receiver's initialization of the feedback timer
178	       when the first data packet doesn't have an estimate of the
179	       RTT.  From feedback from Dado Colussi.

181	     * Added the procedure for sending receiver
182	       feedback packets when a coarse-grained
183	       timestamp is used. From RFC 4243.

185	     Changes from RFC 3448:

187	     * Incorporated changes in the RFC 3448 errata:

189	       -  "If the sender does not receive a feedback report for
190	          four round trip times, it cuts its sending rate in half."
191	          ("Two" changed to "four", for consistency with the rest
192	          of the document.  Reported by Joerg Widmer).

194	       - "If the nofeedback timer expires when the sender does not
195	         yet have an RTT sample, and has not yet received any
196	         feedback from the receiver, or when p == 0,..."
197	         (Added "or when p == 0,", reported by Wim Heirman).

199	       - In Section 5.5, changed:
200	           for (i = 1 to n) { DF_i = 1; }
201	         to:
202	           for (i = 0 to n) { DF_i = 1; }
203	         Reported by Michele R.

205	     * Changed RFC 3448 to correspond to the larger initial windows
206	       specified in RFC 3390.  This includes the following:

208	       - Incorporated Section 5.1 from [RFC4342], saying that
209	         when reducing the sending rate after an idle period, don't
210	         reduce the sending rate below the initial sending rate.

212	       - Change for a datalimited sender:
213	         When the sender has been datalimited, the sender doesn't
214	         let the receive rate limit it to a sending rate less than
215	         the initial rate.

217	       - Small change to slow-start:
218	         Changed so that for the first feedback packet received,
219	         or for the first feedback packet received after an idle
220	         period, the receive rate is not used to limit the
221	         sending rate.  This is because the receiver might not yet
222	         have seen an entire window of data.

224	     * Clarified how the average loss interval is calculated when
225	       the receiver has not yet seen eight loss intervals.

227	     * Discussed more about estimating the average segment size:

229	       - For initializing the loss history after the first loss event,
230	         either the receiver knows the sender's value for s, or
231	         the receiver uses the throughput equation for X_pps and does
232	         not need to know an estimate for s.

234	       - Added a discussion about estimating the average segment size
235	         s in Section 4.1 on "Measuring the Segment Size".

237	       - Changed "packet size" to "segment size".

239	    END OF NOTE TO RFC EDITOR.

241	1.  Introduction

243	    This document specifies TCP-Friendly Rate Control (TFRC).  TFRC is a
244	    congestion control mechanism designed for unicast flows operating in
245	    an Internet environment and competing with TCP traffic [FHPW00].
246	    Instead of specifying a complete protocol, this document simply
247	    specifies a congestion control mechanism that could be used in a
248	    transport protocol such as DCCP (Datagram Congestion Control
249	    Protocol) [RFC4340], in an application incorporating end-to-end
250	    congestion control at the application level, or in the context of
251	    endpoint congestion management [BRS99]. This document does not
252	    discuss packet formats or reliability.  Implementation-related
253	    issues are discussed only briefly, in Section 8.

255	    TFRC is designed to be reasonably fair when competing for bandwidth
256	    with TCP flows, where a flow is "reasonably fair" if its sending
257	    rate is generally within a factor of two of the sending rate of a
258	    TCP flow under the same conditions.  However, TFRC has a much lower
259	    variation of throughput over time compared with TCP, which makes it
260	    more suitable for applications such as telephony or streaming media
261	    where a relatively smooth sending rate is of importance.

263	    The penalty of having smoother throughput than TCP while competing
264	    fairly for bandwidth is that TFRC responds slower than TCP to
265	    changes in available bandwidth.  Thus TFRC should only be used when
266	    the application has a requirement for smooth throughput, in
267	    particular, avoiding TCP's halving of the sending rate in response
268	    to a single packet drop.  For applications that simply need to
269	    transfer as much data as possible in as short a time as possible we
270	    recommend using TCP, or if reliability is not required, using an
271	    Additive-Increase, Multiplicative-Decrease (AIMD) congestion control
272	    scheme with similar parameters to those used by TCP.

274	    TFRC is designed for best performance with applications that use a
275	    fixed segment size, and vary their sending rate in packets per
276	    second in response to congestion.  TFRC can also be used, perhaps
277	    with less optimal performance, with applications that don't have a
278	    fixed segment size, but where the segment size varies according to
279	    the needs of the application (e.g., video applications).

281	    Some applications (e.g., some audio applications) require a fixed
282	    interval of time between packets and vary their segment size instead
283	    of their packet rate in response to congestion.  The congestion
284	    control mechanism in this document is not designed for those
285	    applications; TFRC-SP (Small-Packet TFRC) is a variant of TFRC for
286	    applications that have a fixed sending rate in packets per second
287	    but either use small packets, or vary their packet size in response
288	    to congestion.  TFRC-SP will be specified in a later document [TFRC-
289	    SP].

291	    This document specifies TFRC as a receiver-based mechanism, with the
292	    calculation of the congestion control information (i.e., the loss
293	    event rate) in the data receiver rather in the data sender.  This is
294	    well-suited to an application where the sender is a large server
295	    handling many concurrent connections, and the receiver has more
296	    memory and CPU cycles available for computation.  In addition, a
297	    receiver-based mechanism is more suitable as a building block for
298	    multicast congestion control.  However, it is also possible to
299	    implement TFRC in sender-based variants, as allowed in DCCP's
300	    Congestion Control ID 3 (CCID 3) [RFC4342].

302	2.  Conventions

304	    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
305	    "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
306	    document are to be interpreted as described in [RFC2119].

308	    Appendix A gives a list of terms used in this document.

310	3.  Protocol Mechanism

312	    For its congestion control mechanism, TFRC directly uses a
313	    throughput equation for the allowed sending rate as a function of
314	    the loss event rate and round-trip time.  In order to compete fairly
315	    with TCP, TFRC uses the TCP throughput equation, which roughly
316	    describes TCP's sending rate as a function of the loss event rate,
317	    round-trip time, and segment size.  We define a loss event as one or
318	    more lost or marked packets from a window of data, where a marked
319	    packet refers to a congestion indication from Explicit Congestion
320	    Notification (ECN) [RFC3168].

322	    Generally speaking, TFRC's congestion control mechanism works as
323	    follows:

325	    o   The receiver measures the loss event rate and feeds this
326	        information back to the sender.

328	    o   The sender also uses these feedback messages to measure the
329	        round-trip time (RTT).

331	    o   The loss event rate and RTT are then fed into TFRC's throughput
332	        equation, giving the acceptable transmit rate.

334	    o   The sender then adjusts its transmit rate to match the
335	        calculated rate.

337	    The dynamics of TFRC are sensitive to how the measurements are
338	    performed and applied.  We recommend specific mechanisms below to
339	    perform and apply these measurements.  Other mechanisms are
340	    possible, but it is important to understand how the interactions
341	    between mechanisms affect the dynamics of TFRC.

343	3.1.  TCP Throughput Equation

345	    Any realistic equation giving TCP throughput as a function of loss
346	    event rate and RTT should be suitable for use in TFRC.  However, we
347	    note that the TCP throughput equation used must reflect TCP's
348	    retransmit timeout behavior, as this dominates TCP throughput at
349	    higher loss rates.  We also note that the assumptions implicit in
350	    the throughput equation about the loss event rate parameter have to
351	    be a reasonable match to how the loss rate or loss event rate is
352	    actually measured.  While this match is not perfect for the
353	    throughput equation and loss rate measurement mechanisms given
354	    below, in practice the assumptions turn out to be close enough.

356	    The throughput equation we currently recommend for TFRC is a
357	    slightly simplified version of the throughput equation for Reno TCP
358	    from [PFTK98]. Ideally we'd prefer a throughput equation based on
359	    SACK TCP, but no one has yet derived the throughput equation for
360	    SACK TCP, and from both simulations and experiments, the differences
361	    between the two equations are relatively minor.

363	    The throughput equation is:

365	                                 s
366	    X_Bps = ----------------------------------------------------------
367	            R*sqrt(2*b*p/3) + (t_RTO * (3*sqrt(3*b*p/8)*p*(1+32*p^2)))

369	    Where:

371	        X_Bps is the transmit rate in bytes/second.

373	        s is the segment size in bytes.

375	        R is the round trip time in seconds.

377	        p is the loss event rate, between 0 and 1.0, of the number of
378	        loss events as a fraction of the number of packets transmitted.

380	        t_RTO is the TCP retransmission timeout value in seconds.

382	        b is the number of packets acknowledged by a single TCP
383	        acknowledgement.

385	    We further simplify this by setting t_RTO = 4*R.  A more accurate
386	    calculation of t_RTO is possible, but experiments with the current
387	    setting have resulted in reasonable fairness with existing TCP
388	    implementations [W00].  Another possibility would be to set t_RTO =
389	    max(4R, one second), to match the recommended minimum of one second
390	    on the RTO [RFC2988].

392	    Many current TCP connections use delayed acknowledgements, sending
393	    an acknowledgement for every two data packets received, and thus
394	    have a sending rate modeled by b = 2.  However, TCP is also allowed
395	    to send an acknowledgement for every data packet, and this would be
396	    modeled by b = 1.  Because many TCP implementations do not use
397	    delayed acknowledgements, we recommend b = 1.

399	    In future, different TCP equations may be substituted for this
400	    equation.  The requirement is that the throughput equation be a
401	    reasonable approximation of the sending rate of TCP for conformant
402	    TCP congestion control.

404	    The throughput equation can also be expressed as

406	    X_Bps =  X_pps * s ,

408	    with X_pps, the sending rate in packets per second, given as

410	                                      1
411	    X_pps =  --------------------------------------------------------
412	            R*sqrt(2*b*p/3) + (t_RTO*(3*sqrt(3*b*p/8)*p*(1+32*p^2)))

414	    The parameters s (segment size), p (loss event rate) and R (RTT)
415	    need to be measured or calculated by a TFRC implementation.  The
416	    measurement of s is specified in Section 4.1, measurement of R is
417	    specified in Section 4.3, and measurement of p is specified in
418	    Section 5. In the rest of this document all data rates are measured
419	    in bytes/second.

421	3.2.  Packet Contents

423	    Before specifying the sender and receiver functionality, we describe
424	    the contents of the data packets sent by the sender and feedback
425	    packets sent by the receiver.  As TFRC will be used along with a
426	    transport protocol, we do not specify packet formats, as these
427	    depend on the details of the transport protocol used.

429	3.2.1.  Data Packets

431	    Each data packet sent by the data sender contains the following
432	    information:

434	    o   A sequence number. This number is incremented by one for each
435	        data packet transmitted.  The field must be sufficiently large
436	        that it does not wrap causing two different packets with the
437	        same sequence number to be in the receiver's recent packet
438	        history at the same time.

440	    o   A timestamp indicating when the packet is sent. We denote by
441	        ts_i the timestamp of the packet with sequence number i.  The
442	        resolution of the timestamp should typically be measured in
443	        milliseconds.
444	        This timestamp is used by the receiver to determine which losses
445	        belong to the same loss event.  The timestamp is also echoed by
446	        the receiver to enable the sender to estimate the round-trip
447	        time, for senders that do not save timestamps of transmitted
448	        data packets.
449	        We note that as an alternative to a timestamp incremented in
450	        milliseconds, a "timestamp" that increments every quarter of a
451	        round-trip time would be sufficient for determining when losses
452	        belong to the same loss event, in the context of a protocol
453	        where this is understood by both sender and receiver, and where
454	        the sender saves the timestamps of transmitted data packets.

456	    o   The sender's current estimate of the round trip time. The
457	        estimate reported in packet i is denoted by R_i.  The round-trip
458	        time estimate is used by the receiver, along with the timestamp,
459	        to determine when multiple losses belong to the same loss event.
460	        The round-trip time estimate is also used by the receiver to
461	        determine the interval to use for calculating the receive rate,
462	        and to determine when to send feedback packets.
463	        If the sender sends a coarse-grained "timestamp" that increments
464	        every quarter of a round-trip time, as discussed above, then the
465	        sender does not need to send its current estimate of the round
466	        trip time.

468	3.2.2.  Feedback Packets

470	    Each feedback packet sent by the data receiver contains the
471	    following information:

473	    o   The timestamp of the last data packet received. We denote this
474	        by t_recvdata.  If the last packet received at the receiver has
475	        sequence number i, then t_recvdata = ts_i.

477	        This timestamp is used by the sender to estimate the round-trip
478	        time, and is only needed if the sender does not save timestamps
479	        of transmitted data packets.

481	    o   The amount of time elapsed between the receipt of the last data
482	        packet at the receiver, and the generation of this feedback
483	        report. We denote this by t_delay.

485	    o   The rate at which the receiver estimates that data was received
486	        since the last feedback report was sent. We denote this by
487	        X_recv.

489	    o   The receiver's current estimate of the loss event rate, p.

491	4.  Data Sender Protocol

493	    The data sender sends a stream of data packets to the data receiver
494	    at a controlled rate. When a feedback packet is received from the
495	    data receiver, the data sender changes its sending rate, based on
496	    the information contained in the feedback report. If the sender does
497	    not receive a feedback report for four round trip times, it cuts its
498	    sending rate in half.  This is achieved by means of a timer called
499	    the nofeedback timer.

501	    We specify the sender-side protocol in the following steps:

503	    o   Measurement of the mean segment size being sent.

505	    o   The sender behavior when a feedback packet is received.

507	    o   The sender behavior when the nofeedback timer expires.

509	    o   Oscillation prevention (optional)

511	    o   Scheduling of transmission on non-realtime operating systems.

513	4.1.  Measuring the Segment Size

515	    The parameter s (segment size) is normally known to an application.
516	    This may not be so in two cases:

518	    o   (1) The segment size naturally varies depending on the data.  In
519	        this case, although the segment size varies, that variation is
520	        not coupled to the transmit rate.  The TFRC sender can either
521	        compute the average segment size or use the maximum segment size
522	        for the segment size s.

524	    o   (2) The application needs to change the segment size rather than
525	        the number of segments per second to perform congestion control.
526	        This would normally be the case with packet audio applications
527	        where a fixed interval of time needs to be represented by each
528	        packet.  Such applications need to have a completely different
529	        way of measuring parameters.

531	    For the first class of applications where the segment size varies
532	    depending on the data, the sender MAY estimate the segment size s as
533	    the average segment size over the last four loss intervals.  The
534	    sender MAY also estimate the average segment size over longer time
535	    intervals, if so desired.  The TFRC sender uses the segment size s
536	    in the throughput equation, in the setting of the maximum receive
537	    rate and the minimum sending rate, and in the setting of the
538	    nofeedback timer.

540	    The TFRC receiver may use the average segment size s in initializing
541	    the loss history after the first loss event, but Section 6.3.1 also
542	    gives an alternate procedure that does not use the average segment
543	    size s.

545	    The second class of applications are discussed separately in a
546	    separate document on TFRC-SP.  For the remainder of this section we
547	    assume the sender can estimate the segment size, and that congestion
548	    control is performed by adjusting the number of packets sent per
549	    second.

551	4.2.  Sender Initialization

553	    The initial values for X (the allowed sending rate in bytes per
554	    second) and tld (the Time Last Doubled during slow-start) are
555	    undefined until they are set as described below.  If the sender is
556	    ready to send data when it does not yet have a round trip sample,
557	    the value of X is set to 1 MSS/second (for MSS the Maximum Segment
558	    Size), the nofeedback timer is set to expire after two seconds, and
559	    tld is set either to 0 or to -1.  Upon receiving a round trip time
560	    measurement (e.g., after the first feedback packet), tld is set to
561	    the current time, and the allowed transmit rate X is set to
562	    W_init/R, for W_init below from [RFC3390]:

564	         W_init = min(4*MSS, max(2*MSS, 4380)).

566	    For responding to the initial feedback packet, this replaces step
567	    (4) of Section 4.3 below.

569	    If the sender does have a round trip sample when it is ready to
570	    first send data (e.g., from the SYN exchange or from a previous
571	    connection [RFC2140]), the initial transmit rate X is set to
572	    W_init/R, and tld is set to the current time.

574	    Why is the initial value of TFRC's nofeedback timer set to two
575	    seconds, instead of the recommended initial value of three seconds
576	    for TCP's retransmit timer, from [RFC2988]?  There isn't any
577	    particular reason why TFRC's nofeedback timer should have the same
578	    initial value as TCP's retransmit timer.  TCP's retransmit timer is
579	    used not only to reduce the sending rate in response to congestion,
580	    but also to retransit a packet that is assumed to have been dropped
581	    in the network.  In contrast, TFRC's nofeedback timer is only used
582	    to reduce the allowed sending rate, not to trigger the sending of a
583	    new packet.  As a result, there is no danger to the network for the
584	    initial value of TFRC's nofeedback timer to be smaller than the
585	    recommended initial value for TCP's retransmit timer.

587	4.3.  Sender behavior when a feedback packet is received

589	    The sender knows its current allowed sending rate, X, and maintains
590	    an estimate of the current round trip time, R, and an estimate of
591	    the timeout interval, t_RTO.

593	    When a feedback packet is received by the sender at time t_now, the
594	    following actions should be performed:

596	    1)  Calculate a new round trip sample.
597	        R_sample = (t_now - t_recvdata) - t_delay.

599	    2)  Update the round trip time estimate:

601	             If no feedback has been received before
602	                 R = R_sample;
603	             Else
604	                 R = q*R + (1-q)*R_sample;

606	        TFRC is not sensitive to the precise value for the filter
607	        constant q, but we recommend a default value of 0.9.

609	    3)  Update the timeout interval:

611	             t_RTO = 4*R.

613	    4)  Update the sending rate as follows:

615	             If (sender has been idle or data-limited
616	                   within last two round-trip times)
617	                 min_rate = max(2*X_recv, W_init/R);
618	             Else
619	                 min_rate = 2*X_recv;
620	             If (p > 0)
621	                 Calculate X_Bps using the TCP throughput equation.
622	                 X = max(min(X_Bps, min_rate), s/t_mbi);
623	             Else if ((min_rate < X) and (the first feedback packet, or
624	                   the first feedback packet after a nofeedback timer))
625	                Do nothing;
626	             Else if (t_now - tld >= R)
627	                 X = max(min(2*X, min_rate), s/R);
628	                 tld = t_now;

630	    The condition ``if (sender has been idle or data-limited within last
631	    two round-trip times)'' prevents an idle or data-limited sender from
632	    having to reduce the sending rate to less than the initial sending
633	    rate as a result of limitations from a small receive rate.  The
634	    condition ``if (not the first feedback packet, and not the first
635	    feedback packet after a nofeedback timer)'' prevents a sender from
636	    reducing the sending rate in response to a feedback packet that
637	    reports the receipt of only a few packets after start-up or after an
638	    idle period.

640	    Note that if p == 0, then the sender is in slow-start phase, where
641	    it approximately doubles the sending rate each round-trip time until
642	    a loss occurs. The s/R term gives a minimum sending rate during
643	    slow-start of one packet per RTT.  The parameter t_mbi is 64
644	    seconds, and represents the maximum inter-packet backoff interval in
645	    the persistent absence of feedback.  Thus, when p > 0 the sender
646	    sends at least one packet every 64 seconds.

648	    5)  Reset the nofeedback timer to expire after max(4*R, 2*s/X)
649	        seconds.

651	4.4.  Expiration of nofeedback timer

653	    If the nofeedback timer expires, the sender should perform the
654	    following actions:

656	    1)  Cut the sending rate in half.  If the sender has received
657	        feedback from the receiver, this is done by modifying the
658	        sender's cached copy of X_recv (the receive rate).  Because the
659	        sending rate is limited to at most twice X_recv, modifying
660	        X_recv limits the current sending rate, but allows the sender to
661	        slow-start, doubling its sending rate each RTT, if feedback
662	        messages resume reporting no losses.

664	            If (X_Bps > 2*X_recv)
665	                X_recv = max(X_recv/2, s/(2*t_mbi));
666	            Else
667	                X_recv = X_Bps/4;

669	        The term s/(2*t_mbi) limits the backoff to one packet every 64
670	        seconds in the case of persistent absence of feedback.

672	    2)  The value of X must then be recalculated as described under
673	        point (4) above.

675	        If the nofeedback timer expires when the sender does not yet
676	        have an RTT sample and has not yet received any feedback from
677	        the receiver, or when p == 0, then step (1) can be skipped, and
678	        the sending rate cut in half directly:

680	               X = max(X/2, s/t_mbi)

682	    3)  Restart the nofeedback timer to expire after max(4*R, 2*s/X)
683	        seconds.

685	    Note that when the sender stops sending, the receiver will stop
686	    sending feedback.  When the sender's nofeedback timer expires, the
687	    sender will decrease X_recv.  If the sender subsequently starts to
688	    send again, X_recv will limit the transmit rate, and a normal
689	    slowstart phase will occur until the transmit rate reaches X_Bps.

691	4.5.  Sending a packet after an idle or data-limited period

693	    If the sender has been idle (unable to send because there is little
694	    or no data from the application), the allowed sending rate could
695	    have been reduced due to the nofeedback timer, as specified in the
696	    section above.  Because the sender is always restricted to sending
697	    at most twice the receive rate reported by the receiver, the sender
698	    will be limited to at most doubling its sending rate each round-trip
699	    time, until the sending rate reaches the allowed sending rate
700	    calculated by the throughput equation.

702	4.6.  Preventing Oscillations
703	    To prevent oscillatory behavior in environments with a low degree of
704	    statistical multiplexing it is useful to modify sender's transmit
705	    rate to provide congestion avoidance behavior by reducing the
706	    transmit rate as the queuing delay (and hence RTT) increases.  To do
707	    this the sender maintains an estimate of the long-term RTT and
708	    modifies its sending rate depending on how the most recent sample of
709	    the RTT differs from this value.  The long-term sample is R_sqmean,
710	    and is set as follows:

712	         If no feedback has been received before
713	             R_sqmean = sqrt(R_sample);
714	         Else
715	             R_sqmean = q2*R_sqmean + (1-q2)*sqrt(R_sample);

717	    Thus R_sqmean gives the exponentially weighted moving average of the
718	    square root of the RTT samples.  The constant q2 should be set
719	    similarly to q, and we recommend a value of 0.9 as the default.

721	    The sender obtains the base allowed transmit rate, X, from the
722	    throughput function.  It then calculates a modified instantaneous
723	    transmit rate X_inst, as follows:

725	         X_inst = X * R_sqmean / sqrt(R_sample);

727	    When sqrt(R_sample) is greater than R_sqmean then the queue is
728	    typically increasing and so the transmit rate needs to be decreased
729	    for stable operation.

731	    Note: This modification is not always strictly required, especially
732	    if the degree of statistical multiplexing in the network is high.
733	    However, we recommend that it is done because it does make TFRC
734	    behave better in environments with a low level of statistical
735	    multiplexing.  If it is not done, we recommend using a very low
736	    value of q, such that q is close to or exactly zero.

738	4.7.  Scheduling of Packet Transmissions

740	    As TFRC is rate-based, and as operating systems typically cannot
741	    schedule events precisely, it is necessary to be opportunistic about
742	    sending data packets so that the correct average rate is maintained
743	    despite the coarse-grain or irregular scheduling of the operating
744	    system.  Thus a typical sending loop will calculate the correct
745	    inter-packet interval, t_ipi, as follows:

747	         t_ipi = s/X_inst;

749	    Let t_now be the current time and i be a natural number, i = 0, 1,
750	    ..., with t_i the nominal send time for the i-th packet.  Then the
751	    nominal send time t_(i+1) derives recursively as

753	           t_0 = t_now,
754	           t_(i+1) = t_i + t_ipi.

756	    The parameter t_delta allows a degree of flexibility in the send
757	    time of a packet.  When the application becomes idle, it requests
758	    re-scheduling for time t_i = t_(i-1) + t_ipi, for t_(i-1) the send
759	    time for the previous packet.  When the application is re-scheduled,
760	    it checks the current time, t_now.  If (t_now > t_i - t_delta) then
761	    packet i is sent.

763	    In some cases, when the nominal send time, t_i, of the next packet
764	    is calculated, it may already be the case that t_now > t_i -
765	    t_delta.  In such a case the packet should be sent immediately.
766	    Thus if the operating system has coarse timer granularity and the
767	    transmit rate is high, then TFRC may send short bursts of several
768	    packets separated by intervals of the OS timer granularity.

770	    If the operating system has a scheduling timer granularity of t_gran
771	    seconds, then t_delta would typically be set to:

773	         t_delta = min(t_ipi/2, t_gran/2);

775	    t_gran is 10ms on many Unix systems.  If t_gran is not known, a
776	    value of 10ms can be safely assumed.

778	5.  Calculation of the Loss Event Rate (p)

780	    Obtaining an accurate and stable measurement of the loss event rate
781	    is of primary importance for TFRC. Loss rate measurement is
782	    performed at the receiver, based on the detection of lost or marked
783	    packets from the sequence numbers of arriving packets. We describe
784	    this process before describing the rest of the receiver protocol.

786	5.1.  Detection of Lost or Marked Packets

788	    TFRC assumes that all packets contain a sequence number that is
789	    incremented by one for each packet that is sent.  For the purposes
790	    of this specification, we require that if a lost packet is
791	    retransmitted, the retransmission is given a new sequence number
792	    that is the latest in the transmission sequence, and not the same
793	    sequence number as the packet that was lost.  If a transport
794	    protocol has the requirement that it must retransmit with the
795	    original sequence number, then the transport protocol designer must
796	    figure out how to distinguish delayed from retransmitted packets and
797	    how to detect lost retransmissions.

799	    The receiver maintains a data structure that keeps track of which
800	    packets have arrived and which are missing.  For the purposes of
801	    specification, we assume that the data structure consists of a list
802	    of packets that have arrived along with the receiver timestamp when
803	    each packet was received.  In practice this data structure will
804	    normally be stored in a more compact representation, but this is
805	    implementation-specific.

807	    The loss of a packet is detected by the arrival of at least NDUPACK
808	    packets with a higher sequence number than the lost packet, for
809	    NDUPACK set to 3.  The requirement for NDUPACK subsequent packets is
810	    the same as with TCP, and is to make TFRC more robust in the
811	    presence of reordering.  In contrast to TCP, if a packet arrives
812	    late (after NDUPACK subsequent packets arrived) in TFRC, the late
813	    packet can fill the hole in TFRC's reception record, and the
814	    receiver can recalculate the loss event rate.  Future versions of
815	    TFRC might make the requirement for NDUPACK subsequent packets
816	    adaptive based on experienced packet reordering, but we do not
817	    specify such a mechanism here.

819	    For an ECN-capable connection, a marked packet is detected as a
820	    congestion event as soon as it arrives, without having to wait for
821	    the arrival of subsequent packets.

823	5.2.  Translation from Loss History to Loss Events

825	    TFRC requires that the loss fraction be robust to several
826	    consecutive packets lost or marked where those packets are part of
827	    the same loss event.  This is similar to TCP, which (typically) only
828	    performs one halving of the congestion window during any single RTT.
829	    Thus the receiver needs to map the packet loss history into a loss
830	    event record, where a loss event is one or more packets lost or
831	    marked in an RTT.  To perform this mapping, the receiver needs to
832	    know the RTT to use, and this is supplied periodically by the
833	    sender, typically as control information piggy-backed onto a data
834	    packet.  TFRC is not sensitive to how the RTT measurement sent to
835	    the receiver is made, but we recommend using the sender's calculated
836	    RTT, R, (see Section 4.3) for this purpose.

838	    To determine whether a lost or marked packet should start a new loss
839	    event, or be counted as part of an existing loss event, we need to
840	    compare the sequence numbers and timestamps of the packets that
841	    arrived at the receiver.  For a marked packet S_new, its reception
842	    time T_new can be noted directly.  For a lost packet, we can
843	    interpolate to infer the nominal "arrival time".  Assume:

845	        S_loss is the sequence number of a lost packet.

847	        S_before is the sequence number of the last packet to arrive
848	        with sequence number before S_loss.

850	        S_after is the sequence number of the first packet to arrive
851	        with sequence number after S_loss.

853	        S_max is the largest sequence number.

855	        T_loss is the nominal estimated arrival time for the lost
856	        packet.

858	        T_before is the reception time of S_before.

860	        T_after is the reception time of S_after.

862	    Note that T_before can either be before or after T_after due to
863	    reordering.

865	    For a lost packet S_loss, we can interpolate its nominal "arrival
866	    time" at the receiver from the arrival times of S_before and
867	    S_after. Thus:

869	         T_loss = T_before + ( (T_after - T_before)
870	                     * (S_loss - S_before)/(S_after - S_before) );

872	    Note that if the sequence space wrapped between S_before and
873	    S_after, then the sequence numbers must be modified to take this
874	    into account before performing this calculation.  If the largest
875	    possible sequence number is S_max, and S_before > S_after, then
876	    modifying each sequence number S by S' = (S + (S_max + 1)/2) mod
877	    (S_max + 1) would normally be sufficient.

879	    If the lost packet S_old was determined to have started the previous
880	    loss event, and we have just determined that S_new has been lost,
881	    then we interpolate the nominal arrival times of S_old and S_new,
882	    called T_old and T_new respectively.

884	    If T_old + R >= T_new, then S_new is part of the existing loss
885	    event. Otherwise S_new is the first packet in a new loss event.

887	5.3.  Inter-loss Event Interval

889	    If a loss interval, A, is determined to have started with packet
890	    sequence number S_A and the next loss interval, B, started with
891	    packet sequence number S_B, then the number of packets in loss
892	    interval A is given by (S_B - S_A).  Thus, loss interval A contains
893	    all of the packets transmitted by the sender starting with the first
894	    packet transmitted in loss interval A, and ending with but not
895	    including the first packet transmitted in loss interval B.

897	5.4.  Average Loss Interval

899	    To calculate the loss event rate p, we first calculate the average
900	    loss interval.  This is done using a filter that weights the n most
901	    recent loss event intervals in such a way that the measured loss
902	    event rate changes smoothly.

904	    Weights w_0 to w_(n-1) are calculated as:

906	         If (i < n/2)
907	            w_i = 1;
908	         Else
909	            w_i = 1 - (i - (n/2 - 1))/(n/2 + 1);

911	    Thus if n=8, the values of w_0 to w_7 are:

913	        1.0, 1.0, 1.0, 1.0, 0.8, 0.6, 0.4, 0.2

915	    The value n for the number of loss intervals used in calculating the
916	    loss event rate determines TFRC's speed in responding to changes in
917	    the level of congestion.  As currently specified, TFRC should not be
918	    used for values of n significantly greater than 8, for traffic that
919	    might compete in the global Internet with TCP.  At the very least,
920	    safe operation with values of n greater than 8 would require a
921	    slight change to TFRC's mechanisms to include a more severe response
922	    to two or more round-trip times with heavy packet loss.

924	    When calculating the average loss interval we need to decide whether
925	    to include the interval since the most recent packet loss event.  We
926	    only do this if it is sufficiently large to increase the average
927	    loss interval.

929	    Let the most recent loss intervals be I_0 to I_k, where I_0 is the
930	    interval starting with the most recent loss event (if there has been
931	    one).  If there have been at least n loss intervals, then k is set
932	    to n; otherwise k is the maximum number of loss intervals seen so
933	    far.  We calculate the average loss interval I_mean is:

935	         I_tot0 = 0;
936	         I_tot1 = 0;
937	         W_tot = 0;
938	         for (i = 0 to k-1) {
939	           I_tot0 = I_tot0 + (I_i * w_i);
940	           W_tot = W_tot + w_i;
941	         }
942	         for (i = 1 to k) {
943	           I_tot1 = I_tot1 + (I_i * w_(i-1));
944	         }
945	         I_tot = max(I_tot0, I_tot1);
946	         I_mean = I_tot/W_tot;

948	    The loss event rate, p is simply:

950	         p = 1 / I_mean;

952	5.5.  History Discounting

954	    As described in Section 5.4, when there have been at least eight
955	    loss intervals, the most recent loss interval is only assigned
956	    1/(0.75*n) of the total weight in calculating the average loss
957	    interval, regardless of the size of the most recent loss interval.
958	    This section describes an optional history discounting mechanism,
959	    discussed further in [FHPW00a] and [W00], that allows the TFRC
960	    receiver to adjust the weights, concentrating more of the relative
961	    weight on the most recent loss interval, when the most recent loss
962	    interval is more than twice as large as the computed average loss
963	    interval.

965	    To carry out history discounting, we associate a discount factor
966	    DF_i with each loss interval L_i, for i > 0, where each discount
967	    factor is a floating point number.  The discount array maintains the
968	    cumulative history of discounting for each loss interval.  At the
969	    beginning, the values of DF_i in the discount array are initialized
970	    to 1:

972	         for (i = 0 to n) {
973	           DF_i = 1;
974	         }

976	    History discounting also uses a general discount factor DF, also a
977	    floating point number, that is also initialized to 1.  First we show
978	    how the discount factors are used in calculating the average loss
979	    interval, and then we describe later in this section how the
980	    discount factors are modified over time.

982	    As described in Section 5.4 the average loss interval is calculated
983	    using the n previous loss intervals I_1, ..., I_n, and the interval
984	    I_0 that represents the number of packets sent since the beginning
985	    of the last loss event.  The computation of the average loss
986	    interval using the discount factors is a simple modification of the
987	    procedure in Section 5.4, as follows:

989	         I_tot0 = I_0 * w_0
990	         I_tot1 = 0;
991	         W_tot0 = w_0
992	         W_tot1 = 0;
993	         for (i = 1 to n-1) {
994	           I_tot0 = I_tot0 + (I_i * w_i * DF_i * DF);
995	           W_tot0 = W_tot0 + w_i * DF_i * DF;
996	         }
997	         for (i = 1 to n) {
998	           I_tot1 = I_tot1 + (I_i * w_(i-1) * DF_i);
999	           W_tot1 = W_tot1 + w_(i-1) * DF_i;
1000	         }
1001	         p = min(W_tot0/I_tot0, W_tot1/I_tot1);

1003	    The general discounting factor, DF is updated on every packet
1004	    arrival as follows. First, the receiver computes the weighted
1005	    average I_mean of the loss intervals I_1, ..., I_n:

1007	         I_tot = 0;
1008	         W_tot = 0;
1009	         for (i = 1 to n) {
1010	           W_tot = W_tot + w_(i-1) * DF_i;
1011	           I_tot = I_tot + (I_i * w_(i-1) * DF_i);
1012	         }
1013	         I_mean = I_tot / W_tot;

1015	    This weighted average I_mean is compared to I_0, the number of
1016	    packets sent since the beginning of the last loss event.  If I_0 is
1017	    greater than twice I_mean, then the new loss interval is
1018	    considerably larger than the old ones, and the general discount
1019	    factor DF is updated to decrease the relative weight on the older
1020	    intervals, as follows:

1022	         if (I_0 > 2 * I_mean) {
1023	           DF = 2 * I_mean/I_0;
1024	           if (DF < THRESHOLD)
1025	             DF = THRESHOLD;
1026	         } else
1027	           DF = 1;

1029	    A nonzero value for THRESHOLD ensures that older loss intervals from
1030	    an earlier time of high congestion are not discounted entirely.  We
1031	    recommend a THRESHOLD of 0.5.  Note that with each new packet
1032	    arrival, I_0 will increase further, and the discount factor DF will
1033	    be updated.

1035	    When a new loss event occurs, the current interval shifts from I_0
1036	    to I_1, loss interval I_i shifts to interval I_(i+1), and the loss
1037	    interval I_n is forgotten.  The previous discount factor DF has to
1038	    be incorporated into the discount array.  Because DF_i carries the
1039	    discount factor associated with loss interval I_i, the DF_i array
1040	    has to be shifted as well. This is done as follows:

1042	         for (i = 1 to n) {
1043	           DF_i = DF * DF_i;
1044	         }
1045	         for (i = n-1 to 0 step -1) {
1046	           DF_(i+1) = DF_i;
1047	         }
1048	         I_0 = 1;
1049	         DF_0 = 1;
1050	         DF = 1;

1052	    This completes the description of the optional history discounting
1053	    mechanism. We emphasize that this is an optional mechanism whose
1054	    sole purpose is to allow TFRC to response somewhat more quickly to
1055	    the sudden absence of congestion, as represented by a long current
1056	    loss interval.

1058	6.  Data Receiver Protocol

1060	    The receiver periodically sends feedback messages to the sender.
1061	    Feedback packets should normally be sent at least once per RTT,
1062	    unless the sender is sending at a rate of less than one packet per
1063	    RTT, in which case a feedback packet should be send for every data
1064	    packet received.  A feedback packet should also be sent whenever a
1065	    new loss event is detected without waiting for the end of an RTT,
1066	    and whenever an out-of-order data packet is received that removes a
1067	    loss event from the history.

1069	    If the sender is transmitting at a high rate (many packets per RTT)
1070	    there may be some advantages to sending periodic feedback messages
1071	    more than once per RTT as this allows faster response to changing
1072	    RTT measurements, and more resilience to feedback packet loss.  If
1073	    the receiver was sending k feedback packets per RTT, step (4) of
1074	    Section 6.2 would be modified to set the feedback timer to expire
1075	    after R_m/k seconds.  However, each feedback packet would still
1076	    report the receiver rate over the last RTT, not over a fraction of
1077	    an RTT.  We note that there is little gain from sending a large
1078	    number of feedback messages per RTT.

1080	6.1.  Receiver behavior when a data packet is received

1082	    When a data packet is received, the receiver performs the following
1083	    steps:

1085	    1)  Add the packet to the packet history.

1087	    2)  Let the previous value of p be p_prev.  Calculate the new value
1088	        of p as described in Section 5.

1090	    3)  If p > p_prev, cause the feedback timer to expire, and perform
1091	        the actions described in Section 6.2

1093	        If p <= p_prev no action need be performed.

1095	        However an optimization might check to see if the arrival of the
1096	        packet caused a hole in the packet history to be filled and
1097	        consequently two loss intervals were merged into one.  If this
1098	        is the case, the receiver might also send feedback immediately.
1099	        The effects of such an optimization are normally expected to be
1100	        small.

1102	6.2.  Expiration of feedback timer

1104	    When the feedback timer at the receiver expires, the action to be
1105	    taken depends on whether data packets have been received since the
1106	    last feedback was sent.

1108	    Let the maximum sequence number of a packet at the receiver so far
1109	    be S_m, and the value of the RTT measurement included in packet S_m
1110	    be R_m.  As described in Section 3.2.1, R_m is the sender's current
1111	    estimate of the round trip time, reported in data packets.  If data
1112	    packets have been received since the previous feedback was sent, the
1113	    receiver performs the following steps:

1115	    1)  Calculate the average loss event rate using the algorithm
1116	        described above.

1118	    2)  Calculate the measured receive rate, X_recv, based on the
1119	        packets received within the previous R_m seconds.

1121	    3)  Prepare and send a feedback packet containing the information
1122	        described in Section 3.2.2

1124	    4)  Restart the feedback timer to expire after R_m seconds.

1126	    Note that rule 2) above gives a minimum value for the measured
1127	    receive rate X_recv of one packet per round-trip time.  If the
1128	    sender is limited to a sending rate of less than one packet per
1129	    round-trip time, this will be due to the loss event rate, not from a
1130	    limit imposed by the measured receive rate at the receiver.

1132	    If no data packets have been received since the last feedback was
1133	    sent, no feedback packet is sent, and the feedback timer is
1134	    restarted to expire after R_m seconds.

1136	6.3.  Receiver initialization

1138	    The receiver is initialized by the first data packet that arrives at
1139	    the receiver. Let the sequence number of this packet be i.

1141	    When the first packet is received:

1143	    o   Set p=0

1145	    o   Set  X_recv = 0.

1147	    o   Prepare and send a feedback packet.

1149	    o   Set the feedback timer to expire after R_i seconds.

1151	    If the first data packet doesn't contain an estimate R_i of the
1152	    round-trip time, then the receiver sends a feedback packet for every
1153	    arriving data packet, until a data packet arrives containing an
1154	    estimate of the round-trip time.

1156	    If the sender is using a coarse-grained timestamp that increments
1157	    every quarter of a round-trip time, then a feedback timer is not
1158	    needed, and the following procedure from RFC 4342 is used to
1159	    determine when to send feedback messages.

1161	    o   Whenever the receiver sends a feedback message, the receiver
1162	        sets a local variable last_counter to the greatest received
1163	        value of the window counter since the last feedback message was
1164	        sent, if any data packets have been received since the last
1165	        feedback message was sent.

1167	    o   If the receiver receives a data packet with a window counter
1168	        value greater than or equal to last_counter + 4, then the
1169	        receiver sends a new feedback packet.  ("Greater" and "greatest"
1170	        are measured in circular window counter space.)

1172	6.3.1.  Initializing the Loss History after the First Loss Event

1174	    The number of packets until the first loss can not be used to
1175	    compute the allowed sending rate directly, as the sending rate
1176	    changes rapidly during this time.  TFRC assumes that the correct
1177	    data rate after the first loss is half of the maximum sending rate
1178	    before the loss occurred.  TFRC approximates this target rate
1179	    X_target by the maximum X_recv so far, for X_recv the receive rate
1180	    over a single round-trip time.  (For a TFRC sender that always has
1181	    data to send, it is sufficient to approximate the target rate by the
1182	    most recent X_recv.  However, for a TFRC sender that is sometimes
1183	    data-limited or idle, it is best to use the maximum X_recv so far.)

1185	    After the first loss, instead of initializing the first loss
1186	    interval to the number of packets sent until the first loss, the
1187	    TFRC receiver calculates the loss interval that would be required to
1188	    produce the data rate X_target, and uses this synthetic loss
1189	    interval to seed the loss history mechanism.

1191	    TFRC does this by finding some value p for which the throughput
1192	    equation in Section 3.1 gives a sending rate within 5% of X_target,
1193	    given the round-trip time R, and the first loss interval is then set
1194	    to 1/p.  If the receiver knows the segment size s used by the
1195	    sender, then the receiver can use the throughput equation for X;
1196	    otherwise, the receiver can measure the receive rate in packets per
1197	    second instead of bytes per second for this purpose, and use the
1198	    throughput equation for X_pps.  (The 5% tolerance is introduced
1199	    simply because the throughput equation is difficult to invert, and
1200	    we want to reduce the costs of calculating p numerically.)

1202	    Special care is needed for initializing the first loss interval when
1203	    the first data packet is lost or marked.  When the first data packet
1204	    is lost in TCP, the TCP sender retransmits the packet after the
1205	    retransmit timer expires.  If TCP's first data packet is ECN-marked,
1206	    the TCP sender resets the retransmit timer, and sends a new data
1207	    packet only when the retransmit timer expires [RFC3168] (Section
1208	    6.1.2).  For TFRC, if the first data packet is lost or ECN-marked,
1209	    then the first loss interval consists of the null interval with no
1210	    data packets.  In this case, the loss interval length for this
1211	    (null) loss interval should be set to give a similar sending rate to
1212	    that of TCP.

1214	    When the first TFRC loss interval is null, meaning that the first
1215	    data packet is lost or ECN-marked, in order to follow the behavior
1216	    of TCP, TFRC wants the allowed sending rate to be 1 packet every two
1217	    round-trip times, or equivalently, 0.5 packets per RTT.  Thus, the
1218	    TFRC receiver calculates the loss interval that would be required to
1219	    produce the target rate X_target of 0.5/R packets per second, for
1220	    the round-trip time R, and uses this synthetic loss interval for the
1221	    first loss interval.  The TFRC receiver uses 0.5/R packets per
1222	    second as the minimum value for X_target when initializing the first
1223	    loss interval.

1225	7.  Sender-based Variants

1227	    In a sender-based variant of TFRC, the receiver would use reliable
1228	    delivery to send information about packet losses to the sender, and
1229	    the sender would compute the packet loss rate and the acceptable
1230	    transmit rate.

1232	    The main advantages of a sender-based variant of TFRC would be that
1233	    the sender would not have to trust the receiver's calculation of the
1234	    packet loss rate.  However, with the requirement of reliable
1235	    delivery of loss information from the receiver to the sender, a
1236	    sender-based TFRC would have much tighter constraints on the
1237	    transport protocol in which it is embedded.

1239	    In contrast, the receiver-based variant of TFRC specified in this
1240	    document is robust to the loss of feedback packets, and therefore
1241	    does not require the reliable delivery of feedback packets.  It is
1242	    also better suited for applications where it is desirable to offload
1243	    work from the server to the client as much as possible.

1245	    RFC 4340 and RFC 4342 together specify CCID 3, which can be used as
1246	    a sender-based variant of TFRC.  In CCID 3, each feedback packet
1247	    from the receiver contains a Loss Intervals option, reporting the
1248	    lengths of the most recent loss intervals.  Feedback packets may
1249	    also include the Ack Vector option, allowing the sender to determine
1250	    exactly which packets were dropped or marked, and to check the
1251	    information reported in the Loss Intervals options.  The Ack Vector
1252	    option can also include ECN Nonce Echoes, allowing the sender to
1253	    verify the receiver's report of having received a data packet.  The
1254	    Ack Vector option allows the sender to determine for itself which
1255	    data packets were lost or ECN-marked, to determine loss intervals,
1256	    and to calculate the loss event rate.  Section 9.2 of RFC 4342
1257	    discusses issues in the sender verifying information reported by the
1258	    receiver.

1260	8.  Implementation Issues

1262	    This document has specified the TFRC congestion control mechanism,
1263	    for use by applications and transport protocols.  This section
1264	    mentions briefly some of the few implementation issues.

1266	    For t_RTO = 4*R and b = 1, the throughput equation in Section 3.1
1267	    can be expressed as follows:

1269	                     s
1270	         X_Bps =  --------
1271	                  R * f(p)

1273	    for

1275	         f(p) =  sqrt(2*p/3) + (12*sqrt(3*p/8) * p * (1+32*p^2)).

1277	    A table lookup could be used for the function f(p).

1279	    Many of the multiplications (e.g., q and 1-q for the round-trip time
1280	    average, a factor of 4 for the timeout interval) are or could be by
1281	    powers of two, and therefore could be implemented as simple shift
1282	    operations.

1284	    We note that the optional sender mechanism for preventing
1285	    oscillations described in Section 4.6 uses a square-root
1286	    computation.

1288	    For the calculation of the nominal arrival time T_loss for a lost
1289	    packet from Section 5.2, one way to implement this that would avoid
1290	    concerns about wrapped sequence space would be to use the following:

1292	         T_loss = T_before +  (T_after - T_before) * Dist(S_loss,
1293	         S_before)/Dist(S_after, S_before)

1295	    where

1297	         Dist(Seqno_A, Seqno_B) = (Seqno_A + 2^48 - Seqno_B) % 2^48

1299	    The calculation of the average loss interval in Section 5.4 involves
1300	    multiplications by the weights w_0 to w_(n-1), which for n=8 are:

1302	        1.0, 1.0, 1.0, 1.0, 0.8, 0.6, 0.4, 0.2.

1304	    With a minor loss of smoothness, it would be possible to use weights
1305	    that were powers of two or sums of powers of two, e.g.,
1306	        1.0, 1.0, 1.0, 1.0, 0.75, 0.5, 0.25, 0.25.

1308	    The optional history discounting mechanism described in Section 5.5
1309	    is used in the calculation of the average loss rate.  The history
1310	    discounting mechanism is invoked only when there has been an
1311	    unusually long interval with no packet losses.  For a more efficient
1312	    operation, the discount factor DF_i could be restricted to be a
1313	    power of two.

1315	9.  Changes from RFC 3448

1317	    The changes from RFC 3448 are as follows:

1319	    o   Changes to the initial sending rate: In RFC 3448, the initial
1320	        sending rate was two packets per round trip time.  In this
1321	        document, the initial sending rate can be as high as four
1322	        packets per round trip time, following RFC 3390.

1324	        Following Section 5.1 from [RFC4342], this document also
1325	        specifies that when the sending rate is reduced after an idle
1326	        period, it is not reduced below the initial sending rate.  In
1327	        addition, when the sender has been data-limited and the sender
1328	        is reducing the allowed transmit rate to twice the receive
1329	        rate,, the sender doesn't reduce the allowed transmit rate to
1330	        less than the initial sending rate.

1332	        A larger initial sending rate is of little use if the receiver
1333	        sends a feedback packet after the first packet is received, and
1334	        the sender in response reduces the allowed sending rate to at
1335	        most twice the receive rate.  In the current document, the
1336	        sender does not reduce the allowed sending rate to at most twice
1337	        the receive rate in response to the first feedback packet.

1339	    o   RFC 3448 had contradictory text about whether the sender halved
1340	        its sending rate after *two* round-trip times without receiving
1341	        a feedback report, or after *four* round-trip times.  This
1342	        document clarifies that the sender halves its sending rate after
1343	        four round-trip times without receiving a feedback report
1344	        [RFC3448Err].

1346	    o   Section 4.4 was clarified to specify that on the expiration of
1347	        the nofeedback timer, if p = 0, step (2) applies instead of step
1348	        (1) [RFC3448Err].

1350	    o   A line in Section 5.5 was changed from ``for (i = 1 to n) { DF_i
1351	        = 1; }'' to ``for (i = 0 to n) { DF_i = 1; }'' [RFC3448Err].

1353	    o   Section 5.4 was modified to clarify the receiver's calculation
1354	        of the average loss interval when the receiver has not yet seen
1355	        eight loss intervals.

1357	    o   Section 4.1 was modified to give a specific algorithm that could
1358	        be used for estimating the average segment size.

1360	10.  Security Considerations

1362	    TFRC is not a transport protocol in its own right, but a congestion
1363	    control mechanism that is intended to be used in conjunction with a
1364	    transport protocol.  Therefore security primarily needs to be
1365	    considered in the context of a specific transport protocol and its
1366	    authentication mechanisms.

1368	    Congestion control mechanisms can potentially be exploited to create
1369	    denial of service.  This may occur through spoofed feedback.  Thus
1370	    any transport protocol that uses TFRC should take care to ensure
1371	    that feedback is only accepted from the receiver of the data.  The
1372	    precise mechanism to achieve this will however depend on the
1373	    transport protocol itself.

1375	    In addition, congestion control mechanisms may potentially be
1376	    manipulated by a greedy receiver that wishes to receive more than
1377	    its fair share of network bandwidth.  A receiver might do this by
1378	    claiming to have received packets that in fact were lost due to
1379	    congestion.  Possible defenses against such a receiver would
1380	    normally include some form of nonce that the receiver must feed back
1381	    to the sender to prove receipt.  However, the details of such a
1382	    nonce would depend on the transport protocol, and in particular on
1383	    whether the transport protocol is reliable or unreliable.

1385	    We expect that protocols incorporating ECN with TFRC will also want
1386	    to incorporate feedback from the receiver to the sender using the
1387	    ECN nonce [RFC3540].   The ECN nonce is a modification to ECN that
1388	    protects the sender from the accidental or malicious concealment of
1389	    marked packets.  Again, the details of such a nonce would depend on
1390	    the transport protocol, and are not addressed in this document.

1392	11.  IANA Considerations

1394	    There are no IANA actions required for this document.

1396	12.  Acknowledgments

1398	    We would like to acknowledge feedback and discussions on equation-
1399	    based congestion control with a wide range of people, including
1400	    members of the Reliable Multicast Research Group, the Reliable
1401	    Multicast Transport Working Group, and the End-to-End Research
1402	    Group.   We would like to thank Dado Colussi, Gorry Fairhurst, Ladan
1403	    Gharai, Wim Heirman, Eddie Kohler, Ken Lofgren, Mike Luby, Ian
1404	    McDonald, Michele R., Gerrit Renker, Arjuna Sathiaseelan, Vladica
1405	    Stanisic, Randall Stewart, Eduardo Urzaiz, Shushan Wen, and Wendy
1406	    Lee (lhh@zsu.edu.cn) for feedback on earlier versions of this
1407	    document, and to thank Mark Allman for his extensive feedback from
1408	    using the document to produce a working implementation.

1410	13.  Terminology

1412	    This document uses the following terms:

1414	    DF: discount factor for a loss interval

1416	    last_counter : greatest received value of the window counter

1418	    min_rate : minimum transmit rate

1420	    MSS : Maximum Segment Size (constant)

1422	    n : number of loss intervals

1424	    NDUPACK : number of dupacks for inferring loss (constant)

1426	    nofeedback timer : sender-side timer

1428	    p : measured Loss Event Rate

1430	    p_prev : previous value of p

1432	    q : filter constant for RTT (constant)

1434	    q2 : filter constant for long-term RTT (constant)

1436	    R : estimated path round-trip time

1438	    R_sample : measured path RTT

1440	    R_sqmean : estimated long-term RTT

1442	    s : nominal packet size in bytes (constant)

1444	    S : sequence number

1446	    t_delta : parameter for flexibility in send time

1448	    t_gran : schedular granularity (constant)

1450	    t_ipi : calculated inter-packet interval for sending packets

1452	    t_mbi : maximum RTO value of TCP (constant)

1454	    tld : Time Last Doubled

1456	    t_now : current time

1458	    t_RTO : estimated RTO of TCP

1460	    X : allowed transmit rate
1461	    X_Bps : calculated sending rate in bytes per second

1463	    X_pps : calculated sending rate in packets per second

1465	    X_recv : estimated receive rate at the receiver

1467	    X_inst : instantaneous transmit rate

1469	    W_init : TCP initial window (constant)

1471	14.  Normative References

1473	15.  Informational References

1475	     [BRS99]        Balakrishnan, H., Rahul, H., and Seshan, S., "An
1476	                    Integrated Congestion Management Architecture for
1477	                    Internet Hosts," Proc. ACM SIGCOMM, Cambridge, MA,
1478	                    September 1999.

1480	     [FHPW00]       S. Floyd, M. Handley, J. Padhye, and J. Widmer,
1481	                    "Equation-Based Congestion Control for Unicast
1482	                    Applications", August 2000, Proc SIGCOMM 2000.

1484	     [FHPW00a]      S. Floyd, M. Handley, J. Padhye, and J. Widmer,
1485	                    "Equation-Based Congestion Control for Unicast
1486	                    Applications: the Extended Version", ICSI tech
1487	                    report TR-00-03, March 2000.

1489	     [PFTK98]       Padhye, J. and  Firoiu, V. and Towsley, D. and
1490	                    Kurose, J., "Modeling TCP Throughput: A Simple Model
1491	                    and its Empirical Validation", Proc ACM SIGCOMM
1492	                    1998.

1494	     [RFC2119]      S. Bradner, Key Words For Use in RFCs to Indicate
1495	                    Requirement Levels, RFC 2119.

1497	     [RFC2140]      J. Touch, "TCP Control Block Interdependence", RFC
1498	                    2140, April 1997.

1500	     [RFC2988]      V. Paxson and M. Allman, "Computing TCP's
1501	                    Retransmission Timer", RFC 2988, November 2000.

1503	     [RFC3168]      K. Ramakrishnan and S. Floyd, "The Addition of
1504	                    Explicit Congestion Notification (ECN) to IP", RFC
1505	                    3168, September 2001.

1507	     [RFC3390]      Allman, M., Floyd, S., and C. Partridge, "Increasing
1508	                    TCP's Initial Window", RFC 3390, October 2002.

1510	     [RFC3448Err]   RFC 3448 Errata, URL
1511	                    ``http://www.icir.org/tfrc/rfc3448.errata''.

1513	     [RFC3540]      Wetherall, D., Ely, D., and Spring, N., "Robust ECN
1514	                    Signaling with Nonces", RFC 3540, Experimental, June
1515	                    2003

1517	     [RFC4340]      Kohler, E., Handley, M., and S. Floyd, "Datagram
1518	                    Congestion Control Protocol (DCCP)", RFC 4340, March
1519	                    2006.

1521	     [RFC4342]      Floyd, S., Kohler, E., and J. Padhye, "Profile for
1522	                    Datagram Congestion Control Protocol (DCCP)
1523	                    Congestion Control ID 3: TCP-Friendly Rate Control
1524	                    (TFRC)", RFC 4342, March 2006.

1526	     [TFRC-SP]      Floyd, S., and E. Kohler, TCP Friendly Rate Control
1527	                    (TFRC): the Small-Packet (SP) Variant, Internet
1528	                    draft draft-ietf-dccp-tfrc-voip-07.txt, work in
1529	                    progress, November 2006.  Approved for Experimental.
1530	                    .

1532	     [W00]          Widmer, J., "Equation-Based Congestion Control",
1533	                    Diploma Thesis, University of Mannheim, February
1534	                    2000.  URL "http://www.icir.org/tfrc/".

1536	16.  Authors' Addresses
1537	         Mark Handley,
1538	         Department of Computer Science
1539	         University College London
1540	         Gower Street
1541	         London WC1E 6BT
1542	         UK
1543	         EMail: M.Handley@cs.ucl.ac.uk

1545	         Sally Floyd
1546	         ICIR/ICSI
1547	         1947 Center St, Suite 600
1548	         Berkeley, CA 94708
1549	         floyd@icir.org

1551	         Jitendra Padhye
1552	         Microsoft Research
1553	         padhye@microsoft.com

1555	         Joerg Widmer
1556	         Lehrstuhl Praktische Informatik IV
1557	         Universitat Mannheim
1558	         L 15, 16 - Room 415
1559	         D-68131 Mannheim
1560	         Germany
1561	         widmer@informatik.uni-mannheim.de

1563	Full Copyright Statement

1565	    Copyright (C) The IETF Trust (2007).

1567	    This document is subject to the rights, licenses and restrictions
1568	    contained in BCP 78, and except as set forth therein, the authors
1569	    retain all their rights.

1571	    This document and the information contained herein are provided on
1572	    an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE
1573	    REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE
1574	    IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL
1575	    WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY
1576	    WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE
1577	    ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS
1578	    FOR A PARTICULAR PURPOSE.

1580	Intellectual Property

1582	    The IETF takes no position regarding the validity or scope of any
1583	    Intellectual Property Rights or other rights that might be claimed
1584	    to pertain to the implementation or use of the technology described
1585	    in this document or the extent to which any license under such
1586	    rights might or might not be available; nor does it represent that
1587	    it has made any independent effort to identify any such rights.
1588	    Information on the procedures with respect to rights in RFC
1589	    documents can be found in BCP 78 and BCP 79.

1591	    Copies of IPR disclosures made to the IETF Secretariat and any
1592	    assurances of licenses to be made available, or the result of an
1593	    attempt made to obtain a general license or permission for the use
1594	    of such proprietary rights by implementers or users of this
1595	    specification can be obtained from the IETF on-line IPR repository
1596	    at http://www.ietf.org/ipr.

1598	    The IETF invites any interested party to bring to its attention any
1599	    copyrights, patents or patent applications, or other proprietary
1600	    rights that may cover technology that may be required to implement
1601	    this standard.  Please address the information to the IETF at ietf-
1602	    ipr@ietf.org.