idnits 2.17.1 

draft-ietf-rmt-bb-tfmcc-07.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 16.

  -- Found old boilerplate from RFC 3978, Section 5.5 on line 1363.

  ** This document has an original RFC 3978 Section 5.4 Copyright Line,
     instead of the newer IETF Trust Copyright according to RFC 4748.

  ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead
     of the newer disclaimer which includes the IETF Trust according to RFC
     4748.

  ** The document seems to lack an RFC 3979 Section 5, para. 1 IPR Disclosure
     Acknowledgement. 

  ** The document seems to lack an RFC 3979 Section 5, para. 2 IPR Disclosure
     Acknowledgement. 

  ** The document seems to lack an RFC 3979 Section 5, para. 3 IPR Disclosure
     Invitation. 


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity -- however, there's a paragraph with a matching
     beginning. Boilerplate error?

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** There is 1 instance of too long lines in the document, the longest one
     being 1 character in excess of 72.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not
     match the current year

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (9 May 2006) is 6561 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  ** Downref: Normative reference to an Informational RFC: RFC 3048 (ref. '1')

  ** Downref: Normative reference to an Informational RFC: RFC 3269 (ref. '3')

  -- Obsolete informational reference (is this intentional?): RFC 3941 (ref.
     '6') (Obsoleted by RFC 5401)

  -- Obsolete informational reference (is this intentional?): RFC 2481 (ref.
     '10') (Obsoleted by RFC 3168)

  -- Obsolete informational reference (is this intentional?): RFC 1889 (ref.
     '12') (Obsoleted by RFC 3550)


     Summary: 10 errors (**), 0 flaws (~~), 2 warnings (==), 7 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Internet Engineering Task Force                                   RMT WG
2	INTERNET-DRAFT                             Joerg Widmer/DoCoMo Euro-Labs
3	draft-ietf-rmt-bb-tfmcc-07.txt                          Mark Handley/UCL

5	                                                              9 May 2006
6	                                                  Expires: November 2006

8	           TCP-Friendly Multicast Congestion Control (TFMCC):
9	                         Protocol Specification

11	Status of this Document

13	By submitting this Internet-Draft, each author represents that any
14	applicable patent or other IPR claims of which he or she is aware have
15	been or will be disclosed, and any of which he or she becomes aware will
16	be disclosed, in accordance with Section 6 of BCP 79.

18	Internet-Drafts are working documents of the Internet Engineering Task
19	Force (IETF), its areas, and its working groups.  Note that other groups
20	may also distribute working documents as Internet-Drafts.

22	Internet-Drafts are draft documents valid for a maximum of six months
23	and may be updated, replaced, or obsoleted by other documents at any
24	time.  It is inappropriate to use Internet- Drafts as reference material
25	or to cite them other than as "work in progress."

27	The list of current Internet-Drafts can be accessed at
28	http://www.ietf.org/ietf/1id-abstracts.txt

30	The list of Internet-Draft Shadow Directories can be accessed at
31	http://www.ietf.org/shadow.html.

33	This document is a product of the IETF RMT WG.  Comments should be
34	addressed to the authors, or the WG's mailing list at rmt@lbl.gov.

36	                                Abstract

38	     This document specifies TCP-Friendly Multicast Congestion
39	     Control (TFMCC).  TFMCC is a congestion control mechanism for
40	     multicast transmissions in a best-effort Internet environment.

42	     It is a single-rate congestion control scheme, where the
43	     sending rate is adapted to the receiver experiencing the worst
44	     network conditions.  TFMCC is reasonably fair when competing
45	     for bandwidth with TCP flows and has a relatively low
46	     variation of throughput over time, making it suitable for
47	     applications such as streaming media where a relatively smooth
48	     sending rate is of importance.

50	                           Table of Contents

52	     1. Introduction. . . . . . . . . . . . . . . . . . . . . .   4
53	      1.1. Terminology. . . . . . . . . . . . . . . . . . . . .   4
54	      1.2. Related Documents. . . . . . . . . . . . . . . . . .   5
55	      1.3. Environmental Requirements and Considerations. . . .   5
56	     2. Protocol Overview . . . . . . . . . . . . . . . . . . .   6
57	      2.1. TCP Throughput Equation. . . . . . . . . . . . . . .   7
58	      2.2. Packet Contents. . . . . . . . . . . . . . . . . . .   8
59	       2.2.1. Sender Packets. . . . . . . . . . . . . . . . . .   9
60	       2.2.2. Feedback Packets. . . . . . . . . . . . . . . . .  10
61	     3. Data Sender Protocol. . . . . . . . . . . . . . . . . .  11
62	      3.1. Sender Initialization. . . . . . . . . . . . . . . .  11
63	      3.2. Determining the Maximum RTT. . . . . . . . . . . . .  11
64	      3.3. Adjusting the Sending Rate . . . . . . . . . . . . .  12
65	      3.4. Controlling Receiver Feedback. . . . . . . . . . . .  13
66	      3.5. Assisting Receiver-Side RTT Measurements . . . . . .  14
67	      3.6. Slowstart. . . . . . . . . . . . . . . . . . . . . .  15
68	      3.7. Scheduling of Packet Transmissions . . . . . . . . .  16
69	     4. Data Receiver Protocol. . . . . . . . . . . . . . . . .  17
70	      4.1. Receiver Initialization. . . . . . . . . . . . . . .  17
71	      4.2. Receiver Leave . . . . . . . . . . . . . . . . . . .  17
72	      4.3. Measurement of the Network Conditions. . . . . . . .  18
73	       4.3.1. Updating the Loss Event Rate. . . . . . . . . . .  18
74	       4.3.2. Basic Round-Trip Time Measurement . . . . . . . .  18
75	       4.3.3. One-Way Delay Adjustments . . . . . . . . . . . .  19
76	       4.3.4. Receive Rate Measurements . . . . . . . . . . . .  19
77	      4.4. Setting the Desired Rate . . . . . . . . . . . . . .  20
78	      4.5. Feedback and Feedback Suppression. . . . . . . . . .  20
79	     5. Calculation of the Loss Event Rate. . . . . . . . . . .  22
80	      5.1. Detection of Lost or Marked Packets. . . . . . . . .  22
81	      5.2. Translation from Loss History to Loss Events . . . .  23
82	      5.3. Inter-Loss Event Interval. . . . . . . . . . . . . .  24
83	      5.4. Average Loss Interval. . . . . . . . . . . . . . . .  24
84	      5.5. History Discounting. . . . . . . . . . . . . . . . .  25
85	      5.6. Initializing the Loss History after the First
86	      Loss Event. . . . . . . . . . . . . . . . . . . . . . . .  28
87	     6. Security Considerations . . . . . . . . . . . . . . . .  29
88	     7. IANA Considerations . . . . . . . . . . . . . . . . . .  29
89	     8. Authors' Addresses. . . . . . . . . . . . . . . . . . .  30
90	     9. Acknowledgments . . . . . . . . . . . . . . . . . . . .  30
91	     10. References . . . . . . . . . . . . . . . . . . . . . .  30
92	      10.1. Normative References. . . . . . . . . . . . . . . .  30
93	      10.2. Informative References. . . . . . . . . . . . . . .  30
94	     11. Copyright Notice . . . . . . . . . . . . . . . . . . .  31

96	1.  Introduction

98	This document specifies TCP-Friendly Multicast Congestion Control
99	(TFMCC) [4].  TFMCC is a source-based, single-rate congestion control
100	scheme that builds upon the unicast TCP-Friendly Rate Control mechanism
101	(TFRC) [5].  TFMCC is stable and responsive under a wide range of
102	network conditions and scales to receiver sets on the order of several
103	thousand receivers.  To support scalability, as much congestion control
104	functionality as possible is located at the receivers.  Each receiver
105	continuously determines a desired receive rate that is TCP-friendly for
106	the path from the sender to this receiver.  Selected receivers then
107	report the rate to the sender in feedback packets.

109	TFMCC is a building block as defined in RFC 3048 [1].  Instead of
110	specifying a complete protocol, this document simply specifies a
111	congestion control mechanism that could be used in a transport protocol
112	such as RTP [12], in an application incorporating end-to-end congestion
113	control at the application level.  This document does not discuss packet
114	formats, reliability, or implementation-related issues.

116	TFMCC is designed to be reasonably fair when competing for bandwidth
117	with TCP flows.  A multicast flow is ``reasonably fair'' if its sending
118	rate is generally within a factor of two of the sending rate of a TCP
119	flow from the sender to the slowest receiver of the multicast group
120	under the same network conditions.

122	In general, TFMCC has a low variation of throughput, which makes it
123	suitable for applications such as streaming media where a relatively
124	smooth sending rate is of importance.  The penalty of having smooth
125	throughput while competing fairly for bandwidth is a reduced
126	responsiveness to changes in available bandwidth.  Thus TFMCC should be
127	used when the application has a requirement for smooth throughput, in
128	particular, avoiding halving of the sending rate in response to a single
129	packet drop.  For applications that simply need to multicast as much
130	data as possible in as short a time as possible, PGMCC [11] may be more
131	suitable.

133	1.1.  Terminology

135	In this document, the key words "MUST", "MUST NOT", "REQUIRED", "SHALL",
136	"SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
137	"OPTIONAL" are to be interpreted as described in BCP 14, RFC 2119 [2]
138	and indicate requirement levels for compliant TFMCC implementations.

140	1.2.  Related Documents

142	As described in RFC 3048 [1], TFMCC is a building block that is intended
143	to be used, in conjunction with other building blocks, to help specify a
144	protocol instantiation.  It follows the general guidelines provided in
145	RFC 3269 [3].  In particular, TFMCC is a suitable congestion control
146	building block for NACK-Oriented Reliable Multicast (NORM) [6].

148	1.3.  Environmental Requirements and Considerations

150	TFMCC is intended to be a congestion control scheme that can be used in
151	a complete protocol instantiation that delivers objects and streams
152	(both reliable content delivery and streaming of multimedia
153	information).

155	TFMCC is most applicable for sessions that deliver a substantial amount
156	of data, i.e., in length from hundreds of kilobytes to many gigabytes,
157	and whose duration is in the order of tens of seconds or more.

159	TFMCC is intended for multicast delivery.  There are currently two
160	models of multicast delivery, the Any-Source Multicast (ASM) model as
161	defined in [7] defined in [8].  TFMCC works with both multicast models,
162	but in a slightly different way.  When using ASM, feedback from the
163	receivers is multicast to the sender as well as to all other receivers.
164	Feedback can be either multicast on the same group address used for
165	sending data or on a separate multicast feedback group address.  For
166	SSM, the receivers must unicast the feedback directly to the sender.
167	Hence, feedback from a receiver will not be received by other receivers.

169	TFMCC inherently works with all types of networks that allow bi-
170	directional communication, including LANs, WANs, Intranets, the
171	Internet, asymmetric networks, wireless networks, and satellite
172	networks.  However, in some network environments varying the sending
173	rate to the receivers may not be advantageous (e.g., for a satellite or
174	wireless network, there may be no mechanism for receivers to effectively
175	reduce their reception rate since there may be a fixed transmission rate
176	allocated to the session).

178	The difference in responsiveness of TFMCC and TCP may result in
179	significant throughput differences in case of a very low bitrate.  TFMCC
180	requires an estimate of the loss event rate to calculate a fair sending
181	rate.  This estimate may be inaccurate in case TFMCC receives only very
182	few packets per RTT.  TFMCC should not be used together with TCP if the
183	capacity of the bottleneck link is less than 30KBit/s (e.g., a very slow
184	modem connection).  TFMCC may also achieve a rate that is very different
185	from the average TCP rate in case buffer space at the bottleneck is
186	severely underprovisioned.  In particular, TFMCC is less susceptible to
187	small buffer sizes since TFMCC spaces out packets in time, while TCP
188	sends them back-to-back.  Thus TCP is much more likely to see a packet
189	loss if buffer space is scarce.

191	TFMCC is designed for applications that use a fixed packet size, and
192	vary their sending rate in packets per second in response to congestion.
193	Some applications, e.g. those using audio, require a fixed interval of
194	time between packets and vary their packet size instead of their packet
195	rate in response to congestion.  The congestion control mechanism in
196	this document cannot be used by those applications.

198	2.  Protocol Overview

200	TFMCC extends the basic mechanisms of TFRC into the multicast domain.
201	In order to compete fairly with TCP, TFMCC receivers individually
202	measure the prevalent network conditions and calculate a rate that is
203	TCP-friendly on the path from the sender to themselves.  The rate is
204	determined using an equation for TCP throughput, which roughly describes
205	TCP's sending rate as a function of the loss event rate, round-trip time
206	(RTT), and packet size.  We define a loss event as one or more lost or
207	marked packets from the packets received during one RTT, where a marked
208	packet refers to a congestion indication from Explicit Congestion
209	Notification (ECN) [10].  The sending rate of the multicast transmission
210	is adapted to the receiver experiencing the worst network conditions.

212	Basically, TFMCC's congestion control mechanism works as follows:

214	o Each receiver measures the loss event rate and its RTT to the sender.

216	o Each receiver then uses this information, together with an equation
217	  for TCP throughput, to derive a TCP-friendly sending rate.

219	o Through a distributed feedback suppression mechanism, only a subset of
220	  the receivers are allowed to give feedback to prevent a feedback
221	  implosion at the sender.  The feedback mechanism ensures that
222	  receivers reporting a low desired transmission rate have a high
223	  probability of sending feedback.

225	o Receivers whose feedback is not suppressed report the calculated
226	  transmission rate back to the sender in so-called receiver reports.
227	  The receiver reports serve two purposes: they inform the sender about
228	  the appropriate transmit rate, and they allow the receivers to measure
229	  their RTT.

231	o The sender selects the receiver that reports the lowest rate as
232	  current limiting receiver (CLR).  Whenever feedback with an even lower
233	  rate reaches the sender, the corresponding receiver becomes CLR and
234	  the sending rate is reduced to match that receiver's calculated rate.
235	  The sending rate increases when the CLR reports a calculated rate
236	  higher than the current sending rate.

238	The dynamics of TFMCC are sensitive to how the measurements are
239	performed and applied and what feedback suppression mechanism is chosen.
240	We recommend specific mechanisms below to perform and apply these
241	measurements.  Other mechanisms are possible, but it is important to
242	understand how the interactions between mechanisms affect the dynamics
243	of TFMCC.

245	2.1.  TCP Throughput Equation

247	Any realistic equation giving TCP throughput as a function of loss event
248	rate and RTT should be suitable for use in TFMCC.  However, we note that
249	the TCP throughput equation used must reflect TCP's retransmit timeout
250	behavior, as this dominates TCP throughput at higher loss rates.  We
251	also note that the assumptions implicit in the throughput equation about
252	the loss event rate parameter have to be a reasonable match to how the
253	loss rate or loss event rate is actually measured.  While this match is
254	not perfect for the throughput equation and loss rate measurement
255	mechanisms given below, in practice the assumptions turn out to be close
256	enough.

258	The throughput equation we currently recommend for TFMCC is a slightly
259	simplified version of the throughput equation for Reno TCP from [9]:

261	                                    8 s
262	     X =  ---------------------------------------------------------   (1)
263	           R * (sqrt(2*p/3) + (12*sqrt(3*p/8) * p * (1+32*p^2)))

265	where

267	     X is the transmit rate in bits/second.

269	     s is the packet size in bytes.

271	     R is the round-trip time in seconds.

273	     p is the loss event rate, between 0.0 and 1.0, of the number of
274	     loss events as a fraction of the number of packets transmitted.

276	In future, different TCP equations may be substituted for this equation.
277	The requirement is that the throughput equation be a reasonable
278	approximation of the sending rate of TCP for conformant TCP congestion
279	control.

281	The parameters s (packet size), p (loss event rate) and R (RTT) need to
282	be measured or calculated by a TFMCC implementation.  The measurement of
283	R is specified in Section 4.3.2, and the measurement of p is specified
284	in Section 5.  The parameter s (packet size) is normally known to an
285	application.  This may not be so in two cases:

287	o The packet size naturally varies depending on the data.  In this case,
288	  although the packet size varies, that variation is not coupled to the
289	  transmit rate.  It should normally be safe to use an estimate of the
290	  mean packet size for s.

292	o The application needs to change the packet size rather than the number
293	  of packets per second to perform congestion control.  This would
294	  normally be the case with packet audio applications where a fixed
295	  interval of time needs to be represented by each packet.  Such
296	  applications need to have a different way of measuring parameters.

298	Currently, TFMCC cannot be used for the second class of applications.

300	2.2.  Packet Contents

302	Before specifying the sender and receiver functionality, we describe the
303	congestion control information contained in packets sent by the sender
304	and feedback packets from the receivers.  Information from the sender
305	can either be sent in separate congestion control messages or
306	piggybacked onto data packets.  If separate congestion control messages
307	are sent at time intervals larger than the time interval between data
308	packets (e.g., once per feedback round), it is necessary to be able to
309	include timestamp information destined for more than one receiver to
310	allow a sufficient number of receivers to measure their RTT.

312	As TFMCC will be used along with a transport protocol, we do not specify
313	packet formats, since these depend on the details of the transport
314	protocol used.  The recommended representation of the header fields is
315	given below.  Alternatively, if the computational overhead of a floating
316	point representation is prohibitive, fixed point arithmetic can be used
317	at the expense of larger packet headers.

319	2.2.1.  Sender Packets

321	Each packet sent by the data sender contains the following information:

323	o A sequence number i.  This number is incremented by one for each data
324	  packet transmitted.  The field must be sufficiently large that it does
325	  not wrap causing two different packets with the same sequence number
326	  to be in the receiver's recent packet history at the same time.  In
327	  most cases the sequence number will be supplied by the transport
328	  protocol used along with TFMCC.

330	o A suppression rate X_supp in bits/s.  Only receivers with a calculated
331	  rate lower than the suppression rate are eligible to give feedback,
332	  unless their RTT is higher than the maximum RTT described below in
333	  which case they are also eligible to give feedback.  The suppression
334	  rate should be represented as a 12 bit floating point value with 5
335	  bits for the unsigned exponent and 7 bits for the unsigned mantissa
336	  (to represent rates from 100 bit/s to 400 Gbit/s with an error of less
337	  than 1%).

339	o A timestamp ts_i indicating when the packet is sent.  The resolution
340	  of the timestamp should typically be milliseconds and the timestamp
341	  should be an unsigned integer value no less than 16 bits wide.

343	o A receiver ID r and a copy of the timestamp tr_r' = tr_r of that
344	  receiver's last report, which allows the receiver to measure its RTT.
345	  If there is a delay ts_d between receiving the report from receiver r
346	  and sending the data packet, then tr_r' = tr_r + ts_d is included in
347	  the packet instead.  The receiver ID is described in the next section.
348	  The resolution of the timestamp echo should be milliseconds and the
349	  timestamp should be an unsigned integer value no less than 16 bits
350	  wide.  If separate instead of piggybacked congestion control messages
351	  are used, the packet needs to contain a list of receiver IDs with
352	  corresponding timestamps to allow a sufficient number of receivers to
353	  simultaneously measure their RTT.  For the default values used for the
354	  feedback process this corresponds to a list size on the order of 10 to
355	  20 entries.

357	o A flag is_CLR indicating whether the receiver with ID r is the CLR.

359	o A feedback round counter fb_nr.  This counter is incremented by the
360	  sender at the beginning of a new feedback round to notify the
361	  receivers than all feedback for older rounds should be suppressed.
362	  The feedback round counter should be at least 4 bits wide.

364	o A maximum RTT value R_max, representing the maximum of the RTTs of all
365	  receivers.  The RTT should be measured in milliseconds.  An 8 bit
366	  floating point value with 4 bits for the unsigned exponent and 4 bits
367	  for the unsigned mantissa (to represent RTTs from 1 millisecond to 64
368	  seconds with an error of ca. 6%) should be used for the
369	  representation.

371	2.2.2.  Feedback Packets

373	Each feedback packet sent by a data receiver contains the following
374	information:

376	o A unique receiver ID r.  In most cases the receiver ID will be
377	  supplied by the transport protocol, but it may simply be the IP
378	  address of the receiver.

380	o A flag have_RTT indicating whether the receiver has made at least one
381	  RTT measurement since it joined the session.

383	o A flag have_loss indicating whether the receiver experienced at least
384	  one loss event since it joined the session.

386	o A flag receiver_leave indicating that the receiver will leave the
387	  session (and should therefore not be CLR).

389	o A timestamp tr_r indicating when the feedback packet is sent.  The
390	  representation of the timestamp should be the same as that of the
391	  timestamp echo in the data packets.

393	o An echo ts_i' of the timestamp of the last data packet received.  If
394	  the last packet received at the receiver has sequence number i, then
395	  ts_i' = ts_i is included in the feedback.  If there is a delay tr_d
396	  between receiving that last data packet and sending feedback, then
397	  ts_i' = ts_i + tr_d is included in the feedback instead.  The
398	  representation of the timestamp echo should be the same as that of the
399	  timestamp in the data packets.

401	o A feedback round echo fb_nr, reflecting the highest feedback round
402	  counter value received so far.  The representation of the feedback
403	  round echo should be the same as the one used for the feedback round
404	  counter in the data packets.

406	o The desired sending rate X_r.  This is the rate calculated by the
407	  receiver to be TCP-friendly on the path from the sender to this
408	  receiver.  The representation of the desired sending rate should be
409	  the same as that of the suppression rate in the data packets.

411	3.  Data Sender Protocol

413	The data sender multicasts a stream of data packets to the data
414	receivers at a controlled rate.  Whenever feedback is received, the
415	sender checks if it is necessary to switch CLRs and to readjust the
416	sending rate.

418	The main tasks that have to be provided by a TFMCC sender are:

420	o adjusting the sending rate,

422	o controlling receiver feedback, and

424	o assisting receiver-side RTT measurements.

426	3.1.  Sender Initialization

428	At initialization of the sender, the maximum RTT is set to a value that
429	should be larger than the highest RTT to any of the receivers.  It
430	should not be smaller than 500 milliseconds for operation in the public
431	Internet.  The sending rate X is initialized to 1 packet per maximum
432	RTT.

434	3.2.  Determining the Maximum RTT

436	For each feedback packet that arrives at the sender, the sender computes
437	the instantaneous RTT to the receiver as

439	     R_r = ts_now - ts_i'

441	where ts_now is the time the feedback packet arrived.  Receivers will
442	have adjusted ts_i' for the time interval between receiving the last
443	data packet and sending the corresponding report so that this interval
444	will not be included in R_r.  If the actual RTT is smaller than the
445	resolution of the timestamps and ts_now equals ts_i', then R_r is set to
446	the smallest positive RTT value larger than 0 (i.e., 1 millisecond in
447	our case).  If the instantaneous RTT is larger than the current maximum
448	RTT, the maximum RTT is increased to that value

450	     R_max = R_r

452	Otherwise, if no feedback with a higher instantaneous RTT than the
453	maximum RTT is received during a feedback round (see Section 3.4), the
454	maximum RTT is reduced to
455	     R_max = MAX(R_max * 0.9, R_peak)

457	where R_peak is the peak receiver RTT measured during the feedback
458	round.

460	The maximum RTT is mainly used for feedback suppression among receivers
461	with heterogeneous RTTs.  Feedback suppression is closely coupled to the
462	sending of data packets and for this reason, the maximum RTT must not
463	decrease below the maximum time interval between consecutive data
464	packets:

466	     R_max = max(R_max, 8s/X + ts_gran)

468	where ts_gran is the granularity of the sender's system clock (see
469	Section 3.7).

471	3.3.  Adjusting the Sending Rate

473	When a feedback packet from receiver r arrives at the sender, the sender
474	has to check whether it is necessary to adjust the transmission rate and
475	to switch to a new CLR.

477	How the rate is adjusted depends on the desired rate X_r of the receiver
478	report.  We distinguish four cases:

480	1    If no CLR is present, receiver r becomes the current limiting
481	     receiver.  The sending rate X is directly set to X_r, so long as
482	     this would result in a rate increase of less than 8s/R_max bits/s
483	     (i.e., 1 packet per R_max).  Otherwise X is gradually increased to
484	     X_r at an increase rate of no more than 8s/R_max bits/s every R_max
485	     seconds.

487	2    If receiver r is not the CLR but a CLR is present, then receiver r
488	     becomes the current limiting receiver if X_r is less than the
489	     current sending rate X and the receiver_leave flag of that
490	     receiver's report is not set.  Furthermore, the sending rate is
491	     reduced to X_r.

493	3    If receiver r is not the CLR but a CLR is present and the
494	     receiver_leave flag of the CLR's last report was set, then receiver
495	     r becomes the current limiting receiver.  However, if X_r > X, the
496	     sending rate is not increased to X_r for the duration of a feedback
497	     round to allow other (lower rate) receivers to give feedback and be
498	     selected as CLR.

500	4    If receiver r is the CLR, the sending rate is set to the minimum of
501	     X_r and X + 8s/R_max bits/s.

503	If the receiver has not yet measured its RTT but already experienced
504	packet loss (indicated by the corresponding flags in the receiver
505	report), the receiver report will include a desired rate that is based
506	on the maximum RTT rather than the actual RTT to that receiver.  In this
507	case, the sender adjusts the desired rate using its measurement of the
508	instantaneous RTT R_r to that receiver:

510	     X_r' = X_r * R_max / R_r

512	X_r' is then used instead of X_r to detect whether to switch to a new
513	CLR.

515	If the TFMCC sender receives no reports from the CLR for 4 RTTs, the
516	sending rate is cut in half unless the CLR was selected less than 10
517	RTTs ago.  In addition, if the sender receives no reports from the CLR
518	for at least 10 RTTs, it assumes that the CLR crashed or left the group.
519	A new CLR is selected from the feedback that subsequently arrives at the
520	sender, and we increase as in case 3 above.

522	If no new CLR can be selected (i.e., in the absence of any feedback from
523	any of the receivers) it is necessary to further reduce the sending
524	rate.  For every 10 consecutive RTTs without feedback, the sending rate
525	is cut in half.  The rate is at most reduced to one packet every 8
526	seconds.

528	Note that when receivers stop receiving data packets, they will stop
529	sending feedback.  This eventually causes the sending rate to be reduced
530	in the case of network failure.  If the network subsequently recovers, a
531	linear increase to the calculated rate of the CLR will occur at 8s/R_max
532	bits/s every R_max.

534	3.4.  Controlling Receiver Feedback

536	The receivers allowed to send a receiver report are determined in so-
537	called feedback rounds.  Feedback rounds have a duration T of six times
538	the maximum RTT.  In case the multicast model is ASM, i.e., receiver
539	feedback is multicast to the whole group, the duration of a feedback
540	round may be reduced to four times the maximum RTT.

542	Only receivers wishing to report a rate that is lower than the
543	suppression rate X_supp, or those with a higher RTT than R_max may send
544	feedback.  At the beginning of each feedback round, X_supp is set to the
545	highest possible value that can be represented.  When feedback arrives
546	at the sender over the course of a feedback round, X_supp is decreased
547	such that more and more feedback is suppressed towards the end of the
548	round.  How receiver feedback is spread out over the feedback round is
549	discussed in Section 4.5.

551	Whenever non-CLR feedback for the current round arrives at the sender,
552	X_supp is reduced to

554	     X_supp = (1-g) * X_r

556	if X_supp > X_r.  Feedback that causes the corresponding receiver to be
557	selected as CLR, but was from a non-CLR receiver at the time of sending
558	also contributes to the feedback suppression.  Note that X_r must not be
559	adjusted by the sender to reflect the receiver's real RTT in case X_r
560	was calculated using the maximum RTT, as is done for setting the sending
561	rate (Section 3.3), otherwise a feedback implosion is possible.  The
562	parameter g determines to what extent higher rate feedback can suppress
563	lower rate feedback.  This mechanism guarantees, that the lowest
564	calculated rate reported lies within a factor of g of the actual lowest
565	calculated rate of the receiver set (see [14]).  A value of g of 0.1 is
566	recommended.

568	To allow receivers to suppress their feedback, the sender's suppression
569	rate needs to be updated whenever feedback is received.  This
570	suppression rate has to be communicated to the receivers in a timely
571	manner, either by including it in the data packet header or, if separate
572	congestion control messages are used, by sending a message with the
573	suppression rate whenever the rate changes significantly (i.e., when it
574	is reduced to less than (1-g) times the previously advertised
575	suppression rate).

577	After a time span of T, the feedback round ends if non-CLR feedback was
578	received during that time.  Otherwise, the feedback round ends as soon
579	as the first non-CLR feedback message arrives at the sender but at most
580	after 2T.  The feedback round counter is incremented by one and the
581	suppression rate X_supp is reset to the highest representable value.
582	The feedback round counter restarts with round 0 after a wrap-around.

584	3.5.  Assisting Receiver-Side RTT Measurements

586	Receivers measure their RTT by sending a timestamp with a receiver
587	report, which is echoed by the sender.  If congestion control
588	information is piggybacked onto data packets, usually only one receiver
589	ID and timestamp can be included.  If multiple feedback messages from
590	different receivers arrive at the sender during the time interval
591	between two data packets, the sender has to decide which receiver to
592	allow to measure RTT.  The same applies if separate congestion control
593	messages allow to echo multiple receiver timestamps simultaneously but
594	the number of receivers that gave feedback since the last congestion
595	control message exceeds the list size.

597	The sender's timestamp echoes are prioritized in the following order:

599	1.   a new CLR (after a change of CLR's) or a CLR without any previous
600	     RTT measurements

602	2.   receivers without any previous RTT measurements in the order of the
603	     feedback round echo of the corresponding receiver report (i.e.,
604	     older feedback first)

606	3.   non-CLR receivers with previous RTT measurements, again in
607	     ascending order of the feedback round echo of the report

609	4.   the CLR

611	Ties are broken in favor of the receiver with the lowest reported rate.

613	It is necessary to account for the time that elapses between receiving a
614	report and sending the next data packet.  This time needs to be deducted
615	from the RTT and thus has to be added to the receiver's timestamp value.

617	Whenever no feedback packets arrive in the interval between two data
618	packets, the CLR's last timestamp, adjusted by the appropriate offset,
619	is echoed.  When the number of packets per RTT is so low that all
620	packets carry a non-CLR receiver's timestamp, the CLR's timestamp and ID
621	are included in a data packet at least once per feedback round.

623	3.6.  Slowstart

625	TFMCC uses a slowstart mechanism to quickly approach its fair bandwidth
626	share at the start of a session.  During slowstart, the sending rate
627	increases exponentially.  The rate increase is limited to the minimum of
628	the rates included in the receiver reports and receivers report twice
629	the rate at which they currently receive data.  As in normal congestion
630	control mode, the receiver with the smallest reported rate becomes CLR.
631	Since a receiver can never receive data at a rate higher than its link
632	bandwidth, this effectively limits the overshoot to twice this
633	bandwidth.  In case the resulting increase over R_max is less than
634	8s/R_max bits/s, the sender may choose to increase the rate by up to
635	8s/R_max bits/s every R_max.  The current sending rate is gradually
636	adjusted to the target rate reported in the receiver reports over the
637	course of a RTT.  Slowstart is terminated as soon as any one of the
638	receivers experiences its first packet loss.  Since that receiver's
639	calculated rate will be lower than the current sending rate, the
640	receiver will be selected as CLR.

642	During slowstart, the upper bound on the rate increase of 8s/R_max
643	bits/s every RTT does not apply.  Only after the TFMCC sender receives
644	the first report with the have_loss flag set is the rate increase
645	limited in this way.

647	Slowstart may also be used after the sender has been idle for some time,
648	to quickly reach the previous sending rate.  When the sender stops
649	sending data packets, it records the current sending rate X' = X.  Every
650	10 RTTs, the allowed sending rate will be halved due to lack of receiver
651	feedback as specified in Section 3.3.  This halving may take place
652	multiple times.  When the sender resumes, it may perform a slowstart
653	from the current allowed rate up to the recorded rate X'.  Slowstart
654	ends after the first packet loss by any of the receivers or as soon as
655	X' is reached.

657	To this end, receivers have to clear the have_loss flag after 10 RTTs
658	without data packets as specified in Section 4.3.1.  The have_loss flag
659	is only used during slowstart.  Therefore, clearing the flag has no
660	effect if no packets arrived due to network partitioning or packet loss.

662	3.7.  Scheduling of Packet Transmissions

664	As TFMCC is rate-based, and as operating systems typically cannot
665	schedule events precisely, it is necessary to be opportunistic about
666	sending data packets so that the correct average rate is maintained
667	despite the coarse-grain or irregular scheduling of the operating
668	system.  Thus a typical sending loop will calculate the correct inter-
669	packet interval, ts_ipi, as follows:

671	     ts_ipi = 8s/X

673	When a sender first starts sending at time t_0, it calculates ts_ipi,
674	and calculates a nominal send time t_1 = t_0 + ts_ipi for packet 1.
675	When the application becomes idle, it checks the current time, ts_now,
676	and then requests re-scheduling after (ts_ipi - (ts_now - t_0)) seconds.
677	When the application is re-scheduled, it checks the current time,
678	ts_now, again.  If (ts_now > t_1 - delta) then packet 1 is sent (see
679	below for delta).

681	Now a new ts_ipi may be calculated, and used to calculate a nominal send
682	time t_2 for packet 2: t_2 = t_1 + ts_ipi.  The process then repeats,
683	with each successive packet's send time being calculated from the
684	nominal send time of the previous packet.  Note that the actual send
685	time ts_i and not the nominal send time is included as timestamp in the
686	packet header.

688	In some cases, when the nominal send time, t_i, of the next packet is
689	calculated, it may already be the case that ts_now > t_i - delta.  In
690	such a case the packet should be sent immediately.  Thus if the
691	operating system has coarse timer granularity and the transmit rate is
692	high, then TFMCC may send short bursts of several packets separated by
693	intervals of the OS timer granularity.

695	The parameter delta is to allow a degree of flexibility in the send time
696	of a packet.  If the operating system has a scheduling timer granularity
697	of ts_gran seconds, then delta would typically be set to:

699	     delta = min(ts_ipi/2, ts_gran/2)

701	ts_gran is 10 milliseconds on many Unix systems.  If ts_gran is not
702	known, a value of 10 milliseconds can be safely assumed.

704	4.  Data Receiver Protocol

706	Receivers measure the current network conditions, namely RTT and loss
707	event rate, and use this information to calculate a rate that is fair to
708	competing traffic.  The rate is then communicated to the sender in
709	receiver reports.  Due to the potentially large number of receivers, it
710	is undesirable that all receivers send reports, especially not at the
711	same time.

713	In the description of the receiver functionality, we will first address
714	how the receivers measure the network parameters and then discuss the
715	feedback process.

717	4.1.  Receiver Initialization

719	The receiver is initialized when it receives the first data packet.  The
720	RTT is set to the maximum RTT value contained in the data packet.  This
721	initial value is used as the receiver's RTT until the first real RTT
722	measurement is made.  The loss event rate is initialized to 0.  Also,
723	the flags receiver_leave, have_RTT, and have_loss are cleared.

725	4.2.  Receiver Leave

727	A receiver that sends feedback but wishes to leave the TFMCC session
728	within the next feedback round may indicate the pending leave by setting
729	the receiver_leave flag in its report.  If the leaving receiver is the
730	CLR, the receiver_leave flag should be set for all the reports within
731	the feedback round before the leave takes effect.

733	4.3.  Measurement of the Network Conditions

735	Receivers have to update their estimate of the network parameters with
736	each new data packet they receive.

738	4.3.1.  Updating the Loss Event Rate

740	When a data packet is received, the receiver adds the packet to the
741	packet history.  It then recalculates the new value of the loss event
742	rate p.  The loss event rate measurement mechanism is described
743	separately in Section 5.

745	When a loss event is detected, the flag have_loss is set.  In case no
746	data packets are received for 10 consecutive RTTs, the flag is cleared
747	to allow the sender to slowstart.  It is set again when new data packets
748	arrive and a loss event is detected.

750	4.3.2.  Basic Round-Trip Time Measurement

752	When a receiver gets a data packet that carries the receiver's own ID in
753	the r field, the receiver updates its RTT estimate.

755	1.   The current RTT is calculated as:

757	          R_sample = tr_now - tr_r'

759	     where tr_now is the time the data packet arrives at the receiver
760	     and tr_r' is the receiver report timestamp echoed in the data
761	     packet.  If the actual RTT is smaller than the resolution of the
762	     timestamps and tr_now equals tr_r', then R_sample is set to the
763	     smallest positive RTT value larger than 0 (i.e., 1 millisecond in
764	     our case).

766	2.   The smoothed RTT estimate R is updated:

768	          If no feedback has been received before
769	              R = R_sample
770	          Else
771	              R = q*R + (1-q)*R_sample

773	     A filter parameter q of 0.5 is recommended for non-CLR receivers.
774	     The CLR performs RTT measurements much more frequently and hence
775	     should use a higher filter value.  We recommend using q=0.9.  Note
776	     that TFMCC is not sensitive to the precise value for the filter
777	     constant.

779	Optionally, sender-based instead of receiver-based RTT measurements may
780	be used.  The sender already determines the RTT to a receiver from the
781	receiver's echo of the sender's own timestamp for the calculation of the
782	maximum RTT.  For sender-based RTT measurements, this RTT measurement
783	needs to be communicated to the receiver.  Instead of including an echo
784	of the receiver's timestamp, the sender includes the receiver's RTT in
785	the next data packet, using the prioritization rules described in
786	Section 3.5.

788	To simplify sender operation, smoothing of RTT samples as described
789	above should still be done at the receiver.

791	4.3.3.  One-Way Delay Adjustments

793	When a RTT measurement is performed, the receiver also determines the
794	one-way delay D_r from itself to the sender:

796	     D_r = tr_r' - ts_i

798	where ts_i and tr_r' are the timestamp and receiver report timestamp
799	echo contained in the data packet.  With each new data packet j, the
800	receiver can now calculate an updated RTT estimate as:

802	     R' = max(D_r + tr_now - ts_j, 1 millisecond)

804	In between RTT measurements, the updated R' is used instead of the
805	smoothed RTT R.  Like the RTT samples, R' must be strictly positive.
806	When a new measurement is made, all interim one-way delay measurements
807	are discarded (i.e., the smoothed RTT is updated according to Section
808	4.3.2 without taking the interim one-way delay adjustments into
809	account).

811	For the one-way delay measurements, the clocks of sender and receivers
812	need not be synchronized.  Clock skew will cancel itself out when both
813	one-way measurements are added to form a RTT estimate, as long as clock
814	drift between real RTT measurements is negligible.

816	The same one-way delay adjustments should be applied to the RTT supplied
817	by the sender when using sender-based RTT measurements.

819	4.3.4.  Receive Rate Measurements

821	When a receiver has not experienced any loss events, it cannot calculate
822	a TCP-friendly rate to include in the receiver reports.  Instead, the
823	receiver measures the current receive rate and sets the desired rate X_r
824	to twice the receive rate.

826	The receive rate in bits/s is measured as the number of bits received
827	over the last k RTTs, taking into account the IP and transport packet
828	headers, but excluding the link-layer packet headers.  A value for k
829	between 2 and 4 is recommended.

831	4.4.  Setting the Desired Rate

833	When a receiver measures a non-zero loss event rate, it calculates the
834	desired rate using Equation (1).  In case no RTT measurement is
835	available yet, the maximum RTT is used instead of the receiver's RTT.
836	The desired rate X_r is updated whenever the loss event rate or the RTT
837	changes.

839	A receiver may decide to not report desired rates that are below 1
840	packet per 8 seconds, since a sender is very slow to recover from such
841	low sending rates.  In this case, the receiver reports a desired rate of
842	1 packet per 8 seconds.  However, it must leave the multicast group if
843	for more than 120 seconds the calculated rate falls below the reported
844	rate and the current sending rate is higher than the receiver's
845	calculated rate.

847	As mentioned above, calculation of the desired rate is not possible
848	before the receiver experiences the first loss event.  In that case,
849	twice the rate at which data is received is included in the receiver
850	reports as X_r to allow the sender to slowstart as described in Section
851	3.6.  This is also done when the sender resumes sending data packets
852	after the have_loss flag was cleared due to the sender being idle.

854	4.5.  Feedback and Feedback Suppression

856	Let fb_nr be the highest feedback round counter value received by a
857	receiver.  When a new data packet arrives with a higher feedback round
858	counter than fb_nr, a new feedback round begins and fb_nr is updated.
859	Outstanding feedback for the old round is canceled.  In case a feedback
860	number with a value that is more than half the feedback number space
861	lower than fb_nr is received, the receiver assumes that the feedback
862	round counter wrapped and also cancels the feedback timer and updates
863	fb_nr.

865	The CLR sends its feedback independently from all the other receivers
866	once per RTT.  Its feedback does not suppress other feedback and cannot
867	be suppressed by other receiver's feedback.

869	Non-CLR receivers set a feedback timer at the beginning of a feedback
870	round.  Using an exponentially weighted random timer mechanism, the
871	feedback timer is set to expire after
872	     t = max(T * (1 + log(x)/log(N)), 0)

874	where

876	     x is a random variable uniformly distributed in (0,1],

878	     T is the duration of a feedback round (i.e., 6 * R_max),

880	     N is an estimated upper bound on the number of receivers.

882	N is a constant specific to the TFMCC protocol.  Since TFMCC scales to
883	up to thousands of receivers, setting N to 10,000 for all receivers (and
884	limiting the TFMCC session to at most 10,000 receivers) is recommended.

886	A feedback packet is sent when the feedback timer expires, unless the
887	timer is canceled beforehand.  When the multicast model is ASM, feedback
888	is multicast to the whole group, otherwise the feedback is unicast to
889	the sender.  The feedback packet includes the calculated rate valid at
890	the time the feedback packet is sent (not the rate at the point of time
891	when the feedback timer is set).  The copy of the timestamp ts_i of the
892	last data packet received, which is included in the feedback packet,
893	needs to be adjusted by the time interval between receiving the data
894	packet and sending the report to allow the sender to correctly infer the
895	instantaneous RTT (i.e., that time interval has to be added to the
896	timestamp value).

898	The timer is canceled if a data packet with a lower suppression rate
899	than the receiver's calculated rate and a higher or equal maximum RTT
900	than the receiver's RTT is received.  Likewise, a data packet indicating
901	the beginning of a new feedback round cancels all feedback for older
902	rounds.  In case of ASM, the timer is also canceled if a feedback packet
903	from another non-CLR receiver reporting a lower rate is received.

905	The feedback suppression process is complicated by the fact that the
906	calculated rates of the receivers will change during a feedback round.
907	If the calculated rates decrease rapidly for all receivers, feedback
908	suppression can no longer prevent a feedback implosion since earlier
909	feedback will always report a higher rate than current feedback.  To
910	make the feedback suppression mechanism robust in the face of changing
911	rates, it is necessary to introduce X_fbr, the calculated rate of a
912	receiver at the beginning of a feedback round.  A receiver needs to
913	suppress its feedback not only when the suppression rate is less than
914	the receiver's current calculated rate but also in the case that the
915	suppression rate falls below X_fbr.

917	When the maximum RTT changes significantly during one feedback round, it
918	is necessary to reschedule the feedback timer in proportion to the
919	change.

921	     t = t * R_max / R_max'

923	where R_max is the new maximum RTT and R_max' is the previous maximum
924	RTT.  The same considerations hold, when the last data packets were
925	received more than a time interval of R_max ago.  In this case, it is
926	necessary to add the difference of the inter-packet gap and the maximum
927	RTT to the feedback time to prevent a feedback implosion (e.g., in case
928	the sender crashed).

930	     t = t + max(tr_now - tr_i - R_max, 0)

932	where tr_i is the time when the last data packet arrived at the
933	receiver.

935	More details on the characteristics of the feedback suppression
936	mechanism can be found in [14] and [4].

938	5.  Calculation of the Loss Event Rate

940	Obtaining an accurate and stable measurement of the loss event rate is
941	of primary importance for TFMCC.  Loss rate measurement is performed at
942	the receiver, based on the detection of lost or marked packets from the
943	sequence numbers of arriving packets.

945	5.1.  Detection of Lost or Marked Packets

947	TFMCC assumes that all packets contain a sequence number that is
948	incremented by one for each packet that is sent.  For the purposes of
949	this specification, we require that if a lost packet is retransmitted,
950	the retransmission is given a new sequence number that is the latest in
951	the transmission sequence, and not the same sequence number as the
952	packet that was lost.  If a transport protocol has the requirement that
953	it must retransmit with the original sequence number, then the transport
954	protocol designer must figure out how to distinguish delayed from
955	retransmitted packets and how to detect lost retransmissions.

957	The receivers each maintain a data structure that keeps track of which
958	packets have arrived and which are missing.  For the purposes of
959	specification, we assume that the data structure consists of a list of
960	packets that have arrived along with the timestamp when each packet was
961	received.  In practice this data structure will normally be stored in a
962	more compact representation, but this is implementation-specific.

964	The loss of a packet is detected by the arrival of at least three
965	packets with a higher sequence number than the lost packet.  The
966	requirement for three subsequent packets is the same as with TCP, and is
967	to make TFMCC more robust in the presence of reordering.  In contrast to
968	TCP, if a packet arrives late (after 3 subsequent packets arrived) at a
969	receiver, the late packet can fill the hole in the reception record, and
970	the receiver can recalculate the loss event rate.  Future versions of
971	TFMCC might make the requirement for three subsequent packets adaptive
972	based on experienced packet reordering, but we do not specify such a
973	mechanism here.

975	For an ECN-capable connection, a marked packet is detected as a
976	congestion event as soon as it arrives, without having to wait for the
977	arrival of subsequent packets.

979	5.2.  Translation from Loss History to Loss Events

981	TFMCC requires that the loss event rate be robust to several consecutive
982	packets lost where those packets are part of the same loss event.  This
983	is similar to TCP, which (typically) only performs one halving of the
984	congestion window during any single RTT.  Thus the receivers needs to
985	map the packet loss history into a loss event record, where a loss event
986	is one or more packets lost in a RTT.

988	To determine whether a lost or marked packet should start a new loss
989	event, or be counted as part of an existing loss event, we need to
990	compare the sequence numbers and timestamps of the packets that arrived
991	at the receiver.  For a marked packet S_new, its reception time T_new
992	can be noted directly.  For a lost packet, we can interpolate to infer
993	the nominal "arrival time".  Assume:

995	     S_loss is the sequence number of a lost packet.

997	     S_before is the sequence number of the last packet to arrive with
998	     sequence number before S_loss.

1000	     S_after is the sequence number of the first packet to arrive with
1001	     sequence number after S_loss.

1003	     T_before is the reception time of S_before.

1005	     T_after is the reception time of S_after.

1007	Note that T_before can either be before or after T_after due to
1008	reordering.

1010	For a lost packet S_loss, we can interpolate its nominal "arrival time"
1011	at the receiver from the arrival times of S_before and S_after.  Thus
1012	     T_loss = T_before + ( (T_after - T_before)
1013	                 * (S_loss - S_before)/(S_after - S_before) );

1015	Note that if the sequence space wrapped between S_before and S_after,
1016	then the sequence numbers must be modified to take this into account
1017	before performing the calculation.  If the largest possible sequence
1018	number is S_max, and S_before > S_after, then modifying each sequence
1019	number S by S' = (S + (S_max + 1)/2) mod (S_max + 1) would normally be
1020	sufficient.

1022	If the lost packet S_old was determined to have started the previous
1023	loss event, and we have just determined that S_new has been lost, then
1024	we interpolate the nominal arrival times of S_old and S_new, called
1025	T_old and T_new respectively.

1027	If T_old + R >= T_new, then S_new is part of the existing loss event.
1028	Otherwise S_new is the first packet of a new loss event.

1030	5.3.  Inter-Loss Event Interval

1032	If a loss interval, A, is determined to have started with packet
1033	sequence number S_A and the next loss interval, B, started with packet
1034	sequence number S_B, then the number of packets in loss interval A is
1035	given by (S_B - S_A).

1037	5.4.  Average Loss Interval

1039	To calculate the loss event rate p, we first calculate the average loss
1040	interval.  This is done using a filter that weights the n most recent
1041	loss event intervals in such a way that the measured loss event rate
1042	changes smoothly.

1044	Weights w_0 to w_(n-1) are calculated as:

1046	     If (i < n/2)
1047	        w_i = 1;
1048	     Else
1049	        w_i = 1 - (i - (n/2 - 1))/(n/2 + 1);

1051	Thus if n=8, the values of w_0 to w_7 are:

1053	     1.0, 1.0, 1.0, 1.0, 0.8, 0.6, 0.4, 0.2

1055	The value n for the number of loss intervals used in calculating the
1056	loss event rate determines TFMCC's speed in responding to changes in the
1057	level of congestion.  As currently specified, TFMCC should not be used
1058	for values of n significantly greater than 8, for traffic that might
1059	compete in the global Internet with TCP.  At the very least, safe
1060	operation with values of n greater than 8 would require a slight change
1061	to TFMCC's mechanisms to include a more severe response to two or more
1062	round-trip times with heavy packet loss.

1064	When calculating the average loss interval we need to decide whether to
1065	include the interval since the most recent packet loss event.  We only
1066	do this if it is sufficiently large to increase the average loss
1067	interval.

1069	Thus if the most recent loss intervals are I_0 to I_n, with I_0 being
1070	the interval since the most recent loss event, then we calculate the
1071	average loss interval I_mean as:

1073	     I_tot0 = 0;
1074	     I_tot1 = 0;
1075	     W_tot = 0;
1076	     for (i = 0 to n-1) {
1077	       I_tot0 = I_tot0 + (I_i * w_i);
1078	       W_tot = W_tot + w_i;
1079	     }
1080	     for (i = 1 to n) {
1081	       I_tot1 = I_tot1 + (I_i * w_(i-1));
1082	     }
1083	     I_tot = max(I_tot0, I_tot1);
1084	     I_mean = I_tot/W_tot;

1086	The loss event rate, p is simply:

1088	     p = 1 / I_mean;

1090	5.5.  History Discounting

1092	As described in Section 5.4, the most recent loss interval is only
1093	assigned 4/(3*n) of the total weight in calculating the average loss
1094	interval, regardless of the size of the most recent loss interval.  This
1095	section describes an optional history discounting mechanism that allows
1096	the TFMCC receivers to adjust the weights, concentrating more of the
1097	relative weight on the most recent loss interval, when the most recent
1098	loss interval is more than twice as large as the computed average loss
1099	interval.

1101	To carry out history discounting, we associate a discount factor DF_i
1102	with each loss interval L_i, where each discount factor is a floating
1103	point number.  The discount array maintains the cumulative history of
1104	discounting for each loss interval.  At the beginning, the values of
1105	DF_i in the discount array are initialized to 1:

1107	     for (i = 0 to n) {
1108	       DF_i = 1;
1109	     }

1111	History discounting also uses a general discount factor DF, also a
1112	floating point number, that is also initialized to 1.  First we show how
1113	the discount factors are used in calculating the average loss interval,
1114	and then we describe later in this section how the discount factors are
1115	modified over time.

1117	As described in Section 5.4 the average loss interval is calculated
1118	using the n previous loss intervals I_1, ..., I_n, and the interval I_0
1119	that represents the number of packets received since the last loss
1120	event.  The computation of the average loss interval using the discount
1121	factors is a simple modification of the procedure in Section 5.4, as
1122	follows:

1124	     I_tot0 = I_0 * w_0
1125	     I_tot1 = 0;
1126	     W_tot0 = w_0
1127	     W_tot1 = 0;
1128	     for (i = 1 to n-1) {
1129	       I_tot0 = I_tot0 + (I_i * w_i * DF_i * DF);
1130	       W_tot0 = W_tot0 + w_i * DF_i * DF;
1131	     }
1132	     for (i = 1 to n) {
1133	       I_tot1 = I_tot1 + (I_i * w_(i-1) * DF_i);
1134	       W_tot1 = W_tot1 + w_(i-1) * DF_i;
1135	     }
1136	     p = min(W_tot0/I_tot0, W_tot1/I_tot1);

1138	The general discounting factor, DF is updated on every packet arrival as
1139	follows.  First, a receiver computes the weighted average I_mean of the
1140	loss intervals I_1, ..., I_n:

1142	     I_tot = 0;
1143	     W_tot = 0;
1144	     for (i = 1 to n) {
1145	       W_tot = w_(i-1) * DF_i;
1146	       I_tot = I_tot + (I_i * w_(i-1) * DF_i);
1147	     }
1148	     I_mean = I_tot / W_tot;

1150	This weighted average I_mean is compared to I_0, the number of packets
1151	received since the last loss event.  If I_0 is greater than twice
1152	I_mean, then the new loss interval is considerably larger than the old
1153	ones, and the general discount factor DF is updated to decrease the
1154	relative weight on the older intervals, as follows:

1156	     if (I_0 > 2 * I_mean) {
1157	       DF = 2 * I_mean/I_0;
1158	       if (DF < THRESHOLD)
1159	         DF = THRESHOLD;
1160	     } else
1161	       DF = 1;

1163	A nonzero value for THRESHOLD ensures that older loss intervals from an
1164	earlier time of high congestion are not discounted entirely.  We
1165	recommend a THRESHOLD of 0.5.  Note that with each new packet arrival,
1166	I_0 will increase further, and the discount factor DF will be updated.

1168	When a new loss event occurs, the current interval shifts from I_0 to
1169	I_1, loss interval I_i shifts to interval I_(i+1), and the loss interval
1170	I_n is forgotten.  The previous discount factor DF has to be
1171	incorporated into the discount array.  Because DF_i carries the discount
1172	factor associated with loss interval I_i, the DF_i array has to be
1173	shifted as well.  This is done as follows:

1175	     for (i = 1 to n) {
1176	       DF_i = DF * DF_i;
1177	     }
1178	     for (i = n-1 to 0 step -1) {
1179	       DF_(i+1) = DF_i;
1180	     }
1181	     I_0 = 1;
1182	     DF_0 = 1;
1183	     DF = 1;

1185	This completes the description of the optional history discounting
1186	mechanism.  We emphasize that this is an optional mechanism whose sole
1187	purpose is to allow TFMCC to response somewhat more quickly to the
1188	sudden absence of congestion, as represented by a long current loss
1189	interval.

1191	5.6.  Initializing the Loss History after the First Loss Event

1193	The number of packets received before the first loss event usually does
1194	not reflect the current loss event rate.  When the first loss event
1195	occurs, a TFMCC receiver assumes that the correct data rate is the rate
1196	at which data was received during the last RTT when the loss occurred.
1197	Instead of initializing the first loss interval to the number of packets
1198	sent until the first loss event, the TFMCC receiver calculates the loss
1199	interval that would be required to produce the receive rate X_recv, and
1200	uses this synthetic loss interval l_0 to seed the loss history
1201	mechanism.

1203	The initial loss interval is calculated by inverting a simplified
1204	version of the TCP Equation (1).

1206	                                 8s
1207	     X_recv = sqrt(3/2) * -----------------
1208	                           R * sqrt(1/l_0)

1210	                   X_recv * R
1211	     ==> l_0 = (----------------)^2
1212	                 sqrt(3/2) * 8s

1214	The resulting initial loss interval is too small at higher loss rates
1215	compared to using the more accurate Equation (1), which leads to a more
1216	conservative initial loss event rate.

1218	If a receiver still uses the initial RTT R_max instead of its real RTT,
1219	the initial loss interval is too large in case the initial RTT is higher
1220	than the actual RTT.  As a consequence, the receiver will calculate a
1221	too high desired rate when the first RTT measurement R is made and the
1222	initial loss interval is still in the loss history.  The receiver has to
1223	adjust l_0 as follows:

1225	     l_0 = l_0 * (R/R_max)^2

1227	No action needs to be taken when the first RTT measurement is made after
1228	the initial loss interval left the loss history.

1230	6.  Security Considerations

1232	TFMCC is not a transport protocol in its own right, but a congestion
1233	control mechanism that is intended to be used in conjunction with a
1234	transport protocol.  Therefore security primarily needs to be considered
1235	in the context of a specific transport protocol and its authentication
1236	mechanisms.

1238	Congestion control mechanisms can potentially be exploited to create
1239	denial of service.  This may occur through spoofed feedback.  Thus any
1240	transport protocol that uses TFMCC should take care to ensure that
1241	feedback is only accepted from valid receivers of the data.  The precise
1242	mechanism to achieve this will however depend on the transport protocol
1243	itself.

1245	Congestion control mechanisms may potentially be manipulated by a greedy
1246	receiver that wishes to receive more than its fair share of network
1247	bandwidth.  However, in TFMCC a receiver can only influence the sending
1248	rate if it is the CLR and thus has the lowest calculated rate of all
1249	receivers.  If the calculated rate is then manipulated such that it
1250	exceeds the calculated rate of the second to lowest receiver, it will
1251	cease to be CLR.  A greedy receiver can only significantly increase then
1252	transmission rate if it is the only participant in the session.  If such
1253	scenarios are of concern, possible defenses against such a receiver
1254	would normally include some form of nonce that the receiver must feed
1255	back to the sender to prove receipt.  However, the details of such a
1256	nonce would depend on the transport protocol, and in particular on
1257	whether the transport protocol is reliable or unreliable.

1259	It is possible that a receiver sends feedback claiming it has a very low
1260	calculated rate.  This will reduce the rate of the multicast session and
1261	might render it useless but obviously cannot hurt the network itself.

1263	We expect that protocols incorporating ECN with TFMCC will also want to
1264	incorporate feedback from the receiver to the sender using the ECN nonce
1265	[13].  The ECN nonce is a modification to ECN that protects the sender
1266	from the accidental or malicious concealment of marked packets.  Again,
1267	the details of such a nonce would depend on the transport protocol, and
1268	are not addressed in this document.

1270	7.  IANA Considerations

1272	There are no IANA actions required for this document.

1274	8.  Authors' Addresses

1276	     Joerg Widmer
1277	     DoCoMo Euro-Labs
1278	     Landsberger Str. 312, Munich, Germany
1279	     widmer@acm.org

1281	     Mark Handley
1282	     UCL (University College London)
1283	     Gower Street, London WC1E 6BT, UK
1284	     m.handley@cs.ucl.ac.uk

1286	9.  Acknowledgments

1288	We would like to acknowledge feedback and discussions on equation-based
1289	congestion control with a wide range of people, including members of the
1290	Reliable Multicast Research Group, the Reliable Multicast Transport
1291	Working Group, and the End-to-End Research Group.  We would particularly
1292	like to thank Brian Adamson, Mark Pullen, Fei Zhao, and Magnus
1293	Westerlund for feedback on earlier versions of this document.

1295	10.  References

1297	10.1.  Normative References

1299	[1] B. Whetten, L. Vicisano, R. Kermode, M. Handley, S. Floyd, and M.
1300	     Luby, "Reliable Multicast Transport Building Blocks for One-to-Many
1301	     Bulk-Data Transfer", RFC 3048, January 2001.

1303	[2] S. Bradner, "Key words for use in RFCs to Indicate Requirement
1304	     Levels", BCP 14, RFC 2119, March 1997.

1306	[3] R. Kermode and L. Vicisano, "Author Guidelines for Reliable
1307	     Multicast Transport (RMT) Building Blocks and Protocol
1308	     Instantiation documents", RFC 3269, April 2002.

1310	10.2.  Informative References

1312	[4] J. Widmer and M. Handley, "Extending Equation-Based Congestion
1313	     Control to Multicast Applications", Proc ACM Sigcomm 2001, San
1314	     Diego, August 2001.

1316	[5] S. Floyd, M. Handley, J. Padhye, and J. Widmer, "Equation-Based
1317	     Congestion Control for Unicast Applications", Proc ACM SIGCOMM
1318	     2000, Stockholm, August 2000.

1320	[6] B. Adamson, C. Bormann, M. Handley, and J. Macker, "Negative-
1321	     Acknowledgment (NACK)-Oriented Reliable Multicast (NORM) Building
1322	     Blocks", RFC 3941, November 2004.

1324	[7] S. Deering, "Host Extensions for IP Multicasting", STD 5, RFC 1112,
1325	     August 1989.

1327	[8] H. W. Holbrook, "A Channel Model for Multicast," Ph.D.
1328	     Dissertation, Stanford University, Department of Computer Science,
1329	     Stanford, California, August 2001.

1331	[9] J. Padhye, V. Firoiu, D. Towsley, and J. Kurose, "Modeling TCP
1332	     Throughput: A Simple Model and its Empirical Validation", Proc ACM
1333	     SIGCOMM 1998.

1335	[10] K. Ramakrishnan and S. Floyd, "A Proposal to add Explicit
1336	     Congestion Notification (ECN) to IP", RFC 2481, January 1999.

1338	[11] L. Rizzo, "pgmcc: a TCP-friendly single-rate multicast congestion
1339	     control scheme", Proc ACM Sigcomm 2000, Stockholm, August 2000.

1341	[12] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, "RTP: A
1342	     Transport Protocol for Real-Time Applications", RFC 1889, January
1343	     1996.

1345	[13] N. Spring, D. Wetherall, and D. Ely, "Robust Explicit Congestion
1346	     Notification (ECN) Signaling with Nonces", RFC 3540, June 2003.

1348	[14] J. Widmer and T. Fuhrmann, "Extremum Feedback for Very Large
1349	     Multicast Groups", Proc NGC 2001, London, November 2001.

1351	11.  Copyright Notice

1353	Copyright (C) The Internet Society (2006).  This document is subject to
1354	the rights, licenses and restrictions contained in BCP 78, and except as
1355	set forth therein, the authors retain all their rights.

1357	This document and the information contained herein are provided on an
1358	"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR
1359	IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
1360	ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
1361	INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
1362	INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
1363	WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.