idnits 2.17.1 

draft-ietf-tsvwg-tfrc-03.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Looks like you're using RFC 2026 boilerplate.  This must be updated to
     follow RFC 3978/3979, as updated by RFC 4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity -- however, there's a paragraph with a matching
     beginning. Boilerplate error?

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (20 July 2001) is 8316 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Looks like a reference, but probably isn't: 'BRS99' on line 92

  -- Looks like a reference, but probably isn't: 'WES01' on line 882

  == Unused Reference: '1' is defined on line 920, but no explicit reference
     was found in the text

  == Unused Reference: '9' is defined on line 950, but no explicit reference
     was found in the text

  -- Possible downref: Non-RFC (?) normative reference: ref. '1'

  -- Possible downref: Non-RFC (?) normative reference: ref. '2'

  -- Possible downref: Non-RFC (?) normative reference: ref. '3'

  -- Possible downref: Non-RFC (?) normative reference: ref. '4'

  -- Possible downref: Non-RFC (?) normative reference: ref. '5'

  ** Obsolete normative reference: RFC 1889 (ref. '6') (Obsoleted by RFC 3550)

  ** Obsolete normative reference: RFC 2481 (ref. '7') (Obsoleted by RFC 3168)

  ** Obsolete normative reference: RFC 2988 (ref. '8') (Obsoleted by RFC 6298)

  == Outdated reference: A later version (-04) exists of
     draft-ietf-tsvwg-tcp-nonce-00

  ** Downref: Normative reference to an Historic draft:
     draft-ietf-tsvwg-tcp-nonce (ref. '9')


     Summary: 7 errors (**), 0 flaws (~~), 4 warnings (==), 9 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Internet Engineering Task Force                                   TSV WG
2	INTERNET-DRAFT                                        Mark Handley/ACIRI
3	draft-ietf-tsvwg-tfrc-03.txt                       Jitendra Padhye/ACIRI
4	                                                       Sally Floyd/ACIRI

6	                                             Joerg Widmer/Univ. Mannheim
7	                                                            20 July 2001
8	                                                   Expires: January 2002

10	                   TCP Friendly Rate Control (TFRC):
11	                         Protocol Specification

13	Status of this Document

15	This document is an Internet-Draft and is in full conformance with all
16	provisions of Section 10 of RFC2026.

18	Internet-Drafts are working documents of the Internet Engineering Task
19	Force (IETF), its areas, and its working groups.  Note that other groups
20	may also distribute working documents as Internet-Drafts.

22	Internet-Drafts are draft documents valid for a maximum of six months
23	and may be updated, replaced, or obsoleted by other documents at any
24	time.  It is inappropriate to use Internet- Drafts as reference material
25	or to cite them other than as "work in progress."

27	The list of current Internet-Drafts can be accessed at
28	http://www.ietf.org/ietf/1id-abstracts.txt

30	The list of Internet-Draft Shadow Directories can be accessed at
31	http://www.ietf.org/shadow.html.

33	This document is a product of the IETF TSV WG.  Comments should be
34	addressed to the authors.

36	                                Abstract

38	     This document specifies TCP-Friendly Rate Control (TFRC).
39	     TFRC is a congestion control mechanism for unicast flows
40	     operating in a best-effort Internet environment.  It is
41	     reasonably fair when competing for bandwidth with TCP flows,
42	     but has a much lower variation of throughput over time
43	     compared with TCP, making it more suitable for applications
44	     such as telephony or streaming media where a relatively smooth
45	     sending rate is of importance.

47	                           Table of Contents

49	     1. Introduction. . . . . . . . . . . . . . . . . . . . . .   4
50	     2. Terminology . . . . . . . . . . . . . . . . . . . . . .   5
51	     3. Protocol Mechanism. . . . . . . . . . . . . . . . . . .   5
52	      3.1. TCP Throughput Equation. . . . . . . . . . . . . . .   5
53	      3.2. Packet Contents. . . . . . . . . . . . . . . . . . .   7
54	       3.2.1. Data Packets. . . . . . . . . . . . . . . . . . .   7
55	       3.2.2. Feedback Packets. . . . . . . . . . . . . . . . .   8
56	     4. Data Sender Protocol. . . . . . . . . . . . . . . . . .   8
57	      4.1. Measuring the Packet Size. . . . . . . . . . . . . .   8
58	      4.2. Sender Initialization. . . . . . . . . . . . . . . .   9
59	      4.3. Sender behavior when a feedback packet is
60	      received. . . . . . . . . . . . . . . . . . . . . . . . .   9
61	      4.4. Expiration of nofeedback timer . . . . . . . . . . .  10
62	      4.5. Preventing Oscillations. . . . . . . . . . . . . . .  11
63	      4.6. Scheduling of Packet Transmissions . . . . . . . . .  12
64	     5. Calculation of the Loss Event Rate (p). . . . . . . . .  13
65	      5.1. Detection of Lost or Marked Packets. . . . . . . . .  13
66	      5.2. Translation from Loss History to Loss Events . . . .  13
67	      5.3. Inter-loss Event Interval. . . . . . . . . . . . . .  15
68	      5.4. Average Loss Interval. . . . . . . . . . . . . . . .  15
69	      5.5. History Discounting. . . . . . . . . . . . . . . . .  16
70	     6. Data Receiver Protocol. . . . . . . . . . . . . . . . .  18
71	      6.1. Receiver behavior when a data packet is
72	      received. . . . . . . . . . . . . . . . . . . . . . . . .  19
73	      6.2. Expiration of feedback timer . . . . . . . . . . . .  19
74	      6.3. Receiver initialization. . . . . . . . . . . . . . .  20
75	       6.3.1. Initializing the Loss History after the
76	       First Loss Event . . . . . . . . . . . . . . . . . . . .  20
77	     7. Security Considerations . . . . . . . . . . . . . . . .  20
78	     8. IANA Considerations . . . . . . . . . . . . . . . . . .  21
79	     9. Authors' Addresses. . . . . . . . . . . . . . . . . . .  21
80	     10. Acknowledgments. . . . . . . . . . . . . . . . . . . .  22
81	     11. References . . . . . . . . . . . . . . . . . . . . . .  22

83	1.  Introduction

85	This document specifies TCP-Friendly Rate Control (TFRC).  TFRC is a
86	congestion control mechanism designed for unicast flows operating in a
87	Internet environment and competing with TCP traffic [2]. Instead of
88	specifying a complete protocol, this document simply specifies a
89	congestion control mechanism that could be used in a transport protocol
90	such as RTP [6], in an application incorporating end-to-end congestion
91	control at the application level, or in the context of endpoint
92	congestion management [BRS99].  This document does not discuss packet
93	formats, reliability, or implementation-related issues.

95	TFRC is designed to be reasonably fair when competing for bandwidth with
96	TCP flows, where a flow is ``reasonably fair'' if its sending rate is
97	generally within a factor of two of the sending rate of a TCP flow under
98	the same conditions.  However TFRC has a much lower variation of
99	throughput over time compared with TCP, which makes it more suitable for
100	applications such as telephony or streaming media where a relatively
101	smooth sending rate is of importance.

103	The penalty of having smoother throughput than TCP while competing
104	fairly for bandwidth is that TFRC responds slower than TCP to changes in
105	available bandwidth.  Thus TFRC should only be used when the application
106	has a requirement for smooth throughput, in particular, avoiding TCP's
107	halving of the sending rate in response to a single packet drop.  For
108	applications that simply need to transfer as much data as possible in as
109	short a time as possible we recommend using TCP, or if reliability is
110	not required, using an Additive-Increase, Multiplicative-Decrease (AIMD)
111	congestion control scheme with similar parameters to those used by TCP.

113	TFRC is designed for applications that use a fixed packet size, and vary
114	their sending rate in packets per second in response to congestion.
115	Some audio applications require a fixed interval of time between packets
116	and vary their packet size instead of their packet rate in response to
117	congestion.  The congestion control mechanism in this document cannot be
118	used by those applications; TFRC-PS (for TFRC-PacketSize) is a variant
119	of TFRC for applications that have a fixed sending rate but vary their
120	packet size in response to congestion.  TFRC-PS will be specified in a
121	later document.

123	TFRC is a receiver-based mechanism, with the calculation of the
124	congestion control information (i.e., the loss event rate) in the data
125	receiver rather in the data sender.  This is well-suited to an
126	application where the sender is a large server handling many concurrent
127	connections, and the receiver has more memory and CPU cycles available
128	for computation.  In addition, a receiver-based mechanism is more
129	suitable as a building block for multicast congestion control.

131	2.  Terminology

133	In this document, the key words "MUST", "MUST NOT", "REQUIRED", "SHALL",
134	"SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
135	"OPTIONAL" are to be interpreted as described in RFC 2119 and indicate
136	requirement levels for compliant TFRC implementations.

138	3.  Protocol Mechanism

140	For its congestion control mechanism, TFRC directly uses a throughput
141	equation for the allowed sending rate as a function of the loss event
142	rate and round-trip time.  In order to compete fairly with TCP, TFRC
143	uses the TCP throughput equation, which roughly describes TCP's sending
144	rate as a function of the loss event rate, round-trip time, and packet
145	size.  We define a loss event as one or more lost or marked packets from
146	a window of data, where a marked packet refers to a congestion
147	indication from Explicit Congestion Notification (ECN) [7].

149	Generally speaking, TFRC's congestion control mechanism works as
150	follows:

152	o    The receiver measures the loss event rate and feeds this
153	     information back to the sender.

155	o    The sender also uses these feedback messages to measure the round-
156	     trip time (RTT).

158	o    The loss event rate and RTT are then fed into TFRC's throughput
159	     equation, giving the acceptable transmit rate.

161	o    The sender then adjusts its transmit rate to match the calculated
162	     rate.

164	The dynamics of TFRC are sensitive to how the measurements are performed
165	and applied.  We recommend specific mechanisms below to perform and
166	apply these measurements.  Other mechanisms are possible, but it is
167	important to understand how the interactions between mechanisms affect
168	the dynamics of TFRC.

170	3.1.  TCP Throughput Equation

172	Any realistic equation giving TCP throughput as a function of loss event
173	rate and RTT should be suitable for use in TFRC.  However, we note that
174	the TCP throughput equation used must reflect TCP's retransmit timeout
175	behavior, as this dominates TCP throughput at higher loss rates.  We
176	also note that the assumptions implicit in the throughput equation about
177	the loss event rate parameter have to be a reasonable match to how the
178	loss rate or loss event rate is actually measured.  While this match is
179	not perfect for the throughput equation and loss rate measurement
180	mechanisms given below, in practice the assumptions turn out to be close
181	enough.

183	The throughput equation we currently recommend for TFRC is a slightly
184	simplified version of the throughput equation for Reno TCP from [4].
185	Ideally we'd prefer a throughput equation based on SACK TCP, but no one
186	has yet derived the throughput equation for SACK TCP, and from both
187	simulations and experiments, the differences between the two equations
188	are relatively minor.

190	The throughput equation is:

192	                                     s
193	     X =  ----------------------------------------------------------
194	          R*sqrt(2*b*p/3) + (t_RTO * (3*sqrt(3*b*p/8) * p * (1+32*p^2)))

196	Where:

198	     X is the transmit rate in bytes/second.

200	     s is the packet size in bytes.

202	     R is the round trip time in seconds.

204	     p is the loss event rate, between 0 and 1.0, of the number of loss
205	     events as a fraction of the number of packets transmitted.

207	     t_RTO is the TCP retransmission timeout value in seconds.

209	     b is the number of packets acknowledged by a single TCP
210	     acknowledgement.

212	We further simplify this by setting t_RTO = 4*R.  A more accurate
213	calculation of t_RTO is possible, but experiments with the current
214	setting have resulted in reasonable fairness with existing TCP
215	implementations [5]. Another possibility would be to set t_RTO = max(4R,
216	one second), to match the recommended minimum of one second on the RTO
217	[8].

219	Many current TCP connections use delayed acknowledgements, sending an
220	acknowledgement for every two data packets received, and thus have a
221	sending rate modeled by b = 2.  However, TCP is also allowed to send an
222	acknowledgement for every data packet, and this would be modeled by b =
223	1.  Because many TCP implementations do not use delayed
224	acknowledgements, we recommend b = 1.

226	In future, different TCP equations may be substituted for this equation.
227	The requirement is that the throughput equation be a reasonable
228	approximation of the sending rate of TCP for conformant TCP congestion
229	control.

231	The parameters s (packet size), p (loss event rate) and R (RTT) need to
232	be measured or calculated by a TFRC implementation.  The measurement of
233	s is specified in Section 4.1, measurement of R is specified in Section
234	4.3, and measurement of p is specified in Section 5. In the rest of this
235	document all data rates are measured in bytes/second.

237	3.2.  Packet Contents

239	Before specifying the sender and receiver functionality, we describe the
240	contents of the data packets sent by the sender and feedback packets
241	sent by the receiver.  As TFRC will be used along with a transport
242	protocol, we do not specify packet formats, as these depend on the
243	details of the transport protocol used.

245	3.2.1.  Data Packets

247	Each data packet sent by the data sender contains the following
248	information:

250	o    A sequence number. This number is incremented by one for each data
251	     packet transmitted.  The field must be sufficiently large that it
252	     does not wrap causing two different packets with the same sequence
253	     number to be in the receiver's recent packet history at the same
254	     time.

256	o    A timestamp indicating when the packet is sent. We denote by ts_i
257	     the timestamp of the packet with sequence number i.  The resolution
258	     of the timestamp should typically be measured in milliseconds.

260	o    The sender's current estimate of the round trip time. The estimate
261	     reported in packet i is denoted by R_i.

263	o    The sender's current transmit rate. The estimate reported in packet
264	     i is denoted by X_i.

266	3.2.2.  Feedback Packets

268	Each feedback packet sent by the data receiver contains the following
269	information:

271	o    The timestamp of the last data packet received. We denote this by
272	     t_recvdata.  If the last packet received at the receiver has
273	     sequence number i, then t_recvdata = ts_i.

275	o    The amount of time elapsed between the receipt of the last data
276	     packet at the receiver, and the generation of this feedback report.
277	     We denote this by t_delay.

279	o    The rate at which the receiver estimates that data was received
280	     since the last feedback report was sent. We denote this by X_recv.

282	o    The receiver's current estimate of the loss event rate, p.

284	4.  Data Sender Protocol

286	The data sender sends a stream of data packets to the data receiver at a
287	controlled rate. When a feedback packet is received from the data
288	receiver, the data sender changes its sending rate, based on the
289	information contained in the feedback report. If the sender does not
290	receive a feedback report for two round trip times, it cuts its sending
291	rate in half.  This is achieved by means of a timer called the
292	nofeedback timer.

294	We specify the sender-side protocol in the following steps:

296	o    Measurement of the mean packet size being sent.

298	o    The sender behavior when a feedback packet is received.

300	o    The sender behavior when the nofeedback timer expires.

302	o    Oscillation prevention (optional)

304	o    Scheduling of transmission on non-realtime operating systems.

306	4.1.  Measuring the Packet Size

308	The parameter s (packet size) is normally known to an application.  This
309	may not be so in two cases:

311	o    The packet size naturally varies depending on the data.  In this
312	     case, although the packet size varies, that variation is not
313	     coupled to the transmit rate.  It should normally be safe to use an
314	     estimate of the mean packet size for s.

316	o    The application needs to change the packet size rather than the
317	     number of packets per second to perform congestion control.  This
318	     would normally be the case with packet audio applications where a
319	     fixed interval of time needs to be represented by each packet.
320	     Such applications need to have a completely different way of
321	     measuring parameters.

323	The second class of applications are discussed separately in a separate
324	document on TFRC-PS.  For the remainder of this section we assume the
325	sender can estimate the packet size, and that congestion control is
326	performed by adjusting the number of packets sent per second.

328	4.2.  Sender Initialization

330	To initialize the sender, the value of X is set to 1 packet/second and
331	the nofeedback timer is set to expire after 2 seconds. The initial
332	values for R (RTT) and t_RTO are undefined until they are set as
333	described above. The intial value of tld, for the Time Last Doubled
334	during slow-start, is set to -1.

336	4.3.  Sender behavior when a feedback packet is received

338	The sender knows its current sending rate, X, and maintains an estimate
339	of the current round trip time, R, and an estimate of the timeout
340	interval, t_RTO.

342	When a feedback packet is received by the sender at time t_now, the
343	following actions should be performed:

345	1)   Calculate a new round trip sample.
346	     R_sample = (t_now - t_recvdata) - t_delay.

348	2)   Update the round trip time estimate:

350	          If no feedback has been received before
351	              R = R_sample;
352	          Else
353	              R = q*R + (1-q)*R_sample;

355	     TFRC is not sensitive to the precise value for the filter constant
356	     q, but we recommend a default value of 0.9.

358	3)   Update the timeout interval:

360	          t_RTO = 4*R.

362	4)   Update the sending rate as follows:

364	               If (p > 0)
365	                   Calculate X_calc using the TCP throughput equation.
366	                   X = max(min(X_calc, 2*X_recv), s/64);
367	               Else
368	                   If (t_now - tld >= RTT)
369	                       X = max(min(2*X, 2*X_recv), s/RTT);
370	                       tld = t_now;

372	          Note that if p == 0, then the sender is in slow-start phase,
373	          where it approximately doubles the sending rate each round-
374	          trip time until a loss occurs. The s/RTT term gives a minimum
375	          sending rate during slow-start of one packet per RTT.  When p
376	          > 0, the sender sends at least one packet every 64 seconds.

378	5)   Reset the nofeedback timer to expire after max(4*R, 2*s/X) seconds.

380	4.4.  Expiration of nofeedback timer

382	If the nofeedback timer expires, the sender should perform the following
383	actions:

385	1)   Cut the sending rate in half.  This is done by modifying the
386	     sender's cached copy of X_recv (the receive rate).  Because the
387	     sending rate is limited to at most twice X_recv, modifying X_recv
388	     limits the current sending rate, but allows the sender to slow-
389	     start, doubling its sending rate each RTT, if feedback messages
390	     resume reporting no losses.

392	               If (X_calc > 2*X_recv)
393	                   X_recv = max(X_recv/2, s/128);
394	               Else
395	                   X_recv = X_calc/4;

397	     The s/128 term limits the backoff to one packet every 64 seconds in
398	     the case of persistent absence of feedback.

400	2)   The value of X must then be recalculated as described under point
401	     (4) above.

403	     If the nofeedback timer expires when the sender does not yet have
404	     an RTT sample, and has not yet received any feedback from the
405	     sender, then step (1) can be skipped, and the sending rate cut in
406	     half directly:

408	            X = max(X/2, s/64)

410	3)   Restart the nofeedback timer to expire after max(4*R, 2*s/X)
411	     seconds.

413	Note that when the sender stops sending, the receiver will stop sending
414	feedback.  This will cause the nofeedback timer to start to expire and
415	decrease X_recv.  If the sender subsequently starts to send again,
416	X_recv will limit the transmit rate, and a normal slowstart phase will
417	occur until the transmit rate reaches X_calc.

419	4.5.  Preventing Oscillations

421	To prevent oscillatory behavior in environments with a low degree of
422	statistical multiplexing it is useful to modify sender's transmit rate
423	to provide congestion avoidance behavior by reducing the transmit rate
424	as the queuing delay (and hence RTT) increases.  To do this the sender
425	maintains an estimate of the long-term RTT and modifies its sending rate
426	depending on how the most recent sample of the RTT differs from this
427	value.  The long-term sample is R_sqmean, and is set as follows:

429	     If no feedback has been received before
430	         R_sqmean = sqrt(R_sample);
431	     Else
432	         R_sqmean = q2*R_sqmean + (1-q2)*sqrt(R_sample);

434	Thus R_sqmean gives the exponentially weighted moving average of the
435	square root of the RTT samples.  The constant q2 should be set similarly
436	to q, and we recommend a value of 0.9 as the default.

438	The sender obtains the base transmit rate, X, from the throughput
439	function.  It then calculates a modified instantaneous transmit rate
440	X_inst, as follows:

442	     X_inst = X * R_sqmean / sqrt(R_sample);

444	When sqrt(R_sample) is greater than R_sqmean then the queue is typically
445	increasing and so the transmit rate needs to be decreased for stable
446	operation.

448	Note: This modification is not always strictly required, especially if
449	the degree of statistical multiplexing in the network is high.  However
450	we recommend that it is done because it does make TFRC behave better in
451	environments with a low level of statistical multiplexing.  If it is not
452	done, we recommend using a very low value of q, such that q is close to
453	or exactly zero.

455	4.6.  Scheduling of Packet Transmissions

457	As TFRC is rate-based, and as operating systems typically cannot
458	schedule events precisely, it is necessary to be opportunistic about
459	sending data packets so that the correct average rate is maintained
460	despite the course-grain or irregular scheduling of the operating
461	system.  Thus a typical sending loop will calculate the correct inter-
462	packet interval, t_ipi, as follows:

464	     t_ipi = s/X_inst;

466	When a sender first starts sending at time t_0, it calculates t_ipi, and
467	calculates a nominal send time t_1 = t_0 + t_ipi for packet 1.  When the
468	application becomes idle, it checks the current time, t_now, and then
469	requests re-scheduling after (t_ipi - (t_now - t_0)) seconds.  When the
470	application is re-scheduled, it checks the current time, t_now, again.
471	If (t_now > t_1 - delta) then packet 1 is sent.

473	Now a new t_ipi may be calculated, and used to calculate a nominal send
474	time t_2 for packet 2: t2 = t_1 + t_ipi.  The process then repeats, with
475	each successive packet's send time being calculated from the nominal
476	send time of the previous packet.

478	In some cases, when the nominal send time, t_i, of the next packet is
479	calculated, it may already be the case that t_now > t_i - delta.  In
480	such a case the packet should be sent immediately.  Thus if the
481	operating system has coarse timer granularity and the transmit rate is
482	high, then TFRC may send short bursts of several packets separated by
483	intervals of the OS timer granularity.

485	The parameter delta is to allow a degree of flexibility in the send time
486	of a packet.  If the operating system has a scheduling timer granularity
487	of t_gran seconds, then delta would typically be set to:

489	     delta = min(t_ipi/2, t_gran/2);

491	t_gran is 10ms on many Unix systems.  If t_gran is not known, a value of
492	10ms can be safely assumed.

494	5.  Calculation of the Loss Event Rate (p)

496	Obtaining a accurate and stable measurement of the loss event rate is of
497	primary importance for TFRC. Loss rate measurement is performed at the
498	receiver, based on the detection of lost or marked packets from the
499	sequence numbers of arriving packets. We describe this process before
500	describing the rest of the receiver protocol.

502	5.1.  Detection of Lost or Marked Packets

504	TFRC assumes that all packets contain a sequence number that is
505	incremented by one for each packet that is sent.  For the purposes of
506	this specification, we require that if a lost packet is retransmitted,
507	the retransmission is given a new sequence number that is the latest in
508	the transmission sequence, and not the same sequence number as the
509	packet that was lost.  If a transport protocol has the requirement that
510	it must retransmit with the original sequence number, then the transport
511	protocol designer must figure out how to distinguish delayed from
512	retransmitted packets and how to detect lost retransmissions.

514	The receiver maintains a data structure that keeps track of which
515	packets have arrived and which are missing.  For the purposes of
516	specification, we assume that the data structure consists of a list of
517	packets that have arrived along with the receiver timestamp when each
518	packet was received.  In practice this data structure will normally be
519	stored in a more compact representation, but this is implementation-
520	specific.

522	The loss of a packet is detected by the arrival of at least three
523	packets with a higher sequence number than the lost packet.  The
524	requirement for three subsequent packets is the same as with TCP, and is
525	to make TFRC more robust in the presence of reordering.  In contrast to
526	TCP, if a packet arrives late (after 3 subsequent packets arrived) in
527	TFRC, the late packet can fill the hole in TFRC's reception record, and
528	the receiver can recalculate the loss event rate.  Future versions of
529	TFRC might make the requirement for three subsequent packets adaptive
530	based on experienced packet reordering, but we do not specify such a
531	mechanism here.

533	For an ECN-capable connection, a marked packet is detected as a
534	congestion event as soon as it arrives, without having to wait for the
535	arrival of subsequent packets.

537	5.2.  Translation from Loss History to Loss Events

539	TFRC requires that the loss fraction be robust to several consecutive
540	packets lost where those packets are part of the same loss event.  This
541	is similar to TCP, which (typically) only performs one halving of the
542	congestion window during any single RTT.  Thus the receiver needs to map
543	the packet loss history into a loss event record, where a loss event is
544	one or more packets lost in an RTT.  To perform this mapping, the
545	receiver needs to know the RTT to use, and this is supplied periodically
546	by the sender, typically as control information piggy-backed onto a data
547	packet.  TFRC is not sensitive to how the RTT measurement sent to the
548	receiver is made, but we recommend using the sender's calculated RTT, R,
549	(see Section 4.3) for this purpose.

551	To determine whether a lost or marked packet should start a new loss
552	event, or be counted as part of an existing loss event, we need to
553	compare the sequence numbers and timestamps of the packets that arrived
554	at the receiver.  For a marked packet S_new, its reception time T_new
555	can be noted directly.  For a lost packet, we can interpolate to infer
556	the nominal "arrival time".  Assume:

558	     S_loss is the sequence number of a lost packet.

560	     S_before is the sequence number of the last packet to arrive with
561	     sequence number before S_loss.

563	     S_after is the sequence number of the first packet to arrive with
564	     sequence number after S_loss.

566	     T_before is the reception time of S_before.

568	     T_after is the reception time of S_after.

570	Note that T_before can either be before or after T_after due to
571	reordering.

573	For a lost packet S_loss, we can interpolate its nominal "arrival time"
574	at the receiver from the arrival times of S_before and S_after. Thus

576	     T_loss = T_before + ( (T_after - T_before)
577	                 * (S_loss - S_before)/(S_after - S_before) );

579	Note that if the sequence space wrapped between S_before and S_after,
580	then the sequence numbers must be modified to take this into account
581	before performing this calculation.  If the largest possible sequence
582	number is S_max, and S_before > S_after, then modifying each sequence
583	number S by S' = (S + (S_max + 1)/2) mod (S_max + 1) would normally be
584	sufficient.

586	If the lost packet S_old was determined to have started the previous
587	loss event, and we have just determined that S_new has been lost, then
588	we interpolate the nominal arrival times of S_old and S_new, called
589	T_old and T_new respectively.

591	If T_old + R >= T_new, then S_new is part of the existing loss event.
592	Otherwise S_new is the first packet in a new loss event.

594	5.3.  Inter-loss Event Interval

596	If a loss interval, A, is determined to have started with packet
597	sequence number S_A and the next loss interval, B, started with packet
598	sequence number S_B, then the number of packets in loss interval A is
599	given by (S_B - S_A).

601	5.4.  Average Loss Interval

603	To calculate the loss event rate p, we first calculate the average loss
604	interval.  This is done using a filter that weights the n most recent
605	loss event intervals in such a way that the measured loss event rate
606	changes smoothly.

608	Weights w_0 to w_(n-1) are calculated as:

610	     If (i < n/2)
611	        w_i = 1;
612	     Else
613	        w_i = 1 - (i - (n/2 - 1))/(n/2 + 1);

615	Thus if n=8, the values of w_0 to w_7 are:

617	     1.0, 1.0, 1.0, 1.0, 0.8, 0.6, 0.4, 0.2

619	The value n for the number of loss intervals used in calculating the
620	loss event rate determines TRFC's speed in responding to changes in the
621	level of congestion.  As currently specified, TFRC should not be used
622	for values of n significantly greater than 8, for traffic that might
623	compete in the global Internet with TCP.  At the very least, safe
624	operation with values of n greater than 8 would require a slight change
625	to TFRC's mechanisms to include a more severe response to two or more
626	round-trip times with heavy packet loss.

628	When calculating the average loss interval we need to decide whether to
629	include the interval since the most recent packet loss event.  We only
630	do this if it is sufficiently large to increase the average loss
631	interval.

633	Thus if the most recent loss intervals are I_0 to I_n, with I_0 being
634	the interval since the most recent loss event, then we calculate the
635	average loss interval I_mean as:

637	     I_tot0 = 0;
638	     I_tot1 = 0;
639	     W_tot = 0;
640	     for (i = 0 to n-1) {
641	       I_tot0 = I_tot0 + (I_i * w_i);
642	       W_tot = W_tot + w_i;
643	     }
644	     for (i = 1 to n) {
645	       I_tot1 = I_tot1 + (I_i * w_(i-1));
646	     }
647	     I_tot = max(I_tot0, I_tot1);
648	     I_mean = I_tot/W_tot;

650	The loss event rate, p is simply:

652	     p = 1 / I_mean;

654	5.5.  History Discounting

656	As described in Section 5.4, the most recent loss interval is only
657	assigned 1/(0.75*n) of the total weight in calculating the average loss
658	interval, regardless of the size of the most recent loss interval.  This
659	section describes an optional history discounting mechanism, discussed
660	further in [3] and [5], that allows the TFRC receiver to adjust the
661	weights, concentrating more of the relative weight on the most recent
662	loss interval, when the most recent loss interval is more than twice as
663	large as the computed average loss interval.

665	To carry out history discounting, we associate a discount factor DF_i
666	with each loss interval L_i, for i > 0, where each discount factor is a
667	floating point number.  The discount array maintains the cumulative
668	history of discounting for each loss interval.  At the beginning, the
669	values of DF_i in the discount array are initialized to 1:

671	     for (i = 1 to n) {
672	       DF_i = 1;
673	     }

675	History discounting also uses a general discount factor DF, also a
676	floating point number, that is also initialized to 1.  First we show how
677	the discount factors are used in calculating the average loss interval,
678	and then we describe later in this section how the discount factors are
679	modified over time.

681	As described in Section 5.4 the average loss interval is calculated
682	using the n previous loss intervals I_1, ..., I_n, and the interval I_0
683	that represents the number of packets received since the last loss
684	event.  The computation of the average loss interval using the discount
685	factors is a simple modification of the procedure in Section 5.4, as
686	follows:

688	     I_tot0 = I_0 * w_0
689	     I_tot1 = 0;
690	     W_tot0 = w_0
691	     W_tot1 = 0;
692	     for (i = 1 to n-1) {
693	       I_tot0 = I_tot0 + (I_i * w_i * DF_i * DF);
694	       W_tot0 = W_tot0 + w_i * DF_i * DF;
695	     }
696	     for (i = 1 to n) {
697	       I_tot1 = I_tot1 + (I_i * w_(i-1) * DF_i);
698	       W_tot1 = W_tot1 + w_(i-1) * DF_i;
699	     }
700	     p = min(W_tot0/I_tot0, W_tot1/I_tot1);

702	The general discounting factor, DF is updated on every packet arrival as
703	follows. First, the receiver computes the weighted average I_mean of the
704	loss intervals I_1, ..., I_n:

706	     I_tot = 0;
707	     W_tot = 0;
708	     for (i = 1 to n) {
709	       W_tot = w_(i-1) * DF_i;
710	       I_tot = I_tot + (I_i * w_(i-1) * DF_i);
711	     }
712	     I_mean = I_tot / W_tot;

714	This weighted average I_mean is compared to I_0, the number of packets
715	received since the last loss event.  If I_0 is greater than twice
716	I_mean, then the new loss interval is considerably larger than the old
717	ones, and the general discount factor DF is updated to decrease the
718	relative weight on the older intervals, as follows:

720	     if (I_0 > 2 * I_mean) {
721	       DF = 2 * I_mean/I_0;
722	       if (DF < THRESHOLD)
723	         DF = THRESHOLD;
724	     } else
725	       DF = 1;

727	A nonzero value for THRESHOLD ensures that older loss intervals from an
728	earlier time of high congestion are not discounted entirely.  We
729	recommend a THRESHOLD of 0.5.  Note that with each new packet arrival,
730	I_0 will increase further, and the discount factor DF will be updated.

732	When a new loss event occurs, the current interval shifts from I_0 to
733	I_1, loss interval I_i shifts to interval I_(i+1), and the loss interval
734	I_n is forgotten.  The previous discount factor DF has to be
735	incorporated into the discount array.  Because DF_i carries the discount
736	factor associated with loss interval I_i, the DF_i array has to be
737	shifted as well. This is done as follows:

739	     for (i = 1 to n) {
740	       DF_i = DF * DF_i;
741	     }
742	     for (i = n-1 to 0 step -1) {
743	       DF_(i+1) = DF_i;
744	     }
745	     I_0 = 1;
746	     DF_0 = 1;
747	     DF = 1;

749	This completes the description of the optional history discounting
750	mechanism. We emphasize that this is an optional mechanism whose sole
751	purpose is to allow TFRC to response somewhat more quickly to the sudden
752	absence of congestion, as represented by a long current loss interval.

754	6.  Data Receiver Protocol

756	The receiver periodically sends feedback messages to the sender.
757	Feedback packets should normally be sent at least once per RTT, unless
758	the sender is sending at a rate of less than one packet per RTT, in
759	which case a feedback packet should be send for every data packet
760	received.  A feedback packet should also be sent whenever a new loss
761	event is detected without waiting for the end of an RTT, and whenever an
762	out-of-order data packet is received that removes a loss event from the
763	history.

765	If the sender is transmitting at a high rate (many packets per RTT)
766	there may be some advantages to sending periodic feedback messages more
767	than once per RTT as this allows faster response to changing RTT
768	measurements, and more resilience to feedback packet loss.  However
769	there is little gain from sending a large number of feedback messages
770	per RTT.

772	6.1.  Receiver behavior when a data packet is received

774	When a data packet is received, the receiver performs the following
775	steps:

777	1)   Add the packet to the packet history.

779	2)   Let the previous value of p be p_prev.  Calculate the new value of
780	     p as described in Section 5.

782	3)   If p > p_prev, cause the feedback timer to expire, and perform the
783	     actions described in Section 6.2

785	     If p <= p_prev no action need be performed.

787	     However an optimization might check to see if the arrival of the
788	     packet caused a hole in the packet history to be filled and
789	     consequently two loss intervals were merged into one.  If this is
790	     the case, the receiver might also send feedback immediately.  The
791	     effects of such an optimization are normally expected to be small.

793	6.2.  Expiration of feedback timer

795	When the feedback timer at the receiver expires, the action to be taken
796	depends on whether data packets have been received since the last
797	feedback was sent.

799	Let the maximum sequence number of a packet at the receiver so far be
800	S_m, and the value of the RTT measurement included in packet S_m be R_m.
801	If data packets have been received since the pervious feedback was sent,
802	the receiver performs the following steps:

804	1)   Calculate the average loss event rate using the algorithm described
805	     above.

807	2)   Calculate the measured receive rate, X_recv, based on the packets
808	     received within the previous R_m seconds.

810	3)   Prepare and send a feedback packet containing the information
811	     described in Section 3.2.2

813	4)   Restart the feedback timer to expire after R_m seconds.

815	If no data packets have been received since the last feedback was sent,
816	no feedback packet is sent, and the feedback timer is restarted to
817	expire after R_m seconds.

819	6.3.  Receiver initialization

821	The receiver is initialized by the first packet that arrives at the
822	receiver. Let the sequence number of this packet be i.

824	When the first packet is received:

826	o    Set p=0

828	o    Set  X_recv = X_send, where X_send is the sending rate the sender
829	     reports in the first data packet.

831	o    Prepare and send a feedback packet.

833	o    Set the feedback timer to expire after R_i seconds.

835	6.3.1.  Initializing the Loss History after the First Loss Event

837	The number of packets until the first loss can not be used to compute
838	the sending rate directly, as the sending rate changes rapidly during
839	this time.  TFRC assumes that the correct data rate after the first loss
840	is half of the sending rate when the loss occurred.  TFRC approximates
841	this target rate by X_recv, the receive rate over the most recent round-
842	trip time.  After the first loss, instead of initializing the first loss
843	interval to the number of packets sent until the first loss, the TFRC
844	receiver calculates the loss interval that would be required to produce
845	the data rate X_recv, and uses this synthetic loss interval to seed the
846	loss history mechanism.

848	TFRC does this by finding some value p for which the throughput equation
849	in Section 3.1 gives a sending rate within 5% of X_recv, given the
850	current packet size s and round-trip time R.  The first loss interval is
851	then set to 1/p.  (The 5% tolerance is introduced simply because the
852	throughput equation is difficult to invert, and we want to reduce the
853	costs of calculating p numerically.)

855	7.  Security Considerations

857	TFRC is not a transport protocol in its own right, but a congestion
858	control mechanism that is intended to be used in conjunction with a
859	transport protocol.  Therefore security primarily needs to be considered
860	in the context of a specific transport protocol and its authentication
861	mechanisms.

863	Congestion control mechanisms can potentially be exploited to create
864	denial of service.  This may occur through spoofed feedback.  Thus any
865	transport protocol that uses TFRC should take care to ensure that
866	feedback is only accepted from the receiver of the data.  The precise
867	mechanism to achieve this will however depend on the transport protocol
868	itself.

870	In addition, congestion control mechanisms may potentially be
871	manipulated by a greedy receiver that wishes to receive more than its
872	fair share of network bandwidth.  A receiver might do this by claiming
873	to have received packets that in fact were lost due to congestion.
874	Possible defenses against such a receiver would normally include some
875	form of nonce that the receiver must feed back to the sender to prove
876	receipt.  However, the details of such a nonce would depend on the
877	transport protocol, and in particular on whether the transport protocol
878	is reliable or unreliable.

880	We expect that protocols incorporating ECN with TFRC will also want to
881	incorporate feedback from the receiver to the sender using the ECN nonce
882	[WES01].   The ECN nonce is a modification to ECN that protects the
883	sender from the accidental or malicious concealment of marked packets.
884	Again, the details of such a nonce would depend on the transport
885	protocol, and are not addressed in this document.

887	8.  IANA Considerations

889	There are no IANA actions required for this document.

891	9.  Authors' Addresses

893	     Mark Handley, Jitendra Padhye, Sally Floyd
894	     ACIRI/ICSI
895	     1947 Center St, Suite 600
896	     Berkeley, CA 94708
897	     mjh@aciri.org, padhye@aciri.org, floyd@aciri.org

899	     Joerg Widmer
900	     Lehrstuhl Praktische Informatik IV
901	     Universitat Mannheim
902	     L 15, 16 - Room 415
903	     D-68131 Mannheim
904	     Germany
905	     widmer@informatik.uni-mannheim.de

907	10.  Acknowledgments

909	We would like to acknowledge feedback and discussions on equation-based
910	congestion control with a wide range of people, including members of the
911	Reliable Multicast Research Group, the Reliable Multicast Transport
912	Working Group, and the End-to-End Research Group.   We would like to
913	thank Eduardo Urzaiz, Vladica Stanisic, and Shushan Wen for feedback on
914	earlier versions of this document, and to thank Mark Allman for his
915	extensive feedback from using the draft to produce a working
916	implementation.

918	11.  References

920	[1] Balakrishnan, H., Rahul, H., and Seshan, S., "An Integrated
921	     Congestion Management Architecture for Internet Hosts," Proc. ACM
922	     SIGCOMM, Cambridge, MA, September 1999.

924	[2] S. Floyd, M. Handley, J. Padhye, and J. Widmer, "Equation-Based
925	     Congestion Control for Unicast Applications", August 2000, Proc
926	     SIGCOMM 2000.

928	[3] S. Floyd, M. Handley, J. Padhye, and J. Widmer, "Equation-Based
929	     Congestion Control for Unicast Applications: the Extended Version",
930	     ICSI tech report TR-00-03, March 2000.

932	[4] Padhye, J. and  Firoiu, V. and Towsley, D. and Kurose, J., "Modeling
933	     TCP Throughput: A Simple Model and its Empirical Validation", Proc
934	     ACM SIGCOMM 1998.

936	[5] Widmer, J., "Equation-Based Congestion Control", Diploma Thesis,
937	     University of Mannheim, February 2000.  URL
938	     "http://www.aciri.org/tfrc/".

940	[6] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, "RTP: A
941	     Transport Protocol for Real-Time Applications", RFC 1889, January
942	     1996.

944	[7] K. Ramakrishnan and S. Floyd, "A Proposal to add Explicit Congestion
945	     Notification (ECN) to IP", RFC 2481, January 1999.

947	[8] V. Paxson and M. Allman, "Computing TCP's Retransmission Timer", RFC
948	     2988, November 2000.

950	[9] Wetherall, D., Ely, D., and Spring, N., "Robust ECN Signaling with
951	     Nonces", draft-ietf-tsvwg-tcp-nonce-00.txt, internet draft, work in
952	     progress, January 2001.  Citation for informational purposes only.