idnits 2.17.1 draft-ietf-tsvwg-tfrc-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (18 May 2001) is 8373 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: 'BRS99' on line 91 -- Looks like a reference, but probably isn't: 'WES01' on line 878 == Unused Reference: '1' is defined on line 912, but no explicit reference was found in the text == Unused Reference: '9' is defined on line 942, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. '1' -- Possible downref: Non-RFC (?) normative reference: ref. '2' -- Possible downref: Non-RFC (?) normative reference: ref. '3' -- Possible downref: Non-RFC (?) normative reference: ref. '4' -- Possible downref: Non-RFC (?) normative reference: ref. '5' ** Obsolete normative reference: RFC 1889 (ref. '6') (Obsoleted by RFC 3550) ** Obsolete normative reference: RFC 2481 (ref. '7') (Obsoleted by RFC 3168) ** Obsolete normative reference: RFC 2988 (ref. '8') (Obsoleted by RFC 6298) == Outdated reference: A later version (-04) exists of draft-ietf-tsvwg-tcp-nonce-00 ** Downref: Normative reference to an Historic draft: draft-ietf-tsvwg-tcp-nonce (ref. '9') Summary: 8 errors (**), 0 flaws (~~), 4 warnings (==), 9 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force TSV WG 2 INTERNET-DRAFT Mark Handley/ACIRI 3 draft-ietf-tsvwg-tfrc-02.txt Jitendra Padhye/ACIRI 4 Sally Floyd/ACIRI 6 Joerg Widmer/Univ. Mannheim 7 18 May 2001 8 Expires: November 2001 10 TCP Friendly Rate Control (TFRC): 11 Protocol Specification 13 Status of this Document 15 This document is an Internet-Draft and is in full conformance with all 16 provisions of Section 10 of RFC2026. 18 Internet-Drafts are working documents of the Internet Engineering Task 19 Force (IETF), its areas, and its working groups. Note that other groups 20 may also distribute working documents as Internet-Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet- Drafts as reference material 25 or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 This document is a product of the IETF TSV WG. Comments should be 34 addressed to the authors. 36 Abstract 38 This document specifies TCP-Friendly Rate Control (TFRC). 39 TFRC is a congestion control mechanism for unicast flows 40 operating in a best-effort Internet environment. It is 41 reasonably fair when competing for bandwidth with TCP flows, 42 but has a much lower variation of throughput over time 43 compared with TCP, making it more suitable for applications 44 such as telephony or streaming media where a relatively smooth 45 sending rate is of importance. 47 Table of Contents 49 1. Introduction. . . . . . . . . . . . . . . . . . . . . . 4 50 2. Terminology . . . . . . . . . . . . . . . . . . . . . . 5 51 3. Protocol Mechanism. . . . . . . . . . . . . . . . . . . 5 52 3.1. TCP Throughput Equation. . . . . . . . . . . . . . . 5 53 3.2. Packet Contents. . . . . . . . . . . . . . . . . . . 7 54 3.2.1. Data Packets. . . . . . . . . . . . . . . . . . . 7 55 3.2.2. Feedback Packets. . . . . . . . . . . . . . . . . 8 56 4. Data Sender Protocol. . . . . . . . . . . . . . . . . . 8 57 4.1. Measuring the Packet Size. . . . . . . . . . . . . . 8 58 4.2. Sender Initialization. . . . . . . . . . . . . . . . 9 59 4.3. Sender behavior when a feedback packet is 60 received. . . . . . . . . . . . . . . . . . . . . . . . . 9 61 4.4. Expiration of nofeedback timer . . . . . . . . . . . 10 62 4.5. Preventing Oscillations. . . . . . . . . . . . . . . 11 63 4.6. Scheduling of Packet Transmissions . . . . . . . . . 12 64 5. Calculation of the Loss Event Rate (p). . . . . . . . . 13 65 5.1. Detection of Lost or Marked Packets. . . . . . . . . 13 66 5.2. Translation from Loss History to Loss Events . . . . 13 67 5.3. Inter-loss Event Interval. . . . . . . . . . . . . . 15 68 5.4. Average Loss Interval. . . . . . . . . . . . . . . . 15 69 5.5. History Discounting. . . . . . . . . . . . . . . . . 16 70 6. Data Receiver Protocol. . . . . . . . . . . . . . . . . 18 71 6.1. Receiver behavior when a data packet is 72 received. . . . . . . . . . . . . . . . . . . . . . . . . 19 73 6.2. Expiration of feedback timer . . . . . . . . . . . . 19 74 6.3. Receiver initialization. . . . . . . . . . . . . . . 20 75 6.3.1. Initializing the Loss History after the 76 First Loss Event . . . . . . . . . . . . . . . . . . . . 20 77 7. Security Considerations . . . . . . . . . . . . . . . . 20 78 8. Authors' Addresses. . . . . . . . . . . . . . . . . . . 21 79 9. Acknowledgments . . . . . . . . . . . . . . . . . . . . 21 80 10. References . . . . . . . . . . . . . . . . . . . . . . 22 82 1. Introduction 84 This document specifies TCP-Friendly Rate Control (TFRC). TFRC is a 85 congestion control mechanism designed for unicast flows operating in a 86 Internet environment and competing with TCP traffic. Instead of 87 specifying a complete protocol, this document simply specifies a 88 congestion control mechanism that could be used in a transport protocol 89 such as RTP [6], in an application incorporating end-to-end congestion 90 control at the application level, or in the context of endpoint 91 congestion management [BRS99]. This document does not discuss packet 92 formats, reliability, or implementation-related issues. 94 TFRC is designed to be reasonably fair when competing for bandwidth with 95 TCP flows [2]. However it has a much lower variation of throughput over 96 time compared with TCP, which makes it more suitable for applications 97 such as telephony or streaming media where a relatively smooth sending 98 rate is of importance. 100 The penalty of having smoother throughput than TCP while competing 101 fairly for bandwidth is that TFRC responds slower than TCP to changes in 102 available bandwidth. Thus TFRC should only be used when the application 103 has a requirement for smooth throughput, in particular, avoiding TCP's 104 halving of the sending rate in response to a single packet drop. For 105 applications that simply need to transfer as much data as possible in as 106 short a time as possible we recommend using TCP, or if reliability is 107 not required, using an Additive-Increase, Multiplicative-Decrease (AIMD) 108 congestion control scheme with similar parameters to those used by TCP. 110 TFRC is designed for applications that use a fixed packet size, and vary 111 their sending rate in packets per second in response to congestion. 112 Some audio applications require a fixed interval of time between packets 113 and vary their packet size instead of their packet rate in response to 114 congestion. The congestion control mechanism in this document cannot be 115 used by those applications; variants of TFRC for applications that have 116 a fixed sending rate but vary their packet size in response to 117 congestion will be addressed in a separate document. 119 TFRC is a receiver-based mechanism, with the calculation of the 120 congestion control information (i.e., the loss event rate) in the data 121 receiver rather in the data sender. This is well-suited to an 122 application where the sender is a large server handling many concurrent 123 connections, and the receiver has more memory and CPU cycles available 124 for computation. In addition, a receiver-based mechanism is more 125 suitable as a building block for multicast congestion control. 127 2. Terminology 129 In this document, the key words "MUST", "MUST NOT", "REQUIRED", "SHALL", 130 "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and 131 "OPTIONAL" are to be interpreted as described in RFC 2119 and indicate 132 requirement levels for compliant TFRC implementations. 134 3. Protocol Mechanism 136 For its congestion control mechanism, TFRC directly uses a throughput 137 equation for the allowed sending rate as a function of the loss event 138 rate and round-trip time. In order to compete fairly with TCP, TFRC 139 uses the TCP throughput equation, which roughly describes TCP's sending 140 rate as a function of the loss event rate, round-trip time, and packet 141 size. We define a loss event as one or more lost or marked packets from 142 a window of data, where a marked packet refers to a congestion 143 indication from Explicit Congestion Notification (ECN) [7]. 145 Generally speaking, TFRC's congestion control mechanism works as 146 follows: 148 o The receiver measures the loss event rate and feeds this 149 information back to the sender. 151 o The sender also uses these feedback messages to measure the round- 152 trip time (RTT). 154 o The loss event rate and RTT are then fed into TFRC's throughput 155 equation, giving the acceptable transmit rate. 157 o The sender then adjusts its transmit rate to match the calculated 158 rate. 160 The dynamics of TFRC are sensitive to how the measurements are performed 161 and applied. We recommend specific mechanisms below to perform and 162 apply these measurements. Other mechanisms are possible, but it is 163 important to understand how the interactions between mechanisms affect 164 the dynamics of TFRC. 166 3.1. TCP Throughput Equation 168 Any realistic equation giving TCP throughput as a function of loss event 169 rate and RTT should be suitable for use in TFRC. However, we note that 170 the TCP throughput equation used must reflect TCP's retransmit timeout 171 behavior, as this dominates TCP throughput at higher loss rates. We 172 also note that the assumptions implicit in the throughput equation about 173 the loss event rate parameter have to be a reasonable match to how the 174 loss rate or loss event rate is actually measured. While this match is 175 not perfect for the throughput equation and loss rate measurement 176 mechanisms given below, in practice the assumptions turn out to be close 177 enough. 179 The throughput equation we currently recommend for TFRC is a slightly 180 simplified version of the throughput equation for Reno TCP from [4]. 181 Ideally we'd prefer a throughput equation based on SACK TCP, but no one 182 has yet derived the throughput equation for SACK TCP, and from both 183 simulations and experiments, the differences between the two equations 184 are relatively minor. 186 The throughput equation is: 188 s 189 X = ---------------------------------------------------------- 190 R*sqrt(2*b*p/3) + (t_RTO * (3*sqrt(3*b*p/8) * p * (1+32*p^2))) 192 Where: 194 X is the transmit rate in bytes/second. 196 s is the packet size in bytes. 198 R is the round trip time in seconds. 200 p is the loss event rate, between 0 and 1.0, of the number of loss 201 events as a fraction of the number of packets transmitted. 203 t_RTO is the TCP retransmission timeout value in seconds. 205 b is the number of packets acknowledged by a single TCP 206 acknowledgement. 208 We further simplify this by setting t_RTO = 4*R. A more accurate 209 calculation of t_RTO is possible, but experiments with the current 210 setting have resulted in reasonable fairness with existing TCP 211 implementations [5]. Another possibility would be to set t_RTO = max(4R, 212 one second), to match the recommended minimum of one second on the RTO 213 [8]. 215 Many current TCP connections use delayed acknowledgements, sending an 216 acknowledgement for every two data packets received, and thus have a 217 sending rate modeled by b = 2. However, TCP is also allowed to send an 218 acknowledgement for every data packet, and this would be modeled by b = 219 1. Because many TCP implementations do not use delayed 220 acknowledgements, we recommend b = 1. 222 In future, different TCP equations may be substituted for this equation. 223 The requirement is that the throughput equation be a reasonable 224 approximation of the sending rate of TCP for conformant TCP congestion 225 control. 227 The parameters s (packet size), p (loss event rate) and R (RTT) need to 228 be measured or calculated by a TFRC implementation. The measurement of 229 s is specified in Section 4.1, measurement of R is specified in Section 230 4.3, and measurement of p is specified in Section 5. In the rest of this 231 document all data rates are measured in bytes/second. 233 3.2. Packet Contents 235 Before specifying the sender and receiver functionality, we describe the 236 contents of the data packets sent by the sender and feedback packets 237 sent by the receiver. As TFRC will be used along with a transport 238 protocol, we do not specify packet formats, as these depend on the 239 details of the transport protocol used. 241 3.2.1. Data Packets 243 Each data packet sent by the data sender contains the following 244 information: 246 o A sequence number. This number is incremented by one for each data 247 packet transmitted. The field must be sufficiently large that it 248 does not wrap causing two different packets with the same sequence 249 number to be in the receiver's recent packet history at the same 250 time. 252 o A timestamp indicating when the packet is sent. We denote by ts_i 253 the timestamp of the packet with sequence number i. The resolution 254 of the timestamp should typically be measured in milliseconds. 256 o The sender's current estimate of the round trip time. The estimate 257 reported in packet i is denoted by R_i. 259 o The sender's current transmit rate. The estimate reported in packet 260 i is denoted by X_i. 262 3.2.2. Feedback Packets 264 Each feedback packet sent by the data receiver contains the following 265 information: 267 o The timestamp of the last data packet received. We denote this by 268 t_recvdata. If the last packet received at the receiver has 269 sequence number i, then t_recvdata = ts_i. 271 o The amount of time elapsed between the receipt of the last data 272 packet at the receiver, and the generation of this feedback report. 273 We denote this by t_delay. 275 o The rate at which the receiver estimates that data was received 276 since the last feedback report was sent. We denote this by X_recv. 278 o The receiver's current estimate of the loss event rate, p. 280 4. Data Sender Protocol 282 The data sender sends a stream of data packets to the data receiver at a 283 controlled rate. When a feedback packet is received from the data 284 receiver, the data sender changes its sending rate, based on the 285 information contained in the feedback report. If the sender does not 286 receive a feedback report for two round trip times, it cuts its sending 287 rate in half. This is achieved by means of a timer called the 288 nofeedback timer. 290 We specify the sender-side protocol in the following steps: 292 o Measurement of the mean packet size being sent. 294 o The sender behavior when a feedback packet is received. 296 o The sender behavior when the nofeedback timer expires. 298 o Oscillation prevention (optional) 300 o Scheduling of transmission on non-realtime operating systems. 302 4.1. Measuring the Packet Size 304 The parameter s (packet size) is normally known to an application. This 305 may not be so in two cases: 307 o The packet size naturally varies depending on the data. In this 308 case, although the packet size varies, that variation is not 309 coupled to the transmit rate. It should normally be safe to use an 310 estimate of the mean packet size for s. 312 o The application needs to change the packet size rather than the 313 number of packets per second to perform congestion control. This 314 would normally be the case with packet audio applications where a 315 fixed interval of time needs to be represented by each packet. 316 Such applications need to have a completely different way of 317 measuring parameters. 319 The second class of applications are discussed separately in a separate 320 document. For the remainder of this section we assume the sender can 321 estimate the packet size, and that congestion control is performed by 322 adjusting the number of packets sent per second. 324 4.2. Sender Initialization 326 To initialize the sender, the value of X is set to 1 packet/second and 327 the nofeedback timer is set to expire after 2 seconds. The initial 328 values for R (RTT) and t_RTO are undefined until they are set as 329 described above. The intial value of tld, for the Time Last Doubled 330 during slow-start, is set to -1. 332 4.3. Sender behavior when a feedback packet is received 334 The sender knows its current sending rate, X, and maintains an estimate 335 of the current round trip time, R, and an estimate of the timeout 336 interval, t_RTO. 338 When a feedback packet is received by the sender at time t_now, the 339 following actions should be performed: 341 1) Calculate a new round trip sample. 342 R_sample = (t_now - t_recvdata) - t_delay. 344 2) Update the round trip time estimate: 346 If no feedback has been received before 347 R = R_sample; 348 Else 349 R = q*R + (1-q)*R_sample; 351 TFRC is not sensitive to the precise value for the filter constant 352 q, but we recommend a default value of 0.9. 354 3) Update the timeout interval: 356 t_RTO = 4*R. 358 4) Update the sending rate as follows: 360 If (p > 0) 361 Calculate X_calc using the TCP throughput equation. 362 X = max(min(X_calc, 2*X_recv), s/64); 363 Else 364 If (t_now - tld >= RTT) 365 X = max(min(2*X, 2*X_recv), s/RTT); 366 tld = t_now; 368 Note that if p == 0, then the sender is in slow-start phase, 369 where it approximately doubles the sending rate each round- 370 trip time until a loss occurs. The s/RTT term gives a minimum 371 sending rate during slow-start of one packet per RTT. When p 372 > 0, the sender sends at least one packet every 64 seconds. 374 5) Reset the nofeedback timer to expire after max(4*R, 2*s/X) seconds. 376 4.4. Expiration of nofeedback timer 378 If the nofeedback timer expires, the sender should perform the following 379 actions: 381 1) Cut the sending rate in half. This is done by modifying the 382 sender's cached copy of X_recv (the receive rate). Because the 383 sending rate is limited to at most twice X_recv, modifying X_recv 384 limits the current sending rate, but allows the sender to slow- 385 start, doubling its sending rate each RTT, if feedback messages 386 resume reporting no losses. 388 If (X_calc > 2*X_recv) 389 X_recv = max(X_recv/2, s/128); 390 Else 391 X_recv = X_calc/4; 393 The s/128 term limits the backoff to one packet every 64 seconds in 394 the case of persistent absence of feedback. 396 2) The value of X must then be recalculated as described under point 397 (4) above. 399 If the nofeedback timer expires when the sender does not yet have 400 an RTT sample, and has not yet received any feedback from the 401 sender, then step (1) can be skipped, and the sending rate cut in 402 half directly: 404 X = max(X/2, s/64) 406 3) Restart the nofeedback timer to expire after max(4*R, 2*s/X) 407 seconds. 409 Note that when the sender stops sending, the receiver will stop sending 410 feedback. This will cause the nofeedback timer to start to expire and 411 decrease X_recv. If the sender subsequently starts to send again, 412 X_recv will limit the transmit rate, and a normal slowstart phase will 413 occur until the transmit rate reaches X_calc. 415 4.5. Preventing Oscillations 417 To prevent oscillatory behavior in environments with a low degree of 418 statistical multiplexing it is useful to modify sender's transmit rate 419 to provide congestion avoidance behavior by reducing the transmit rate 420 as the queuing delay (and hence RTT) increases. To do this the sender 421 maintains an estimate of the long-term RTT and modifies its sending rate 422 depending on how the most recent sample of the RTT differs from this 423 value. The long-term sample is R_sqmean, and is set as follows: 425 If no feedback has been received before 426 R_sqmean = sqrt(R_sample); 427 Else 428 R_sqmean = q2*R_sqmean + (1-q2)*sqrt(R_sample); 430 Thus R_sqmean gives the exponentially weighted moving average of the 431 square root of the RTT samples. The constant q2 should be set similarly 432 to q, and we recommend a value of 0.9 as the default. 434 The sender obtains the base transmit rate, X, from the throughput 435 function. It then calculates a modified instantaneous transmit rate 436 X_inst, as follows: 438 X_inst = X * R_sqmean / sqrt(R_sample); 440 When sqrt(R_sample) is greater than R_sqmean then the queue is typically 441 increasing and so the transmit rate needs to be decreased for stable 442 operation. 444 Note: This modification is not always strictly required, especially if 445 the degree of statistical multiplexing in the network is high. However 446 we recommend that it is done because it does make TFRC behave better in 447 environments with a low level of statistical multiplexing. If it is not 448 done, we recommend using a very low value of q, such that q is close to 449 or exactly zero. 451 4.6. Scheduling of Packet Transmissions 453 As TFRC is rate-based, and as operating systems typically cannot 454 schedule events precisely, it is necessary to be opportunistic about 455 sending data packets so that the correct average rate is maintained 456 despite the course-grain or irregular scheduling of the operating 457 system. Thus a typical sending loop will calculate the correct inter- 458 packet interval, t_ipi, as follows: 460 t_ipi = s/X_inst; 462 When a sender first starts sending at time t_0, it calculates t_ipi, and 463 calculates a nominal send time t_1 = t_0 + t_ipi for packet 1. When the 464 application becomes idle, it checks the current time, t_now, and then 465 requests re-scheduling after (t_ipi - (t_now - t_0)) seconds. When the 466 application is re-scheduled, it checks the current time, t_now, again. 467 If (t_now > t_1 - delta) then packet 1 is sent. 469 Now a new t_ipi may be calculated, and used to calculate a nominal send 470 time t_2 for packet 2: t2 = t_1 + t_ipi. The process then repeats, with 471 each successive packet's send time being calculated from the nominal 472 send time of the previous packet. 474 In some cases, when the nominal send time, t_i, of the next packet is 475 calculated, it may already be the case that t_now > t_i - delta. In 476 such a case the packet should be sent immediately. Thus if the 477 operating system has coarse timer granularity and the transmit rate is 478 high, then TFRC may send short bursts of several packets separated by 479 intervals of the OS timer granularity. 481 The parameter delta is to allow a degree of flexibility in the send time 482 of a packet. If the operating system has a scheduling timer granularity 483 of t_gran seconds, then delta would typically be set to: 485 delta = min(t_ipi/2, t_gran/2); 487 t_gran is 10ms on many Unix systems. If t_gran is not known, a value of 488 10ms can be safely assumed. 490 5. Calculation of the Loss Event Rate (p) 492 Obtaining a accurate and stable measurement of the loss event rate is of 493 primary importance for TFRC. Loss rate measurement is performed at the 494 receiver, based on the detection of lost or marked packets from the 495 sequence numbers of arriving packets. We describe this process before 496 describing the rest of the receiver protocol. 498 5.1. Detection of Lost or Marked Packets 500 TFRC assumes that all packets contain a sequence number that is 501 incremented by one for each packet that is sent. For the purposes of 502 this specification, we require that if a lost packet is retransmitted, 503 the retransmission is given a new sequence number that is the latest in 504 the transmission sequence, and not the same sequence number as the 505 packet that was lost. If a transport protocol has the requirement that 506 it must retransmit with the original sequence number, then the transport 507 protocol designer must figure out how to distinguish delayed from 508 retransmitted packets and how to detect lost retransmissions. 510 The receiver maintains a data structure that keeps track of which 511 packets have arrived and which are missing. For the purposes of 512 specification, we assume that the data structure consists of a list of 513 packets that have arrived along with the receiver timestamp when each 514 packet was received. In practice this data structure will normally be 515 stored in a more compact representation, but this is implementation- 516 specific. 518 The loss of a packet is detected by the arrival of at least three 519 packets with a higher sequence number than the lost packet. The 520 requirement for three subsequent packets is the same as with TCP, and is 521 to make TFRC more robust in the presence of reordering. In contrast to 522 TCP, if a packet arrives late (after 3 subsequent packets arrived) in 523 TFRC, the late packet can fill the hole in TFRC's reception record, and 524 the receiver can recalculate the loss event rate. Future versions of 525 TFRC might make the requirement for three subsequent packets adaptive 526 based on experienced packet reordering, but we do not specify such a 527 mechanism here. 529 For an ECN-capable connection, a marked packet is detected as a 530 congestion event as soon as it arrives, without having to wait for the 531 arrival of subsequent packets. 533 5.2. Translation from Loss History to Loss Events 535 TFRC requires that the loss fraction be robust to several consecutive 536 packets lost where those packets are part of the same loss event. This 537 is similar to TCP, which (typically) only performs one halving of the 538 congestion window during any single RTT. Thus the receiver needs to map 539 the packet loss history into a loss event record, where a loss event is 540 one or more packets lost in an RTT. To perform this mapping, the 541 receiver needs to know the RTT to use, and this is supplied periodically 542 by the sender, typically as control information piggy-backed onto a data 543 packet. TFRC is not sensitive to how the RTT measurement sent to the 544 receiver is made, but we recommend using the sender's calculated RTT, R, 545 (see Section 4.3) for this purpose. 547 To determine whether a lost or marked packet should start a new loss 548 event, or be counted as part of an existing loss event, we need to 549 compare the sequence numbers and timestamps of the packets that arrived 550 at the receiver. For a marked packet S_new, its reception time T_new 551 can be noted directly. For a lost packet, we can interpolate to infer 552 the nominal "arrival time". Assume: 554 S_loss is the sequence number of a lost packet. 556 S_before is the sequence number of the last packet to arrive with 557 sequence number before S_loss. 559 S_after is the sequence number of the first packet to arrive with 560 sequence number after S_loss. 562 T_before is the reception time of S_before. 564 T_after is the reception time of S_after. 566 Note that T_before can either be before or after T_after due to 567 reordering. 569 For a lost packet S_loss, we can interpolate its nominal "arrival time" 570 at the receiver from the arrival times of S_before and S_after. Thus 572 T_loss = T_before + ( (T_after - T_before) 573 * (S_loss - S_before)/(S_after - S_before) ); 575 Note that if the sequence space wrapped between S_before and S_after, 576 then the sequence numbers must be modified to take this into account 577 before performing this calculation. If the largest possible sequence 578 number is S_max, and S_before > S_after, then modifying each sequence 579 number S by S' = (S + (S_max + 1)/2) mod (S_max + 1) would normally be 580 sufficient. 582 If the lost packet S_old was determined to have started the previous 583 loss event, and we have just determined that S_new has been lost, then 584 we interpolate the nominal arrival times of S_old and S_new, called 585 T_old and T_new respectively. 587 If T_old + R >= T_new, then S_new is part of the existing loss event. 588 Otherwise S_new is the first packet in a new loss event. 590 5.3. Inter-loss Event Interval 592 If a loss interval, A, is determined to have started with packet 593 sequence number S_A and the next loss interval, B, started with packet 594 sequence number S_B, then the number of packets in loss interval A is 595 given by (S_B - S_A). 597 5.4. Average Loss Interval 599 To calculate the loss event rate p, we first calculate the average loss 600 interval. This is done using a filter that weights the n most recent 601 loss event intervals in such a way that the measured loss event rate 602 changes smoothly. 604 Weights w_0 to w_(n-1) are calculated as: 606 If (i < n/2) 607 w_i = 1; 608 Else 609 w_i = 1 - (i - (n/2 - 1))/(n/2 + 1); 611 Thus if n=8, the values of w_0 to w_7 are: 613 1.0, 1.0, 1.0, 1.0, 0.8, 0.6, 0.4, 0.2 615 The value n for the number of loss intervals used in calculating the 616 loss event rate determines TRFC's speed in responding to changes in the 617 level of congestion. As currently specified, TFRC should not be used 618 for values of n significantly greater than 8, for traffic that might 619 compete in the global Internet with TCP. At the very least, safe 620 operation with values of n greater than 8 would require a slight change 621 to TFRC's mechanisms to include a more severe response to two or more 622 round-trip times with heavy packet loss. 624 When calculating the average loss interval we need to decide whether to 625 include the interval since the most recent packet loss event. We only 626 do this if it is sufficiently large to increase the average loss 627 interval. 629 Thus if the most recent loss intervals are I_0 to I_n, with I_0 being 630 the interval since the most recent loss event, then we calculate the 631 average loss interval I_mean as: 633 I_tot0 = 0; 634 I_tot1 = 0; 635 W_tot = 0; 636 for (i = 0 to n-1) { 637 I_tot0 = I_tot0 + (I_i * w_i); 638 W_tot = W_tot + w_i; 639 } 640 for (i = 1 to n) { 641 I_tot1 = I_tot1 + (I_i * w_(i-1)); 642 } 643 I_tot = max(I_tot0, I_tot1); 644 I_mean = I_tot/W_tot; 646 The loss event rate, p is simply: 648 p = 1 / I_mean; 650 5.5. History Discounting 652 As described in Section 5.4, the most recent loss interval is only 653 assigned 1/(0.75*n) of the total weight in calculating the average loss 654 interval, regardless of the size of the most recent loss interval. This 655 section describes an optional history discounting mechanism, discussed 656 further in [3] and [5], that allows the TFRC receiver to adjust the 657 weights, concentrating more of the relative weight on the most recent 658 loss interval, when the most recent loss interval is more than twice as 659 large as the computed average loss interval. 661 To carry out history discounting, we associate a discount factor DF_i 662 with each loss interval L_i, for i > 0, where each discount factor is a 663 floating point number. The discount array maintains the cumulative 664 history of discounting for each loss interval. At the beginning, the 665 values of DF_i in the discount array are initialized to 1: 667 for (i = 1 to n) { 668 DF_i = 1; 669 } 671 History discounting also uses a general discount factor DF, also a 672 floating point number, that is also initialized to 1. First we show how 673 the discount factors are used in calculating the average loss interval, 674 and then we describe later in this section how the discount factors are 675 modified over time. 677 As described in Section 5.4 the average loss interval is calculated 678 using the n previous loss intervals I_1, ..., I_n, and the interval I_0 679 that represents the number of packets received since the last loss 680 event. The computation of the average loss interval using the discount 681 factors is a simple modification of the procedure in Section 5.4, as 682 follows: 684 I_tot0 = I_0 * w_0 685 I_tot1 = 0; 686 W_tot0 = w_0 687 W_tot1 = 0; 688 for (i = 1 to n-1) { 689 I_tot0 = I_tot0 + (I_i * w_i * DF_i * DF); 690 W_tot0 = W_tot0 + w_i * DF_i * DF; 691 } 692 for (i = 1 to n) { 693 I_tot1 = I_tot1 + (I_i * w_(i-1) * DF_i); 694 W_tot1 = W_tot1 + w_(i-1) * DF_i; 695 } 696 p = min(W_tot0/I_tot0, W_tot1/I_tot1); 698 The general discounting factor, DF is updated on every packet arrival as 699 follows. First, the receiver computes the weighted average I_mean of the 700 loss intervals I_1, ..., I_n: 702 I_tot = 0; 703 W_tot = 0; 704 for (i = 1 to n) { 705 W_tot = w_(i-1) * DF_i; 706 I_tot = I_tot + (I_i * w_(i-1) * DF_i); 707 } 708 I_mean = I_tot / W_tot; 710 This weighted average I_mean is compared to I_0, the number of packets 711 received since the last loss event. If I_0 is greater than twice 712 I_mean, then the new loss interval is considerably larger than the old 713 ones, and the general discount factor DF is updated to decrease the 714 relative weight on the older intervals, as follows: 716 if (I_0 > 2 * I_mean) { 717 DF = 2 * I_mean/I_0; 718 if (DF < THRESHOLD) 719 DF = THRESHOLD; 720 } else 721 DF = 1; 723 A nonzero value for THRESHOLD ensures that older loss intervals from an 724 earlier time of high congestion are not discounted entirely. We 725 recommend a THRESHOLD of 0.5. Note that with each new packet arrival, 726 I_0 will increase further, and the discount factor DF will be updated. 728 When a new loss event occurs, the current interval shifts from I_0 to 729 I_1, loss interval I_i shifts to interval I_(i+1), and the loss interval 730 I_n is forgotten. The previous discount factor DF has to be 731 incorporated into the discount array. Because DF_i carries the discount 732 factor associated with loss interval I_i, the DF_i array has to be 733 shifted as well. This is done as follows: 735 for (i = 1 to n) { 736 DF_i = DF * DF_i; 737 } 738 for (i = n-1 to 0 step -1) { 739 DF_(i+1) = DF_i; 740 } 741 I_0 = 1; 742 DF_0 = 1; 743 DF = 1; 745 This completes the description of the optional history discounting 746 mechanism. We emphasize that this is an optional mechanism whose sole 747 purpose is to allow TFRC to response somewhat more quickly to the sudden 748 absence of congestion, as represented by a long current loss interval. 750 6. Data Receiver Protocol 752 The receiver periodically sends feedback messages to the sender. 753 Feedback packets should normally be sent at least once per RTT, unless 754 the sender is sending at a rate of less than one packet per RTT, in 755 which case a feedback packet should be send for every data packet 756 received. A feedback packet should also be sent whenever a new loss 757 event is detected without waiting for the end of an RTT, and whenever an 758 out-of-order data packet is received that removes a loss event from the 759 history. 761 If the sender is transmitting at a high rate (many packets per RTT) 762 there may be some advantages to sending periodic feedback messages more 763 than once per RTT as this allows faster response to changing RTT 764 measurements, and more resilience to feedback packet loss. However 765 there is little gain from sending a large number of feedback messages 766 per RTT. 768 6.1. Receiver behavior when a data packet is received 770 When a data packet is received, the receiver performs the following 771 steps: 773 1) Add the packet to the packet history. 775 2) Let the previous value of p be p_prev. Calculate the new value of 776 p as described in Section 5. 778 3) If p > p_prev, cause the feedback timer to expire, and perform the 779 actions described in Section 6.2 781 If p <= p_prev no action need be performed. 783 However an optimization might check to see if the arrival of the 784 packet caused a hole in the packet history to be filled and 785 consequently two loss intervals were merged into one. If this is 786 the case, the receiver might also send feedback immediately. The 787 effects of such an optimization are normally expected to be small. 789 6.2. Expiration of feedback timer 791 When the feedback timer at the receiver expires, the action to be taken 792 depends on whether data packets have been received since the last 793 feedback was sent. 795 Let the maximum sequence number of a packet at the receiver so far be 796 S_m, and the value of the RTT measurement included in packet S_m be R_m. 797 If data packets have been received since the pervious feedback was sent, 798 the receiver performs the following steps: 800 1) Calculate the average loss event rate using the algorithm described 801 above. 803 2) Calculate the measured receive rate, X_recv, based on the packets 804 received within the previous R_m seconds. 806 3) Prepare and send a feedback packet containing the information 807 described in Section 3.2.2 809 4) Restart the feedback timer to expire after R_m seconds. 811 If no data packets have been received since the last feedback was sent, 812 no feedback packet is sent, and the feedback timer is restarted to 813 expire after R_m seconds. 815 6.3. Receiver initialization 817 The receiver is initialized by the first packet that arrives at the 818 receiver. Let the sequence number of this packet be i. 820 When the first packet is received: 822 o Set p=0 824 o Set X_recv = X_send, where X_send is the sending rate the sender 825 reports in the first data packet. 827 o Prepare and send a feedback packet. 829 o Set the feedback timer to expire after R_i seconds. 831 6.3.1. Initializing the Loss History after the First Loss Event 833 The number of packets until the first loss can not be used to compute 834 the sending rate directly, as the sending rate changes rapidly during 835 this time. TFRC assumes that the correct data rate after the first loss 836 is half of the sending rate when the loss occurred. TFRC approximates 837 this target rate by X_recv, the receive rate over the most recent round- 838 trip time. After the first loss, instead of initializing the first loss 839 interval to the number of packets sent until the first loss, the TFRC 840 receiver calculates the loss interval that would be required to produce 841 the data rate X_recv, and uses this synthetic loss interval to seed the 842 loss history mechanism. 844 TFRC does this by finding some value p for which the throughput equation 845 in Section 3.1 gives a sending rate within 5% of X_recv, given the 846 current packet size s and round-trip time R. The first loss interval is 847 then set to 1/p. (The 5% tolerance is introduced simply because the 848 throughput equation is difficult to invert, and we want to reduce the 849 costs of calculating p numerically.) 851 7. Security Considerations 853 TFRC is not a transport protocol in its own right, but a congestion 854 control mechanism that is intended to be used in conjunction with a 855 transport protocol. Therefore security primarily needs to be considered 856 in the context of a specific transport protocol and its authentication 857 mechanisms. 859 Congestion control mechanisms can potentially be exploited to create 860 denial of service. This may occur through spoofed feedback. Thus any 861 transport protocol that uses TFRC should take care to ensure that 862 feedback is only accepted from the receiver of the data. The precise 863 mechanism to achieve this will however depend on the transport protocol 864 itself. 866 In addition, congestion control mechanisms may potentially be 867 manipulated by a greedy receiver that wishes to receive more than its 868 fair share of network bandwidth. A receiver might do this by claiming 869 to have received packets that in fact were lost due to congestion. 870 Possible defenses against such a receiver would normally include some 871 form of nonce that the receiver must feed back to the sender to prove 872 receipt. However, the details of such a nonce would depend on the 873 transport protocol, and in particular on whether the transport protocol 874 is reliable or unreliable. 876 We expect that protocols incorporating ECN with TFRC will also want to 877 incorporate feedback from the receiver to the sender using the ECN nonce 878 [WES01]. The ECN nonce is a modification to ECN that protects the 879 sender from the accidental or malicious concealment of marked packets. 880 Again, the details of such a nonce would depend on the transport 881 protocol, and are not addressed in this document. 883 8. Authors' Addresses 885 Mark Handley, Jitendra Padhye, Sally Floyd 886 ACIRI/ICSI 887 1947 Center St, Suite 600 888 Berkeley, CA 94708 889 mjh@aciri.org, padhye@aciri.org, floyd@aciri.org 891 Joerg Widmer 892 Lehrstuhl Praktische Informatik IV 893 Universitat Mannheim 894 L 15, 16 - Room 415 895 D-68131 Mannheim 896 Germany 897 widmer@informatik.uni-mannheim.de 899 9. Acknowledgments 901 We would like to acknowledge feedback and discussions on equation-based 902 congestion control with a wide range of people, including members of the 903 Reliable Multicast Research Group, the Reliable Multicast Transport 904 Working Group, and the End-to-End Research Group. We would like to 905 thank Eduardo Urzaiz, Vladica Stanisic, and Shushan Wen for feedback on 906 earlier versions of this document, and to thank Mark Allman for his 907 extensive feedback from using the draft to produce a working 908 implementation. 910 10. References 912 [1] Balakrishnan, H., Rahul, H., and Seshan, S., "An Integrated 913 Congestion Management Architecture for Internet Hosts," Proc. ACM 914 SIGCOMM, Cambridge, MA, September 1999. 916 [2] S. Floyd, M. Handley, J. Padhye, and J. Widmer, "Equation-Based 917 Congestion Control for Unicast Applications", August 2000, Proc 918 SIGCOMM 2000. 920 [3] S. Floyd, M. Handley, J. Padhye, and J. Widmer, "Equation-Based 921 Congestion Control for Unicast Applications: the Extended Version", 922 ICSI tech report TR-00-03, March 2000. 924 [4] Padhye, J. and Firoiu, V. and Towsley, D. and Kurose, J., "Modeling 925 TCP Throughput: A Simple Model and its Empirical Validation", Proc 926 ACM SIGCOMM 1998. 928 [5] Widmer, J., "Equation-Based Congestion Control", Diploma Thesis, 929 University of Mannheim, February 2000. URL 930 "http://www.aciri.org/tfrc/". 932 [6] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, "RTP: A 933 Transport Protocol for Real-Time Applications", RFC 1889, January 934 1996. 936 [7] K. Ramakrishnan and S. Floyd, "A Proposal to add Explicit Congestion 937 Notification (ECN) to IP", RFC 2481, January 1999. 939 [8] V. Paxson and M. Allman, "Computing TCP's Retransmission Timer", RFC 940 2988, November 2000. 942 [9] Wetherall, D., Ely, D., and Spring, N., "Robust ECN Signaling with 943 Nonces", draft-ietf-tsvwg-tcp-nonce-00.txt, internet draft, work in 944 progress, January 2001. Citation for informational purposes only.