idnits 2.17.1 draft-ietf-dccp-rfc3448bis-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 20. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 2189. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 2200. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 2207. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 2213. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (8 July 2007) is 6137 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 3448 (Obsoleted by RFC 5348) -- Obsolete informational reference (is this intentional?): RFC 2140 (Obsoleted by RFC 9040) -- Obsolete informational reference (is this intentional?): RFC 2581 (Obsoleted by RFC 5681) -- Obsolete informational reference (is this intentional?): RFC 2861 (Obsoleted by RFC 7661) -- Obsolete informational reference (is this intentional?): RFC 2988 (Obsoleted by RFC 6298) -- Duplicate reference: RFC3448, mentioned in 'RFC3448Err', was also mentioned in 'RFC3448'. -- Obsolete informational reference (is this intentional?): RFC 3448 (Obsoleted by RFC 5348) Summary: 2 errors (**), 0 flaws (~~), 1 warning (==), 13 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force S. Floyd 3 INTERNET-DRAFT ICIR 4 Intended status: Proposed Standard M. Handley 5 Expires: January 2008 University College London 6 J. Padhye 7 Microsoft 8 J. Widmer 9 University of Mannheim 10 8 July 2007 12 TCP Friendly Rate Control (TFRC): Protocol Specification 13 draft-ietf-dccp-rfc3448bis-02.txt 15 Status of this Memo 17 By submitting this Internet-Draft, each author represents that any 18 applicable patent or other IPR claims of which he or she is aware 19 have been or will be disclosed, and any of which he or she becomes 20 aware will be disclosed, in accordance with Section 6 of BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF), its areas, and its working groups. Note that 24 other groups may also distribute working documents as Internet- 25 Drafts. 27 Internet-Drafts are draft documents valid for a maximum of six 28 months and may be updated, replaced, or obsoleted by other documents 29 at any time. It is inappropriate to use Internet-Drafts as 30 reference material or to cite them other than as "work in progress." 32 The list of current Internet-Drafts can be accessed at 33 http://www.ietf.org/ietf/1id-abstracts.txt. 35 The list of Internet-Draft Shadow Directories can be accessed at 36 http://www.ietf.org/shadow.html. 38 This Internet-Draft will expire on January 2008. 40 Copyright Notice 42 Copyright (C) The IETF Trust (2007). 44 Abstract 46 This document specifies TCP-Friendly Rate Control (TFRC). TFRC is a 47 congestion control mechanism for unicast flows operating in a best- 48 effort Internet environment. It is reasonably fair when competing 49 for bandwidth with TCP flows, but has a much lower variation of 50 throughput over time compared with TCP, making it more suitable for 51 applications such as streaming media where a relatively smooth 52 sending rate is of importance. 54 Table of Contents 56 1. Introduction ...................................................9 57 2. Conventions ...................................................10 58 3. Protocol Mechanism ............................................10 59 3.1. TCP Throughput Equation ..................................11 60 3.2. Packet Contents ..........................................12 61 3.2.1. Data Packets ......................................13 62 3.2.2. Feedback Packets ..................................13 63 4. Data Sender Protocol ..........................................14 64 4.1. Measuring the Segment Size ...............................14 65 4.2. Sender Initialization ....................................15 66 4.3. Sender Behavior When a Feedback Packet is Received .......16 67 4.4. Expiration of Nofeedback Timer ...........................18 68 4.5. Reducing Oscillations ....................................21 69 4.6. Scheduling of Packet Transmissions .......................22 70 4.6.1. Sending Packets Before their Nominal Send Time ....23 71 5. Calculation of the Loss Event Rate (p) ........................24 72 5.1. Detection of Lost or Marked Packets ......................24 73 5.2. Translation from Loss History to Loss Events .............25 74 5.3. Inter-loss Event Interval ................................27 75 5.4. Average Loss Interval ....................................27 76 5.5. History Discounting ......................................28 77 6. Data Receiver Protocol ........................................30 78 6.1. Receiver Behavior When a Data Packet is Received .........31 79 6.2. Expiration of Feedback Timer .............................32 80 6.3. Receiver Initialization ..................................33 81 6.3.1. Initializing the Loss History after the First Loss 82 Event ....................................................33 83 7. Sender-based Variants .........................................34 84 8. Implementation Issues .........................................35 85 9. Changes from RFC 3448 .........................................36 86 10. Security Considerations ......................................39 87 11. IANA Considerations ..........................................40 88 12. Acknowledgments ..............................................40 89 A. Terminology ...................................................40 90 B. The Initial Value of the Nofeedback Timer .....................42 91 C. Response to Idle or Data-limited Periods ......................42 92 C.1. Long Idle or Data-limited Periods ........................43 93 C.2. Short Idle or Data-limited Periods .......................45 94 C.3. Moderate Idle or Data-limited Periods ....................46 95 C.4. Other Patterns ...........................................46 96 Normative References .............................................46 97 Informational References .........................................47 98 Authors' Addresses ...............................................48 99 Full Copyright Statement .........................................49 100 Intellectual Property ............................................49 101 NOTE TO RFC EDITOR: PLEASE DELETE THIS NOTE UPON PUBLICATION. 103 Changes from draft-ietf-dccp-rfc3448bis-01.txt: 105 * Specified that the sender is not limited by the receive rate 106 if the sender has been data-limited for an entire feedback 107 interval. 109 * Added variables "initial_rate" and "recover_rate, for the 110 initial transmit rate and the rate for resuming after an idle 111 period, for easier specification of Faster Restart (in a separate 112 document). Also added the variable "recv_limit" to specify 113 the limit on the sending rate that is computed from the receive 114 rate, and the variable "timer_limit" to specify the 115 limit on the sending rate from the expiration of the nofeedback 116 timer. 117 Explained why recover_rate is not used as lower bound 118 for nofeedback timer expirations after a data-limited period. 120 * Added Appendix C on "Response to Idle or Data-limited Periods". 122 * Revised the section on "Scheduling of Packet Transmissions" 123 to make clear what is specification, and what is 124 implementation. From Gerrit. Also stated that the 125 accumulation of sending credits should be limited 126 to a round-trip time's worth of packets. 128 * For measuring the receive rate, added that after a loss event, 129 the receive rate SHOULD be measured over the most recent RTT, 130 but for simplicity of implementation, MAY be measured over 131 a slightly longer time interval. 133 * Clarified that RTT measurements don't necessarily come from 134 feedback packets; they could also come from other places, 135 e.g., from the SYN exchange. 137 * Specified that the sender may maintain unused sent credits 138 up to one RTT. This gives behavior similar to TCP. 139 Also specified that the sender should not sent packets more 140 that rtt/2 seconds before their nominal send time. 142 * Reinserted the last paragraph of Section 4.4 from RFC 3448. 143 It must have been deleted accidently. 145 * TODO in ns-2 146 - Add a variable to ns-2 to allow either TFRC or CCID3. 148 * Feedback from Arjuna Sathiaseelan: 150 - Changing W_init to be in terms of segment size s, not MSS. 152 * Changed THRESHOLD, the lower bound on the history 153 discounting parameter DF, from 0.5 to 0.25, for more 154 history discounting when the current interval is long. 156 * Relying on the sender not to use X_recv from data-limited 157 periods. This gives behavior similar to TCP, when 158 ACK-clocking is not in effect in data-limited periods. 159 The largest X_recv over the most recent two round-trip 160 times is used to limit the sending rate. This is 161 maintained using X_recv_set. Taken together, these avoid 162 problems with the first feedback packet after an idle 163 period, and this avoids problems with limitations 164 from X_recv during data-limited periods. 166 * Clarified that when the receiver receives a data packet, 167 and didn't send a feedback packet when the feedback timer 168 last expired (because no data packets were received), 169 then the receiver sends a feedback packet immediately. 171 * Clarified that the feedback packet reports the rate over 172 the last RTT, not necessarily the rate since the 173 last feedback packet was sent (if no feedback packet was 174 sent when the feedback timer last expired). 176 * Corrected earlier code designed to prevent the receive 177 rate from limiting the sending rate when the first feedback 178 packet received, or for the first feedback packet received 179 after an idle period. 181 * Clarified that we have p=0 only until the first loss event. 182 After the first loss event, p>0, and it is not possible to go 183 back to p=0. In response to old email. 185 * Clarified in Section 6.1 that the loss event rate does not 186 have to be recalculated with the arrival of each new data 187 packet. 189 * Clarified the section on Reducing Oscillations. Feedback from 190 Gerrit Renker. 192 Changes from draft-ietf-dccp-rfc3448bis-00.txt: 194 * When initializing the loss history after the first 195 data packet sent is lost or ECN-marked, TFRC uses 196 a minimum receive rate of 0.5 packets per second. 198 * For initializing the estimated packet drop rate 199 for the first loss interval when coming out of slow-start, 200 it is ok to use the maximum receive rate so far, not just 201 the receive rate in the last round-trip time. 202 Feedback from Ladan Gharai. 204 * General feedback from Gorry Fairhurst: 205 - Added a reference for RFC4828. 206 - Clarified that R_m is sender's estimate of RTT, as reported 207 in Section 3.2.1. 208 - Added a definition of terms. 209 - Added a discussion of why the initial value of the nofeedback 210 timer is two seconds, instead of three seconds for the 211 recommended initial value for TCP's retransmit timer. 213 * General feedback from Arjuna Sathiaseelan: 214 - Added more details about sending multiple feedback 215 packets per RTT. 216 - Added change to Section 4.3 to use the first feedback 217 packet, or the first feedback packet after a 218 nofeedback timer during slow-start, *if min_rate > X*. 220 * General feedback from Gerrit Renker: 221 - Changed "delta" to "t_delta". 222 - Changed X_calc to X_Bps, clarified X. 223 - Clarified send times in "Scheduling of Packet Transmissions". 224 - Changed so that tld can be initialized to either 0 or -1. 225 - Fixed Section 5.5 to say that the most recent lost 226 interval has weight 1/(0.75*n) *when there have been 227 at least eight loss intervals*. 228 - Clarified introduction about fixed-size and variable-size 229 packets. 231 * Added more about sender-based variants. 232 Feedback from Guillaume Jourjon. 234 * Corrected that the loss interval I_0 includes all transmitted 235 packets, including lost and marked packets (as defined in Section 236 5.3 in the general definition.) Email from Eddie Kohler and 237 Gerrit Renker. 239 * Not done: I didn't add a minimum value for the nofeedback 240 timer. (Why would a nofeedback timer need to be bigger 241 than max(4*R, 2*s/X)? Email discussing pros and cons from 242 Arjuna. 244 Changes from draft-floyd-rfc3448bis-00.txt: 246 * Name change to draft-ietf-dccp-rfc3448bis-00.txt. 248 * Specified the receiver's initialization of the feedback timer 249 when the first data packet doesn't have an estimate of the 250 RTT. From feedback from Dado Colussi. 252 * Added the procedure for sending receiver 253 feedback packets when a coarse-grained 254 timestamp is used. From RFC 4243. 256 Changes from RFC 3448: 258 * Incorporated changes in the RFC 3448 errata: 260 - "If the sender does not receive a feedback report for 261 four round trip times, it cuts its sending rate in half." 262 ("Two" changed to "four", for consistency with the rest 263 of the document. Reported by Joerg Widmer). 265 - "If the nofeedback timer expires when the sender does not 266 yet have an RTT sample, and has not yet received any 267 feedback from the receiver, or when p == 0,..." 268 (Added "or when p == 0,", reported by Wim Heirman). 270 - In Section 5.5, changed: 271 for (i = 1 to n) { DF_i = 1; } 272 to: 273 for (i = 0 to n) { DF_i = 1; } 274 Reported by Michele R. 276 * Changed RFC 3448 to correspond to the larger initial windows 277 specified in RFC 3390. This includes the following: 279 - Incorporated Section 5.1 from [RFC4342], saying that 280 when reducing the sending rate after an idle period, don't 281 reduce the sending rate below the initial sending rate. 283 - Change for a datalimited sender: 284 When the sender has been datalimited, the sender doesn't 285 let the receive rate limit it to a sending rate less than 286 the initial rate. 288 - Small change to slow-start: 289 Changed so that for the first feedback packet received, 290 or for the first feedback packet received after an idle 291 period, the receive rate is not used to limit the 292 sending rate. This is because the receiver might not yet 293 have seen an entire window of data. 295 * Clarified how the average loss interval is calculated when 296 the receiver has not yet seen eight loss intervals. 298 * Discussed more about estimating the average segment size: 300 - For initializing the loss history after the first loss event, 301 either the receiver knows the sender's value for s, or 302 the receiver uses the throughput equation for X_pps and does 303 not need to know an estimate for s. 305 - Added a discussion about estimating the average segment size 306 s in Section 4.1 on "Measuring the Segment Size". 308 - Changed "packet size" to "segment size". 310 END OF NOTE TO RFC EDITOR. 312 1. Introduction 314 This document specifies TCP-Friendly Rate Control (TFRC). TFRC is a 315 congestion control mechanism designed for unicast flows operating in 316 an Internet environment and competing with TCP traffic [FHPW00]. 317 Instead of specifying a complete protocol, this document simply 318 specifies a congestion control mechanism that could be used in a 319 transport protocol such as DCCP (Datagram Congestion Control 320 Protocol) [RFC4340], in an application incorporating end-to-end 321 congestion control at the application level, or in the context of 322 endpoint congestion management [BRS99]. This document does not 323 discuss packet formats or reliability. Implementation-related 324 issues are discussed only briefly, in Section 8. 326 TFRC is designed to be reasonably fair when competing for bandwidth 327 with TCP flows, where a flow is "reasonably fair" if its sending 328 rate is generally within a factor of two of the sending rate of a 329 TCP flow under the same conditions. However, TFRC has a much lower 330 variation of throughput over time compared with TCP, which makes it 331 more suitable for applications such as telephony or streaming media 332 where a relatively smooth sending rate is of importance. 334 The penalty of having smoother throughput than TCP while competing 335 fairly for bandwidth is that TFRC responds slower than TCP to 336 changes in available bandwidth. Thus TFRC should only be used when 337 the application has a requirement for smooth throughput, in 338 particular, avoiding TCP's halving of the sending rate in response 339 to a single packet drop. For applications that simply need to 340 transfer as much data as possible in as short a time as possible we 341 recommend using TCP, or if reliability is not required, using an 342 Additive-Increase, Multiplicative-Decrease (AIMD) congestion control 343 scheme with similar parameters to those used by TCP. 345 TFRC is designed for best performance with applications that use a 346 fixed segment size, and vary their sending rate in packets per 347 second in response to congestion. TFRC can also be used, perhaps 348 with less optimal performance, with applications that don't have a 349 fixed segment size, but where the segment size varies according to 350 the needs of the application (e.g., video applications). 352 Some applications (e.g., some audio applications) require a fixed 353 interval of time between packets and vary their segment size instead 354 of their packet rate in response to congestion. The congestion 355 control mechanism in this document is not designed for those 356 applications; TFRC-SP (Small-Packet TFRC) is a variant of TFRC for 357 applications that have a fixed sending rate in packets per second 358 but either use small packets, or vary their packet size in response 359 to congestion. TFRC-SP is specified in a separate document 361 [RFC4828]. 363 This document specifies TFRC as a receiver-based mechanism, with the 364 calculation of the congestion control information (i.e., the loss 365 event rate) in the data receiver rather in the data sender. This is 366 well-suited to an application where the sender is a large server 367 handling many concurrent connections, and the receiver has more 368 memory and CPU cycles available for computation. In addition, a 369 receiver-based mechanism is more suitable as a building block for 370 multicast congestion control. However, it is also possible to 371 implement TFRC in sender-based variants, as allowed in DCCP's 372 Congestion Control ID 3 (CCID 3) [RFC4342]. 374 2. Conventions 376 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 377 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 378 document are to be interpreted as described in [RFC2119]. 380 Appendix A gives a list of technical terms used in this document. 382 3. Protocol Mechanism 384 For its congestion control mechanism, TFRC directly uses a 385 throughput equation for the allowed sending rate as a function of 386 the loss event rate and round-trip time. In order to compete fairly 387 with TCP, TFRC uses the TCP throughput equation, which roughly 388 describes TCP's sending rate as a function of the loss event rate, 389 round-trip time, and segment size. We define a loss event as one or 390 more lost or marked packets from a window of data, where a marked 391 packet refers to a congestion indication from Explicit Congestion 392 Notification (ECN) [RFC3168]. 394 Generally speaking, TFRC's congestion control mechanism works as 395 follows: 397 o The receiver measures the loss event rate and feeds this 398 information back to the sender. 400 o The sender also uses these feedback messages to measure the 401 round-trip time (RTT). 403 o The loss event rate and RTT are then fed into TFRC's throughput 404 equation, and the resulting sending rate is limited to at most 405 twice the receive rate to give the allowed transmit rate X. 407 o The sender then adjusts its transmit rate to match the allowed 408 transmit rate X. 410 The dynamics of TFRC are sensitive to how the measurements are 411 performed and applied. We recommend specific mechanisms below to 412 perform and apply these measurements. Other mechanisms are 413 possible, but it is important to understand how the interactions 414 between mechanisms affect the dynamics of TFRC. 416 3.1. TCP Throughput Equation 418 Any realistic equation giving TCP throughput as a function of loss 419 event rate and RTT should be suitable for use in TFRC. However, we 420 note that the TCP throughput equation used must reflect TCP's 421 retransmit timeout behavior, as this dominates TCP throughput at 422 higher loss rates. We also note that the assumptions implicit in 423 the throughput equation about the loss event rate parameter have to 424 be a reasonable match to how the loss rate or loss event rate is 425 actually measured. While this match is not perfect for the 426 throughput equation and loss rate measurement mechanisms given 427 below, in practice the assumptions turn out to be close enough. 429 The throughput equation we currently recommend for TFRC is a 430 slightly simplified version of the throughput equation for Reno TCP 431 from [PFTK98]. Ideally we'd prefer a throughput equation based on 432 SACK TCP, but no one has yet derived the throughput equation for 433 SACK TCP, and from both simulations and experiments, the differences 434 between the two equations are relatively minor. 436 The throughput equation is: 438 s 439 X_Bps = ---------------------------------------------------------- 440 R*sqrt(2*b*p/3) + (t_RTO * (3*sqrt(3*b*p/8)*p*(1+32*p^2))) 442 Where: 444 X_Bps is the transmit rate in bytes/second. (X_Bps is the same 445 as X_calc in RFC 3448.) 447 s is the segment size in bytes. 449 R is the round trip time in seconds. 451 p is the loss event rate, between 0 and 1.0, of the number of 452 loss events as a fraction of the number of packets transmitted. 454 t_RTO is the TCP retransmission timeout value in seconds. 456 b is the maximum number of packets acknowledged by a single TCP 457 acknowledgement. 459 We further simplify this by setting t_RTO = 4*R. A more accurate 460 calculation of t_RTO is possible, but experiments with the current 461 setting have resulted in reasonable fairness with existing TCP 462 implementations [W00]. Another possibility would be to set t_RTO = 463 max(4R, one second), to match the recommended minimum of one second 464 on the RTO [RFC2988]. 466 Many current TCP connections use delayed acknowledgements, sending 467 an acknowledgement for every two data packets received, and thus 468 have a sending rate modeled by b = 2. However, TCP is also allowed 469 to send an acknowledgement for every data packet, and this would be 470 modeled by b = 1. Because many TCP implementations do not use 471 delayed acknowledgements, we recommend b = 1. 473 In future, different TCP equations may be substituted for this 474 equation. The requirement is that the throughput equation be a 475 reasonable approximation of the sending rate of TCP for conformant 476 TCP congestion control. 478 The throughput equation can also be expressed as 480 X_Bps = X_pps * s , 482 with X_pps, the sending rate in packets per second, given as 484 1 485 X_pps = -------------------------------------------------------- 486 R*sqrt(2*b*p/3) + (t_RTO*(3*sqrt(3*b*p/8)*p*(1+32*p^2))) 488 The parameters s (segment size), p (loss event rate) and R (RTT) 489 need to be measured or calculated by a TFRC implementation. The 490 measurement of s is specified in Section 4.1, measurement of R is 491 specified in Section 4.3, and measurement of p is specified in 492 Section 5. In the rest of this document data rates are measured in 493 bytes/second unless otherwise specified. 495 3.2. Packet Contents 497 Before specifying the sender and receiver functionality, we describe 498 the contents of the data packets sent by the sender and feedback 499 packets sent by the receiver. As TFRC will be used along with a 500 transport protocol, we do not specify packet formats, as these 501 depend on the details of the transport protocol used. 503 3.2.1. Data Packets 505 Each data packet sent by the data sender contains the following 506 information: 508 o A sequence number. This number is incremented by one for each 509 data packet transmitted. The field must be sufficiently large 510 that it does not wrap causing two different packets with the 511 same sequence number to be in the receiver's recent packet 512 history at the same time. 514 o A timestamp indicating when the packet is sent. We denote by 515 ts_i the timestamp of the packet with sequence number i. The 516 resolution of the timestamp should typically be measured in 517 milliseconds. 519 This timestamp is used by the receiver to determine which losses 520 belong to the same loss event. The timestamp is also echoed by 521 the receiver to enable the sender to estimate the round-trip 522 time, for senders that do not save timestamps of transmitted 523 data packets. 525 We note that as an alternative to a timestamp incremented in 526 milliseconds, a "timestamp" that increments every quarter of a 527 round-trip time would be sufficient for determining when losses 528 belong to the same loss event, in the context of a protocol 529 where this is understood by both sender and receiver, and where 530 the sender saves the timestamps of transmitted data packets. 532 o The sender's current estimate of the round trip time. The 533 estimate reported in packet i is denoted by R_i. The round-trip 534 time estimate is used by the receiver, along with the timestamp, 535 to determine when multiple losses belong to the same loss event. 536 The round-trip time estimate is also used by the receiver to 537 determine the interval to use for calculating the receive rate, 538 and to determine when to send feedback packets. 540 If the sender sends a coarse-grained "timestamp" that increments 541 every quarter of a round-trip time, as discussed above, then the 542 sender does not need to send its current estimate of the round 543 trip time. 545 3.2.2. Feedback Packets 547 Each feedback packet sent by the data receiver contains the 548 following information: 550 o The timestamp of the last data packet received. We denote this 551 by t_recvdata. If the last packet received at the receiver has 552 sequence number i, then t_recvdata = ts_i. 553 This timestamp is used by the sender to estimate the round-trip 554 time, and is only needed if the sender does not save timestamps 555 of transmitted data packets. 557 o The amount of time elapsed between the receipt of the last data 558 packet at the receiver, and the generation of this feedback 559 report. We denote this by t_delay. 561 o The rate at which the receiver estimates that data was received 562 in the previous round-trip time. We denote this by X_recv. 564 o The receiver's current estimate of the loss event rate p. 566 4. Data Sender Protocol 568 The data sender sends a stream of data packets to the data receiver 569 at a controlled rate. When a feedback packet is received from the 570 data receiver, the data sender changes its sending rate, based on 571 the information contained in the feedback report. If the sender does 572 not receive a feedback report for four round trip times, it cuts its 573 sending rate in half. This is achieved by means of a timer called 574 the nofeedback timer. 576 We specify the sender-side protocol in the following steps: 578 o Measurement of the mean segment size being sent. 580 o Sender initialization. 582 o The sender behavior when a feedback packet is received. 584 o The sender behavior when the nofeedback timer expires. 586 o Oscillation prevention (optional) 588 o Scheduling of transmission on non-realtime operating systems. 590 4.1. Measuring the Segment Size 592 The parameter s (segment size) is normally known to an application. 593 This may not be so in two cases: 595 o (1) The segment size naturally varies depending on the data. In 596 this case, although the segment size varies, that variation is 597 not coupled to the transmit rate. The TFRC sender can either 598 compute the average segment size or use the maximum segment size 599 for the segment size s. 601 o (2) The application needs to change the segment size rather than 602 the number of segments per second to perform congestion control. 603 This would normally be the case with packet audio applications 604 where a fixed interval of time needs to be represented by each 605 packet. Such applications need to have a completely different 606 way of measuring parameters. 608 For the first class of applications where the segment size varies 609 depending on the data, the sender MAY estimate the segment size s as 610 the average segment size over the last four loss intervals. The 611 sender MAY also estimate the average segment size over longer time 612 intervals, if so desired. The TFRC sender uses the segment size s 613 in the throughput equation, in the setting of the maximum receive 614 rate and the minimum and initial sending rates, and in the setting 615 of the nofeedback timer. 617 The TFRC receiver may use the average segment size s in initializing 618 the loss history after the first loss event, but Section 6.3.1 also 619 gives an alternate procedure that does not use the average segment 620 size s. 622 The second class of applications are discussed separately in a 623 separate document on TFRC-SP. For the remainder of this section we 624 assume the sender can estimate the segment size, and that congestion 625 control is performed by adjusting the number of packets sent per 626 second. 628 4.2. Sender Initialization 630 The initial values for X (the allowed sending rate in bytes per 631 second) and tld (the Time Last Doubled during slow-start) are 632 undefined until they are set as described below. If the sender is 633 ready to send data when it does not yet have a round trip sample, 634 the value of X is set to s bytes per second, for segment size s, the 635 nofeedback timer is set to expire after two seconds, and tld is set 636 either to 0 or to -1. Upon receiving a round trip time measurement 637 (e.g., after the first feedback packet or the SYN exchange, or from 638 a previous connection [RFC2140]), tld is set to the current time, 639 and the allowed transmit rate X is set to the initial_rate, specifed 640 as W_init/R, for W_init based on [RFC3390]: 642 W_init = min(4*s, max(2*s, 4380)). 644 For responding to the initial feedback packet, this replaces step 645 (4) of Section 4.3 below. 647 Appendix B explains why the initial value of TFRC's nofeedback timer 648 is set to two seconds, instead of the recommended initial value of 649 three seconds for TCP's retransmit timer from [RFC2988]. 651 4.3. Sender Behavior When a Feedback Packet is Received 653 The sender knows its current allowed sending rate X, and maintains 654 an estimate of the current round trip time R. The sender also 655 maintains X_recv_set as a small set of recent X_recv values. 656 X_recv_set is first initialized to contain the value Infinity (or a 657 suitably large number). The variable recv_limit is defined as the 658 limit on the sending rate that is computed from the receive rate. 659 In this document, in step (4) below, recv_limit is specified as 660 twice the maximum value in X_recv_set. Future documents might 661 specify alternate values for recv_limit. 663 When a feedback packet is received by the sender at time t_now, the 664 following actions should be performed. 666 1) Calculate a new round trip sample. 667 R_sample = (t_now - t_recvdata) - t_delay. 669 2) Update the round trip time estimate: 671 If no feedback has been received before 672 R = R_sample; 673 Else 674 R = q*R + (1-q)*R_sample; 676 TFRC is not sensitive to the precise value for the filter 677 constant q, but we recommend a default value of 0.9. 679 3) Update the timeout interval: 681 RTO = max(4*R, 2*s/X) 683 4) Update the allowed sending rate as follows: 685 If (the entire interval covered by the feedback packet 686 was a data-limited interval) { 687 Replace X_recv_set contents by Infinity; 688 } Else // typical behavior 689 Update X_recv_set; 690 recv_limit = 2 * max (X_recv_set); 691 If (p > 0) // congestion avoidance phase 692 Calculate X_Bps using the TCP throughput equation. 693 X = max(min(X_Bps, recv_limit), s/t_mbi); 694 Else if (t_now - tld >= R) // initial slow-start 695 X = max(min(2*X, recv_limit), initial_rate); 696 tld = t_now; 698 5) If oscillation reduction is used, calculate the instantaneous 699 transmit rate X_inst, following Section 4.5. 701 6) Reset the nofeedback timer to expire after RTO seconds. 703 The subroutine for updating X_recv_set below keeps a set of X_recv 704 values received for non-data-limited periods over the most recent 705 two round-trip times. 707 Update X_recv_set: 708 Add X_recv to X_recv_set; 709 Delete from X_recv_set values older than 710 two round-trip times. 712 We define a sender a data-limited any time it is not sending as much 713 as it is allowed to send (including unused send credits discussed in 714 Section 4.6). We define an interval as a `data-limited interval' if 715 the sender was data-limited over the *entire* interval. The first 716 ``if'' condition in step (4) prevents a sender from having to reduce 717 the sending rate as a result of a feedback packet reporting the 718 receive rate from a data-limited period. 720 Thus, consider a sender that is sending at its full allowed rate, 721 except that it is sending packets in pairs, rather than sending each 722 packet as soon as it can. Such a sender is considered data-limited 723 part of the time, because it is not always sending packets as soon 724 as it can. However, any interval that covers the transmission of at 725 least two data packets is not a data-limited interval for this 726 sender. 728 Because X_recv_set is initialized with the value Infinity, 729 recv_limit is set to Infinity for the first two round-trip times of 730 the connection. As a result, the sending rate is not limited by the 731 receive rate during that period. This avoids the problem of the 732 sending rate being limited by the value of X_recv from the first 733 feedback packet, which reports only one segment received in the last 734 round-trip time, 736 How does the sender determine the period covered by a feedback 737 packet? In general, the receiver will be sending a feedback packet 738 once per round-trip time, so typically the sender will be able to 739 determine exactly the period covered by the current feedback packet 740 from the previous feedback packet. However, in cases when the 741 previous feedback packet was lost, or when the receiver sends a 742 feedback packet early because it detected a lost or ECN-marked 743 packet, the sender will have to estimate the interval covered by the 744 feedback packet. As specified in Section 6.2, each feedback packet 745 sent by the receiver covers a round-trip time, for the round-trip 746 time estimate R_m maintained by the receiver R_m seconds before the 747 feedback packet was sent. 749 Note that when p=0, the sender has not yet learned of any loss 750 events, and the sender is in the initial slow-start phase. In this 751 initial slow-start phase, the sender can approximately double the 752 sending rate each round-trip time until a loss occurs. The 753 initial_rate term in step (4) gives a minimum allowed sending rate 754 during slow-start of the initial allowed sending rate. We note that 755 if the sender is data-limited during slow-start, or if the 756 connection is limited by the path bandwidth, then the sender isn't 757 necessarily able to double its sending rate each round-trip time; 758 the sender's sending rate is limited to at most twice the receive 759 rate, or at most initial_rate, whichever is larger. 761 This is similar to TCP's behavior, where the sending rate is 762 limiting by the rate of incoming acknowledgement packets, along with 763 the modification of the window increase algorithm. Thus in TCP's 764 Slow-Start, for the most aggressive case of the TCP receiver 765 acknowledging every data packet, the TCP sender's sending rate is 766 limited to at most twice the rate of these incoming acknowledgment 767 packets. 769 The parameter t_mbi is 64 seconds, and represents the maximum inter- 770 packet backoff interval in the persistent absence of feedback. 771 Thus, when p > 0 the sender sends at least one packet every 64 772 seconds. 774 4.4. Expiration of Nofeedback Timer 776 This section specifies the sender's response to a nofeedback timer. 777 The nofeedback timer could expire because of an idle period, or 778 because of data or feedback packets dropped in the network. 780 This section uses the variable recover_rate. When the TFRC sender 781 reduces the allowed sending rate in response to a nofeedback timer, 782 and the sender has been idle ever since the nofeedback timer was 783 set, the allowed sending rate is not reduced below the recover_rate. 784 For this document, the recover_rate is set to the initial_rate. 785 Future documents may explore other possible values for the 786 recover_rate. 788 If the nofeedback timer expires, the sender should perform the 789 following actions: 791 1) Cut the allowed sending rate in half. If the sender has an RTT 792 measurement, the allowed sending rate is reduced by setting 793 X_recv; the sending rate is limited to at most twice X_recv. 794 Modifying X_recv limits the sending rate, but also allows the 795 sender to slow-start, doubling its sending rate each RTT, if 796 feedback messages resume reporting no losses. 798 If the nofeedback timer expires when the sender does not yet 799 have an RTT sample and has not yet received any feedback from 800 the receiver, or when p == 0, then X_recv is not halved, and the 801 sending rate is cut in half directly. 803 If the sender has been idle since this nofeedback timer was set 804 and X_recv is less than the recover_rate, then X_recv should not 805 be halved in response to the timer expiration. This ensures 806 that the allowed sending rate is never reduced to less than half 807 the recover_rate as a result of an idle period. 809 X_recv_set is the set of recent X_recv values. We use the variable 810 timer_limit for the limit on the sending rate computed from the 811 expiration of the nofeedback timer. 813 X_recv = max (X_recv_set); 814 If (sender does not have an RTT sample and has not received 815 any feedback from receiver) 816 // We don't have X_Bps or recover_rate yet. 817 X = max(X/2, s/t_mbi); 818 Else if (X_recv < recover_rate, and 819 sender has been idle ever 820 since nofeedback timer was set) 821 Timer_limit is not updated; 822 Else if (p==0) 823 // We don't have X_Bps yet. 824 X = max(X/2, s/t_mbi); 825 Else if (X_Bps > 2*X_recv)) 826 // 2*X_recv was already limiting the sending rate. 827 timer_limit = X_recv; 828 Else 829 // The sending rate was limited by X_Bps, not by X_recv. 830 timer_limit = X_Bps/2; 831 If (timer_limit < s/t_mbi) 832 timer_limit = s/t_mbi; 834 The term s/t_mbi limits the backoff to one packet every 64 835 seconds. 837 2) If timer_limit has been changed, then do the following: 839 If (timer_limit has been updated) 840 Replace X_recv_set contents with timer_limit/2. 841 Recalculate X as in step (4) of Section 4.3. 843 3) Restart the nofeedback timer to expire after max(4*R, 2*s/X) 844 seconds. 846 Note that when the allowed sending rate is limited after an idle 847 period, it is never reduced below half the recover_rate. 849 If the sender has been data-limited but not idle since the 850 nofeedback timer was set, it is possible that the nofeedback timer 851 expired because data or feedback packets were dropped in the 852 network. In this case, the nofeedback timer is the backup mechanism 853 for the sender to detect these losses, similar to the retransmit 854 timer in TCP. 856 Note that when the sender stops sending, the receiver will stop 857 sending feedback. When the sender's nofeedback timer expires, the 858 sender could use the procedure above to limit the sending rate. If 859 the sender subsequently starts to send again, X_recv_set will be 860 used to limit the transmit rate, and slowstart behavior will occur 861 until the transmit rate reaches X_Bps. 863 The TFRC sender's reduction of the allowed sending rate after the 864 nofeedback timer expires is similar to TCP's reduction of the 865 congestion window cwnd after each RTO seconds of an idle period, for 866 TCP with Congestion Window Validation [RFC2861]. 868 4.5. Reducing Oscillations 870 To reduce oscillations in queueing delay and sending rate in 871 environments with a low degree of statistical multiplexing at the 872 congested link, it can be useful for the sender to reduce the 873 transmit rate as the queuing delay (and hence RTT) increases. To do 874 this the sender maintains R_sqmean, a long-term estimate of the 875 square root of the RTT, and modifies its sending rate depending on 876 how the square root of R_sample, the most recent sample of the RTT, 877 differs from the long-term estimate. The long-term estimate 878 R_sqmean is set as follows: 880 If no feedback has been received before 881 R_sqmean = sqrt(R_sample); 882 Else 883 R_sqmean = q2*R_sqmean + (1-q2)*sqrt(R_sample); 885 Thus R_sqmean gives the exponentially weighted moving average of the 886 square root of the RTT samples. The constant q2 should be set 887 similarly to q, the constant used in the round trip time estimate R. 888 We recommend a value of 0.9 as the default for q2. 890 When sqrt(R_sample) is greater than R_sqmean then the current round- 891 trip time is greater than the long-term average, implying that 892 queueing delay is probably increasing. In this case, the transmit 893 rate is decreased to minimize oscillations in queueing delay. 895 The sender obtains the base allowed transmit rate, X, as described 896 in step (4) of Section 4.3 above. It then calculates a modified 897 instantaneous transmit rate X_inst, as follows: 899 X_inst = X * R_sqmean / sqrt(R_sample); 900 If (p > 0) // congestion avoidance phase 901 X_inst = max(X_inst, s/t_mbi) 902 Else if (t_now - tld >= R) // initial slow-start 903 X_inst = max(X_inst, s/R) 905 Because we are using square roots, there is generally only a 906 moderate difference between the instantaneous transmit rate X_inst 907 and the allowed transmit rate X. For example, in a somewhat extreme 908 case when the current RTT sample R_sample is twice as large as the 909 long-term average, then sqrt(R_sample) will be roughly 1.44 times 910 R_sqmean, and the allowed transmit rate will be reduced by a factor 911 of roughly 0.7. 913 Note: This modification for reducing oscillatory behavior is not 914 always needed, especially if the degree of statistical multiplexing 915 in the network is high. However, it SHOULD be implemented because 916 it makes TFRC behave better in environments with a low level of 917 statistical multiplexing. The performance of this modification is 918 illustrated in Section 3.1.3 of [FHPW00]. If it is not implemented, 919 we recommend using a very low value of the weight q for the average 920 round-trip time. 922 4.6. Scheduling of Packet Transmissions 924 As TFRC is rate-based, and as operating systems typically cannot 925 schedule events precisely, it is necessary to be opportunistic about 926 sending data packets so that the correct average rate is maintained 927 despite the coarse-grain or irregular scheduling of the operating 928 system. To help maintain the correct average sending rate, the TFRC 929 sender may send some packets before their nominal send time. 931 In addition, the scheduling of packet transmissions controls the 932 allowed burstiness of senders after an idle or data-limited period. 933 Allowing the TFRC sender to accumulate sending `credits' for past 934 unused send times allows the TFRC sender to send a burst or data 935 after an idle period. As a comparison with TCP, TCP may send up to 936 a round-trip time's worth of packets in a single burst, but never 937 more. As examples, for TCP bursts can be sent when an ACK arrives 938 acknowledging a window of data, or when a data-limited sender, after 939 a delay of nearly a round-trip time, suddenly has a window of data 940 to send. 942 To limit burstiness, a TFRC implementation MUST prevent bursts of 943 arbitrary size. This limit MUST be less than or equal to one round- 944 trip time's worth of packets. A TFRC implementation MAY limit 945 bursts to less than a round-trip time's worth of packets, if so 946 desired. However, we note that such limits also constrain TFRC's 947 performance beyond the case for the current TCP. 949 A typical sending loop will calculate the correct inter-packet 950 interval, t_ipi, as follows: 952 t_ipi = s/X_inst; 954 Let t_now be the current time and i be a natural number, i = 0, 1, 955 ..., with t_i the nominal send time for the i-th packet. Then the 956 nominal send time t_(i+1) derives recursively as 958 t_0 = t_now, 959 t_(i+1) = t_i + t_ipi. 961 For TFRC senders allowed to accumulate sending `credits' for unused 962 sent time over the last T seconds, the sender would be allowed to 963 use unused nominal sent times t_j for t_j < now - T. We recommend T 964 set to the round-trip time. 966 4.6.1. Sending Packets Before their Nominal Send Time 968 Let t_gran be the scheduling timer granularity of the operating 969 system. If the operating system has a coarse timer granularity or 970 otherwise cannot support short t_ipi intervals, then either the TFRC 971 sender will be restricted to a sending rate of at most 1 packet 972 every t_gran seconds, or the TFRC sender must be allowed to send 973 short bursts of packets. In addition to allowing the sender to 974 accumulate sending `credits' for past unused send times, it can be 975 useful to allow the sender to send a packet before its scheduled 976 send time, as described in the section below. 978 A parameter t_delta MAY be used to allow a packet to be sent before 979 its nominal send time. Consider an application that becomes idle 980 and requests re-scheduling for time t_i = t_(i-1) + t_ipi, for 981 t_(i-1) the send time for the previous packet. When the application 982 is re-scheduled, it checks the current time, t_now. If (t_now > t_i 983 - t_delta) then packet i is sent. When the nominal send time, t_i, 984 of the next packet is calculated, it may already be the case that 985 t_now > t_i - t_delta. In such a case the packet would be sent 986 immediately. 988 In order to send at most one packet before its nominal send time, 989 and never to send a packet more than a round-trip time before its 990 nominal send time the parameter t_delta would be set as follows: 992 t_delta = min(t_ipi, t_gran, rtt)/2; 994 The scheduling granularity t_gran is 10ms on many Unix systems. If 995 t_gran is not known, a value of 10ms could be assumed. 997 As an example, consider a TFRC flow with an allowed sending rate X 998 of 10 packets per round-trip time, a round-trip time of 100 ms, a 999 system with a scheduling granularity t_gran of 10 ms, and the 1000 ability to accumulate unused sending credits for a round-trip time. 1001 In this case, t_ipi is 1 ms. The TFRC sender would be allowed to 1002 send packets 0.5 ms before their nominal sending time, and would be 1003 allowed to save unused sending credits for 100 ms. The scheduling 1004 granularity of 10 ms would not significantly affect the performance 1005 of the connection. 1007 As a different example, consider a TFRC flow with a scheduling 1008 granularity less than the round-trip time, for example, with a 1009 round-trip time of 0.1 ms and a system with a scheduling granularity 1010 of 1 ms, and with the ability to accumulate unused sending credits 1011 for a round-trip time. The TFRC sender would be allowed to save 1012 unused sending credits for 0.1 ms. If the scheduling granularity 1013 *did not* affect the sender's response to an incoming feedback 1014 packet, then the TFRC sender would be able to send an RTT of data 1015 (as determined by the allowed sending rate) each RTT, in response to 1016 incoming feedback packets. In this case, the coarse scheduling 1017 granularity would not significantly reduce the sending rate, but the 1018 sending rate would be bursty, with a round-trip time of data sent in 1019 response to each feedback packet. 1021 However, performance would be different in this case if the 1022 operating system scheduling granularity affected the sender's 1023 response to feedback packets as well as the general scheduling of 1024 the sender, In this case the sender's performance would be severely 1025 limited by the scheduling granularity being less than the round-trip 1026 time, with the sender able to send an RTT of data, at the allowed 1027 sending rate, at most once every 1 ms. This restriction of the 1028 sending rate is an unavoidable consequence of allowing burstiness of 1029 at most a round-trip time of data. 1031 5. Calculation of the Loss Event Rate (p) 1033 Obtaining an accurate and stable measurement of the loss event rate 1034 is of primary importance for TFRC. Loss rate measurement is 1035 performed at the receiver, based on the detection of lost or marked 1036 packets from the sequence numbers of arriving packets. We describe 1037 this process before describing the rest of the receiver protocol. 1038 If the receiver has not yet detected a lost or marked packet, then 1039 the receiver doesn't calculate the loss event rate, but reports a 1040 loss event rate of zero. 1042 5.1. Detection of Lost or Marked Packets 1044 TFRC assumes that all packets contain a sequence number that is 1045 incremented by one for each packet that is sent. For the purposes 1046 of this specification, we require that if a lost packet is 1047 retransmitted, the retransmission is given a new sequence number 1048 that is the latest in the transmission sequence, and not the same 1049 sequence number as the packet that was lost. If a transport 1050 protocol has the requirement that it must retransmit with the 1051 original sequence number, then the transport protocol designer must 1052 figure out how to distinguish delayed from retransmitted packets and 1053 how to detect lost retransmissions. 1055 The receiver maintains a data structure that keeps track of which 1056 packets have arrived and which are missing. For the purposes of 1057 specification, we assume that the data structure consists of a list 1058 of packets that have arrived along with the receiver timestamp when 1059 each packet was received. In practice this data structure will 1060 normally be stored in a more compact representation, but this is 1061 implementation-specific. 1063 The loss of a packet is detected by the arrival of at least NDUPACK 1064 packets with a higher sequence number than the lost packet, for 1065 NDUPACK set to 3. The requirement for NDUPACK subsequent packets is 1066 the same as with TCP, and is to make TFRC more robust in the 1067 presence of reordering. In contrast to TCP, if a packet arrives 1068 late (after NDUPACK subsequent packets arrived) in TFRC, the late 1069 packet can fill the hole in TFRC's reception record, and the 1070 receiver can recalculate the loss event rate. Future versions of 1071 TFRC might make the requirement for NDUPACK subsequent packets 1072 adaptive based on experienced packet reordering, but we do not 1073 specify such a mechanism here. 1075 For an ECN-capable connection, a marked packet is detected as a 1076 congestion event as soon as it arrives, without having to wait for 1077 the arrival of subsequent packets. 1079 5.2. Translation from Loss History to Loss Events 1081 TFRC requires that the loss fraction be robust to several 1082 consecutive packets lost or marked where those packets are part of 1083 the same loss event. This is similar to TCP, which (typically) only 1084 performs one halving of the congestion window during any single RTT. 1085 Thus the receiver needs to map the packet loss history into a loss 1086 event record, where a loss event is one or more packets lost or 1087 marked in an RTT. To perform this mapping, the receiver needs to 1088 know the RTT to use, and this is supplied periodically by the 1089 sender, typically as control information piggy-backed onto a data 1090 packet. TFRC is not sensitive to how the RTT measurement sent to 1091 the receiver is made, but we recommend using the sender's calculated 1092 RTT, R, (see Section 4.3) for this purpose. 1094 To determine whether a lost or marked packet should start a new loss 1095 event, or be counted as part of an existing loss event, we need to 1096 compare the sequence numbers and timestamps of the packets that 1097 arrived at the receiver. For a marked packet S_new, its reception 1098 time T_new can be noted directly. For a lost packet, we can 1099 interpolate to infer the nominal "arrival time". Assume: 1101 S_loss is the sequence number of a lost packet. 1103 S_before is the sequence number of the last packet to arrive 1104 with sequence number before S_loss. 1106 S_after is the sequence number of the first packet to arrive 1107 with sequence number after S_loss. 1109 S_max is the largest sequence number. 1111 Therefore, S_before < S_loss < S_after <= S_max. 1113 T_loss is the nominal estimated arrival time for the lost 1114 packet. 1116 T_before is the reception time of S_before. 1118 T_after is the reception time of S_after. 1120 Note that T_before can either be before or after T_after due to 1121 reordering. 1123 For a lost packet S_loss, we can interpolate its nominal "arrival 1124 time" at the receiver from the arrival times of S_before and 1125 S_after. Thus: 1127 T_loss = T_before + ( (T_after - T_before) 1128 * (S_loss - S_before)/(S_after - S_before) ); 1130 Note that if the sequence space wrapped between S_before and 1131 S_after, then the sequence numbers must be modified to take this 1132 into account before performing this calculation. If the largest 1133 possible sequence number is S_max, and S_before > S_after, then 1134 modifying each sequence number S by S' = (S + (S_max + 1)/2) mod 1135 (S_max + 1) would normally be sufficient. 1137 If the lost packet S_old was determined to have started the previous 1138 loss event, and we have just determined that S_new has been lost, 1139 then we interpolate the nominal arrival times of S_old and S_new, 1140 called T_old and T_new respectively. 1142 If T_old + R >= T_new, then S_new is part of the existing loss 1143 event. Otherwise S_new is the first packet in a new loss event. 1145 5.3. Inter-loss Event Interval 1147 If a loss interval, A, is determined to have started with packet 1148 sequence number S_A and the next loss interval, B, started with 1149 packet sequence number S_B, then the number of packets in loss 1150 interval A is given by (S_B - S_A). Thus, loss interval A contains 1151 all of the packets transmitted by the sender starting with the first 1152 packet transmitted in loss interval A, and ending with but not 1153 including the first packet transmitted in loss interval B. 1155 5.4. Average Loss Interval 1157 To calculate the loss event rate p, we first calculate the average 1158 loss interval. This is done using a filter that weights the n most 1159 recent loss event intervals in such a way that the measured loss 1160 event rate changes smoothly. If the receiver has not yet seen a 1161 lost or marked packet, then the receiver doesn't calculate the 1162 average loss interval. 1164 Weights w_0 to w_(n-1) are calculated as: 1166 If (i < n/2) 1167 w_i = 1; 1168 Else 1169 w_i = 1 - (i - (n/2 - 1))/(n/2 + 1); 1171 Thus if n=8, the values of w_0 to w_7 are: 1173 1.0, 1.0, 1.0, 1.0, 0.8, 0.6, 0.4, 0.2 1175 The value n for the number of loss intervals used in calculating the 1176 loss event rate determines TFRC's speed in responding to changes in 1177 the level of congestion. As currently specified, TFRC SHOULD NOT 1178 use values of n significantly greater than 8, for traffic that might 1179 compete in the global Internet with TCP. At the very least, safe 1180 operation with values of n greater than 8 would require a slight 1181 change to TFRC's mechanisms to include a more severe response to two 1182 or more round-trip times with heavy packet loss. 1184 When calculating the average loss interval we need to decide whether 1185 to include the current loss interval, defined as the loss interval 1186 containing the most recent loss event. We only include the current 1187 loss interval if it is sufficiently large to increase the average 1188 loss interval. 1190 Let the most recent loss intervals be I_0 to I_k, where I_0 is the 1191 current loss interval. If there have been at least n loss 1192 intervals, then k is set to n; otherwise k is the maximum number of 1193 loss intervals seen so far. We calculate the average loss interval 1194 I_mean as follows: 1196 I_tot0 = 0; 1197 I_tot1 = 0; 1198 W_tot = 0; 1199 for (i = 0 to k-1) { 1200 I_tot0 = I_tot0 + (I_i * w_i); 1201 W_tot = W_tot + w_i; 1202 } 1203 for (i = 1 to k) { 1204 I_tot1 = I_tot1 + (I_i * w_(i-1)); 1205 } 1206 I_tot = max(I_tot0, I_tot1); 1207 I_mean = I_tot/W_tot; 1209 The loss event rate, p is simply: 1211 p = 1 / I_mean; 1213 5.5. History Discounting 1215 As described in Section 5.4, when there have been at least eight 1216 loss intervals, the most recent loss interval is only assigned 1217 1/(0.75*n) of the total weight in calculating the average loss 1218 interval, regardless of the size of the most recent loss interval. 1219 This section describes an optional history discounting mechanism, 1220 discussed further in [FHPW00a] and [W00], that allows the TFRC 1221 receiver to adjust the weights, concentrating more of the relative 1222 weight on the most recent loss interval, when the most recent loss 1223 interval is more than twice as large as the computed average loss 1224 interval. 1226 To carry out history discounting, we associate a discount factor 1227 DF_i with each loss interval L_i, for i > 0, where each discount 1228 factor is a floating point number. The discount array maintains the 1229 cumulative history of discounting for each loss interval. At the 1230 beginning, the values of DF_i in the discount array are initialized 1231 to 1: 1233 for (i = 0 to n) { 1234 DF_i = 1; 1235 } 1237 History discounting also uses a general discount factor DF, also a 1238 floating point number, that is also initialized to 1. First we show 1239 how the discount factors are used in calculating the average loss 1240 interval, and then we describe later in this section how the 1241 discount factors are modified over time. 1243 As described in Section 5.4 the average loss interval is calculated 1244 using the n previous loss intervals I_1, ..., I_n and the current 1245 loss interval I_0. The computation of the average loss interval 1246 using the discount factors is a simple modification of the procedure 1247 in Section 5.4, as follows: 1249 I_tot0 = I_0 * w_0 1250 I_tot1 = 0; 1251 W_tot0 = w_0 1252 W_tot1 = 0; 1253 for (i = 1 to n-1) { 1254 I_tot0 = I_tot0 + (I_i * w_i * DF_i * DF); 1255 W_tot0 = W_tot0 + w_i * DF_i * DF; 1256 } 1257 for (i = 1 to n) { 1258 I_tot1 = I_tot1 + (I_i * w_(i-1) * DF_i); 1259 W_tot1 = W_tot1 + w_(i-1) * DF_i; 1260 } 1261 p = min(W_tot0/I_tot0, W_tot1/I_tot1); 1263 The general discounting factor DF is updated on every packet arrival 1264 as follows. First, the receiver computes the weighted average I_mean 1265 of the loss intervals I_1, ..., I_n: 1267 I_tot = 0; 1268 W_tot = 0; 1269 for (i = 1 to n) { 1270 W_tot = W_tot + w_(i-1) * DF_i; 1271 I_tot = I_tot + (I_i * w_(i-1) * DF_i); 1272 } 1273 I_mean = I_tot / W_tot; 1275 This weighted average I_mean is compared to I_0, the size of current 1276 loss interval. If I_0 is greater than twice I_mean, then the new 1277 loss interval is considerably larger than the old ones, and the 1278 general discount factor DF is updated to decrease the relative 1279 weight on the older intervals, as follows: 1281 if (I_0 > 2 * I_mean) { 1282 DF = 2 * I_mean/I_0; 1283 if (DF < THRESHOLD) 1284 DF = THRESHOLD; 1285 } else 1286 DF = 1; 1288 A nonzero value for THRESHOLD ensures that older loss intervals from 1289 an earlier time of high congestion are not discounted entirely. We 1290 recommend a THRESHOLD of 0.25. Note that with each new packet 1291 arrival, I_0 will increase further, and the discount factor DF will 1292 be updated. 1294 When a new loss event occurs, the current interval shifts from I_0 1295 to I_1, loss interval I_i shifts to interval I_(i+1), and the loss 1296 interval I_n is forgotten. The previous discount factor DF has to 1297 be incorporated into the discount array. Because DF_i carries the 1298 discount factor associated with loss interval I_i, the DF_i array 1299 has to be shifted as well. This is done as follows: 1301 for (i = 1 to n) { 1302 DF_i = DF * DF_i; 1303 } 1304 for (i = n-1 to 0 step -1) { 1305 DF_(i+1) = DF_i; 1306 } 1307 I_0 = 1; 1308 DF_0 = 1; 1309 DF = 1; 1311 This completes the description of the optional history discounting 1312 mechanism. We emphasize that this is an optional mechanism whose 1313 sole purpose is to allow TFRC to response somewhat more quickly to 1314 the sudden absence of congestion, as represented by a long current 1315 loss interval. 1317 6. Data Receiver Protocol 1319 The receiver periodically sends feedback messages to the sender. 1320 Feedback packets should normally be sent at least once per RTT, 1321 unless the sender is sending at a rate of less than one packet per 1322 RTT, in which case a feedback packet should be send for every data 1323 packet received. A feedback packet should also be sent whenever a 1324 new loss event is detected without waiting for the end of an RTT, 1325 and whenever an out-of-order data packet is received that removes a 1326 loss event from the history. 1328 If the sender is transmitting at a high rate (many packets per RTT) 1329 there may be some advantages to sending periodic feedback messages 1330 more than once per RTT as this allows faster response to changing 1331 RTT measurements, and more resilience to feedback packet loss. If 1332 the receiver was sending k feedback packets per RTT, step (4) of 1333 Section 6.2 would be modified to set the feedback timer to expire 1334 after R_m/k seconds. However, each feedback packet would still 1335 report the receiver rate over the last RTT, not over a fraction of 1336 an RTT. We note that there is little gain from sending a large 1337 number of feedback messages per RTT. 1339 6.1. Receiver Behavior When a Data Packet is Received 1341 When a data packet is received, the receiver performs the following 1342 steps: 1344 1) Add the packet to the packet history. 1346 2) Check if done: If the new packet results in the detection of a 1347 new loss event, or if no feedback packet was sent when the 1348 feedback timer last expired, go to step 3). Otherwise, no 1349 action need be performed (unless the optimization in the next 1350 paragraph is used), so exit the procedure. 1352 An optimization might check to see if the arrival of the packet 1353 caused a hole in the packet history to be filled and 1354 consequently two loss intervals were merged into one. If this 1355 is the case, the receiver might also send feedback immediately. 1356 The effects of such an optimization are normally expected to be 1357 small. 1359 3) Calculate p: Let the previous value of p be p_prev. Calculate 1360 the new value of p as described in Section 5. 1362 4) Expire feedback timer?: If p > p_prev, cause the feedback timer 1363 to expire, and perform the actions described in Section 6.2 1365 If p <= p_prev and no feedback packet was sent when the feedback 1366 timer last expired, cause the feedback timer to expire, and 1367 perform the actions described in Section 6.2 If p <= p_prev and 1368 a feedback packet was sent when the feedback timer last expired, 1369 no action need be performed. 1371 6.2. Expiration of Feedback Timer 1373 When the feedback timer at the receiver expires, the action to be 1374 taken depends on whether data packets have been received since the 1375 last feedback was sent. 1377 For the m-th expiration of the feedback timer, let the maximum 1378 sequence number of a packet at the receiver so far be S_m, and the 1379 value of the RTT measurement included in packet S_m be R_m. As 1380 described in Section 3.2.1, R_m is the sender's current estimate of 1381 the round trip time, reported in data packets. If data packets have 1382 been received since the previous feedback was sent, the receiver 1383 performs the following steps: 1385 1) Calculate the average loss event rate using the algorithm 1386 described above. 1388 2a) If the feedback timer expired at its normal time, or expired 1389 early due to a new lost or marked packet (i.e., step (3) in 1390 Section 6.1), calculate the measured receive rate, X_recv, based 1391 on the packets received within the previous R_(m-1) seconds. In 1392 the typical case, when the receiver is sending only one feedback 1393 packet per round-trip time and the feedback timer did not expire 1394 early due to an idle period, then R_(m-1) would be the time 1395 interval since the feedback timer last expired. 1397 2b) If the feedback timer expired early due to a new lost or marked 1398 packet (i.e., step (3) in Section 6.1), the receive rate X_recv 1399 SHOULD be calculated based on the packets received within the 1400 previous R_(m-1) seconds. For ease of implementation, the 1401 receive rate MAY be calculated over a longer time interval, the 1402 time interval going back to the most recent feedback timer 1403 expiration that was at least R_(m-1) seconds ago. 1405 3) Prepare and send a feedback packet containing the information 1406 described in Section 3.2.2. 1408 4) Restart the feedback timer to expire after R_m seconds. 1410 Note that rule 2) above gives a minimum value for the measured 1411 receive rate X_recv of one packet per round-trip time. If the 1412 sender is limited to a sending rate of less than one packet per 1413 round-trip time, this will be due to the loss event rate, not from a 1414 limit imposed by the measured receive rate at the receiver. 1416 If no data packets have been received since the last feedback was 1417 sent, no feedback packet is sent, and the feedback timer is 1418 restarted to expire after R_m seconds. 1420 6.3. Receiver Initialization 1422 The receiver is initialized by the first data packet that arrives at 1423 the receiver. Let the sequence number of this packet be i. 1425 When the first packet is received: 1427 o Set p=0 1429 o Set X_recv = 0. 1431 o Prepare and send a feedback packet. 1433 o Set the feedback timer to expire after R_i seconds. 1435 If the first data packet doesn't contain an estimate R_i of the 1436 round-trip time, then the receiver sends a feedback packet for every 1437 arriving data packet, until a data packet arrives containing an 1438 estimate of the round-trip time. 1440 If the sender is using a coarse-grained timestamp that increments 1441 every quarter of a round-trip time, then a feedback timer is not 1442 needed, and the following procedure from RFC 4342 is used to 1443 determine when to send feedback messages. 1445 o Whenever the receiver sends a feedback message, the receiver 1446 sets a local variable last_counter to the greatest received 1447 value of the window counter since the last feedback message was 1448 sent, if any data packets have been received since the last 1449 feedback message was sent. 1451 o If the receiver receives a data packet with a window counter 1452 value greater than or equal to last_counter + 4, then the 1453 receiver sends a new feedback packet. ("Greater" and "greatest" 1454 are measured in circular window counter space.) 1456 6.3.1. Initializing the Loss History after the First Loss Event 1458 The number of packets until the first loss can not be used to 1459 compute the allowed sending rate directly, as the sending rate 1460 changes rapidly during this time. TFRC assumes that the correct 1461 data rate after the first loss is half of the maximum sending rate 1462 before the loss occurred. TFRC approximates this target rate 1463 X_target by the maximum X_rec so far, for X_recv the receive rate 1464 over a single round-trip time. (For a TFRC sender that always has 1465 data to send, it is sufficient to approximate the target rate by the 1466 most recent X_recv. However, for a TFRC sender that is sometimes 1467 data-limited or idle, it is best to use the maximum X_recv so far.) 1468 After the first loss, instead of initializing the first loss 1469 interval to the number of packets sent until the first loss, the 1470 TFRC receiver calculates the loss interval that would be required to 1471 produce the data rate X_target, and uses this synthetic loss 1472 interval to seed the loss history mechanism. 1474 TFRC does this by finding some value p for which the throughput 1475 equation in Section 3.1 gives a sending rate within 5% of X_target, 1476 given the round-trip time R, and the first loss interval is then set 1477 to 1/p. If the receiver knows the segment size s used by the 1478 sender, then the receiver can use the throughput equation for X; 1479 otherwise, the receiver can measure the receive rate in packets per 1480 second instead of bytes per second for this purpose, and use the 1481 throughput equation for X_pps. (The 5% tolerance is introduced 1482 simply because the throughput equation is difficult to invert, and 1483 we want to reduce the costs of calculating p numerically.) 1485 Special care is needed for initializing the first loss interval when 1486 the first data packet is lost or marked. When the first data packet 1487 is lost in TCP, the TCP sender retransmits the packet after the 1488 retransmit timer expires. If TCP's first data packet is ECN-marked, 1489 the TCP sender resets the retransmit timer, and sends a new data 1490 packet only when the retransmit timer expires [RFC3168] (Section 1491 6.1.2). For TFRC, if the first data packet is lost or ECN-marked, 1492 then the first loss interval consists of the null interval with no 1493 data packets. In this case, the loss interval length for this 1494 (null) loss interval should be set to give a similar sending rate to 1495 that of TCP. 1497 When the first TFRC loss interval is null, meaning that the first 1498 data packet is lost or ECN-marked, in order to follow the behavior 1499 of TCP, TFRC wants the allowed sending rate to be 1 packet every two 1500 round-trip times, or equivalently, 0.5 packets per RTT. Thus, the 1501 TFRC receiver calculates the loss interval that would be required to 1502 produce the target rate X_target of 0.5/R packets per second, for 1503 the round-trip time R, and uses this synthetic loss interval for the 1504 first loss interval. The TFRC receiver uses 0.5/R packets per 1505 second as the minimum value for X_target when initializing the first 1506 loss interval. 1508 7. Sender-based Variants 1510 In a sender-based variant of TFRC, the receiver uses reliable 1511 delivery to send information about packet losses to the sender, and 1512 the sender computes the packet loss rate and the acceptable transmit 1513 rate. 1515 The main advantage of a sender-based variant of TFRC is that the 1516 sender does not have to trust the receiver's calculation of the 1517 packet loss rate. However, with the requirement of reliable 1518 delivery of loss information from the receiver to the sender, a 1519 sender-based TFRC would have much tighter constraints on the 1520 transport protocol in which it is embedded. 1522 In contrast, the receiver-based variant of TFRC specified in this 1523 document is robust to the loss of feedback packets, and therefore 1524 does not require the reliable delivery of feedback packets. It is 1525 also better suited for applications where it is desirable to offload 1526 work from the server to the client as much as possible. 1528 RFC 4340 and RFC 4342 together specify CCID 3, which can be used as 1529 a sender-based variant of TFRC. In CCID 3, each feedback packet 1530 from the receiver contains a Loss Intervals option, reporting the 1531 lengths of the most recent loss intervals. Feedback packets may 1532 also include the Ack Vector option, allowing the sender to determine 1533 exactly which packets were dropped or marked and to check the 1534 information reported in the Loss Intervals options. The Ack Vector 1535 option can also include ECN Nonce Echoes, allowing the sender to 1536 verify the receiver's report of having received an unmarked data 1537 packet. The Ack Vector option allows the sender to see for itself 1538 which data packets were lost or ECN-marked, to determine loss 1539 intervals, and to calculate the loss event rate. Section 9.2 of RFC 1540 4342 discusses issues in the sender verifying information reported 1541 by the receiver. 1543 8. Implementation Issues 1545 This document has specified the TFRC congestion control mechanism, 1546 for use by applications and transport protocols. This section 1547 mentions briefly some of the implementation issues. 1549 Computing the throughput equation (Section 3.1): For t_RTO = 4*R 1550 and b = 1, the throughput equation in Section 3.1 can be expressed 1551 as follows: 1553 s 1554 X_Bps = -------- 1555 R * f(p) 1557 for 1559 f(p) = sqrt(2*p/3) + (12*sqrt(3*p/8) * p * (1+32*p^2)). 1561 A table lookup could be used for the function f(p). 1563 Many of the multiplications (e.g., q and 1-q for the round-trip time 1564 average, a factor of 4 for the timeout interval) are or could be by 1565 powers of two, and therefore could be implemented as simple shift 1566 operations. 1568 The sender mechanism for preventing oscillations (Section 4.5): We 1569 note that the optional sender mechanism for preventing oscillations 1570 described in Section 4.5 uses a square-root computation. 1572 Calculating the nominal packet arrival time (Section 5.2). For the 1573 calculation of the nominal arrival time T_loss for a lost packet 1574 from Section 5.2, one way to implement this that would avoid 1575 concerns about wrapped sequence space would be to use the following: 1577 T_loss = T_before + (T_after - T_before) 1578 * Dist(S_loss, S_before)/Dist(S_after, S_before) 1580 where 1582 Dist(Seqno_A, Seqno_B) = (Seqno_A + 2^48 - Seqno_B) % 2^48 1584 The calculation of the average loss interval (Section 5.4): The 1585 calculation of the average loss interval in Section 5.4 involves 1586 multiplications by the weights w_0 to w_(n-1), which for n=8 are: 1588 1.0, 1.0, 1.0, 1.0, 0.8, 0.6, 0.4, 0.2. 1590 With a minor loss of smoothness, it would be possible to use weights 1591 that were powers of two or sums of powers of two, e.g., 1593 1.0, 1.0, 1.0, 1.0, 0.75, 0.5, 0.25, 0.25. 1595 The optional history discounting mechanism (Section 5.5): The 1596 optional history discounting mechanism described in Section 5.5 is 1597 used in the calculation of the average loss rate. The history 1598 discounting mechanism is invoked only when there has been an 1599 unusually long interval with no packet losses. For a more efficient 1600 operation, the discount factor DF_i could be restricted to be a 1601 power of two. 1603 9. Changes from RFC 3448 1605 This section summarizes the changes from RFC 3448. 1607 Section 4.1, estimating the average segment size: Section 4.1 was 1608 modified to give a specific algorithm that could be used for 1609 estimating the average segment size. 1611 Section 4.2, update to the initial sending rate: In RFC 3448, the 1612 initial sending rate was two packets per round trip time. In this 1613 document, the initial sending rate can be as high as four packets 1614 per round trip time, following RFC 3390. The initial sending rate 1615 was changed to be in terms of the segment size s, not in terms of 1616 the MSS. 1618 Section 4.2 now says that tld, the Time Last Doubled during slow- 1619 start, can be initialized to either 0 or to -1. Section 4.2 was 1620 also clarified to say that RTT measurements don't only come from 1621 feedback packets; they could also come from other places, such as 1622 the SYN exchange. 1624 Section 4.3, response to feedback packets: Section 4.3 was modified 1625 to change the way that the receive rate is used in limiting the 1626 sender's allowed sending rate, by using the set of receive rate 1627 values of the last two round-trip times, and initializing the set of 1628 receive rate values by a large value. 1630 The larger initial sending rate in Section 4.2 is of little use if 1631 the receiver sends a feedback packet after the first packet is 1632 received, and the sender in response reduces the allowed sending 1633 rate to at most two packets per RTT, which would be twice the 1634 receive rate. Because of the change in the sender's processing of 1635 the receive rate, the sender now does not reduce the allowed sending 1636 rate to twice the reported receive rate in response to the first 1637 feedback packet. 1639 The sender never uses the receive rate from a data-limited period to 1640 restrict the allowed sending rate. Appendix C discusses this 1641 response further. 1643 Section 4.4, response to an idle period: Following Section 5.1 from 1644 [RFC4342], this document specifies that when the sending rate is 1645 reduced after an idle period that covers the period since the 1646 nofeedback timer was set, the allowed sending rate is not reduced 1647 below the initial sending rate. 1649 Section 4.4, correction from [RFC3448Err]. RFC 3448 had 1650 contradictory text about whether the sender halved its sending rate 1651 after *two* round-trip times without receiving a feedback report, or 1652 after *four* round-trip times. This document clarifies that the 1653 sender halves its sending rate after four round-trip times without 1654 receiving a feedback report [RFC3448Err]. 1656 Section 4.4, clarification for Slow-Start: Section 4.4 was clarified 1657 to specify that on the expiration of the nofeedback timer, if p = 0, 1658 X_Bps can't be used, because the sender doesn't yet have a value for 1659 X_Bps. 1661 Section 4.7: credits for unused send time: Section 4.7 has been 1662 clarified to say that the TFRC sender gets to accumulate up to an 1663 RTT of `credits' for unused send time. Section 4.7 was also 1664 rewritten to clarity what is specification and what is 1665 implementation. 1667 Section 5.4, clarification: Section 5.4 was modified to clarify the 1668 receiver's calculation of the average loss interval when the 1669 receiver has not yet seen eight loss intervals. 1671 Section 5.5, correction: Section 5.5 was corrected to say that the 1672 loss interval I_0 includes all transmitted packets, including lost 1673 and marked packets (as defined in Section 5.3 in the general 1674 definition of loss intervals.) 1676 Section 5.5, correction from [RFC3448Err]: A line in Section 5.5 was 1677 changed from 1679 ``for (i = 1 to n) { DF_i = 1; }'' 1681 to 1683 ``for (i = 0 to n) { DF_i = 1; }'' 1685 [RFC3448Err]. 1687 Section 5.5, history discounting: THRESHOLD, the lower bound on the 1688 history discounting parameter DF, has been changed from 0.5 to 0.25, 1689 to allow more history discounting when the current interval is long. 1691 Section 6, multipe feedback packets: Section 6 now contains more 1692 discussion of procedures if the receiver sends multiple feedback 1693 packets each round-trip time. 1695 Section 6.3, initialization of the feedback timer: Section 6.3 now 1696 specifies the receiver's initialization of the feedback timer if the 1697 first data packet received doesn't have an estimate of the round- 1698 trip time. 1700 Section 6.3, a coarse-grained timestamp: Section 6.3 was modified to 1701 incorporate, as an option, a coarse-grained timestamp from the 1702 sender that increments every quarter of a round-trip time, instead 1703 of a more fine-grained timestamp. This follows RFC 4243. 1705 Section 6.3.1, after the first loss event: Section 6.3.1 now says 1706 that for initializing the loss history after the first loss event, 1707 the receiver uses the maximum receive rate so far, instead of the 1708 receive rate in the last round-trip time. 1710 Section 6.3.1, if the first data packet is dropped: Section 6.3.1 1711 now contains a specification for initializing the loss history if 1712 the first data packet sent is lost or ECN-marked. 1714 Section 7, sender-based variants: Section 7's discussion of sender- 1715 based variants has been expanded, with reference to RFC 4342. 1717 10. Security Considerations 1719 TFRC is not a transport protocol in its own right, but a congestion 1720 control mechanism that is intended to be used in conjunction with a 1721 transport protocol. Therefore security primarily needs to be 1722 considered in the context of a specific transport protocol and its 1723 authentication mechanisms. 1725 Congestion control mechanisms can potentially be exploited to create 1726 denial of service. This may occur through spoofed feedback. Thus 1727 any transport protocol that uses TFRC should take care to ensure 1728 that feedback is only accepted from the receiver of the data. The 1729 precise mechanism to achieve this will however depend on the 1730 transport protocol itself. 1732 In addition, congestion control mechanisms may potentially be 1733 manipulated by a greedy receiver that wishes to receive more than 1734 its fair share of network bandwidth. A receiver might do this by 1735 claiming to have received packets that in fact were lost due to 1736 congestion. Possible defenses against such a receiver would 1737 normally include some form of nonce that the receiver must feed back 1738 to the sender to prove receipt. However, the details of such a 1739 nonce would depend on the transport protocol, and in particular on 1740 whether the transport protocol is reliable or unreliable. 1742 We expect that protocols incorporating ECN with TFRC will also want 1743 to incorporate feedback from the receiver to the sender using the 1744 ECN nonce [RFC3540]. The ECN nonce is a modification to ECN that 1745 protects the sender from the accidental or malicious concealment of 1746 marked packets. Again, the details of such a nonce would depend on 1747 the transport protocol, and are not addressed in this document. 1749 11. IANA Considerations 1751 There are no IANA actions required for this document. 1753 12. Acknowledgments 1755 We would like to acknowledge feedback and discussions on equation- 1756 based congestion control with a wide range of people, including 1757 members of the Reliable Multicast Research Group, the Reliable 1758 Multicast Transport Working Group, and the End-to-End Research 1759 Group. We would like to thank Dado Colussi, Gorry Fairhurst, Ladan 1760 Gharai, Wim Heirman, Eddie Kohler, Ken Lofgren, Mike Luby, Ian 1761 McDonald, Michele R., Gerrit Renker, Arjuna Sathiaseelan, Vladica 1762 Stanisic, Randall Stewart, Eduardo Urzaiz, Shushan Wen, and Wendy 1763 Lee (lhh@zsu.edu.cn) for feedback on earlier versions of this 1764 document, and to thank Mark Allman for his extensive feedback from 1765 using [RFC3448] to produce a working implementation. 1767 A. Terminology 1769 This document uses the following terms. 1771 DF: Discount factor for a loss interval (Section 5.5). 1773 initial_rate: 1774 Allowed initial sending rate. 1776 last_counter: 1777 Greatest received value of the window counter (Section 6.3). 1779 min_rate: 1780 Minimum transmit rate (Section 4.3). 1782 n: Number of loss intervals. 1784 NDUPACK: 1785 Number of dupacks for inferring loss (constant) (Section 5.1). 1787 nofeedback timer: 1788 Sender-side timer (Section 4). 1790 p: Estimated Loss Event Rate. 1792 p_prev: 1793 Previous value of p (Section 6.1). 1795 q: Filter constant for RTT (constant) (Section 4.3). 1797 q2: Filter constant for long-term RTT (constant) (Section 4.6). 1799 R: Estimated path round-trip time. 1801 R_sample: 1802 Measured path RTT (Section 4.3). 1804 R_sqmean: 1805 Long-term estimate of the square root of the RTT (Section 4.6). 1807 recover_rate: 1808 Allowed rate for resuming after an idle period. 1810 recv_limit; 1811 Limit on sending rate computed from the receive rate. 1813 s: Nominal packet size in bytes. 1815 S: Sequence number. 1817 t_delay: 1818 Reported time delay between receipt of the last packet at the 1819 receiver and the generation of the feedback packet (Section 1820 3.2.2). 1822 t_delta: 1823 Parameter for flexibility in send time (Section 4.7). 1825 t_gran: 1826 Schedular granularity (constant) (Section 4.7). 1828 t_ipi: 1829 Inter-packet interval for sending packets (Section 4.7). 1831 t_mbi: 1832 Maximum RTO value of TCP (constant) (Section 4.3). 1834 tld: 1835 Time Last Doubled (Section 4.2). 1837 t_now: 1838 Current time (Section 4.3). 1840 t_RTO: 1841 Estimated RTO of TCP (Section 4.3). 1843 X: Allowed transmit rate, as limited by the receive rate. 1845 X_Bps: 1846 Calculated sending rate in bytes per second (Section 3.1). 1848 X_pps: 1849 Calculated sending rate in packets per second (Section 3.1). 1851 X_recv: 1852 Estimated receive rate at the receiver (Section 3.2.2). 1854 X_inst: 1855 Instantaneous allowed transmit rate (Section 4.6). 1857 W_init: 1858 TCP initial window (constant) (Section 4.2). 1860 B. The Initial Value of the Nofeedback Timer 1862 Why is the initial value of TFRC's nofeedback timer set to two 1863 seconds, instead of the recommended initial value of three seconds 1864 for TCP's retransmit timer, from [RFC2988]? There isn't any 1865 particular reason why TFRC's nofeedback timer should have the same 1866 initial value as TCP's retransmit timer. TCP's retransmit timer is 1867 used not only to reduce the sending rate in response to congestion, 1868 but also to retransmit a packet that is assumed to have been dropped 1869 in the network. In contrast, TFRC's nofeedback timer is only used 1870 to reduce the allowed sending rate, not to trigger the sending of a 1871 new packet. As a result, there is no danger to the network for the 1872 initial value of TFRC's nofeedback timer to be smaller than the 1873 recommended initial value for TCP's retransmit timer. 1875 Further, when the nofeedback timer has not yet expired, TFRC has a 1876 more slowly-responding congestion control mechanism that TCP, and 1877 TFRC's use of the receive rate for limiting the sending rate is 1878 somewhat less precise than TCP's use of windows and ack-clocking, so 1879 the nofeedback timer is a particularly important safety mechanism 1880 for TFRC. For all of these reasons, it is perfectly reasonable for 1881 TFRC's nofeedback timer to have a smaller initial value than that of 1882 TCP's retransmit timer. 1884 C. Response to Idle or Data-limited Periods 1886 Future work could explore alternate responses to using the receive 1887 rate during a data-limited period. 1889 C.1. Long Idle or Data-limited Periods 1891 Table 1 summarizes the response of Standard TCP [RFC2581], TCP with 1892 Congestion Window Validation [RFC2861], Standard TFRC [RFC3448], and 1893 Revised TFRC (this document) in response to long idle or data- 1894 limited periods. For the purposes of this section, we define a long 1895 period as a period of at least an RTO. 1897 Protocol Long idle periods Long data-limited periods 1898 -------------- -------------------- ---------------------- 1899 Standard TCP: Window -> initial. No change in window. 1901 TCP with CWV: Halve window Reduce window half way 1902 (not below initial cwnd). to used window. 1904 Standard TFRC: Halve rate Rate limited to 1905 (not below 1 pkt/64 sec). twice receive rate. 1906 One RTT after sending pkt, 1907 rate is limited by X_recv. 1909 Revised TFRC: Halve rate Rate not limited to 1910 (not below initial rate). twice receive rate. 1912 Table 1: Response to long idle or data-limited periods. 1914 Standard TCP after long idle periods: For Standard TCP, [RFC2581] 1915 specifies that TCP SHOULD set the congestion window to no more than 1916 the initial window after an idle period of at least an RTO. 1918 Standard TCP after long data-limited periods: Standard TCP [RFC2581] 1919 does not reduce TCP's congestion window after a data-limited period, 1920 when the congestion window is not fully used. Standard TCP in 1921 [RFC2581] uses the FlightSize, the amount of outstanding data in the 1922 network, only in setting the slow-start threshold after a retransmit 1923 timeout. Standard TCP is not limited by TCP's ack-clocking 1924 mechanism during a data-limited period. 1926 Standard TCP's lax response to a data-limited period is quite 1927 different from its stringent response to an idle period. 1929 TCP with Congestion Window Validation (CWV) after long idle periods: 1930 As an experimental alternative, [RFC2861] specifies a more moderate 1931 response to an idle period than that of Standard TCP, where during 1932 an idle period the TCP sender halves cwnd after each RTO, down to 1933 the initial cwnd. 1935 TCP with Congestion Window Validation after long data-limited 1936 periods: As an experimental alternative, [RFC2861] specifies a more 1937 stringent response to a data-limited period than that of Standard 1938 TCP, where after each RTO seconds of a data-limited period, the 1939 congestion window is reduced half way down to the window that is 1940 actually used. 1942 The response of TCP with CWV to an idle period is similar to its 1943 response to a data-limited period. TCP with CWV is less restrictive 1944 than Standard TCP in response to an idle period, and more 1945 restrictive than Standard TCP in response to a data-limited period. 1947 Standard TFRC after long idle periods: For Standard TFRC, [RFC3448] 1948 specifies that the allowed sending rate is halved after each RTO 1949 seconds of an idle period. The allowed sending rate is not reduced 1950 below one packet in 64 seconds. After an idle period, the first 1951 feedback packet receives reports a receive rate of one packet per 1952 round-trip time, and this receive rate is used to limit the sending 1953 rate. Standard TFRC effectively slow-starts up from this allowed 1954 sending rate. 1956 Standard TFRC after long data-limited periods: [RFC3448] does not 1957 distinguish between data-limited and not-data-limited periods. As a 1958 consequence, the allowed sending rate is limited to at most twice 1959 the receive rate during and after a data-limited period. This is a 1960 very restrictive response, more restrictive than that of either 1961 Standard TCP or of TCP with CWV. 1963 Revised TFRC after long idle periods: For Revised TFRC, this 1964 document specifies that the allowed sending rate is halved after 1965 each RTO seconds of an idle period. The allowed sending rate is not 1966 reduced below the initial sending rate as the result of an idle 1967 period. The first feedback packet received after the idle period 1968 reports a receive rate of one packet per round-trip time. However, 1969 the Revised TFRC sender does not use this receive rate for limiting 1970 the sending rate. Thus, Revised TFRC differs from Standard TFRC in 1971 the lower limit used in the reduction of the sending rate, and in 1972 the better response to the first feedback packet received after the 1973 idle period. 1975 Revised TFRC after long data-limited periods: For Revised TFRC, this 1976 document distinguishs between data-limited and not-data-limited 1977 periods. As specified in Section 4.3, Revised TFRC does not reduce 1978 the allowed sending rate in response to the receive rate during a 1979 data-limited period. This is perhaps an overly-lax response, but it 1980 is similar to the response of Standard TCP, and is quite different 1981 from the very restrictive response of Standard TFRC to a data- 1982 limited period. 1984 Recovery after idle or data-limited periods: When TCP reduces the 1985 congestion window after an idle or data-utilized period, TCP can set 1986 the slow-start threshold ssthresh to allow the TCP sender to slow- 1987 start back up towards its old sending rate when the idle or data- 1988 limited period is over. However in TFRC, even when the TFRC 1989 sender's sending rate is restricted by twice the previous receive 1990 rate, this results in the sender being able to double the sending 1991 rate from one round-trip time to the next, if permitted by the 1992 throughput equation. Thus, TFRC doesn't need a mechanism such as 1993 TCP's setting of ssthresh to allow a slow-start after an idle or 1994 data-limited period. 1996 For future work, one avenue to explore would be the addition of 1997 Congestion Window Validation mechanisms for TFRC's response to data- 1998 limited periods. Currently, following Standard TCP, during data- 1999 limited periods Revised TFRC does not limit its allowed sending rate 2000 as a function of the receive rate. 2002 C.2. Short Idle or Data-limited Periods 2004 Table 2 summarizes the response of Standard TCP [RFC2581], TCP with 2005 Congestion Window Validation [RFC2861], Standard TFRC [RFC3448], and 2006 Revised TFRC (this document) in response to short idle or data- 2007 limited periods. For the purposes of this section, we define a 2008 short period as a period of less than an RTT. 2010 Protocol Short idle periods Short data-limited periods 2011 -------------- -------------------- ---------------------- 2012 Standard TCP: Send a burst up to cwnd. Send a burst up to cwnd. 2014 TCP with CWV: Send a burst up to cwnd. Send a burst up to cwnd. 2016 Standard TFRC: ? ? 2018 Revised TFRC: Send a burst Send a burst 2019 (up to an RTT of (up to an RTT of 2020 unused send credits). unused send credits). 2022 Table 2: Response to short idle or data-limited periods. 2024 Table 2 shows that Revised TFRC has a similar response to that of 2025 Standard TCP and of TCP with CWV to a short idle or data-limited 2026 period. For a short idle or data-limited period, TCP is limited 2027 only by the size of the unused congestion window, and Revised TFRC 2028 is limited only by the number of unused send credits (up to an RTT's 2029 worth). For Standard TFRC, [RFC3448] did not explicitly specify the 2030 behavior with respect to unused send credits. 2032 C.3. Moderate Idle or Data-limited Periods 2034 Table 3 summarizes the response of Standard TCP [RFC2581], TCP with 2035 Congestion Window Validation [RFC2861], Standard TFRC [RFC3448], and 2036 Revised TFRC (this document) in response to moderate idle or data- 2037 limited periods. For the purposes of this section, we define a 2038 moderate period as a period greater than an RTT, but less than an 2039 RTO. 2041 Protocol Moderate idle periods Moderate data-limited periods 2042 ------------- --------------------- ------------------------- 2043 Standard TCP: Send a burst up to cwnd. Send a burst up to cwnd. 2045 TCP with CWV: Send a burst up to cwnd. Send a burst up to cwnd. 2047 Standard TFRC: ? Limited by X_recv. 2049 Revised TFRC: Send a burst Send a burst 2050 (up to an RTT of (up to an RTT of 2051 unused send credits). unused send credits). 2053 Table 3: Response to moderate idle or data-limited periods. 2055 Table 3 shows that Revised TFRC has a similar response to that of 2056 Standard TCP and of TCP with CWV to a moderate idle or data-limited 2057 period. For a moderate idle or data-limited period, TCP is limited 2058 only by the size of the unused congestion window. For a moderate 2059 idle period, Revised TFRC is limited only by the number of unused 2060 send credits (up to an RTT's worth). For a moderate data-limited 2061 period, Standard TCP would be limited by X_recv from the most recent 2062 feedback packet. In contrast, Revised TFRC isn't limited by the 2063 receive rate from data-limited periods that cover an entire feedback 2064 period of a round-trip time. For Standard TFRC, [RFC3448] did not 2065 explicitly specify the behavior with respect to unused send credits. 2067 C.4. Other Patterns 2069 Other possible patterns to consider in evaluting Revised TFRC would 2070 be to compare the behavior of TCP, Standard TFRC, and Revised TFRC 2071 for connections with alternating busy and idle periods, alternating 2072 idle and data-limited periods, or with idle or data-limited periods 2073 during Slow-Start, 2075 Normative References 2077 [RFC3448] M. Handley, S. Floyd, J. Padhye, and J. Widmer, TCP 2078 Friendly Rate Control (TFRC): Protocol 2079 Specification, RFC 3448, January 2003. 2081 Informational References 2083 [BRS99] Balakrishnan, H., Rahul, H., and Seshan, S., "An 2084 Integrated Congestion Management Architecture for 2085 Internet Hosts," Proc. ACM SIGCOMM, Cambridge, MA, 2086 September 1999. 2088 [FHPW00] S. Floyd, M. Handley, J. Padhye, and J. Widmer, 2089 "Equation-Based Congestion Control for Unicast 2090 Applications", August 2000, Proc SIGCOMM 2000. 2092 [FHPW00a] S. Floyd, M. Handley, J. Padhye, and J. Widmer, 2093 "Equation-Based Congestion Control for Unicast 2094 Applications: the Extended Version", ICSI tech 2095 report TR-00-03, March 2000. 2097 [PFTK98] Padhye, J. and Firoiu, V. and Towsley, D. and 2098 Kurose, J., "Modeling TCP Throughput: A Simple Model 2099 and its Empirical Validation", Proc ACM SIGCOMM 2100 1998. 2102 [RFC2119] S. Bradner, Key Words For Use in RFCs to Indicate 2103 Requirement Levels, RFC 2119. 2105 [RFC2140] J. Touch, "TCP Control Block Interdependence", RFC 2106 2140, April 1997. 2108 [RFC2581] Allman, M., Paxson, V., and W. Stevens, "TCP 2109 Congestion Control", RFC 2581, April 1999. 2111 [RFC2861] M. Handley, J. Padhye, and S. Floyd, TCP Congestion 2112 Window Validation, RFC2861, June 2000. 2114 [RFC2988] V. Paxson and M. Allman, "Computing TCP's 2115 Retransmission Timer", RFC 2988, November 2000. 2117 [RFC3168] K. Ramakrishnan and S. Floyd, "The Addition of 2118 Explicit Congestion Notification (ECN) to IP", RFC 2119 3168, September 2001. 2121 [RFC3390] Allman, M., Floyd, S., and C. Partridge, "Increasing 2122 TCP's Initial Window", RFC 3390, October 2002. 2124 [RFC3448Err] RFC 3448 Errata, URL 2125 ``http://www.icir.org/tfrc/rfc3448.errata''. 2127 [RFC3540] Wetherall, D., Ely, D., and Spring, N., "Robust ECN 2128 Signaling with Nonces", RFC 3540, Experimental, June 2129 2003 2131 [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram 2132 Congestion Control Protocol (DCCP)", RFC 4340, March 2133 2006. 2135 [RFC4342] Floyd, S., Kohler, E., and J. Padhye, "Profile for 2136 Datagram Congestion Control Protocol (DCCP) 2137 Congestion Control ID 3: TCP-Friendly Rate Control 2138 (TFRC)", RFC 4342, March 2006. 2140 [RFC4828] Floyd, S., and E. Kohler, TCP Friendly Rate Control 2141 (TFRC): the Small-Packet (SP) Variant, RFC 4828, 2142 Experimental, April 2007. 2144 [W00] Widmer, J., "Equation-Based Congestion Control", 2145 Diploma Thesis, University of Mannheim, February 2146 2000. URL "http://www.icir.org/tfrc/". 2148 Authors' Addresses 2150 Mark Handley, 2151 Department of Computer Science 2152 University College London 2153 Gower Street 2154 London WC1E 6BT 2155 UK 2156 EMail: M.Handley@cs.ucl.ac.uk 2158 Sally Floyd 2159 ICSI 2160 1947 Center St, Suite 600 2161 Berkeley, CA 94708 2162 floyd@icir.org 2164 Jitendra Padhye 2165 Microsoft Research 2166 padhye@microsoft.com 2167 Joerg Widmer 2168 DoCoMo Euro-Labs 2169 Landsberger Strasse 312 2170 80687 Munich 2171 Germany 2172 widmer@acm.org 2174 Full Copyright Statement 2176 Copyright (C) The IETF Trust (2007). 2178 This document is subject to the rights, licenses and restrictions 2179 contained in BCP 78, and except as set forth therein, the authors 2180 retain all their rights. 2182 This document and the information contained herein are provided on 2183 an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 2184 REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE 2185 IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL 2186 WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY 2187 WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE 2188 ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS 2189 FOR A PARTICULAR PURPOSE. 2191 Intellectual Property 2193 The IETF takes no position regarding the validity or scope of any 2194 Intellectual Property Rights or other rights that might be claimed 2195 to pertain to the implementation or use of the technology described 2196 in this document or the extent to which any license under such 2197 rights might or might not be available; nor does it represent that 2198 it has made any independent effort to identify any such rights. 2199 Information on the procedures with respect to rights in RFC 2200 documents can be found in BCP 78 and BCP 79. 2202 Copies of IPR disclosures made to the IETF Secretariat and any 2203 assurances of licenses to be made available, or the result of an 2204 attempt made to obtain a general license or permission for the use 2205 of such proprietary rights by implementers or users of this 2206 specification can be obtained from the IETF on-line IPR repository 2207 at http://www.ietf.org/ipr. 2209 The IETF invites any interested party to bring to its attention any 2210 copyrights, patents or patent applications, or other proprietary 2211 rights that may cover technology that may be required to implement 2212 this standard. Please address the information to the IETF at ietf- 2213 ipr@ietf.org.