idnits 2.17.1 draft-ietf-rmt-bb-tfmcc-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 16. -- Found old boilerplate from RFC 3978, Section 5.5 on line 1363. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. ** The document seems to lack an RFC 3979 Section 5, para. 1 IPR Disclosure Acknowledgement. ** The document seems to lack an RFC 3979 Section 5, para. 2 IPR Disclosure Acknowledgement. ** The document seems to lack an RFC 3979 Section 5, para. 3 IPR Disclosure Invitation. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There is 1 instance of too long lines in the document, the longest one being 1 character in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (9 May 2006) is 6561 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Downref: Normative reference to an Informational RFC: RFC 3048 (ref. '1') ** Downref: Normative reference to an Informational RFC: RFC 3269 (ref. '3') -- Obsolete informational reference (is this intentional?): RFC 3941 (ref. '6') (Obsoleted by RFC 5401) -- Obsolete informational reference (is this intentional?): RFC 2481 (ref. '10') (Obsoleted by RFC 3168) -- Obsolete informational reference (is this intentional?): RFC 1889 (ref. '12') (Obsoleted by RFC 3550) Summary: 10 errors (**), 0 flaws (~~), 2 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force RMT WG 2 INTERNET-DRAFT Joerg Widmer/DoCoMo Euro-Labs 3 draft-ietf-rmt-bb-tfmcc-07.txt Mark Handley/UCL 5 9 May 2006 6 Expires: November 2006 8 TCP-Friendly Multicast Congestion Control (TFMCC): 9 Protocol Specification 11 Status of this Document 13 By submitting this Internet-Draft, each author represents that any 14 applicable patent or other IPR claims of which he or she is aware have 15 been or will be disclosed, and any of which he or she becomes aware will 16 be disclosed, in accordance with Section 6 of BCP 79. 18 Internet-Drafts are working documents of the Internet Engineering Task 19 Force (IETF), its areas, and its working groups. Note that other groups 20 may also distribute working documents as Internet-Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet- Drafts as reference material 25 or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 This document is a product of the IETF RMT WG. Comments should be 34 addressed to the authors, or the WG's mailing list at rmt@lbl.gov. 36 Abstract 38 This document specifies TCP-Friendly Multicast Congestion 39 Control (TFMCC). TFMCC is a congestion control mechanism for 40 multicast transmissions in a best-effort Internet environment. 42 It is a single-rate congestion control scheme, where the 43 sending rate is adapted to the receiver experiencing the worst 44 network conditions. TFMCC is reasonably fair when competing 45 for bandwidth with TCP flows and has a relatively low 46 variation of throughput over time, making it suitable for 47 applications such as streaming media where a relatively smooth 48 sending rate is of importance. 50 Table of Contents 52 1. Introduction. . . . . . . . . . . . . . . . . . . . . . 4 53 1.1. Terminology. . . . . . . . . . . . . . . . . . . . . 4 54 1.2. Related Documents. . . . . . . . . . . . . . . . . . 5 55 1.3. Environmental Requirements and Considerations. . . . 5 56 2. Protocol Overview . . . . . . . . . . . . . . . . . . . 6 57 2.1. TCP Throughput Equation. . . . . . . . . . . . . . . 7 58 2.2. Packet Contents. . . . . . . . . . . . . . . . . . . 8 59 2.2.1. Sender Packets. . . . . . . . . . . . . . . . . . 9 60 2.2.2. Feedback Packets. . . . . . . . . . . . . . . . . 10 61 3. Data Sender Protocol. . . . . . . . . . . . . . . . . . 11 62 3.1. Sender Initialization. . . . . . . . . . . . . . . . 11 63 3.2. Determining the Maximum RTT. . . . . . . . . . . . . 11 64 3.3. Adjusting the Sending Rate . . . . . . . . . . . . . 12 65 3.4. Controlling Receiver Feedback. . . . . . . . . . . . 13 66 3.5. Assisting Receiver-Side RTT Measurements . . . . . . 14 67 3.6. Slowstart. . . . . . . . . . . . . . . . . . . . . . 15 68 3.7. Scheduling of Packet Transmissions . . . . . . . . . 16 69 4. Data Receiver Protocol. . . . . . . . . . . . . . . . . 17 70 4.1. Receiver Initialization. . . . . . . . . . . . . . . 17 71 4.2. Receiver Leave . . . . . . . . . . . . . . . . . . . 17 72 4.3. Measurement of the Network Conditions. . . . . . . . 18 73 4.3.1. Updating the Loss Event Rate. . . . . . . . . . . 18 74 4.3.2. Basic Round-Trip Time Measurement . . . . . . . . 18 75 4.3.3. One-Way Delay Adjustments . . . . . . . . . . . . 19 76 4.3.4. Receive Rate Measurements . . . . . . . . . . . . 19 77 4.4. Setting the Desired Rate . . . . . . . . . . . . . . 20 78 4.5. Feedback and Feedback Suppression. . . . . . . . . . 20 79 5. Calculation of the Loss Event Rate. . . . . . . . . . . 22 80 5.1. Detection of Lost or Marked Packets. . . . . . . . . 22 81 5.2. Translation from Loss History to Loss Events . . . . 23 82 5.3. Inter-Loss Event Interval. . . . . . . . . . . . . . 24 83 5.4. Average Loss Interval. . . . . . . . . . . . . . . . 24 84 5.5. History Discounting. . . . . . . . . . . . . . . . . 25 85 5.6. Initializing the Loss History after the First 86 Loss Event. . . . . . . . . . . . . . . . . . . . . . . . 28 87 6. Security Considerations . . . . . . . . . . . . . . . . 29 88 7. IANA Considerations . . . . . . . . . . . . . . . . . . 29 89 8. Authors' Addresses. . . . . . . . . . . . . . . . . . . 30 90 9. Acknowledgments . . . . . . . . . . . . . . . . . . . . 30 91 10. References . . . . . . . . . . . . . . . . . . . . . . 30 92 10.1. Normative References. . . . . . . . . . . . . . . . 30 93 10.2. Informative References. . . . . . . . . . . . . . . 30 94 11. Copyright Notice . . . . . . . . . . . . . . . . . . . 31 96 1. Introduction 98 This document specifies TCP-Friendly Multicast Congestion Control 99 (TFMCC) [4]. TFMCC is a source-based, single-rate congestion control 100 scheme that builds upon the unicast TCP-Friendly Rate Control mechanism 101 (TFRC) [5]. TFMCC is stable and responsive under a wide range of 102 network conditions and scales to receiver sets on the order of several 103 thousand receivers. To support scalability, as much congestion control 104 functionality as possible is located at the receivers. Each receiver 105 continuously determines a desired receive rate that is TCP-friendly for 106 the path from the sender to this receiver. Selected receivers then 107 report the rate to the sender in feedback packets. 109 TFMCC is a building block as defined in RFC 3048 [1]. Instead of 110 specifying a complete protocol, this document simply specifies a 111 congestion control mechanism that could be used in a transport protocol 112 such as RTP [12], in an application incorporating end-to-end congestion 113 control at the application level. This document does not discuss packet 114 formats, reliability, or implementation-related issues. 116 TFMCC is designed to be reasonably fair when competing for bandwidth 117 with TCP flows. A multicast flow is ``reasonably fair'' if its sending 118 rate is generally within a factor of two of the sending rate of a TCP 119 flow from the sender to the slowest receiver of the multicast group 120 under the same network conditions. 122 In general, TFMCC has a low variation of throughput, which makes it 123 suitable for applications such as streaming media where a relatively 124 smooth sending rate is of importance. The penalty of having smooth 125 throughput while competing fairly for bandwidth is a reduced 126 responsiveness to changes in available bandwidth. Thus TFMCC should be 127 used when the application has a requirement for smooth throughput, in 128 particular, avoiding halving of the sending rate in response to a single 129 packet drop. For applications that simply need to multicast as much 130 data as possible in as short a time as possible, PGMCC [11] may be more 131 suitable. 133 1.1. Terminology 135 In this document, the key words "MUST", "MUST NOT", "REQUIRED", "SHALL", 136 "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and 137 "OPTIONAL" are to be interpreted as described in BCP 14, RFC 2119 [2] 138 and indicate requirement levels for compliant TFMCC implementations. 140 1.2. Related Documents 142 As described in RFC 3048 [1], TFMCC is a building block that is intended 143 to be used, in conjunction with other building blocks, to help specify a 144 protocol instantiation. It follows the general guidelines provided in 145 RFC 3269 [3]. In particular, TFMCC is a suitable congestion control 146 building block for NACK-Oriented Reliable Multicast (NORM) [6]. 148 1.3. Environmental Requirements and Considerations 150 TFMCC is intended to be a congestion control scheme that can be used in 151 a complete protocol instantiation that delivers objects and streams 152 (both reliable content delivery and streaming of multimedia 153 information). 155 TFMCC is most applicable for sessions that deliver a substantial amount 156 of data, i.e., in length from hundreds of kilobytes to many gigabytes, 157 and whose duration is in the order of tens of seconds or more. 159 TFMCC is intended for multicast delivery. There are currently two 160 models of multicast delivery, the Any-Source Multicast (ASM) model as 161 defined in [7] defined in [8]. TFMCC works with both multicast models, 162 but in a slightly different way. When using ASM, feedback from the 163 receivers is multicast to the sender as well as to all other receivers. 164 Feedback can be either multicast on the same group address used for 165 sending data or on a separate multicast feedback group address. For 166 SSM, the receivers must unicast the feedback directly to the sender. 167 Hence, feedback from a receiver will not be received by other receivers. 169 TFMCC inherently works with all types of networks that allow bi- 170 directional communication, including LANs, WANs, Intranets, the 171 Internet, asymmetric networks, wireless networks, and satellite 172 networks. However, in some network environments varying the sending 173 rate to the receivers may not be advantageous (e.g., for a satellite or 174 wireless network, there may be no mechanism for receivers to effectively 175 reduce their reception rate since there may be a fixed transmission rate 176 allocated to the session). 178 The difference in responsiveness of TFMCC and TCP may result in 179 significant throughput differences in case of a very low bitrate. TFMCC 180 requires an estimate of the loss event rate to calculate a fair sending 181 rate. This estimate may be inaccurate in case TFMCC receives only very 182 few packets per RTT. TFMCC should not be used together with TCP if the 183 capacity of the bottleneck link is less than 30KBit/s (e.g., a very slow 184 modem connection). TFMCC may also achieve a rate that is very different 185 from the average TCP rate in case buffer space at the bottleneck is 186 severely underprovisioned. In particular, TFMCC is less susceptible to 187 small buffer sizes since TFMCC spaces out packets in time, while TCP 188 sends them back-to-back. Thus TCP is much more likely to see a packet 189 loss if buffer space is scarce. 191 TFMCC is designed for applications that use a fixed packet size, and 192 vary their sending rate in packets per second in response to congestion. 193 Some applications, e.g. those using audio, require a fixed interval of 194 time between packets and vary their packet size instead of their packet 195 rate in response to congestion. The congestion control mechanism in 196 this document cannot be used by those applications. 198 2. Protocol Overview 200 TFMCC extends the basic mechanisms of TFRC into the multicast domain. 201 In order to compete fairly with TCP, TFMCC receivers individually 202 measure the prevalent network conditions and calculate a rate that is 203 TCP-friendly on the path from the sender to themselves. The rate is 204 determined using an equation for TCP throughput, which roughly describes 205 TCP's sending rate as a function of the loss event rate, round-trip time 206 (RTT), and packet size. We define a loss event as one or more lost or 207 marked packets from the packets received during one RTT, where a marked 208 packet refers to a congestion indication from Explicit Congestion 209 Notification (ECN) [10]. The sending rate of the multicast transmission 210 is adapted to the receiver experiencing the worst network conditions. 212 Basically, TFMCC's congestion control mechanism works as follows: 214 o Each receiver measures the loss event rate and its RTT to the sender. 216 o Each receiver then uses this information, together with an equation 217 for TCP throughput, to derive a TCP-friendly sending rate. 219 o Through a distributed feedback suppression mechanism, only a subset of 220 the receivers are allowed to give feedback to prevent a feedback 221 implosion at the sender. The feedback mechanism ensures that 222 receivers reporting a low desired transmission rate have a high 223 probability of sending feedback. 225 o Receivers whose feedback is not suppressed report the calculated 226 transmission rate back to the sender in so-called receiver reports. 227 The receiver reports serve two purposes: they inform the sender about 228 the appropriate transmit rate, and they allow the receivers to measure 229 their RTT. 231 o The sender selects the receiver that reports the lowest rate as 232 current limiting receiver (CLR). Whenever feedback with an even lower 233 rate reaches the sender, the corresponding receiver becomes CLR and 234 the sending rate is reduced to match that receiver's calculated rate. 235 The sending rate increases when the CLR reports a calculated rate 236 higher than the current sending rate. 238 The dynamics of TFMCC are sensitive to how the measurements are 239 performed and applied and what feedback suppression mechanism is chosen. 240 We recommend specific mechanisms below to perform and apply these 241 measurements. Other mechanisms are possible, but it is important to 242 understand how the interactions between mechanisms affect the dynamics 243 of TFMCC. 245 2.1. TCP Throughput Equation 247 Any realistic equation giving TCP throughput as a function of loss event 248 rate and RTT should be suitable for use in TFMCC. However, we note that 249 the TCP throughput equation used must reflect TCP's retransmit timeout 250 behavior, as this dominates TCP throughput at higher loss rates. We 251 also note that the assumptions implicit in the throughput equation about 252 the loss event rate parameter have to be a reasonable match to how the 253 loss rate or loss event rate is actually measured. While this match is 254 not perfect for the throughput equation and loss rate measurement 255 mechanisms given below, in practice the assumptions turn out to be close 256 enough. 258 The throughput equation we currently recommend for TFMCC is a slightly 259 simplified version of the throughput equation for Reno TCP from [9]: 261 8 s 262 X = --------------------------------------------------------- (1) 263 R * (sqrt(2*p/3) + (12*sqrt(3*p/8) * p * (1+32*p^2))) 265 where 267 X is the transmit rate in bits/second. 269 s is the packet size in bytes. 271 R is the round-trip time in seconds. 273 p is the loss event rate, between 0.0 and 1.0, of the number of 274 loss events as a fraction of the number of packets transmitted. 276 In future, different TCP equations may be substituted for this equation. 277 The requirement is that the throughput equation be a reasonable 278 approximation of the sending rate of TCP for conformant TCP congestion 279 control. 281 The parameters s (packet size), p (loss event rate) and R (RTT) need to 282 be measured or calculated by a TFMCC implementation. The measurement of 283 R is specified in Section 4.3.2, and the measurement of p is specified 284 in Section 5. The parameter s (packet size) is normally known to an 285 application. This may not be so in two cases: 287 o The packet size naturally varies depending on the data. In this case, 288 although the packet size varies, that variation is not coupled to the 289 transmit rate. It should normally be safe to use an estimate of the 290 mean packet size for s. 292 o The application needs to change the packet size rather than the number 293 of packets per second to perform congestion control. This would 294 normally be the case with packet audio applications where a fixed 295 interval of time needs to be represented by each packet. Such 296 applications need to have a different way of measuring parameters. 298 Currently, TFMCC cannot be used for the second class of applications. 300 2.2. Packet Contents 302 Before specifying the sender and receiver functionality, we describe the 303 congestion control information contained in packets sent by the sender 304 and feedback packets from the receivers. Information from the sender 305 can either be sent in separate congestion control messages or 306 piggybacked onto data packets. If separate congestion control messages 307 are sent at time intervals larger than the time interval between data 308 packets (e.g., once per feedback round), it is necessary to be able to 309 include timestamp information destined for more than one receiver to 310 allow a sufficient number of receivers to measure their RTT. 312 As TFMCC will be used along with a transport protocol, we do not specify 313 packet formats, since these depend on the details of the transport 314 protocol used. The recommended representation of the header fields is 315 given below. Alternatively, if the computational overhead of a floating 316 point representation is prohibitive, fixed point arithmetic can be used 317 at the expense of larger packet headers. 319 2.2.1. Sender Packets 321 Each packet sent by the data sender contains the following information: 323 o A sequence number i. This number is incremented by one for each data 324 packet transmitted. The field must be sufficiently large that it does 325 not wrap causing two different packets with the same sequence number 326 to be in the receiver's recent packet history at the same time. In 327 most cases the sequence number will be supplied by the transport 328 protocol used along with TFMCC. 330 o A suppression rate X_supp in bits/s. Only receivers with a calculated 331 rate lower than the suppression rate are eligible to give feedback, 332 unless their RTT is higher than the maximum RTT described below in 333 which case they are also eligible to give feedback. The suppression 334 rate should be represented as a 12 bit floating point value with 5 335 bits for the unsigned exponent and 7 bits for the unsigned mantissa 336 (to represent rates from 100 bit/s to 400 Gbit/s with an error of less 337 than 1%). 339 o A timestamp ts_i indicating when the packet is sent. The resolution 340 of the timestamp should typically be milliseconds and the timestamp 341 should be an unsigned integer value no less than 16 bits wide. 343 o A receiver ID r and a copy of the timestamp tr_r' = tr_r of that 344 receiver's last report, which allows the receiver to measure its RTT. 345 If there is a delay ts_d between receiving the report from receiver r 346 and sending the data packet, then tr_r' = tr_r + ts_d is included in 347 the packet instead. The receiver ID is described in the next section. 348 The resolution of the timestamp echo should be milliseconds and the 349 timestamp should be an unsigned integer value no less than 16 bits 350 wide. If separate instead of piggybacked congestion control messages 351 are used, the packet needs to contain a list of receiver IDs with 352 corresponding timestamps to allow a sufficient number of receivers to 353 simultaneously measure their RTT. For the default values used for the 354 feedback process this corresponds to a list size on the order of 10 to 355 20 entries. 357 o A flag is_CLR indicating whether the receiver with ID r is the CLR. 359 o A feedback round counter fb_nr. This counter is incremented by the 360 sender at the beginning of a new feedback round to notify the 361 receivers than all feedback for older rounds should be suppressed. 362 The feedback round counter should be at least 4 bits wide. 364 o A maximum RTT value R_max, representing the maximum of the RTTs of all 365 receivers. The RTT should be measured in milliseconds. An 8 bit 366 floating point value with 4 bits for the unsigned exponent and 4 bits 367 for the unsigned mantissa (to represent RTTs from 1 millisecond to 64 368 seconds with an error of ca. 6%) should be used for the 369 representation. 371 2.2.2. Feedback Packets 373 Each feedback packet sent by a data receiver contains the following 374 information: 376 o A unique receiver ID r. In most cases the receiver ID will be 377 supplied by the transport protocol, but it may simply be the IP 378 address of the receiver. 380 o A flag have_RTT indicating whether the receiver has made at least one 381 RTT measurement since it joined the session. 383 o A flag have_loss indicating whether the receiver experienced at least 384 one loss event since it joined the session. 386 o A flag receiver_leave indicating that the receiver will leave the 387 session (and should therefore not be CLR). 389 o A timestamp tr_r indicating when the feedback packet is sent. The 390 representation of the timestamp should be the same as that of the 391 timestamp echo in the data packets. 393 o An echo ts_i' of the timestamp of the last data packet received. If 394 the last packet received at the receiver has sequence number i, then 395 ts_i' = ts_i is included in the feedback. If there is a delay tr_d 396 between receiving that last data packet and sending feedback, then 397 ts_i' = ts_i + tr_d is included in the feedback instead. The 398 representation of the timestamp echo should be the same as that of the 399 timestamp in the data packets. 401 o A feedback round echo fb_nr, reflecting the highest feedback round 402 counter value received so far. The representation of the feedback 403 round echo should be the same as the one used for the feedback round 404 counter in the data packets. 406 o The desired sending rate X_r. This is the rate calculated by the 407 receiver to be TCP-friendly on the path from the sender to this 408 receiver. The representation of the desired sending rate should be 409 the same as that of the suppression rate in the data packets. 411 3. Data Sender Protocol 413 The data sender multicasts a stream of data packets to the data 414 receivers at a controlled rate. Whenever feedback is received, the 415 sender checks if it is necessary to switch CLRs and to readjust the 416 sending rate. 418 The main tasks that have to be provided by a TFMCC sender are: 420 o adjusting the sending rate, 422 o controlling receiver feedback, and 424 o assisting receiver-side RTT measurements. 426 3.1. Sender Initialization 428 At initialization of the sender, the maximum RTT is set to a value that 429 should be larger than the highest RTT to any of the receivers. It 430 should not be smaller than 500 milliseconds for operation in the public 431 Internet. The sending rate X is initialized to 1 packet per maximum 432 RTT. 434 3.2. Determining the Maximum RTT 436 For each feedback packet that arrives at the sender, the sender computes 437 the instantaneous RTT to the receiver as 439 R_r = ts_now - ts_i' 441 where ts_now is the time the feedback packet arrived. Receivers will 442 have adjusted ts_i' for the time interval between receiving the last 443 data packet and sending the corresponding report so that this interval 444 will not be included in R_r. If the actual RTT is smaller than the 445 resolution of the timestamps and ts_now equals ts_i', then R_r is set to 446 the smallest positive RTT value larger than 0 (i.e., 1 millisecond in 447 our case). If the instantaneous RTT is larger than the current maximum 448 RTT, the maximum RTT is increased to that value 450 R_max = R_r 452 Otherwise, if no feedback with a higher instantaneous RTT than the 453 maximum RTT is received during a feedback round (see Section 3.4), the 454 maximum RTT is reduced to 455 R_max = MAX(R_max * 0.9, R_peak) 457 where R_peak is the peak receiver RTT measured during the feedback 458 round. 460 The maximum RTT is mainly used for feedback suppression among receivers 461 with heterogeneous RTTs. Feedback suppression is closely coupled to the 462 sending of data packets and for this reason, the maximum RTT must not 463 decrease below the maximum time interval between consecutive data 464 packets: 466 R_max = max(R_max, 8s/X + ts_gran) 468 where ts_gran is the granularity of the sender's system clock (see 469 Section 3.7). 471 3.3. Adjusting the Sending Rate 473 When a feedback packet from receiver r arrives at the sender, the sender 474 has to check whether it is necessary to adjust the transmission rate and 475 to switch to a new CLR. 477 How the rate is adjusted depends on the desired rate X_r of the receiver 478 report. We distinguish four cases: 480 1 If no CLR is present, receiver r becomes the current limiting 481 receiver. The sending rate X is directly set to X_r, so long as 482 this would result in a rate increase of less than 8s/R_max bits/s 483 (i.e., 1 packet per R_max). Otherwise X is gradually increased to 484 X_r at an increase rate of no more than 8s/R_max bits/s every R_max 485 seconds. 487 2 If receiver r is not the CLR but a CLR is present, then receiver r 488 becomes the current limiting receiver if X_r is less than the 489 current sending rate X and the receiver_leave flag of that 490 receiver's report is not set. Furthermore, the sending rate is 491 reduced to X_r. 493 3 If receiver r is not the CLR but a CLR is present and the 494 receiver_leave flag of the CLR's last report was set, then receiver 495 r becomes the current limiting receiver. However, if X_r > X, the 496 sending rate is not increased to X_r for the duration of a feedback 497 round to allow other (lower rate) receivers to give feedback and be 498 selected as CLR. 500 4 If receiver r is the CLR, the sending rate is set to the minimum of 501 X_r and X + 8s/R_max bits/s. 503 If the receiver has not yet measured its RTT but already experienced 504 packet loss (indicated by the corresponding flags in the receiver 505 report), the receiver report will include a desired rate that is based 506 on the maximum RTT rather than the actual RTT to that receiver. In this 507 case, the sender adjusts the desired rate using its measurement of the 508 instantaneous RTT R_r to that receiver: 510 X_r' = X_r * R_max / R_r 512 X_r' is then used instead of X_r to detect whether to switch to a new 513 CLR. 515 If the TFMCC sender receives no reports from the CLR for 4 RTTs, the 516 sending rate is cut in half unless the CLR was selected less than 10 517 RTTs ago. In addition, if the sender receives no reports from the CLR 518 for at least 10 RTTs, it assumes that the CLR crashed or left the group. 519 A new CLR is selected from the feedback that subsequently arrives at the 520 sender, and we increase as in case 3 above. 522 If no new CLR can be selected (i.e., in the absence of any feedback from 523 any of the receivers) it is necessary to further reduce the sending 524 rate. For every 10 consecutive RTTs without feedback, the sending rate 525 is cut in half. The rate is at most reduced to one packet every 8 526 seconds. 528 Note that when receivers stop receiving data packets, they will stop 529 sending feedback. This eventually causes the sending rate to be reduced 530 in the case of network failure. If the network subsequently recovers, a 531 linear increase to the calculated rate of the CLR will occur at 8s/R_max 532 bits/s every R_max. 534 3.4. Controlling Receiver Feedback 536 The receivers allowed to send a receiver report are determined in so- 537 called feedback rounds. Feedback rounds have a duration T of six times 538 the maximum RTT. In case the multicast model is ASM, i.e., receiver 539 feedback is multicast to the whole group, the duration of a feedback 540 round may be reduced to four times the maximum RTT. 542 Only receivers wishing to report a rate that is lower than the 543 suppression rate X_supp, or those with a higher RTT than R_max may send 544 feedback. At the beginning of each feedback round, X_supp is set to the 545 highest possible value that can be represented. When feedback arrives 546 at the sender over the course of a feedback round, X_supp is decreased 547 such that more and more feedback is suppressed towards the end of the 548 round. How receiver feedback is spread out over the feedback round is 549 discussed in Section 4.5. 551 Whenever non-CLR feedback for the current round arrives at the sender, 552 X_supp is reduced to 554 X_supp = (1-g) * X_r 556 if X_supp > X_r. Feedback that causes the corresponding receiver to be 557 selected as CLR, but was from a non-CLR receiver at the time of sending 558 also contributes to the feedback suppression. Note that X_r must not be 559 adjusted by the sender to reflect the receiver's real RTT in case X_r 560 was calculated using the maximum RTT, as is done for setting the sending 561 rate (Section 3.3), otherwise a feedback implosion is possible. The 562 parameter g determines to what extent higher rate feedback can suppress 563 lower rate feedback. This mechanism guarantees, that the lowest 564 calculated rate reported lies within a factor of g of the actual lowest 565 calculated rate of the receiver set (see [14]). A value of g of 0.1 is 566 recommended. 568 To allow receivers to suppress their feedback, the sender's suppression 569 rate needs to be updated whenever feedback is received. This 570 suppression rate has to be communicated to the receivers in a timely 571 manner, either by including it in the data packet header or, if separate 572 congestion control messages are used, by sending a message with the 573 suppression rate whenever the rate changes significantly (i.e., when it 574 is reduced to less than (1-g) times the previously advertised 575 suppression rate). 577 After a time span of T, the feedback round ends if non-CLR feedback was 578 received during that time. Otherwise, the feedback round ends as soon 579 as the first non-CLR feedback message arrives at the sender but at most 580 after 2T. The feedback round counter is incremented by one and the 581 suppression rate X_supp is reset to the highest representable value. 582 The feedback round counter restarts with round 0 after a wrap-around. 584 3.5. Assisting Receiver-Side RTT Measurements 586 Receivers measure their RTT by sending a timestamp with a receiver 587 report, which is echoed by the sender. If congestion control 588 information is piggybacked onto data packets, usually only one receiver 589 ID and timestamp can be included. If multiple feedback messages from 590 different receivers arrive at the sender during the time interval 591 between two data packets, the sender has to decide which receiver to 592 allow to measure RTT. The same applies if separate congestion control 593 messages allow to echo multiple receiver timestamps simultaneously but 594 the number of receivers that gave feedback since the last congestion 595 control message exceeds the list size. 597 The sender's timestamp echoes are prioritized in the following order: 599 1. a new CLR (after a change of CLR's) or a CLR without any previous 600 RTT measurements 602 2. receivers without any previous RTT measurements in the order of the 603 feedback round echo of the corresponding receiver report (i.e., 604 older feedback first) 606 3. non-CLR receivers with previous RTT measurements, again in 607 ascending order of the feedback round echo of the report 609 4. the CLR 611 Ties are broken in favor of the receiver with the lowest reported rate. 613 It is necessary to account for the time that elapses between receiving a 614 report and sending the next data packet. This time needs to be deducted 615 from the RTT and thus has to be added to the receiver's timestamp value. 617 Whenever no feedback packets arrive in the interval between two data 618 packets, the CLR's last timestamp, adjusted by the appropriate offset, 619 is echoed. When the number of packets per RTT is so low that all 620 packets carry a non-CLR receiver's timestamp, the CLR's timestamp and ID 621 are included in a data packet at least once per feedback round. 623 3.6. Slowstart 625 TFMCC uses a slowstart mechanism to quickly approach its fair bandwidth 626 share at the start of a session. During slowstart, the sending rate 627 increases exponentially. The rate increase is limited to the minimum of 628 the rates included in the receiver reports and receivers report twice 629 the rate at which they currently receive data. As in normal congestion 630 control mode, the receiver with the smallest reported rate becomes CLR. 631 Since a receiver can never receive data at a rate higher than its link 632 bandwidth, this effectively limits the overshoot to twice this 633 bandwidth. In case the resulting increase over R_max is less than 634 8s/R_max bits/s, the sender may choose to increase the rate by up to 635 8s/R_max bits/s every R_max. The current sending rate is gradually 636 adjusted to the target rate reported in the receiver reports over the 637 course of a RTT. Slowstart is terminated as soon as any one of the 638 receivers experiences its first packet loss. Since that receiver's 639 calculated rate will be lower than the current sending rate, the 640 receiver will be selected as CLR. 642 During slowstart, the upper bound on the rate increase of 8s/R_max 643 bits/s every RTT does not apply. Only after the TFMCC sender receives 644 the first report with the have_loss flag set is the rate increase 645 limited in this way. 647 Slowstart may also be used after the sender has been idle for some time, 648 to quickly reach the previous sending rate. When the sender stops 649 sending data packets, it records the current sending rate X' = X. Every 650 10 RTTs, the allowed sending rate will be halved due to lack of receiver 651 feedback as specified in Section 3.3. This halving may take place 652 multiple times. When the sender resumes, it may perform a slowstart 653 from the current allowed rate up to the recorded rate X'. Slowstart 654 ends after the first packet loss by any of the receivers or as soon as 655 X' is reached. 657 To this end, receivers have to clear the have_loss flag after 10 RTTs 658 without data packets as specified in Section 4.3.1. The have_loss flag 659 is only used during slowstart. Therefore, clearing the flag has no 660 effect if no packets arrived due to network partitioning or packet loss. 662 3.7. Scheduling of Packet Transmissions 664 As TFMCC is rate-based, and as operating systems typically cannot 665 schedule events precisely, it is necessary to be opportunistic about 666 sending data packets so that the correct average rate is maintained 667 despite the coarse-grain or irregular scheduling of the operating 668 system. Thus a typical sending loop will calculate the correct inter- 669 packet interval, ts_ipi, as follows: 671 ts_ipi = 8s/X 673 When a sender first starts sending at time t_0, it calculates ts_ipi, 674 and calculates a nominal send time t_1 = t_0 + ts_ipi for packet 1. 675 When the application becomes idle, it checks the current time, ts_now, 676 and then requests re-scheduling after (ts_ipi - (ts_now - t_0)) seconds. 677 When the application is re-scheduled, it checks the current time, 678 ts_now, again. If (ts_now > t_1 - delta) then packet 1 is sent (see 679 below for delta). 681 Now a new ts_ipi may be calculated, and used to calculate a nominal send 682 time t_2 for packet 2: t_2 = t_1 + ts_ipi. The process then repeats, 683 with each successive packet's send time being calculated from the 684 nominal send time of the previous packet. Note that the actual send 685 time ts_i and not the nominal send time is included as timestamp in the 686 packet header. 688 In some cases, when the nominal send time, t_i, of the next packet is 689 calculated, it may already be the case that ts_now > t_i - delta. In 690 such a case the packet should be sent immediately. Thus if the 691 operating system has coarse timer granularity and the transmit rate is 692 high, then TFMCC may send short bursts of several packets separated by 693 intervals of the OS timer granularity. 695 The parameter delta is to allow a degree of flexibility in the send time 696 of a packet. If the operating system has a scheduling timer granularity 697 of ts_gran seconds, then delta would typically be set to: 699 delta = min(ts_ipi/2, ts_gran/2) 701 ts_gran is 10 milliseconds on many Unix systems. If ts_gran is not 702 known, a value of 10 milliseconds can be safely assumed. 704 4. Data Receiver Protocol 706 Receivers measure the current network conditions, namely RTT and loss 707 event rate, and use this information to calculate a rate that is fair to 708 competing traffic. The rate is then communicated to the sender in 709 receiver reports. Due to the potentially large number of receivers, it 710 is undesirable that all receivers send reports, especially not at the 711 same time. 713 In the description of the receiver functionality, we will first address 714 how the receivers measure the network parameters and then discuss the 715 feedback process. 717 4.1. Receiver Initialization 719 The receiver is initialized when it receives the first data packet. The 720 RTT is set to the maximum RTT value contained in the data packet. This 721 initial value is used as the receiver's RTT until the first real RTT 722 measurement is made. The loss event rate is initialized to 0. Also, 723 the flags receiver_leave, have_RTT, and have_loss are cleared. 725 4.2. Receiver Leave 727 A receiver that sends feedback but wishes to leave the TFMCC session 728 within the next feedback round may indicate the pending leave by setting 729 the receiver_leave flag in its report. If the leaving receiver is the 730 CLR, the receiver_leave flag should be set for all the reports within 731 the feedback round before the leave takes effect. 733 4.3. Measurement of the Network Conditions 735 Receivers have to update their estimate of the network parameters with 736 each new data packet they receive. 738 4.3.1. Updating the Loss Event Rate 740 When a data packet is received, the receiver adds the packet to the 741 packet history. It then recalculates the new value of the loss event 742 rate p. The loss event rate measurement mechanism is described 743 separately in Section 5. 745 When a loss event is detected, the flag have_loss is set. In case no 746 data packets are received for 10 consecutive RTTs, the flag is cleared 747 to allow the sender to slowstart. It is set again when new data packets 748 arrive and a loss event is detected. 750 4.3.2. Basic Round-Trip Time Measurement 752 When a receiver gets a data packet that carries the receiver's own ID in 753 the r field, the receiver updates its RTT estimate. 755 1. The current RTT is calculated as: 757 R_sample = tr_now - tr_r' 759 where tr_now is the time the data packet arrives at the receiver 760 and tr_r' is the receiver report timestamp echoed in the data 761 packet. If the actual RTT is smaller than the resolution of the 762 timestamps and tr_now equals tr_r', then R_sample is set to the 763 smallest positive RTT value larger than 0 (i.e., 1 millisecond in 764 our case). 766 2. The smoothed RTT estimate R is updated: 768 If no feedback has been received before 769 R = R_sample 770 Else 771 R = q*R + (1-q)*R_sample 773 A filter parameter q of 0.5 is recommended for non-CLR receivers. 774 The CLR performs RTT measurements much more frequently and hence 775 should use a higher filter value. We recommend using q=0.9. Note 776 that TFMCC is not sensitive to the precise value for the filter 777 constant. 779 Optionally, sender-based instead of receiver-based RTT measurements may 780 be used. The sender already determines the RTT to a receiver from the 781 receiver's echo of the sender's own timestamp for the calculation of the 782 maximum RTT. For sender-based RTT measurements, this RTT measurement 783 needs to be communicated to the receiver. Instead of including an echo 784 of the receiver's timestamp, the sender includes the receiver's RTT in 785 the next data packet, using the prioritization rules described in 786 Section 3.5. 788 To simplify sender operation, smoothing of RTT samples as described 789 above should still be done at the receiver. 791 4.3.3. One-Way Delay Adjustments 793 When a RTT measurement is performed, the receiver also determines the 794 one-way delay D_r from itself to the sender: 796 D_r = tr_r' - ts_i 798 where ts_i and tr_r' are the timestamp and receiver report timestamp 799 echo contained in the data packet. With each new data packet j, the 800 receiver can now calculate an updated RTT estimate as: 802 R' = max(D_r + tr_now - ts_j, 1 millisecond) 804 In between RTT measurements, the updated R' is used instead of the 805 smoothed RTT R. Like the RTT samples, R' must be strictly positive. 806 When a new measurement is made, all interim one-way delay measurements 807 are discarded (i.e., the smoothed RTT is updated according to Section 808 4.3.2 without taking the interim one-way delay adjustments into 809 account). 811 For the one-way delay measurements, the clocks of sender and receivers 812 need not be synchronized. Clock skew will cancel itself out when both 813 one-way measurements are added to form a RTT estimate, as long as clock 814 drift between real RTT measurements is negligible. 816 The same one-way delay adjustments should be applied to the RTT supplied 817 by the sender when using sender-based RTT measurements. 819 4.3.4. Receive Rate Measurements 821 When a receiver has not experienced any loss events, it cannot calculate 822 a TCP-friendly rate to include in the receiver reports. Instead, the 823 receiver measures the current receive rate and sets the desired rate X_r 824 to twice the receive rate. 826 The receive rate in bits/s is measured as the number of bits received 827 over the last k RTTs, taking into account the IP and transport packet 828 headers, but excluding the link-layer packet headers. A value for k 829 between 2 and 4 is recommended. 831 4.4. Setting the Desired Rate 833 When a receiver measures a non-zero loss event rate, it calculates the 834 desired rate using Equation (1). In case no RTT measurement is 835 available yet, the maximum RTT is used instead of the receiver's RTT. 836 The desired rate X_r is updated whenever the loss event rate or the RTT 837 changes. 839 A receiver may decide to not report desired rates that are below 1 840 packet per 8 seconds, since a sender is very slow to recover from such 841 low sending rates. In this case, the receiver reports a desired rate of 842 1 packet per 8 seconds. However, it must leave the multicast group if 843 for more than 120 seconds the calculated rate falls below the reported 844 rate and the current sending rate is higher than the receiver's 845 calculated rate. 847 As mentioned above, calculation of the desired rate is not possible 848 before the receiver experiences the first loss event. In that case, 849 twice the rate at which data is received is included in the receiver 850 reports as X_r to allow the sender to slowstart as described in Section 851 3.6. This is also done when the sender resumes sending data packets 852 after the have_loss flag was cleared due to the sender being idle. 854 4.5. Feedback and Feedback Suppression 856 Let fb_nr be the highest feedback round counter value received by a 857 receiver. When a new data packet arrives with a higher feedback round 858 counter than fb_nr, a new feedback round begins and fb_nr is updated. 859 Outstanding feedback for the old round is canceled. In case a feedback 860 number with a value that is more than half the feedback number space 861 lower than fb_nr is received, the receiver assumes that the feedback 862 round counter wrapped and also cancels the feedback timer and updates 863 fb_nr. 865 The CLR sends its feedback independently from all the other receivers 866 once per RTT. Its feedback does not suppress other feedback and cannot 867 be suppressed by other receiver's feedback. 869 Non-CLR receivers set a feedback timer at the beginning of a feedback 870 round. Using an exponentially weighted random timer mechanism, the 871 feedback timer is set to expire after 872 t = max(T * (1 + log(x)/log(N)), 0) 874 where 876 x is a random variable uniformly distributed in (0,1], 878 T is the duration of a feedback round (i.e., 6 * R_max), 880 N is an estimated upper bound on the number of receivers. 882 N is a constant specific to the TFMCC protocol. Since TFMCC scales to 883 up to thousands of receivers, setting N to 10,000 for all receivers (and 884 limiting the TFMCC session to at most 10,000 receivers) is recommended. 886 A feedback packet is sent when the feedback timer expires, unless the 887 timer is canceled beforehand. When the multicast model is ASM, feedback 888 is multicast to the whole group, otherwise the feedback is unicast to 889 the sender. The feedback packet includes the calculated rate valid at 890 the time the feedback packet is sent (not the rate at the point of time 891 when the feedback timer is set). The copy of the timestamp ts_i of the 892 last data packet received, which is included in the feedback packet, 893 needs to be adjusted by the time interval between receiving the data 894 packet and sending the report to allow the sender to correctly infer the 895 instantaneous RTT (i.e., that time interval has to be added to the 896 timestamp value). 898 The timer is canceled if a data packet with a lower suppression rate 899 than the receiver's calculated rate and a higher or equal maximum RTT 900 than the receiver's RTT is received. Likewise, a data packet indicating 901 the beginning of a new feedback round cancels all feedback for older 902 rounds. In case of ASM, the timer is also canceled if a feedback packet 903 from another non-CLR receiver reporting a lower rate is received. 905 The feedback suppression process is complicated by the fact that the 906 calculated rates of the receivers will change during a feedback round. 907 If the calculated rates decrease rapidly for all receivers, feedback 908 suppression can no longer prevent a feedback implosion since earlier 909 feedback will always report a higher rate than current feedback. To 910 make the feedback suppression mechanism robust in the face of changing 911 rates, it is necessary to introduce X_fbr, the calculated rate of a 912 receiver at the beginning of a feedback round. A receiver needs to 913 suppress its feedback not only when the suppression rate is less than 914 the receiver's current calculated rate but also in the case that the 915 suppression rate falls below X_fbr. 917 When the maximum RTT changes significantly during one feedback round, it 918 is necessary to reschedule the feedback timer in proportion to the 919 change. 921 t = t * R_max / R_max' 923 where R_max is the new maximum RTT and R_max' is the previous maximum 924 RTT. The same considerations hold, when the last data packets were 925 received more than a time interval of R_max ago. In this case, it is 926 necessary to add the difference of the inter-packet gap and the maximum 927 RTT to the feedback time to prevent a feedback implosion (e.g., in case 928 the sender crashed). 930 t = t + max(tr_now - tr_i - R_max, 0) 932 where tr_i is the time when the last data packet arrived at the 933 receiver. 935 More details on the characteristics of the feedback suppression 936 mechanism can be found in [14] and [4]. 938 5. Calculation of the Loss Event Rate 940 Obtaining an accurate and stable measurement of the loss event rate is 941 of primary importance for TFMCC. Loss rate measurement is performed at 942 the receiver, based on the detection of lost or marked packets from the 943 sequence numbers of arriving packets. 945 5.1. Detection of Lost or Marked Packets 947 TFMCC assumes that all packets contain a sequence number that is 948 incremented by one for each packet that is sent. For the purposes of 949 this specification, we require that if a lost packet is retransmitted, 950 the retransmission is given a new sequence number that is the latest in 951 the transmission sequence, and not the same sequence number as the 952 packet that was lost. If a transport protocol has the requirement that 953 it must retransmit with the original sequence number, then the transport 954 protocol designer must figure out how to distinguish delayed from 955 retransmitted packets and how to detect lost retransmissions. 957 The receivers each maintain a data structure that keeps track of which 958 packets have arrived and which are missing. For the purposes of 959 specification, we assume that the data structure consists of a list of 960 packets that have arrived along with the timestamp when each packet was 961 received. In practice this data structure will normally be stored in a 962 more compact representation, but this is implementation-specific. 964 The loss of a packet is detected by the arrival of at least three 965 packets with a higher sequence number than the lost packet. The 966 requirement for three subsequent packets is the same as with TCP, and is 967 to make TFMCC more robust in the presence of reordering. In contrast to 968 TCP, if a packet arrives late (after 3 subsequent packets arrived) at a 969 receiver, the late packet can fill the hole in the reception record, and 970 the receiver can recalculate the loss event rate. Future versions of 971 TFMCC might make the requirement for three subsequent packets adaptive 972 based on experienced packet reordering, but we do not specify such a 973 mechanism here. 975 For an ECN-capable connection, a marked packet is detected as a 976 congestion event as soon as it arrives, without having to wait for the 977 arrival of subsequent packets. 979 5.2. Translation from Loss History to Loss Events 981 TFMCC requires that the loss event rate be robust to several consecutive 982 packets lost where those packets are part of the same loss event. This 983 is similar to TCP, which (typically) only performs one halving of the 984 congestion window during any single RTT. Thus the receivers needs to 985 map the packet loss history into a loss event record, where a loss event 986 is one or more packets lost in a RTT. 988 To determine whether a lost or marked packet should start a new loss 989 event, or be counted as part of an existing loss event, we need to 990 compare the sequence numbers and timestamps of the packets that arrived 991 at the receiver. For a marked packet S_new, its reception time T_new 992 can be noted directly. For a lost packet, we can interpolate to infer 993 the nominal "arrival time". Assume: 995 S_loss is the sequence number of a lost packet. 997 S_before is the sequence number of the last packet to arrive with 998 sequence number before S_loss. 1000 S_after is the sequence number of the first packet to arrive with 1001 sequence number after S_loss. 1003 T_before is the reception time of S_before. 1005 T_after is the reception time of S_after. 1007 Note that T_before can either be before or after T_after due to 1008 reordering. 1010 For a lost packet S_loss, we can interpolate its nominal "arrival time" 1011 at the receiver from the arrival times of S_before and S_after. Thus 1012 T_loss = T_before + ( (T_after - T_before) 1013 * (S_loss - S_before)/(S_after - S_before) ); 1015 Note that if the sequence space wrapped between S_before and S_after, 1016 then the sequence numbers must be modified to take this into account 1017 before performing the calculation. If the largest possible sequence 1018 number is S_max, and S_before > S_after, then modifying each sequence 1019 number S by S' = (S + (S_max + 1)/2) mod (S_max + 1) would normally be 1020 sufficient. 1022 If the lost packet S_old was determined to have started the previous 1023 loss event, and we have just determined that S_new has been lost, then 1024 we interpolate the nominal arrival times of S_old and S_new, called 1025 T_old and T_new respectively. 1027 If T_old + R >= T_new, then S_new is part of the existing loss event. 1028 Otherwise S_new is the first packet of a new loss event. 1030 5.3. Inter-Loss Event Interval 1032 If a loss interval, A, is determined to have started with packet 1033 sequence number S_A and the next loss interval, B, started with packet 1034 sequence number S_B, then the number of packets in loss interval A is 1035 given by (S_B - S_A). 1037 5.4. Average Loss Interval 1039 To calculate the loss event rate p, we first calculate the average loss 1040 interval. This is done using a filter that weights the n most recent 1041 loss event intervals in such a way that the measured loss event rate 1042 changes smoothly. 1044 Weights w_0 to w_(n-1) are calculated as: 1046 If (i < n/2) 1047 w_i = 1; 1048 Else 1049 w_i = 1 - (i - (n/2 - 1))/(n/2 + 1); 1051 Thus if n=8, the values of w_0 to w_7 are: 1053 1.0, 1.0, 1.0, 1.0, 0.8, 0.6, 0.4, 0.2 1055 The value n for the number of loss intervals used in calculating the 1056 loss event rate determines TFMCC's speed in responding to changes in the 1057 level of congestion. As currently specified, TFMCC should not be used 1058 for values of n significantly greater than 8, for traffic that might 1059 compete in the global Internet with TCP. At the very least, safe 1060 operation with values of n greater than 8 would require a slight change 1061 to TFMCC's mechanisms to include a more severe response to two or more 1062 round-trip times with heavy packet loss. 1064 When calculating the average loss interval we need to decide whether to 1065 include the interval since the most recent packet loss event. We only 1066 do this if it is sufficiently large to increase the average loss 1067 interval. 1069 Thus if the most recent loss intervals are I_0 to I_n, with I_0 being 1070 the interval since the most recent loss event, then we calculate the 1071 average loss interval I_mean as: 1073 I_tot0 = 0; 1074 I_tot1 = 0; 1075 W_tot = 0; 1076 for (i = 0 to n-1) { 1077 I_tot0 = I_tot0 + (I_i * w_i); 1078 W_tot = W_tot + w_i; 1079 } 1080 for (i = 1 to n) { 1081 I_tot1 = I_tot1 + (I_i * w_(i-1)); 1082 } 1083 I_tot = max(I_tot0, I_tot1); 1084 I_mean = I_tot/W_tot; 1086 The loss event rate, p is simply: 1088 p = 1 / I_mean; 1090 5.5. History Discounting 1092 As described in Section 5.4, the most recent loss interval is only 1093 assigned 4/(3*n) of the total weight in calculating the average loss 1094 interval, regardless of the size of the most recent loss interval. This 1095 section describes an optional history discounting mechanism that allows 1096 the TFMCC receivers to adjust the weights, concentrating more of the 1097 relative weight on the most recent loss interval, when the most recent 1098 loss interval is more than twice as large as the computed average loss 1099 interval. 1101 To carry out history discounting, we associate a discount factor DF_i 1102 with each loss interval L_i, where each discount factor is a floating 1103 point number. The discount array maintains the cumulative history of 1104 discounting for each loss interval. At the beginning, the values of 1105 DF_i in the discount array are initialized to 1: 1107 for (i = 0 to n) { 1108 DF_i = 1; 1109 } 1111 History discounting also uses a general discount factor DF, also a 1112 floating point number, that is also initialized to 1. First we show how 1113 the discount factors are used in calculating the average loss interval, 1114 and then we describe later in this section how the discount factors are 1115 modified over time. 1117 As described in Section 5.4 the average loss interval is calculated 1118 using the n previous loss intervals I_1, ..., I_n, and the interval I_0 1119 that represents the number of packets received since the last loss 1120 event. The computation of the average loss interval using the discount 1121 factors is a simple modification of the procedure in Section 5.4, as 1122 follows: 1124 I_tot0 = I_0 * w_0 1125 I_tot1 = 0; 1126 W_tot0 = w_0 1127 W_tot1 = 0; 1128 for (i = 1 to n-1) { 1129 I_tot0 = I_tot0 + (I_i * w_i * DF_i * DF); 1130 W_tot0 = W_tot0 + w_i * DF_i * DF; 1131 } 1132 for (i = 1 to n) { 1133 I_tot1 = I_tot1 + (I_i * w_(i-1) * DF_i); 1134 W_tot1 = W_tot1 + w_(i-1) * DF_i; 1135 } 1136 p = min(W_tot0/I_tot0, W_tot1/I_tot1); 1138 The general discounting factor, DF is updated on every packet arrival as 1139 follows. First, a receiver computes the weighted average I_mean of the 1140 loss intervals I_1, ..., I_n: 1142 I_tot = 0; 1143 W_tot = 0; 1144 for (i = 1 to n) { 1145 W_tot = w_(i-1) * DF_i; 1146 I_tot = I_tot + (I_i * w_(i-1) * DF_i); 1147 } 1148 I_mean = I_tot / W_tot; 1150 This weighted average I_mean is compared to I_0, the number of packets 1151 received since the last loss event. If I_0 is greater than twice 1152 I_mean, then the new loss interval is considerably larger than the old 1153 ones, and the general discount factor DF is updated to decrease the 1154 relative weight on the older intervals, as follows: 1156 if (I_0 > 2 * I_mean) { 1157 DF = 2 * I_mean/I_0; 1158 if (DF < THRESHOLD) 1159 DF = THRESHOLD; 1160 } else 1161 DF = 1; 1163 A nonzero value for THRESHOLD ensures that older loss intervals from an 1164 earlier time of high congestion are not discounted entirely. We 1165 recommend a THRESHOLD of 0.5. Note that with each new packet arrival, 1166 I_0 will increase further, and the discount factor DF will be updated. 1168 When a new loss event occurs, the current interval shifts from I_0 to 1169 I_1, loss interval I_i shifts to interval I_(i+1), and the loss interval 1170 I_n is forgotten. The previous discount factor DF has to be 1171 incorporated into the discount array. Because DF_i carries the discount 1172 factor associated with loss interval I_i, the DF_i array has to be 1173 shifted as well. This is done as follows: 1175 for (i = 1 to n) { 1176 DF_i = DF * DF_i; 1177 } 1178 for (i = n-1 to 0 step -1) { 1179 DF_(i+1) = DF_i; 1180 } 1181 I_0 = 1; 1182 DF_0 = 1; 1183 DF = 1; 1185 This completes the description of the optional history discounting 1186 mechanism. We emphasize that this is an optional mechanism whose sole 1187 purpose is to allow TFMCC to response somewhat more quickly to the 1188 sudden absence of congestion, as represented by a long current loss 1189 interval. 1191 5.6. Initializing the Loss History after the First Loss Event 1193 The number of packets received before the first loss event usually does 1194 not reflect the current loss event rate. When the first loss event 1195 occurs, a TFMCC receiver assumes that the correct data rate is the rate 1196 at which data was received during the last RTT when the loss occurred. 1197 Instead of initializing the first loss interval to the number of packets 1198 sent until the first loss event, the TFMCC receiver calculates the loss 1199 interval that would be required to produce the receive rate X_recv, and 1200 uses this synthetic loss interval l_0 to seed the loss history 1201 mechanism. 1203 The initial loss interval is calculated by inverting a simplified 1204 version of the TCP Equation (1). 1206 8s 1207 X_recv = sqrt(3/2) * ----------------- 1208 R * sqrt(1/l_0) 1210 X_recv * R 1211 ==> l_0 = (----------------)^2 1212 sqrt(3/2) * 8s 1214 The resulting initial loss interval is too small at higher loss rates 1215 compared to using the more accurate Equation (1), which leads to a more 1216 conservative initial loss event rate. 1218 If a receiver still uses the initial RTT R_max instead of its real RTT, 1219 the initial loss interval is too large in case the initial RTT is higher 1220 than the actual RTT. As a consequence, the receiver will calculate a 1221 too high desired rate when the first RTT measurement R is made and the 1222 initial loss interval is still in the loss history. The receiver has to 1223 adjust l_0 as follows: 1225 l_0 = l_0 * (R/R_max)^2 1227 No action needs to be taken when the first RTT measurement is made after 1228 the initial loss interval left the loss history. 1230 6. Security Considerations 1232 TFMCC is not a transport protocol in its own right, but a congestion 1233 control mechanism that is intended to be used in conjunction with a 1234 transport protocol. Therefore security primarily needs to be considered 1235 in the context of a specific transport protocol and its authentication 1236 mechanisms. 1238 Congestion control mechanisms can potentially be exploited to create 1239 denial of service. This may occur through spoofed feedback. Thus any 1240 transport protocol that uses TFMCC should take care to ensure that 1241 feedback is only accepted from valid receivers of the data. The precise 1242 mechanism to achieve this will however depend on the transport protocol 1243 itself. 1245 Congestion control mechanisms may potentially be manipulated by a greedy 1246 receiver that wishes to receive more than its fair share of network 1247 bandwidth. However, in TFMCC a receiver can only influence the sending 1248 rate if it is the CLR and thus has the lowest calculated rate of all 1249 receivers. If the calculated rate is then manipulated such that it 1250 exceeds the calculated rate of the second to lowest receiver, it will 1251 cease to be CLR. A greedy receiver can only significantly increase then 1252 transmission rate if it is the only participant in the session. If such 1253 scenarios are of concern, possible defenses against such a receiver 1254 would normally include some form of nonce that the receiver must feed 1255 back to the sender to prove receipt. However, the details of such a 1256 nonce would depend on the transport protocol, and in particular on 1257 whether the transport protocol is reliable or unreliable. 1259 It is possible that a receiver sends feedback claiming it has a very low 1260 calculated rate. This will reduce the rate of the multicast session and 1261 might render it useless but obviously cannot hurt the network itself. 1263 We expect that protocols incorporating ECN with TFMCC will also want to 1264 incorporate feedback from the receiver to the sender using the ECN nonce 1265 [13]. The ECN nonce is a modification to ECN that protects the sender 1266 from the accidental or malicious concealment of marked packets. Again, 1267 the details of such a nonce would depend on the transport protocol, and 1268 are not addressed in this document. 1270 7. IANA Considerations 1272 There are no IANA actions required for this document. 1274 8. Authors' Addresses 1276 Joerg Widmer 1277 DoCoMo Euro-Labs 1278 Landsberger Str. 312, Munich, Germany 1279 widmer@acm.org 1281 Mark Handley 1282 UCL (University College London) 1283 Gower Street, London WC1E 6BT, UK 1284 m.handley@cs.ucl.ac.uk 1286 9. Acknowledgments 1288 We would like to acknowledge feedback and discussions on equation-based 1289 congestion control with a wide range of people, including members of the 1290 Reliable Multicast Research Group, the Reliable Multicast Transport 1291 Working Group, and the End-to-End Research Group. We would particularly 1292 like to thank Brian Adamson, Mark Pullen, Fei Zhao, and Magnus 1293 Westerlund for feedback on earlier versions of this document. 1295 10. References 1297 10.1. Normative References 1299 [1] B. Whetten, L. Vicisano, R. Kermode, M. Handley, S. Floyd, and M. 1300 Luby, "Reliable Multicast Transport Building Blocks for One-to-Many 1301 Bulk-Data Transfer", RFC 3048, January 2001. 1303 [2] S. Bradner, "Key words for use in RFCs to Indicate Requirement 1304 Levels", BCP 14, RFC 2119, March 1997. 1306 [3] R. Kermode and L. Vicisano, "Author Guidelines for Reliable 1307 Multicast Transport (RMT) Building Blocks and Protocol 1308 Instantiation documents", RFC 3269, April 2002. 1310 10.2. Informative References 1312 [4] J. Widmer and M. Handley, "Extending Equation-Based Congestion 1313 Control to Multicast Applications", Proc ACM Sigcomm 2001, San 1314 Diego, August 2001. 1316 [5] S. Floyd, M. Handley, J. Padhye, and J. Widmer, "Equation-Based 1317 Congestion Control for Unicast Applications", Proc ACM SIGCOMM 1318 2000, Stockholm, August 2000. 1320 [6] B. Adamson, C. Bormann, M. Handley, and J. Macker, "Negative- 1321 Acknowledgment (NACK)-Oriented Reliable Multicast (NORM) Building 1322 Blocks", RFC 3941, November 2004. 1324 [7] S. Deering, "Host Extensions for IP Multicasting", STD 5, RFC 1112, 1325 August 1989. 1327 [8] H. W. Holbrook, "A Channel Model for Multicast," Ph.D. 1328 Dissertation, Stanford University, Department of Computer Science, 1329 Stanford, California, August 2001. 1331 [9] J. Padhye, V. Firoiu, D. Towsley, and J. Kurose, "Modeling TCP 1332 Throughput: A Simple Model and its Empirical Validation", Proc ACM 1333 SIGCOMM 1998. 1335 [10] K. Ramakrishnan and S. Floyd, "A Proposal to add Explicit 1336 Congestion Notification (ECN) to IP", RFC 2481, January 1999. 1338 [11] L. Rizzo, "pgmcc: a TCP-friendly single-rate multicast congestion 1339 control scheme", Proc ACM Sigcomm 2000, Stockholm, August 2000. 1341 [12] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, "RTP: A 1342 Transport Protocol for Real-Time Applications", RFC 1889, January 1343 1996. 1345 [13] N. Spring, D. Wetherall, and D. Ely, "Robust Explicit Congestion 1346 Notification (ECN) Signaling with Nonces", RFC 3540, June 2003. 1348 [14] J. Widmer and T. Fuhrmann, "Extremum Feedback for Very Large 1349 Multicast Groups", Proc NGC 2001, London, November 2001. 1351 11. Copyright Notice 1353 Copyright (C) The Internet Society (2006). This document is subject to 1354 the rights, licenses and restrictions contained in BCP 78, and except as 1355 set forth therein, the authors retain all their rights. 1357 This document and the information contained herein are provided on an 1358 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR 1359 IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 1360 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 1361 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 1362 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1363 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.