idnits 2.17.1 draft-ietf-rmt-bb-tfmcc-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 16. -- Found old boilerplate from RFC 3978, Section 5.5 on line 1317. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. ** The document seems to lack an RFC 3979 Section 5, para. 1 IPR Disclosure Acknowledgement. ** The document seems to lack an RFC 3979 Section 5, para. 2 IPR Disclosure Acknowledgement. ** The document seems to lack an RFC 3979 Section 5, para. 3 IPR Disclosure Invitation. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There is 1 instance of too long lines in the document, the longest one being 1 character in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (3 March 2006) is 6628 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- No information found for draft-ietf-rmt-norm-bb - is the name correct? -- Possible downref: Normative reference to a draft: ref. '1' -- Possible downref: Non-RFC (?) normative reference: ref. '2' -- Possible downref: Non-RFC (?) normative reference: ref. '3' -- Possible downref: Non-RFC (?) normative reference: ref. '4' ** Obsolete normative reference: RFC 2481 (ref. '5') (Obsoleted by RFC 3168) -- Possible downref: Non-RFC (?) normative reference: ref. '6' ** Obsolete normative reference: RFC 1889 (ref. '7') (Obsoleted by RFC 3550) ** Downref: Normative reference to an Historic RFC: RFC 3540 (ref. '8') -- Possible downref: Non-RFC (?) normative reference: ref. '9' -- Possible downref: Non-RFC (?) normative reference: ref. '10' Summary: 12 errors (**), 0 flaws (~~), 2 warnings (==), 12 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force RMT WG 2 INTERNET-DRAFT Joerg Widmer/DoCoMo Euro-Labs 3 draft-ietf-rmt-bb-tfmcc-06.txt Mark Handley/UCL 5 3 March 2006 6 Expires: September 2006 8 TCP-Friendly Multicast Congestion Control (TFMCC): 9 Protocol Specification 11 Status of this Document 13 By submitting this Internet-Draft, each author represents that any 14 applicable patent or other IPR claims of which he or she is aware have 15 been or will be disclosed, and any of which he or she becomes aware will 16 be disclosed, in accordance with Section 6 of BCP 79. 18 Internet-Drafts are working documents of the Internet Engineering Task 19 Force (IETF), its areas, and its working groups. Note that other groups 20 may also distribute working documents as Internet-Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet- Drafts as reference material 25 or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 This document is a product of the IETF RMT WG. Comments should be 34 addressed to the authors, or the WG's mailing list at rmt@lbl.gov. 36 Abstract 38 This document specifies TCP-Friendly Multicast Congestion 39 Control (TFMCC). TFMCC is a congestion control mechanism for 40 multicast transmissions in a best-effort Internet environment. 42 It is a single-rate congestion control scheme, where the 43 sending rate is adapted to the receiver experiencing the worst 44 network conditions. TFMCC is reasonably fair when competing 45 for bandwidth with TCP flows and has a relatively low 46 variation of throughput over time, making it suitable for 47 applications such as streaming media where a relatively smooth 48 sending rate is of importance. 50 Table of Contents 52 1. Introduction. . . . . . . . . . . . . . . . . . . . . . 4 53 1.1. Terminology. . . . . . . . . . . . . . . . . . . . . 4 54 1.2. Related Documents. . . . . . . . . . . . . . . . . . 5 55 1.3. Environmental Requirements and Considerations. . . . 5 56 2. Protocol Overview . . . . . . . . . . . . . . . . . . . 6 57 2.1. TCP Throughput Equation. . . . . . . . . . . . . . . 7 58 2.2. Packet Contents. . . . . . . . . . . . . . . . . . . 8 59 2.2.1. Sender Packets. . . . . . . . . . . . . . . . . . 8 60 2.2.2. Feedback Packets. . . . . . . . . . . . . . . . . 10 61 3. Data Sender Protocol. . . . . . . . . . . . . . . . . . 10 62 3.1. Sender Initialization. . . . . . . . . . . . . . . . 11 63 3.2. Determining the Maximum RTT. . . . . . . . . . . . . 11 64 3.3. Adjusting the Sending Rate . . . . . . . . . . . . . 12 65 3.4. Controlling Receiver Feedback. . . . . . . . . . . . 13 66 3.5. Assisting Receiver-Side RTT Measurements . . . . . . 14 67 3.6. Slowstart. . . . . . . . . . . . . . . . . . . . . . 15 68 3.7. Scheduling of Packet Transmissions . . . . . . . . . 16 69 4. Data Receiver Protocol. . . . . . . . . . . . . . . . . 16 70 4.1. Receiver Initialization. . . . . . . . . . . . . . . 17 71 4.2. Receiver Leave . . . . . . . . . . . . . . . . . . . 17 72 4.3. Measurement of the Network Conditions. . . . . . . . 17 73 4.3.1. Updating the Loss Event Rate. . . . . . . . . . . 17 74 4.3.2. Basic Round-Trip Time Measurement . . . . . . . . 17 75 4.3.3. One-Way Delay Adjustments . . . . . . . . . . . . 18 76 4.3.4. Receive Rate Measurements . . . . . . . . . . . . 19 77 4.4. Setting the Desired Rate . . . . . . . . . . . . . . 19 78 4.5. Feedback and Feedback Suppression. . . . . . . . . . 20 79 5. Calculation of the Loss Event Rate. . . . . . . . . . . 21 80 5.1. Detection of Lost or Marked Packets. . . . . . . . . 22 81 5.2. Translation from Loss History to Loss Events . . . . 22 82 5.3. Inter-Loss Event Interval. . . . . . . . . . . . . . 23 83 5.4. Average Loss Interval. . . . . . . . . . . . . . . . 24 84 5.5. History Discounting. . . . . . . . . . . . . . . . . 25 85 5.6. Initializing the Loss History after the First 86 Loss Event. . . . . . . . . . . . . . . . . . . . . . . . 27 87 6. Security Considerations . . . . . . . . . . . . . . . . 28 88 7. IANA Considerations . . . . . . . . . . . . . . . . . . 29 89 8. Authors' Addresses. . . . . . . . . . . . . . . . . . . 29 90 9. Acknowledgments . . . . . . . . . . . . . . . . . . . . 29 91 10. References . . . . . . . . . . . . . . . . . . . . . . 30 92 11. Copyright Notice . . . . . . . . . . . . . . . . . . . 31 94 1. Introduction 96 This document specifies TCP-Friendly Multicast Congestion Control 97 (TFMCC) [10]. TFMCC is a source-based, single-rate congestion control 98 scheme that builds upon the unicast TCP-Friendly Rate Control mechanism 99 (TFRC) [2]. TFMCC is stable and responsive under a wide range of network 100 conditions and scales to receiver sets on the order of several thousand 101 receivers. To support scalability, as much congestion control 102 functionality as possible is located at the receivers. Each receiver 103 continuously determines a desired receive rate that is TCP-friendly for 104 the path from the sender to this receiver. Selected receivers then 105 report the rate to the sender in feedback packets. 107 TFMCC is a building block as defined in RFC 3048. Instead of specifying 108 a complete protocol, this document simply specifies a congestion control 109 mechanism that could be used in a transport protocol such as RTP [7], in 110 an application incorporating end-to-end congestion control at the 111 application level. This document does not discuss packet formats, 112 reliability, or implementation-related issues. 114 TFMCC is designed to be reasonably fair when competing for bandwidth 115 with TCP flows. A multicast flow is ``reasonably fair'' if its sending 116 rate is generally within a factor of two of the sending rate of a TCP 117 flow from the sender to the slowest receiver of the multicast group 118 under the same network conditions. 120 In general, TFMCC has a low variation of throughput, which makes it 121 suitable for applications such as streaming media where a relatively 122 smooth sending rate is of importance. The penalty of having smooth 123 throughput while competing fairly for bandwidth is a reduced 124 responsiveness to changes in available bandwidth. Thus TFMCC should be 125 used when the application has a requirement for smooth throughput, in 126 particular, avoiding halving of the sending rate in response to a single 127 packet drop. For applications that simply need to multicast as much 128 data as possible in as short a time as possible, PGMCC [6] may be more 129 suitable. 131 1.1. Terminology 133 In this document, the key words "MUST", "MUST NOT", "REQUIRED", "SHALL", 134 "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and 135 "OPTIONAL" are to be interpreted as described in RFC 2119 and indicate 136 requirement levels for compliant TFMCC implementations. 138 1.2. Related Documents 140 As described in RFC 3048, TFMCC is a building block that is intended to 141 be used, in conjunction with other building blocks, to help specify a 142 protocol instantiation. In particular, TFMCC is a suitable congestion 143 control building block for NACK-Oriented Reliable Multicast (NORM) [1]. 145 1.3. Environmental Requirements and Considerations 147 TFMCC is intended to be a congestion control scheme that can be used in 148 a complete protocol instantiation that delivers objects and streams 149 (both reliable content delivery and streaming of multimedia 150 information). 152 TFMCC is most applicable for sessions that deliver a substantial amount 153 of data, i.e., in length from hundreds of kilobytes to many gigabytes, 154 and whose duration is in the order of tens of seconds or more. 156 TFMCC is intended for multicast delivery. There are currently two 157 models of multicast delivery, the Any-Source Multicast (ASM) model as 158 defined in RFC 1112 and the Source-Specific Multicast (SSM) model as 159 defined in [3]. TFMCC works with both multicast models, but in a 160 slightly different way. When using ASM, feedback from the receivers is 161 multicast to the sender as well as to all other receivers. Feedback can 162 be either multicast on the same group address used for sending data or 163 on a separate multicast feedback group address. For SSM, the receivers 164 must unicast the feedback directly to the sender. Hence, feedback from 165 a receiver will not be received by other receivers. 167 TFMCC inherently works with all types of networks, including LANs, WANs, 168 Intranets, the Internet, asymmetric networks, wireless networks, and 169 satellite networks. However, in some network environments varying the 170 sending rate to the receivers may not be advantageous (e.g., for a 171 satellite or wireless network, there may be no mechanism for receivers 172 to effectively reduce their reception rate since there may be a fixed 173 transmission rate allocated to the session). 175 The difference in responsiveness of TFMCC and TCP may result in 176 significant throughput differences in case of a very low bitrate. TFMCC 177 requires an estimate of the loss event rate to calculate a fair sending 178 rate. This estimate may be inaccurate in case TFMCC receives only very 179 few packets per RTT. TFMCC should not be used together with TCP if the 180 capacity of the bottleneck link is less than 30KBit/s (e.g., a very slow 181 modem connection). TFMCC may also achieve a rate that is very different 182 from the average TCP rate in case buffer space at the bottleneck is 183 severely underprovisioned. In particular, TFMCC is less susceptible to 184 small buffer sizes since TFMCC spaces out packets in time, while TCP 185 sends them back-to-back. Thus TCP is much more likely to see a packet 186 loss if buffer space is scarce. 188 TFMCC is designed for applications that use a fixed packet size, and 189 vary their sending rate in packets per second in response to congestion. 190 Some audio applications require a fixed interval of time between packets 191 and vary their packet size instead of their packet rate in response to 192 congestion. The congestion control mechanism in this document cannot be 193 used by those applications. 195 2. Protocol Overview 197 TFMCC extends the basic mechanisms of TFRC into the multicast domain. 198 In order to compete fairly with TCP, TFMCC receivers individually 199 measure the prevalent network conditions and calculate a rate that is 200 TCP-friendly on the path from the sender to themselves. The rate is 201 determined using an equation for TCP throughput, which roughly describes 202 TCP's sending rate as a function of the loss event rate, round-trip time 203 (RTT), and packet size. We define a loss event as one or more lost or 204 marked packets from the packets received during one RTT, where a marked 205 packet refers to a congestion indication from Explicit Congestion 206 Notification (ECN) [5]. The sending rate of the multicast transmission 207 is adapted to the receiver experiencing the worst network conditions. 209 Basically, TFMCC's congestion control mechanism works as follows: 211 o Each receiver measures the loss event rate and its RTT to the sender. 213 o Each receiver then uses this information, together with an equation 214 for TCP throughput, to derive a TCP-friendly sending rate. 216 o Through a distributed feedback suppression mechanism, only a subset of 217 the receivers are allowed to give feedback to prevent a feedback 218 implosion at the sender. The feedback mechanism ensures that 219 receivers reporting a low desired transmission rate have a high 220 probability of sending feedback. 222 o Receivers whose feedback is not suppressed report the calculated 223 transmission rate back to the sender in so-called receiver reports. 224 The receiver reports serve two purposes: they inform the sender about 225 the appropriate transmit rate, and they allow the receivers to measure 226 their RTT. 228 o The sender selects the receiver that reports the lowest rate as 229 current limiting receiver (CLR). Whenever feedback with an even lower 230 rate reaches the sender, the corresponding receiver becomes CLR and 231 the sending rate is reduced to match that receiver's calculated rate. 233 The sending rate increases when the CLR reports a calculated rate 234 higher than the current sending rate. 236 The dynamics of TFMCC are sensitive to how the measurements are 237 performed and applied and what feedback suppression mechanism is chosen. 238 We recommend specific mechanisms below to perform and apply these 239 measurements. Other mechanisms are possible, but it is important to 240 understand how the interactions between mechanisms affect the dynamics 241 of TFMCC. 243 2.1. TCP Throughput Equation 245 Any realistic equation giving TCP throughput as a function of loss event 246 rate and RTT should be suitable for use in TFMCC. However, we note that 247 the TCP throughput equation used must reflect TCP's retransmit timeout 248 behavior, as this dominates TCP throughput at higher loss rates. We 249 also note that the assumptions implicit in the throughput equation about 250 the loss event rate parameter have to be a reasonable match to how the 251 loss rate or loss event rate is actually measured. While this match is 252 not perfect for the throughput equation and loss rate measurement 253 mechanisms given below, in practice the assumptions turn out to be close 254 enough. 256 The throughput equation we currently recommend for TFMCC is a slightly 257 simplified version of the throughput equation for Reno TCP from [4]: 259 8 s 260 X = --------------------------------------------------------- (1) 261 R * (sqrt(2*p/3) + (12*sqrt(3*p/8) * p * (1+32*p^2))) 263 where 265 X is the transmit rate in bits/second. 267 s is the packet size in bytes. 269 R is the round-trip time in seconds. 271 p is the loss event rate, between 0.0 and 1.0, of the number of 272 loss events as a fraction of the number of packets transmitted. 274 In future, different TCP equations may be substituted for this equation. 275 The requirement is that the throughput equation be a reasonable 276 approximation of the sending rate of TCP for conformant TCP congestion 277 control. 279 The parameters s (packet size), p (loss event rate) and R (RTT) need to 280 be measured or calculated by a TFMCC implementation. The measurement of 281 R is specified in Section 4.3.2, and the measurement of p is specified 282 in Section 5. The parameter s (packet size) is normally known to an 283 application. This may not be so in two cases: 285 o The packet size naturally varies depending on the data. In this case, 286 although the packet size varies, that variation is not coupled to the 287 transmit rate. It should normally be safe to use an estimate of the 288 mean packet size for s. 290 o The application needs to change the packet size rather than the number 291 of packets per second to perform congestion control. This would 292 normally be the case with packet audio applications where a fixed 293 interval of time needs to be represented by each packet. Such 294 applications need to have a different way of measuring parameters. 296 Currently, TFMCC cannot be used for the second class of applications. 298 2.2. Packet Contents 300 Before specifying the sender and receiver functionality, we describe the 301 congestion control information contained in packets sent by the sender 302 and feedback packets from the receivers. Information from the sender 303 can either be sent in separate congestion control messages or 304 piggybacked onto data packets. If separate congestion control messages 305 are sent at time intervals larger than the time interval between data 306 packets (e.g., once per feedback round), it is necessary to be able to 307 include timestamp information destined for more than one receiver to 308 allow a sufficient number of receivers to measure their RTT. 310 As TFMCC will be used along with a transport protocol, we do not specify 311 packet formats, since these depend on the details of the transport 312 protocol used. The recommended representation of the header fields is 313 given below. Alternatively, if the computational overhead of a floating 314 point representation is prohibitive, fixed point arithmetic can be used 315 at the expense of larger packet headers. 317 2.2.1. Sender Packets 319 Each packet sent by the data sender contains the following information: 321 o A sequence number i. This number is incremented by one for each data 322 packet transmitted. The field must be sufficiently large that it does 323 not wrap causing two different packets with the same sequence number 324 to be in the receiver's recent packet history at the same time. In 325 most cases the sequence number will be supplied by the transport 326 protocol used along with TFMCC. 328 o A suppression rate X_supp in bits/s. Only receivers with a calculated 329 rate lower than the suppression rate are eligible to give feedback, 330 unless their RTT is higher than the maximum RTT described below in 331 which case they are also eligible to give feedback. The suppression 332 rate should be represented as a 12 bit floating point value with 5 333 bits for the unsigned exponent and 7 bits for the unsigned mantissa 334 (to represent rates from 100 bit/s to 400 Gbit/s with an error of less 335 than 1%). 337 o A timestamp ts_i indicating when the packet is sent. The resolution 338 of the timestamp should typically be milliseconds and the timestamp 339 should be an unsigned integer value no less than 16 bits wide. 341 o A receiver ID r and a copy of the timestamp ts_r of that receiver's 342 last report, which allows the receiver to measure its RTT. The 343 receiver ID is described in the next section. The resolution of the 344 timestamp echo should be milliseconds and the timestamp should be an 345 unsigned integer value no less than 16 bits wide. If separate instead 346 of piggybacked congestion control messages are used, the packet needs 347 to contain a list of receiver IDs with corresponding timestamps to 348 allow a sufficient number of receivers to simultaneously measure their 349 RTT. For the default values used for the feedback process this 350 corresponds to a list size on the order of 10 to 20 entries. 352 o A flag is_CLR indicating whether the receiver with ID r is the CLR. 354 o A feedback round counter fb_nr. This counter is incremented by the 355 sender at the beginning of a new feedback round to notify the 356 receivers than all feedback for older rounds should be suppressed. 357 The feedback round counter should be at least 4 bits wide. 359 o A maximum RTT value R_max, representing the maximum of the RTTs of all 360 receivers. The RTT should be measured in milliseconds. An 8 bit 361 floating point value with 4 bits for the unsigned exponent and 4 bits 362 for the unsigned mantissa (to represent RTTs from 1 millisecond to 64 363 seconds with an error of ca. 6%) should be used for the 364 representation. 366 2.2.2. Feedback Packets 368 Each feedback packet sent by a data receiver contains the following 369 information: 371 o A unique receiver ID r. In most cases the receiver ID will be 372 supplied by the transport protocol, but it may simply be the IP 373 address of the receiver. 375 o A flag have_RTT indicating whether the receiver has made at least one 376 RTT measurement since it joined the session. 378 o A flag have_loss indicating whether the receiver experienced at least 379 one loss event since it joined the session. 381 o A flag receiver_leave indicating that the receiver will leave the 382 session (and should therefore not be CLR). 384 o A timestamp ts_r indicating when the feedback packet is sent. The 385 representation of the timestamp should be the same as that of the 386 timestamp echo in the data packets. 388 o An echo of the timestamp of the last data packet received. If the 389 last packet received at the receiver has sequence number i, then ts_i 390 is included in the feedback. If there is a delay t_d between 391 receiving that last data packet and sending feedback, then ts_i' = 392 ts_i + t_d is included in the feedback instead of ts_i. The 393 representation of the timestamp echo should be the same as that of the 394 timestamp in the data packets. 396 o A feedback round echo fb_nr, reflecting the highest feedback round 397 counter value received so far. The representation of the feedback 398 round echo should be the same as the one used for the feedback round 399 counter in the data packets. 401 o The desired sending rate X_r. This is the rate calculated by the 402 receiver to be TCP-friendly on the path from the sender to this 403 receiver. The representation of the desired sending rate should be 404 the same as that of the suppression rate in the data packets. 406 3. Data Sender Protocol 408 The data sender multicasts a stream of data packets to the data 409 receivers at a controlled rate. Whenever feedback is received, the 410 sender checks if it is necessary to switch CLRs and to readjust the 411 sending rate. 413 The main tasks that have to be provided by a TFMCC sender are: 415 o adjusting the sending rate, 417 o controlling receiver feedback, and 419 o assisting receiver-side RTT measurements. 421 3.1. Sender Initialization 423 At initialization of the sender, the maximum RTT is set to a value that 424 should be larger than the highest RTT to any of the receivers. It 425 should not be smaller than 500 milliseconds for operation in the public 426 Internet. The sending rate X is initialized to 1 packet per maximum 427 RTT. 429 3.2. Determining the Maximum RTT 431 For each feedback packet that arrives at the sender, the sender computes 432 the instantaneous RTT to the receiver as 434 R_r = t_now - ts_i 436 where t_now is the time the feedback packet arrived. Receivers will 437 have adjusted ts_i for the time interval between receiving the last data 438 packet and sending the corresponding report so that this interval will 439 not be included in R_r. If the actual RTT is smaller than the 440 resolution of the timestamps and t_now equals ts_i, then R_r is set to 441 the smallest positive RTT value larger than 0 (i.e., 1 millisecond in 442 our case). If the instantaneous RTT is larger than the current maximum 443 RTT, the maximum RTT is increased to that value 445 R_max = R_r 447 Otherwise, if no feedback with a higher instantaneous RTT than the 448 maximum RTT is received during a feedback round (see Section 3.4), the 449 maximum RTT is reduced to 451 R_max = MAX(R_max * 0.9, R_peak) 453 where R_peak is the peak receiver RTT measured during the feedback 454 round. 456 The maximum RTT is mainly used for feedback suppression among receivers 457 with heterogeneous RTTs. Feedback suppression is closely coupled to the 458 sending of data packets and for this reason, the maximum RTT must not 459 decrease below the maximum time interval between consecutive data 460 packets: 462 R_max = max(R_max, s/X + t_gran) 464 where t_gran is the granularity of the sender's system clock (see 465 Section 3.7). 467 3.3. Adjusting the Sending Rate 469 When a feedback packet from receiver r arrives at the sender, the sender 470 has to check whether it is necessary to adjust the transmission rate and 471 to switch to a new CLR. 473 How the rate is adjusted depends on the desired rate X_r of the receiver 474 report. We distinguish four cases: 476 1 If no CLR is present, receiver r becomes the current limiting 477 receiver. The sending rate X is directly set to X_r, so long as 478 this would result in a rate increase of less than 8s/R_max bits/s 479 (i.e., 1 packet per R_max). Otherwise X is gradually increased to 480 X_r at an increase rate of no more than 8s/R_max bits/s every R_max 481 seconds. 483 2 If receiver r is not the CLR but a CLR is present, then receiver r 484 becomes the current limiting receiver if X_r is less than the 485 current sending rate X and the receiver_leave flag of that 486 receiver's report is not set. Furthermore, the sending rate is 487 reduced to X_r. 489 3 If receiver r is not the CLR but a CLR is present and the 490 receiver_leave flag of the CLR's last report was set, then receiver 491 r becomes the current limiting receiver. However, if X_r > X, the 492 sending rate is not increased to X_r for the duration of a feedback 493 round to allow other (lower rate) receivers to give feedback and be 494 selected as CLR. 496 4 If receiver r is the CLR, the sending rate is set to the minimum of 497 X_r and X + 8s/R_max bits/s. 499 If the receiver has not yet measured its RTT but already experienced 500 packet loss (indicated by the corresponding flags in the receiver 501 report), the receiver report will include a desired rate that is based 502 on the maximum RTT rather than the actual RTT to that receiver. In this 503 case, the sender adjusts the desired rate using its measurement of the 504 instantaneous RTT R_r to that receiver: 506 X_r' = X_r * R_max / R_r 508 X_r' is then used instead of X_r to detect whether to switch to a new 509 CLR. 511 If the TFMCC sender receives no reports from the CLR for 4 RTTs, the 512 sending rate is cut in half unless the CLR was selected less than 10 513 RTTs ago. In addition, if the sender receives no reports from the CLR 514 for at least 10 RTTs, it assumes that the CLR crashed or left the group. 515 A new CLR is selected from the feedback that subsequently arrives at the 516 sender, and we increase as in case 3 above. 518 If no new CLR can be selected (i.e., in the absence of any feedback from 519 any of the receivers) it is necessary to further reduce the sending 520 rate. For every 10 consecutive RTTs without feedback, the sending rate 521 is cut in half. The rate is at most reduced to one packet every 8 522 seconds. 524 Note that when receivers stop receiving data packets, they will stop 525 sending feedback. This eventually causes the sending rate to be reduced 526 in the case of network failure. If the network subsequently recovers, a 527 linear increase to the calculated rate of the CLR will occur at 8s/R_max 528 bits/s every R_max. 530 3.4. Controlling Receiver Feedback 532 The receivers allowed to send a receiver report are determined in so- 533 called feedback rounds. Feedback rounds have a duration T of six times 534 the maximum RTT. In case the multicast model is ASM, i.e., receiver 535 feedback is multicast to the whole group, the duration of a feedback 536 round may be reduced to four times the maximum RTT. 538 Only receivers wishing to report a rate that is lower than the 539 suppression rate X_supp, or those with a higher RTT than R_max may send 540 feedback. At the beginning of each feedback round, X_supp is set to the 541 highest possible value that can be represented. When feedback arrives 542 at the sender over the course of a feedback round, X_supp is decreased 543 such that more and more feedback is suppressed towards the end of the 544 round. How receiver feedback is spread out over the feedback round is 545 discussed in Section 4.5. 547 Whenever non-CLR feedback for the current round arrives at the sender, 548 X_supp is reduced to 550 X_supp = (1-g) * X_r 552 if X_supp > X_r. Feedback that causes the corresponding receiver to be 553 selected as CLR, but was from a non-CLR receiver at the time of sending 554 also contributes to the feedback suppression. Note that X_r must not be 555 adjusted by the sender to reflect the receiver's real RTT in case X_r 556 was calculated using the maximum RTT, as is done for setting the sending 557 rate (Section 3.3), otherwise a feedback implosion is possible. The 558 parameter g determines to what extent higher rate feedback can suppress 559 lower rate feedback. This mechanism guarantees, that the lowest 560 calculated rate reported lies within a factor of g of the actual lowest 561 calculated rate of the receiver set (see [9]). A value of g of 0.1 is 562 recommended. 564 To allow receivers to suppress their feedback, the sender's suppression 565 rate needs to be updated whenever feedback is received. This 566 suppression rate has to be communicated to the receivers in a timely 567 manner, either by including it in the data packet header or, if separate 568 congestion control messages are used, by sending a message with the 569 suppression rate whenever the rate changes significantly (i.e., when it 570 is reduced to less than (1-g) times the previously advertised 571 suppression rate). 573 After a time span of T, the feedback round ends if non-CLR feedback was 574 received during that time. Otherwise, the feedback round ends as soon 575 as the first non-CLR feedback message arrives at the sender but at most 576 after 2T. The feedback round counter is incremented by one and the 577 suppression rate X_supp is reset to the highest representable value. 578 The feedback round counter restarts with round 0 after a wrap-around. 580 3.5. Assisting Receiver-Side RTT Measurements 582 Receivers measure their RTT by sending a timestamp with a receiver 583 report, which is echoed by the sender. If congestion control 584 information is piggybacked onto data packets, usually only one receiver 585 ID and timestamp can be included. If multiple feedback messages from 586 different receivers arrive at the sender during the time interval 587 between two data packets, the sender has to decide which receiver to 588 allow to measure RTT. The same applies if separate congestion control 589 messages allow to echo multiple receiver timestamps simultaneously but 590 the number of receivers that gave feedback since the last congestion 591 control message exceeds the list size. 593 The sender's timestamp echoes are prioritized in the following order: 595 1. a new CLR (after a change of CLR's) or a CLR without any previous 596 RTT measurements 598 2. receivers without any previous RTT measurements in the order of the 599 feedback round echo of the corresponding receiver report (i.e., 600 older feedback first) 602 3. non-CLR receivers with previous RTT measurements, again in 603 ascending order of the feedback round echo of the report 605 4. the CLR 607 Ties are broken in favor of the receiver with the lowest reported rate. 609 It is necessary to account for the time that elapses between receiving a 610 report and sending the next data packet. This time needs to be deducted 611 from the RTT and thus has to be added to the receiver's timestamp value. 613 Whenever no feedback packets arrive in the interval between two data 614 packets, the CLR's last timestamp, adjusted by the appropriate offset, 615 is echoed. When the number of packets per RTT is so low that all 616 packets carry a non-CLR receiver's timestamp, the CLR's timestamp and ID 617 are included in a data packet at least once per feedback round. 619 3.6. Slowstart 621 TFMCC uses a slowstart mechanism to quickly approach its fair bandwidth 622 share at the start of a session. During slowstart, the sending rate 623 increases exponentially. The rate increase is limited to the minimum of 624 the rates included in the receiver reports and receivers report twice 625 the rate at which they currently receive data. As in normal congestion 626 control mode, the receiver with the smallest reported rate becomes CLR. 627 Since a receiver can never receive data at a rate higher than its link 628 bandwidth, this effectively limits the overshoot to twice this 629 bandwidth. In case the resulting increase over R_max is less than 630 8s/R_max bits/s, the sender may choose to increase the rate by up to 631 8s/R_max bits/s every R_max. The current sending rate is gradually 632 adjusted to the target rate reported in the receiver reports over the 633 course of a RTT. Slowstart is terminated as soon as any one of the 634 receivers experiences its first packet loss. Since that receiver's 635 calculated rate will be lower than the current sending rate, the 636 receiver will be selected as CLR. 638 During slowstart, the upper bound on the rate increase of 8s/R_max 639 bits/s every RTT does not apply. Only after the TFMCC sender receives 640 the first report with the have_loss flag set is the rate increase 641 limited in this way. 643 3.7. Scheduling of Packet Transmissions 645 As TFMCC is rate-based, and as operating systems typically cannot 646 schedule events precisely, it is necessary to be opportunistic about 647 sending data packets so that the correct average rate is maintained 648 despite the coarse-grain or irregular scheduling of the operating 649 system. Thus a typical sending loop will calculate the correct inter- 650 packet interval, t_ipi, as follows: 652 t_ipi = s/X 654 When a sender first starts sending at time t_0, it calculates t_ipi, and 655 calculates a nominal send time t_1 = t_0 + t_ipi for packet 1. When the 656 application becomes idle, it checks the current time, t_now, and then 657 requests re-scheduling after (t_ipi - (t_now - t_0)) seconds. When the 658 application is re-scheduled, it checks the current time, t_now, again. 659 If (t_now > t_1 - delta) then packet 1 is sent (see below for delta). 661 Now a new t_ipi may be calculated, and used to calculate a nominal send 662 time t_2 for packet 2: t2 = t_1 + t_ipi. The process then repeats, with 663 each successive packet's send time being calculated from the nominal 664 send time of the previous packet. 666 In some cases, when the nominal send time, t_i, of the next packet is 667 calculated, it may already be the case that t_now > t_i - delta. In 668 such a case the packet should be sent immediately. Thus if the 669 operating system has coarse timer granularity and the transmit rate is 670 high, then TFMCC may send short bursts of several packets separated by 671 intervals of the OS timer granularity. 673 The parameter delta is to allow a degree of flexibility in the send time 674 of a packet. If the operating system has a scheduling timer granularity 675 of t_gran seconds, then delta would typically be set to: 677 delta = min(t_ipi/2, t_gran/2) 679 t_gran is 10 milliseconds on many Unix systems. If t_gran is not known, 680 a value of 10 milliseconds can be safely assumed. 682 4. Data Receiver Protocol 684 Receivers measure the current network conditions, namely RTT and loss 685 event rate, and use this information to calculate a rate that is fair to 686 competing traffic. The rate is then communicated to the sender in 687 receiver reports. Due to the potentially large number of receivers, it 688 is undesirable that all receivers send reports, especially not at the 689 same time. 691 In the description of the receiver functionality, we will first address 692 how the receivers measure the network parameters and then discuss the 693 feedback process. 695 4.1. Receiver Initialization 697 The receiver is initialized when it receives the first data packet. The 698 RTT is set to the maximum RTT value contained in the data packet. This 699 initial value is used as the receiver's RTT until the first real RTT 700 measurement is made. The loss event rate is initialized to 0. 702 4.2. Receiver Leave 704 A receiver that sends feedback but wishes to leave the TFMCC session 705 within the next feedback round may indicate the pending leave by setting 706 the receiver_leave flag in its report. If the leaving receiver is the 707 CLR, the receiver_leave flag should be set for all the reports within 708 the feedback round before the leave takes effect. 710 4.3. Measurement of the Network Conditions 712 Receivers have to update their estimate of the network parameters with 713 each new data packet they receive. 715 4.3.1. Updating the Loss Event Rate 717 When a data packet is received, the receiver adds the packet to the 718 packet history. It then recalculates the new value of the loss event 719 rate p. The loss event rate measurement mechanism is described 720 separately in Section 5. 722 4.3.2. Basic Round-Trip Time Measurement 724 When a receiver gets a data packet that carries the receiver's own ID in 725 the r field, the receiver updates its RTT estimate. 727 1. The current RTT is calculated as: 729 R_sample = t_now - ts_r 731 where t_now is the time the data packet arrives at the receiver and 732 ts_r is the receiver report timestamp echoed in the data packet. If 733 the actual RTT is smaller than the resolution of the timestamps and 734 t_now equals ts_r, then R_sample is set to the smallest positive 735 RTT value larger than 0 (i.e., 1 millisecond in our case). 737 2. The smoothed RTT estimate R is updated: 739 If no feedback has been received before 740 R = R_sample 741 Else 742 R = q*R + (1-q)*R_sample 744 A filter parameter q of 0.5 is recommended for non-CLR receivers. 745 The CLR performs RTT measurements much more frequently and hence 746 should use a higher filter value. We recommend using q=0.9. Note 747 that TFMCC is not sensitive to the precise value for the filter 748 constant. 750 Optionally, sender-based instead of receiver-based RTT measurements may 751 be used. The sender already determines the RTT to a receiver from the 752 receiver's echo of the sender's own timestamp for the calculation of the 753 maximum RTT. For sender-based RTT measurements, this RTT measurement 754 needs to be communicated to the receiver. Instead of including an echo 755 of the receiver's timestamp, the sender includes the receiver's RTT in 756 the next data packet, using the prioritization rules described in 757 Section 3.5. 759 To simplify sender operation, smoothing of RTT samples as described 760 above should still be done at the receiver. 762 4.3.3. One-Way Delay Adjustments 764 When a RTT measurement is performed, the receiver also determines the 765 one-way delay D_r from itself to the sender: 767 D_r = ts_r - ts_i 769 where ts_i and ts_r are the timestamp and timestamp echo contained in 770 the data packet. With each new data packet i', the receiver can now 771 calculate an updated RTT estimate as: 773 R' = max(D_r + t_now - ts_i', 1 millisecond) 775 In between RTT measurements, the updated R' is used instead of the 776 smoothed RTT R. Like the RTT samples, R' must be strictly positive. When 777 a new measurement is made, all interim one-way delay measurements are 778 discarded (i.e., the smoothed RTT is updated according to Section 4.3.2 779 without taking one-way delay adjustments into account). 781 For the one-way delay measurements, the clocks of sender and receivers 782 need not be synchronized. Clock skew will cancel itself out when both 783 one-way measurements are added to form a RTT estimate, as long as clock 784 drift between real RTT measurements is negligible. 786 The same one-way delay adjustments should be applied to the RTT supplied 787 by the sender when using sender-based RTT measurements. 789 4.3.4. Receive Rate Measurements 791 When a receiver has not experienced any loss events, it cannot calculate 792 a TCP-friendly rate to include in the receiver reports. Instead, the 793 receiver measures the current receive rate and sets the desired rate X_r 794 to twice the receive rate. 796 The receive rate in bits/s is measured as the number of bits received 797 over the last k RTTs, taking into account the IP and transport packet 798 headers, but excluding the link-layer packet headers. A value for k 799 between 2 and 4 is recommended. 801 4.4. Setting the Desired Rate 803 When a receiver measures a non-zero loss event rate, it calculates the 804 desired rate using Equation (1). In case no RTT measurement is 805 available yet, the maximum RTT is used instead of the receiver's RTT. 806 The desired rate X_r is updated whenever the loss event rate or the RTT 807 changes. 809 A receiver may decide to not report desired rates that are below 1 810 packet per 8 seconds, since a sender is very slow to recover from such 811 low sending rates. In this case, the receiver reports a desired rate of 812 1 packet per 8 seconds. However, it must leave the multicast group if 813 for more than 120 seconds the calculated rate falls below the reported 814 rate and the current sending rate is higher than the receiver's 815 calculated rate. 817 As mentioned above, calculation of the desired rate is not possible 818 before the receiver experiences the first loss event and in that case 819 twice the rate at which data is received is included in the receiver 820 reports as X_r. This mechanism allows the sender to slowstart as 821 described in Section 3.6. 823 4.5. Feedback and Feedback Suppression 825 Let fb_nr be the highest feedback round counter value received by a 826 receiver. When a new data packet arrives with a higher feedback round 827 counter than fb_nr, a new feedback round begins and fb_nr is updated. 828 Outstanding feedback for the old round is canceled. In case a feedback 829 number with a value that is more than half the feedback number space 830 lower than fb_nr is received, the receiver assumes that the feedback 831 round counter wrapped and also cancels the feedback timer and updates 832 fb_nr. 834 The CLR sends its feedback independently from all the other receivers 835 once per RTT. Its feedback does not suppress other feedback and cannot 836 be suppressed by other receiver's feedback. 838 Non-CLR receivers set a feedback timer at the beginning of a feedback 839 round. Using an exponentially weighted random timer mechanism, the 840 feedback timer is set to expire after 842 t = max(T * (1 + log(x)/log(N)), 0) 844 where 846 x is a random variable uniformly distributed in (0,1], 848 T is the duration of a feedback round (i.e., 6 * R_max), 850 N is an estimated upper bound on the number of receivers. 852 N is a constant specific to the TFMCC protocol. Since TFMCC scales to 853 up to thousands of receivers, setting N to 10,000 for all receivers (and 854 limiting the TFMCC session to at most 10,000 receivers) is recommended. 856 A feedback packet is sent when the feedback timer expires, unless the 857 timer is canceled beforehand. When the multicast model is ASM, feedback 858 is multicast to the whole group, otherwise the feedback is unicast to 859 the sender. The feedback packet includes the calculated rate valid at 860 the time the feedback packet is sent (not the rate at the point of time 861 when the feedback timer is set). The copy of the timestamp ts_i of the 862 last data packet received, which is included in the feedback packet, 863 needs to be adjusted by the time interval between receiving the data 864 packet and sending the report to allow the sender to correctly infer the 865 instantaneous RTT (i.e., that time interval has to be added to the 866 timestamp value). 868 The timer is canceled if a data packet with a lower suppression rate 869 than the receiver's calculated rate and a higher or equal maximum RTT 870 than the receiver's RTT is received. Likewise, a data packet indicating 871 the beginning of a new feedback round cancels all feedback for older 872 rounds. In case of ASM, the timer is also canceled if a feedback packet 873 from another non-CLR receiver reporting a lower rate is received. 875 The feedback suppression process is complicated by the fact that the 876 calculated rates of the receivers will change during a feedback round. 877 If the calculated rates decrease rapidly for all receivers, feedback 878 suppression can no longer prevent a feedback implosion since earlier 879 feedback will always report a higher rate than current feedback. To make 880 the feedback suppression mechanism robust in the face of changing rates, 881 it is necessary to introduce X_fbr, the calculated rate of a receiver at 882 the beginning of a feedback round. A receiver needs to suppress its 883 feedback not only when the suppression rate is less than the receiver's 884 current calculated rate but also in the case that the suppression rate 885 falls below X_fbr. 887 When the maximum RTT changes significantly during one feedback round, it 888 is necessary to reschedule the feedback timer in proportion to the 889 change. 891 t = t * R_max / R_max' 893 where R_max is the new maximum RTT and R_max' is the previous maximum 894 RTT. The same considerations hold, when the last data packets were 895 received more than a time interval of R_max ago. In this case, it is 896 necessary to add the difference of the inter-packet gap and the maximum 897 RTT to the feedback time to prevent a feedback implosion (e.g., in case 898 the sender crashed). 900 t = t + max(t_now - tr_i - R_max, 0) 902 where tr_i is the time when the last data packet arrived at the 903 receiver. 905 More details on the characteristics of the feedback suppression 906 mechanism can be found in [9] and [10]. 908 5. Calculation of the Loss Event Rate 910 Obtaining an accurate and stable measurement of the loss event rate is 911 of primary importance for TFMCC. Loss rate measurement is performed at 912 the receiver, based on the detection of lost or marked packets from the 913 sequence numbers of arriving packets. 915 5.1. Detection of Lost or Marked Packets 917 TFMCC assumes that all packets contain a sequence number that is 918 incremented by one for each packet that is sent. For the purposes of 919 this specification, we require that if a lost packet is retransmitted, 920 the retransmission is given a new sequence number that is the latest in 921 the transmission sequence, and not the same sequence number as the 922 packet that was lost. If a transport protocol has the requirement that 923 it must retransmit with the original sequence number, then the transport 924 protocol designer must figure out how to distinguish delayed from 925 retransmitted packets and how to detect lost retransmissions. 927 The receivers each maintain a data structure that keeps track of which 928 packets have arrived and which are missing. For the purposes of 929 specification, we assume that the data structure consists of a list of 930 packets that have arrived along with the timestamp when each packet was 931 received. In practice this data structure will normally be stored in a 932 more compact representation, but this is implementation-specific. 934 The loss of a packet is detected by the arrival of at least three 935 packets with a higher sequence number than the lost packet. The 936 requirement for three subsequent packets is the same as with TCP, and is 937 to make TFMCC more robust in the presence of reordering. In contrast to 938 TCP, if a packet arrives late (after 3 subsequent packets arrived) at a 939 receiver, the late packet can fill the hole in the reception record, and 940 the receiver can recalculate the loss event rate. Future versions of 941 TFMCC might make the requirement for three subsequent packets adaptive 942 based on experienced packet reordering, but we do not specify such a 943 mechanism here. 945 For an ECN-capable connection, a marked packet is detected as a 946 congestion event as soon as it arrives, without having to wait for the 947 arrival of subsequent packets. 949 5.2. Translation from Loss History to Loss Events 951 TFMCC requires that the loss event rate be robust to several consecutive 952 packets lost where those packets are part of the same loss event. This 953 is similar to TCP, which (typically) only performs one halving of the 954 congestion window during any single RTT. Thus the receivers needs to 955 map the packet loss history into a loss event record, where a loss event 956 is one or more packets lost in a RTT. 958 To determine whether a lost or marked packet should start a new loss 959 event, or be counted as part of an existing loss event, we need to 960 compare the sequence numbers and timestamps of the packets that arrived 961 at the receiver. For a marked packet S_new, its reception time T_new 962 can be noted directly. For a lost packet, we can interpolate to infer 963 the nominal "arrival time". Assume: 965 S_loss is the sequence number of a lost packet. 967 S_before is the sequence number of the last packet to arrive with 968 sequence number before S_loss. 970 S_after is the sequence number of the first packet to arrive with 971 sequence number after S_loss. 973 T_before is the reception time of S_before. 975 T_after is the reception time of S_after. 977 Note that T_before can either be before or after T_after due to 978 reordering. 980 For a lost packet S_loss, we can interpolate its nominal "arrival time" 981 at the receiver from the arrival times of S_before and S_after. Thus 983 T_loss = T_before + ( (T_after - T_before) 984 * (S_loss - S_before)/(S_after - S_before) ); 986 Note that if the sequence space wrapped between S_before and S_after, 987 then the sequence numbers must be modified to take this into account 988 before performing the calculation. If the largest possible sequence 989 number is S_max, and S_before > S_after, then modifying each sequence 990 number S by S' = (S + (S_max + 1)/2) mod (S_max + 1) would normally be 991 sufficient. 993 If the lost packet S_old was determined to have started the previous 994 loss event, and we have just determined that S_new has been lost, then 995 we interpolate the nominal arrival times of S_old and S_new, called 996 T_old and T_new respectively. 998 If T_old + R >= T_new, then S_new is part of the existing loss event. 999 Otherwise S_new is the first packet of a new loss event. 1001 5.3. Inter-Loss Event Interval 1003 If a loss interval, A, is determined to have started with packet 1004 sequence number S_A and the next loss interval, B, started with packet 1005 sequence number S_B, then the number of packets in loss interval A is 1006 given by (S_B - S_A). 1008 5.4. Average Loss Interval 1010 To calculate the loss event rate p, we first calculate the average loss 1011 interval. This is done using a filter that weights the n most recent 1012 loss event intervals in such a way that the measured loss event rate 1013 changes smoothly. 1015 Weights w_0 to w_(n-1) are calculated as: 1017 If (i < n/2) 1018 w_i = 1; 1019 Else 1020 w_i = 1 - (i - (n/2 - 1))/(n/2 + 1); 1022 Thus if n=8, the values of w_0 to w_7 are: 1024 1.0, 1.0, 1.0, 1.0, 0.8, 0.6, 0.4, 0.2 1026 The value n for the number of loss intervals used in calculating the 1027 loss event rate determines TFMCC's speed in responding to changes in the 1028 level of congestion. As currently specified, TFMCC should not be used 1029 for values of n significantly greater than 8, for traffic that might 1030 compete in the global Internet with TCP. At the very least, safe 1031 operation with values of n greater than 8 would require a slight change 1032 to TFMCC's mechanisms to include a more severe response to two or more 1033 round-trip times with heavy packet loss. 1035 When calculating the average loss interval we need to decide whether to 1036 include the interval since the most recent packet loss event. We only 1037 do this if it is sufficiently large to increase the average loss 1038 interval. 1040 Thus if the most recent loss intervals are I_0 to I_n, with I_0 being 1041 the interval since the most recent loss event, then we calculate the 1042 average loss interval I_mean as: 1044 I_tot0 = 0; 1045 I_tot1 = 0; 1046 W_tot = 0; 1047 for (i = 0 to n-1) { 1048 I_tot0 = I_tot0 + (I_i * w_i); 1049 W_tot = W_tot + w_i; 1050 } 1051 for (i = 1 to n) { 1052 I_tot1 = I_tot1 + (I_i * w_(i-1)); 1053 } 1054 I_tot = max(I_tot0, I_tot1); 1055 I_mean = I_tot/W_tot; 1057 The loss event rate, p is simply: 1059 p = 1 / I_mean; 1061 5.5. History Discounting 1063 As described in Section 5.4, the most recent loss interval is only 1064 assigned 4/(3*n) of the total weight in calculating the average loss 1065 interval, regardless of the size of the most recent loss interval. This 1066 section describes an optional history discounting mechanism that allows 1067 the TFMCC receivers to adjust the weights, concentrating more of the 1068 relative weight on the most recent loss interval, when the most recent 1069 loss interval is more than twice as large as the computed average loss 1070 interval. 1072 To carry out history discounting, we associate a discount factor DF_i 1073 with each loss interval L_i, where each discount factor is a floating 1074 point number. The discount array maintains the cumulative history of 1075 discounting for each loss interval. At the beginning, the values of 1076 DF_i in the discount array are initialized to 1: 1078 for (i = 0 to n) { 1079 DF_i = 1; 1080 } 1082 History discounting also uses a general discount factor DF, also a 1083 floating point number, that is also initialized to 1. First we show how 1084 the discount factors are used in calculating the average loss interval, 1085 and then we describe later in this section how the discount factors are 1086 modified over time. 1088 As described in Section 5.4 the average loss interval is calculated 1089 using the n previous loss intervals I_1, ..., I_n, and the interval I_0 1090 that represents the number of packets received since the last loss 1091 event. The computation of the average loss interval using the discount 1092 factors is a simple modification of the procedure in Section 5.4, as 1093 follows: 1095 I_tot0 = I_0 * w_0 1096 I_tot1 = 0; 1097 W_tot0 = w_0 1098 W_tot1 = 0; 1099 for (i = 1 to n-1) { 1100 I_tot0 = I_tot0 + (I_i * w_i * DF_i * DF); 1101 W_tot0 = W_tot0 + w_i * DF_i * DF; 1102 } 1103 for (i = 1 to n) { 1104 I_tot1 = I_tot1 + (I_i * w_(i-1) * DF_i); 1105 W_tot1 = W_tot1 + w_(i-1) * DF_i; 1106 } 1107 p = min(W_tot0/I_tot0, W_tot1/I_tot1); 1109 The general discounting factor, DF is updated on every packet arrival as 1110 follows. First, a receiver computes the weighted average I_mean of the 1111 loss intervals I_1, ..., I_n: 1113 I_tot = 0; 1114 W_tot = 0; 1115 for (i = 1 to n) { 1116 W_tot = w_(i-1) * DF_i; 1117 I_tot = I_tot + (I_i * w_(i-1) * DF_i); 1118 } 1119 I_mean = I_tot / W_tot; 1121 This weighted average I_mean is compared to I_0, the number of packets 1122 received since the last loss event. If I_0 is greater than twice 1123 I_mean, then the new loss interval is considerably larger than the old 1124 ones, and the general discount factor DF is updated to decrease the 1125 relative weight on the older intervals, as follows: 1127 if (I_0 > 2 * I_mean) { 1128 DF = 2 * I_mean/I_0; 1129 if (DF < THRESHOLD) 1130 DF = THRESHOLD; 1131 } else 1132 DF = 1; 1134 A nonzero value for THRESHOLD ensures that older loss intervals from an 1135 earlier time of high congestion are not discounted entirely. We 1136 recommend a THRESHOLD of 0.5. Note that with each new packet arrival, 1137 I_0 will increase further, and the discount factor DF will be updated. 1139 When a new loss event occurs, the current interval shifts from I_0 to 1140 I_1, loss interval I_i shifts to interval I_(i+1), and the loss interval 1141 I_n is forgotten. The previous discount factor DF has to be 1142 incorporated into the discount array. Because DF_i carries the discount 1143 factor associated with loss interval I_i, the DF_i array has to be 1144 shifted as well. This is done as follows: 1146 for (i = 1 to n) { 1147 DF_i = DF * DF_i; 1148 } 1149 for (i = n-1 to 0 step -1) { 1150 DF_(i+1) = DF_i; 1151 } 1152 I_0 = 1; 1153 DF_0 = 1; 1154 DF = 1; 1156 This completes the description of the optional history discounting 1157 mechanism. We emphasize that this is an optional mechanism whose sole 1158 purpose is to allow TFMCC to response somewhat more quickly to the 1159 sudden absence of congestion, as represented by a long current loss 1160 interval. 1162 5.6. Initializing the Loss History after the First Loss Event 1164 The number of packets received before the first loss event usually does 1165 not reflect the current loss event rate. When the first loss event 1166 occurs, a TFMCC receiver assumes that the correct data rate is the rate 1167 at which data was received during the last RTT when the loss occurred. 1168 Instead of initializing the first loss interval to the number of packets 1169 sent until the first loss event, the TFMCC receiver calculates the loss 1170 interval that would be required to produce the receive rate X_recv, and 1171 uses this synthetic loss interval l_0 to seed the loss history 1172 mechanism. 1174 The initial loss interval is calculated by inverting a simplified 1175 version of the TCP Equation (1). 1177 s 1178 X_recv = sqrt(3/2) * ----------------- 1179 R * sqrt(1/l_0) 1181 X_recv * R 1182 ==> l_0 = (---------------)^2 1183 sqrt(3/2) * s 1185 The resulting initial loss interval is too small at higher loss rates 1186 compared to using the more accurate Equation (1), which leads to a more 1187 conservative initial loss event rate. 1189 If a receiver still uses the initial RTT R_max instead of its real RTT, 1190 the initial loss interval is too large in case the initial RTT is higher 1191 than the actual RTT. As a consequence, the receiver will calculate a 1192 too high desired rate when the first RTT measurement R is made and the 1193 initial loss interval is still in the loss history. The receiver has to 1194 adjust l_0 as follows: 1196 l_0 = l_0 * (R/R_max)^2 1198 No action needs to be taken when the first RTT measurement is made after 1199 the initial loss interval left the loss history. 1201 6. Security Considerations 1203 TFMCC is not a transport protocol in its own right, but a congestion 1204 control mechanism that is intended to be used in conjunction with a 1205 transport protocol. Therefore security primarily needs to be considered 1206 in the context of a specific transport protocol and its authentication 1207 mechanisms. 1209 Congestion control mechanisms can potentially be exploited to create 1210 denial of service. This may occur through spoofed feedback. Thus any 1211 transport protocol that uses TFMCC should take care to ensure that 1212 feedback is only accepted from valid receivers of the data. The precise 1213 mechanism to achieve this will however depend on the transport protocol 1214 itself. 1216 Congestion control mechanisms may potentially be manipulated by a greedy 1217 receiver that wishes to receive more than its fair share of network 1218 bandwidth. However, in TFMCC a receiver can only influence the sending 1219 rate if it is the CLR and thus has the lowest calculated rate of all 1220 receivers. If the calculated rate is then manipulated such that it 1221 exceeds the calculated rate of the second to lowest receiver, it will 1222 cease to be CLR. A greedy receiver can only significantly increase then 1223 transmission rate if it is the only participant in the session. If such 1224 scenarios are of concern, possible defenses against such a receiver 1225 would normally include some form of nonce that the receiver must feed 1226 back to the sender to prove receipt. However, the details of such a 1227 nonce would depend on the transport protocol, and in particular on 1228 whether the transport protocol is reliable or unreliable. 1230 It is possible that a receiver sends feedback claiming it has a very low 1231 calculated rate. This will reduce the rate of the multicast session and 1232 might render it useless but obviously cannot hurt the network itself. 1234 We expect that protocols incorporating ECN with TFMCC will also want to 1235 incorporate feedback from the receiver to the sender using the ECN nonce 1236 [8]. The ECN nonce is a modification to ECN that protects the sender 1237 from the accidental or malicious concealment of marked packets. Again, 1238 the details of such a nonce would depend on the transport protocol, and 1239 are not addressed in this document. 1241 7. IANA Considerations 1243 There are no IANA actions required for this document. 1245 8. Authors' Addresses 1247 Joerg Widmer 1248 DoCoMo Euro-Labs 1249 Landsberger Str. 312, Munich, Germany 1250 widmer@acm.org 1252 Mark Handley 1253 UCL (University College London) 1254 Gower Street, London WC1E 6BT, UK 1255 m.handley@cs.ucl.ac.uk 1257 9. Acknowledgments 1259 We would like to acknowledge feedback and discussions on equation-based 1260 congestion control with a wide range of people, including members of the 1261 Reliable Multicast Research Group, the Reliable Multicast Transport 1262 Working Group, and the End-to-End Research Group. We would particularly 1263 like to thank Brian Adamson, Mark Pullen, and Fei Zhao for feedback on 1264 earlier versions of this document. 1266 10. References 1268 [1] B. Adamson, C. Bormann, M. Handley, and J. Macker, "NACK-Oriented 1269 Reliable Multicast (NORM) Protocol Building Blocks", Internet Draft 1270 draft-ietf-rmt-norm-bb-09.txt, July 2004, work in progress. 1271 Citation for informational purposes only. 1273 [2] S. Floyd, M. Handley, J. Padhye, and J. Widmer, "Equation-Based 1274 Congestion Control for Unicast Applications", Proc ACM SIGCOMM 1275 2000, Stockholm, August 2000 1277 [3] H. W. Holbrook, "A Channel Model for Multicast," Ph.D. 1278 Dissertation, Stanford University, Department of Computer Science, 1279 Stanford, California, August 2001. 1281 [4] J. Padhye, V. Firoiu, D. Towsley, and J. Kurose, "Modeling TCP 1282 Throughput: A Simple Model and its Empirical Validation", Proc ACM 1283 SIGCOMM 1998. 1285 [5] K. Ramakrishnan and S. Floyd, "A Proposal to add Explicit Congestion 1286 Notification (ECN) to IP", RFC 2481, January 1999. 1288 [6] L. Rizzo, "pgmcc: a TCP-friendly single-rate multicast congestion 1289 control scheme", Proc ACM SIGCOMM 2000, Stockholm, August 2000 1291 [7] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, "RTP: A 1292 Transport Protocol for Real-Time Applications", RFC 1889, January 1293 1996. 1295 [8] N. Spring, D. Wetherall, and D. Ely, "Robust Explicit Congestion 1296 Notification (ECN) Signaling with Nonces", RFC 3540, June 2003. 1298 [9] J. Widmer and T. Fuhrmann, "Extremum Feedback for Very Large 1299 Multicast Groups", Proc NGC 2001, London, November 2001. 1301 [10] J. Widmer and M. Handley, "Extending Equation-Based Congestion 1302 Control to Multicast Applications", Proc ACM SIGCOMM 2001, San 1303 Diego, August 2001 1305 11. Copyright Notice 1307 Copyright (C) The Internet Society (2006). This document is subject to 1308 the rights, licenses and restrictions contained in BCP 78, and except as 1309 set forth therein, the authors retain all their rights. 1311 This document and the information contained herein are provided on an 1312 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR 1313 IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 1314 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 1315 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 1316 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1317 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.