idnits 2.17.1 draft-ietf-rmt-bb-tfmcc-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3667, Section 5.1 on line 16. -- Found old boilerplate from RFC 3978, Section 5.5 on line 1288. ** The document seems to lack an RFC 3978 Section 5.1 IPR Disclosure Acknowledgement -- however, there's a paragraph with a matching beginning. Boilerplate error? ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. ** The document seems to lack an RFC 3979 Section 5, para. 1 IPR Disclosure Acknowledgement. ** The document seems to lack an RFC 3979 Section 5, para. 2 IPR Disclosure Acknowledgement. ** The document seems to lack an RFC 3979 Section 5, para. 3 IPR Disclosure Invitation. ** The document uses RFC 3667 boilerplate or RFC 3978-like boilerplate instead of verbatim RFC 3978 boilerplate. After 6 May 2005, submission of drafts without verbatim RFC 3978 boilerplate is not accepted. The following non-3978 patterns matched text found in the document. That text should be removed or replaced: By submitting this Internet-Draft, I certify that any applicable patent or other IPR claims of which I am aware have been disclosed, or will be disclosed, and any of which I become aware will be disclosed, in accordance with RFC 3668. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There is 1 instance of too long lines in the document, the longest one being 1 character in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (13 July 2004) is 7228 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: 'WES01' on line 1205 == Unused Reference: '5' is defined on line 1252, but no explicit reference was found in the text == Unused Reference: '9' is defined on line 1265, but no explicit reference was found in the text -- No information found for draft-ietf-rmt-norm-bb - is the name correct? -- Possible downref: Normative reference to a draft: ref. '1' -- Possible downref: Non-RFC (?) normative reference: ref. '2' -- Possible downref: Non-RFC (?) normative reference: ref. '3' -- Possible downref: Non-RFC (?) normative reference: ref. '4' ** Obsolete normative reference: RFC 2988 (ref. '5') (Obsoleted by RFC 6298) ** Obsolete normative reference: RFC 2481 (ref. '6') (Obsoleted by RFC 3168) -- Possible downref: Non-RFC (?) normative reference: ref. '7' ** Obsolete normative reference: RFC 1889 (ref. '8') (Obsoleted by RFC 3550) ** Downref: Normative reference to an Historic draft: draft-ietf-tsvwg-tcp-nonce (ref. '9') -- Possible downref: Non-RFC (?) normative reference: ref. '10' -- Possible downref: Non-RFC (?) normative reference: ref. '11' Summary: 15 errors (**), 0 flaws (~~), 4 warnings (==), 13 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force RMT WG 2 INTERNET-DRAFT Joerg Widmer/EPFL 3 draft-ietf-rmt-bb-tfmcc-03.txt Mark Handley/ICIR 5 13 July 2004 6 Expires: January 2005 8 TCP-Friendly Multicast Congestion Control (TFMCC): 9 Protocol Specification 11 Status of this Document 13 By submitting this Internet-Draft, I certify that any applicable patent 14 or other IPR claims of which I am aware have been disclosed, or will be 15 disclosed, and any of which I become aware will be disclosed, in 16 accordance with RFC 3668. 18 Internet-Drafts are working documents of the Internet Engineering Task 19 Force (IETF), its areas, and its working groups. Note that other groups 20 may also distribute working documents as Internet-Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet- Drafts as reference material 25 or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 This document is a product of the IETF RMT WG. Comments should be 34 addressed to the authors, or the WG's mailing list at rmt@lbl.gov. 36 Abstract 38 This document specifies TCP-Friendly Multicast Congestion 39 Control (TFMCC). TFMCC is a congestion control mechanism for 40 multicast transmissions in a best-effort Internet environment. 42 It is a single-rate congestion control scheme, where the 43 sending rate is adapted to the receiver experiencing the worst 44 network conditions. TFMCC is reasonably fair when competing 45 for bandwidth with TCP flows and has a relatively low 46 variation of throughput over time, making it suitable for 47 applications such as streaming media where a relatively smooth 48 sending rate is of importance. 50 Table of Contents 52 1. Introduction. . . . . . . . . . . . . . . . . . . . . . 4 53 1.1. Terminology. . . . . . . . . . . . . . . . . . . . . 5 54 1.2. Related Documents. . . . . . . . . . . . . . . . . . 5 55 1.3. Environmental Requirements and Considerations. . . . 5 56 2. Protocol Overview . . . . . . . . . . . . . . . . . . . 6 57 2.1. TCP Throughput Equation. . . . . . . . . . . . . . . 7 58 2.2. Packet Contents. . . . . . . . . . . . . . . . . . . 8 59 2.2.1. Sender Packets. . . . . . . . . . . . . . . . . . 8 60 2.2.2. Feedback Packets. . . . . . . . . . . . . . . . . 9 61 3. Data Sender Protocol. . . . . . . . . . . . . . . . . . 10 62 3.1. Sender Initialization. . . . . . . . . . . . . . . . 11 63 3.2. Determining the Maximum RTT. . . . . . . . . . . . . 11 64 3.3. Adjusting the Sending Rate . . . . . . . . . . . . . 12 65 3.4. Controlling Receiver Feedback. . . . . . . . . . . . 13 66 3.5. Assisting Receiver-Side RTT Measurements . . . . . . 14 67 3.6. Slowstart. . . . . . . . . . . . . . . . . . . . . . 15 68 3.7. Scheduling of Packet Transmissions . . . . . . . . . 15 69 4. Data Receiver Protocol. . . . . . . . . . . . . . . . . 16 70 4.1. Receiver Initialization. . . . . . . . . . . . . . . 16 71 4.2. Receiver Leave . . . . . . . . . . . . . . . . . . . 16 72 4.3. Measurement of the Network Conditions. . . . . . . . 17 73 4.3.1. Updating the Loss Event Rate. . . . . . . . . . . 17 74 4.3.2. Basic Round-Trip Time Measurement . . . . . . . . 17 75 4.3.3. One-Way Delay Adjustments . . . . . . . . . . . . 18 76 4.3.4. Receive Rate Measurements . . . . . . . . . . . . 18 77 4.4. Setting the Desired Rate . . . . . . . . . . . . . . 19 78 4.5. Feedback and Feedback Suppression. . . . . . . . . . 19 79 5. Calculation of the Loss Event Rate. . . . . . . . . . . 21 80 5.1. Detection of Lost or Marked Packets. . . . . . . . . 21 81 5.2. Translation from Loss History to Loss Events . . . . 22 82 5.3. Inter-Loss Event Interval. . . . . . . . . . . . . . 23 83 5.4. Average Loss Interval. . . . . . . . . . . . . . . . 23 84 5.5. History Discounting. . . . . . . . . . . . . . . . . 24 85 5.6. Initializing the Loss History after the First 86 Loss Event. . . . . . . . . . . . . . . . . . . . . . . . 26 87 6. Security Considerations . . . . . . . . . . . . . . . . 27 88 7. IANA Considerations . . . . . . . . . . . . . . . . . . 28 89 8. Authors' Addresses. . . . . . . . . . . . . . . . . . . 28 90 9. Acknowledgments . . . . . . . . . . . . . . . . . . . . 28 91 10. References . . . . . . . . . . . . . . . . . . . . . . 29 92 11. Copyright Notice . . . . . . . . . . . . . . . . . . . 30 94 1. Introduction 96 This document specifies TCP-Friendly Multicast Congestion Control 97 (TFMCC) [11]. TFMCC is a source-based, single-rate congestion control 98 scheme that builds upon the unicast TCP-Friendly Rate Control mechanism 99 (TFRC) [2]. TFMCC is stable and responsive under a wide range of network 100 conditions and scales to receivers sets on the order of several thousand 101 receivers. To support scalability, as much congestion control 102 functionality as possible is located at the receivers. Each receiver 103 continuously determines a desired receive rate that is TCP-friendly for 104 the path from the sender to this receiver. Selected receivers then 105 report the rate to the sender in feedback packets. 107 TFMCC is a building block as defined in RFC3048. Instead of specifying 108 a complete protocol, this document simply specifies a congestion control 109 mechanism that could be used in a transport protocol such as RTP [8], in 110 an application incorporating end-to-end congestion control at the 111 application level. This document does not discuss packet formats, 112 reliability, or implementation-related issues. 114 TFMCC is designed to be reasonably fair when competing for bandwidth 115 with TCP flows. A multicast flow is ``reasonably fair'' if its sending 116 rate is generally within a factor of two of the sending rate of a TCP 117 flow from the sender to the slowest receiver of the multicast group 118 under the same network conditions. 120 In general, TFMCC has a low variation of throughput, which makes it 121 suitable for applications such as streaming media where a relatively 122 smooth sending rate is of importance. The penalty of having smooth 123 throughput while competing fairly for bandwidth is a reduced 124 responsiveness to changes in available bandwidth. Thus TFMCC should be 125 used when the application has a requirement for smooth throughput, in 126 particular, avoiding halving of the sending rate in response to a single 127 packet drop. For applications that simply need to multicast as much 128 data as possible in as short a time as possible, PGMCC [7] may be more 129 suitable. 131 TFMCC is designed for applications that use a fixed packet size, and 132 vary their sending rate in packets per second in response to congestion. 133 Some audio applications require a fixed interval of time between packets 134 and vary their packet size instead of their packet rate in response to 135 congestion. The congestion control mechanism in this document cannot be 136 used by those applications. 138 1.1. Terminology 140 In this document, the key words "MUST", "MUST NOT", "REQUIRED", "SHALL", 141 "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and 142 "OPTIONAL" are to be interpreted as described in RFC 2119 and indicate 143 requirement levels for compliant TFMCC implementations. 145 1.2. Related Documents 147 As described in RFC3048, TFMCC is a building block that is intended to 148 be used, in conjunction with other building blocks, to help specify a 149 protocol instantiation. In particular, TFMCC is a suitable congestion 150 control building block for NACK-Oriented Reliable Multicast (NORM) [1]. 152 1.3. Environmental Requirements and Considerations 154 TFMCC is intended to be a congestion control scheme that can be used in 155 a complete protocol instantiation that delivers objects and streams 156 (both reliable content delivery and streaming of multimedia 157 information). 159 TFMCC is most applicable for sessions that deliver a substantial amount 160 of data, i.e., in length from hundreds of kilobytes to many gigabytes, 161 and whose duration is in the order of tens of seconds or more. 163 TFMCC is intended for multicast delivery. There are currently two 164 models of multicast delivery, the Any-Source Multicast (ASM) model as 165 defined in RFC1112 and the Source-Specific Multicast (SSM) model as 166 defined in [3]. TFMCC works with both multicast models, but in a 167 slightly different way. When using ASM, feedback from the receivers is 168 multicast to the sender as well as to all other receivers. Feedback can 169 be either multicast on the same group address used for sending data or 170 on a separate multicast feedback group address. For SSM, the receivers 171 must unicast the feedback directly to the sender. Hence, feedback from 172 a receiver will not be received by other receivers. 174 TFMCC inherently works with all types of networks, including LANs, WANs, 175 Intranets, the Internet, asymmetric networks, wireless networks, and 176 satellite networks. However, in some network environments varying the 177 sending rate to the receivers may not be advantageous (e.g., for a 178 satellite or wireless network, there may be no mechanism for receivers 179 to effectively reduce their reception rate since there may be a fixed 180 transmission rate allocated to the session). 182 2. Protocol Overview 184 TFMCC extends the basic mechanisms of TFRC into the multicast domain. 185 In order to compete fairly with TCP, TFMCC receivers individually 186 measure the prevalent network conditions and calculate a rate that is 187 TCP-friendly on the path from the sender to themselves. The rate is 188 determined using an equation for TCP throughput, which roughly describes 189 TCP's sending rate as a function of the loss event rate, round-trip time 190 (RTT), and packet size. We define a loss event as one or more lost or 191 marked packets from the packets received during one RTT, where a marked 192 packet refers to a congestion indication from Explicit Congestion 193 Notification (ECN) [6]. The sending rate of the multicast transmission 194 is adapted to the receiver experiencing the worst network conditions. 196 Basically, TFMCC's congestion control mechanism works as follows: 198 o Each receiver measures the loss event rate and its RTT to the sender. 200 o Each receiver then uses this information, together with an equation 201 for TCP throughput, to derive a TCP-friendly sending rate. 203 o Through a distributed feedback suppression mechanism, only a subset of 204 the receivers are allowed to give feedback to prevent a feedback 205 implosion at the sender. The feedback mechanism ensures that 206 receivers reporting a low desired transmission rate have a high 207 probability of sending feedback. 209 o Receivers whose feedback is not suppressed report the calculated 210 transmission rate back to the sender in so-called receiver reports. 211 The receiver reports serve two purposes: they inform the sender about 212 the appropriate transmit rate, and they allow the receivers to measure 213 their RTT. 215 o The sender selects the receiver that reports the lowest rate as 216 current limiting receiver (CLR). Whenever feedback with an even lower 217 rate reaches the sender, the corresponding receiver becomes CLR and 218 the sending rate is reduced to match that receiver's calculated rate. 219 The sending rate increases when the CLR reports a calculated rate 220 higher than the current sending rate. 222 The dynamics of TFMCC are sensitive to how the measurements are 223 performed and applied and what feedback suppression mechanism is chosen. 224 We recommend specific mechanisms below to perform and apply these 225 measurements. Other mechanisms are possible, but it is important to 226 understand how the interactions between mechanisms affect the dynamics 227 of TFMCC. 229 2.1. TCP Throughput Equation 231 Any realistic equation giving TCP throughput as a function of loss event 232 rate and RTT should be suitable for use in TFMCC. However, we note that 233 the TCP throughput equation used must reflect TCP's retransmit timeout 234 behavior, as this dominates TCP throughput at higher loss rates. We 235 also note that the assumptions implicit in the throughput equation about 236 the loss event rate parameter have to be a reasonable match to how the 237 loss rate or loss event rate is actually measured. While this match is 238 not perfect for the throughput equation and loss rate measurement 239 mechanisms given below, in practice the assumptions turn out to be close 240 enough. 242 The throughput equation we currently recommend for TFMCC is a slightly 243 simplified version of the throughput equation for Reno TCP from [4]: 245 8 s 246 X = --------------------------------------------------------- (1) 247 R * (sqrt(2*p/3) + (12*sqrt(3*p/8) * p * (1+32*p^2))) 249 where 251 X is the transmit rate in bits/second. 253 s is the packet size in bytes. 255 R is the round-trip time in seconds. 257 p is the loss event rate, between 0.0 and 1.0, of the number of 258 loss events as a fraction of the number of packets transmitted. 260 In future, different TCP equations may be substituted for this equation. 261 The requirement is that the throughput equation be a reasonable 262 approximation of the sending rate of TCP for conformant TCP congestion 263 control. 265 The parameters s (packet size), p (loss event rate) and R (RTT) need to 266 be measured or calculated by a TFMCC implementation. The measurement of 267 R is specified in Section 4.3.2, and the measurement of p is specified 268 in Section 5. The parameter s (packet size) is normally known to an 269 application. This may not be so in two cases: 271 o The packet size naturally varies depending on the data. In this case, 272 although the packet size varies, that variation is not coupled to the 273 transmit rate. It should normally be safe to use an estimate of the 274 mean packet size for s. 276 o The application needs to change the packet size rather than the number 277 of packets per second to perform congestion control. This would 278 normally be the case with packet audio applications where a fixed 279 interval of time needs to be represented by each packet. Such 280 applications need to have a different way of measuring parameters. 282 Currently, TFMCC cannot be used for the second class of applications. 284 2.2. Packet Contents 286 Before specifying the sender and receiver functionality, we describe the 287 congestion control information contained in packets sent by the sender 288 and feedback packets from the receivers. Information from the sender 289 can either be sent in separate congestion control messages or 290 piggybacked onto data packets. If separate congestion control messages 291 are sent at time intervals larger than the time interval between data 292 packets (e.g., once per feedback round), it is necessary to be able to 293 include timestamp information destined for more than one receiver to 294 allow a sufficient number of receivers to measure their RTT. 296 As TFMCC will be used along with a transport protocol, we do not specify 297 packet formats, since these depend on the details of the transport 298 protocol used. The recommended representation of the header fields is 299 given below. Alternatively, if the computational overhead of a floating 300 point representation is prohibitive, fixed point arithmetic can be used 301 at the expense of larger packet headers. 303 2.2.1. Sender Packets 305 Each packet sent by the data sender contains the following information: 307 o A sequence number i. This number is incremented by one for each data 308 packet transmitted. The field must be sufficiently large that it does 309 not wrap causing two different packets with the same sequence number 310 to be in the receiver's recent packet history at the same time. In 311 most cases the sequence number will be supplied by the transport 312 protocol used along with TFMCC. 314 o A suppression rate X_supp in bits/s. Only receivers with a calculated 315 rate lower than the suppression rate are eligible to give feedback, 316 unless their RTT is higher than the maximum RTT described below in 317 which case they are also eligible to give feedback. The suppression 318 rate should be represented as a 12 bit floating point value with 5 319 bits for the unsigned exponent and 7 bits for the unsigned mantissa 320 (to represent rates from 100 bit/s to 400 Gbit/s with an error of less 321 than 1%). 323 o A timestamp ts_i indicating when the packet is sent. The resolution 324 of the timestamp should typically be milliseconds and the timestamp 325 should be an unsigned integer value no less than 16 bits wide. 327 o A receiver ID r and a copy of the timestamp ts_r of that receiver's 328 last report, which allows the receiver to measure its RTT. The 329 receiver ID is described in the next section. The resolution of the 330 timestamp echo should be milliseconds and the timestamp should be an 331 unsigned integer value no less than 16 bits wide. If separate instead 332 of piggybacked congestion control messages are used, the packet needs 333 to contain a list of receiver IDs with corresponding timestamps to 334 allow a sufficient number of receivers to simultaneously measure their 335 RTT. For the default values used for the feedback process this 336 corresponds to a list size on the order of 10 to 20 entries. 338 o A flag is_CLR indicating whether the receiver with ID r is the CLR. 340 o A feedback round counter fb_nr. This counter is incremented by the 341 sender at the beginning of a new feedback round to notify the 342 receivers than all feedback for older rounds should be suppressed. 343 The feedback round counter should be at least 4 bits wide. 345 o A maximum RTT value R_max, representing the maximum of the RTTs of all 346 receivers. The RTT should be measured in milliseconds. An 8 bit 347 floating point value with 4 bits for the unsigned exponent and 4 bits 348 for the unsigned mantissa (to represent RTTs from 1 millisecond to 64 349 seconds with an error of ca. 6%) should be used for the 350 representation. 352 2.2.2. Feedback Packets 354 TFMCC receivers use two different formats for feedback packets depending 355 on whether they have made at least one RTT measurement or not. Each 356 feedback packet sent by a data receiver contains the following 357 information: 359 o A unique receiver ID r. In most cases the receiver ID will be 360 supplied by the transport protocol, but it may simply be the IP 361 address of the receiver. 363 o A flag have_RTT indicating whether the receiver has made at least one 364 RTT measurement since it joined the session. 366 o A flag have_loss indicating whether the receiver experienced at least 367 one loss event since it joined the session. 369 o A flag receiver_leave indicating that the receiver will leave the 370 session (and should therefore not be CLR). 372 o A timestamp ts_r indicating when the feedback packet is sent. The 373 representation of the timestamp should be the same as that of the 374 timestamp echo in the data packets. 376 o An echo of the timestamp of the last data packet received. If the 377 last packet received at the receiver has sequence number i, then ts_i 378 is included in the feedback. If there is a delay t_d between 379 receiving that last data packet and sending feedback, then ts_i' = 380 ts_i + t_d is included in the feedback instead of ts_i. The 381 representation of the timestamp echo should be the same as that of the 382 timestamp in the data packets. 384 o A feedback round echo fb_nr, reflecting the highest feedback round 385 counter value received so far. The representation of the feedback 386 round echo should be the same as the one used for the feedback round 387 counter in the data packets. 389 o The desired sending rate X_r. This is the rate calculated by the 390 receiver to be TCP-friendly on the path from the sender to this 391 receiver. The representation of the desired sending rate should be 392 the same as that of the suppression rate in the data packets. 394 3. Data Sender Protocol 396 The data sender multicasts a stream of data packets to the data 397 receivers at a controlled rate. Whenever feedback is received, the 398 sender checks if it is necessary to switch CLRs and to readjust the 399 sending rate. 401 The main tasks that have to be provided by a TFMCC sender are: 403 o adjusting the sending rate, 405 o controlling receiver feedback, and 407 o assisting receiver-side RTT measurements. 409 3.1. Sender Initialization 411 At initialization of the sender, the maximum RTT is set to a value that 412 should be larger than the highest RTT to any of the receivers. It 413 should not be smaller than 500 milliseconds for operation in the public 414 Internet. The sending rate X is initialized to 1 packet per maximum 415 RTT. 417 3.2. Determining the Maximum RTT 419 For each feedback packet that arrives at the sender, the sender computes 420 the instantaneous RTT to the receiver as 422 R_r = t_now - ts_i 424 where t_now is the time the feedback packet arrived. Receivers will 425 have adjusted ts_i for the time interval between receiving the last data 426 packet and sending the corresponding report so that this interval will 427 not be included in R_r. If the instantaneous RTT is larger than the 428 current maximum RTT, the maximum RTT is increased to that value 430 R_max = R_r 432 Otherwise, if no feedback with a higher instantaneous RTT than the 433 maximum RTT is received during a feedback round (see Section 3.4), the 434 maximum RTT is reduced to 436 R_max = MAX(R_max * 0.9, R_peak) 438 where R_peak is the peak receiver RTT measured during the feedback 439 round. 441 The maximum RTT is mainly used for feedback suppression among receivers 442 with heterogeneous RTTs. Feedback suppression is closely coupled to the 443 sending of data packets and for this reason, the maximum RTT must not 444 decrease below the maximum time interval between consecutive data 445 packets: 447 R_max = max(R_max, s/X + t_gran) 449 where t_gran is the granularity of the sender's system clock (see 450 Section 3.7). 452 3.3. Adjusting the Sending Rate 454 When a feedback packet from receiver r arrives at the sender, the sender 455 has to check whether it is necessary to adjust the transmission rate and 456 to switch to a new CLR. 458 How the rate is adjusted depends on the desired rate X_r of the receiver 459 report. We distinguish four cases: 461 1 If no CLR is present, receiver r becomes the current limiting 462 receiver. The sending rate X is directly set to X_r, so long as 463 this would result in a rate increase of less than 8s/R_max bits/s. 464 Otherwise X is gradually increased to X_r at an increase rate of no 465 more than 8s/R_max bits/s every R_max seconds. 467 2 If receiver r is not the CLR but a CLR is present, then receiver r 468 becomes the current limiting receiver if X_r is less than the 469 current sending rate X and the receiver_leave flag of that 470 receiver's report is not set. Furthermore, the sending rate is 471 reduced to X_r. 473 3 If receiver r is not the CLR but a CLR is present and the 474 receiver_leave flag of the CLR's last report was set, then receiver 475 r becomes the current limiting receiver. However, if X_r > X, the 476 sending rate is not increased to X_r for the duration of a feedback 477 round to allow other (lower rate) receivers to give feedback and be 478 selected as CLR. 480 4 If receiver r is the CLR, the sending rate is set to the minimum of 481 X_r and X + 8s/R_max bits/s. 483 If the receiver has not yet measured its RTT (i.e., the have_RTT flag is 484 set), the receiver report will include a desired rate that is based on 485 the maximum RTT rather than the actual RTT to that receiver. In this 486 case, the sender adjusts the desired rate using its measurement of the 487 instantaneous RTT R_r to that receiver: 489 X_r' = X_r * R_max / R_r 491 X_r' is then used instead of X_r to detect whether to switch to a new 492 CLR. 494 If the TFMCC sender receives no reports from the CLR for 4 RTTs, the 495 sending rate is cut in half unless the CLR was selected less than 10 496 RTTs ago. In addition, if the sender receives no reports from the CLR 497 for at least 10 RTTs, it assumes that the CLR crashed or left the group. 498 A new CLR is selected from the feedback that subsequently arrives at the 499 sender, and we increase as in case 3 above. 501 If no new CLR can be selected (i.e., in the absence of any feedback from 502 any of the receivers) it is necessary to further reduce the sending 503 rate. For every 10 consecutive RTTs without feedback, the sending rate 504 is cut in half. The rate is at most reduced to one packet every 64 505 seconds. Note that when receivers stop receiving data packets, they 506 will stop sending feedback. This eventually causes the sending rate to 507 be reduced in the case of network failure. If the network subsequently 508 recovers, a linear increase to the calculated rate of the CLR will occur 509 at 8s/R_max bits/s every R_max. 511 3.4. Controlling Receiver Feedback 513 The receivers allowed to send a receiver report are determined in so- 514 called feedback rounds. Feedback rounds have a duration T of six times 515 the maximum RTT. In case the multicast model is ASM, i.e., receiver 516 feedback is multicast to the whole group, the duration of a feedback 517 round may be reduced to four times the maximum RTT. 519 Only receivers wishing to report a rate that is lower than the 520 suppression rate X_supp, or those with a higher RTT than R_max may send 521 feedback. At the beginning of each feedback round, X_supp is set to the 522 highest possible value that can be represented. When feedback arrives 523 at the sender over the course of a feedback round, X_supp is decreased 524 such that more and more feedback is suppressed towards the end of the 525 round. How receiver feedback is spread out over the feedback round is 526 discussed in Section 4.5. 528 Whenever non-CLR feedback for the current round arrives at the sender, 529 X_supp is reduced to 531 X_supp = (1-g) * X_r 533 if X_supp > X_r. Feedback that causes the corresponding receiver to be 534 selected as CLR, but was from a non-CLR receiver at the time of sending 535 also contributes to the feedback suppression. Note that X_r must not be 536 adjusted by the sender to reflect the receiver's real RTT in case X_r 537 was calculated using the maximum RTT, as is done for setting the sending 538 rate (Section 3.3), otherwise a feedback implosion is possible. The 539 parameter g determines to what extent higher rate feedback can suppress 540 lower rate feedback. This mechanism guarantees, that the lowest 541 calculated rate reported lies within a factor of g of the actual lowest 542 calculated rate of the receiver set (see [10]). A value of g of 0.1 is 543 recommended. 545 To allow receivers to suppress their feedback, the sender's suppression 546 rate needs to be updated whenever feedback is received. This 547 suppression rate has to be communicated to the receivers in a timely 548 manner, either by including it in the data packet header or, if separate 549 congestion control messages are used, by sending a message with the 550 suppression rate whenever the rate changes significantly (i.e., when it 551 is reduced to less than (1-g) times the previous suppression rate). 553 After a time span of T, the feedback round ends if non-CLR feedback was 554 received during that time. Otherwise, the feedback round ends as soon 555 as the first non-CLR feedback message arrives at the sender but at most 556 after 2T. The feedback round counter is incremented by one and the 557 suppression rate X_supp is reset to the highest representable value. 558 The feedback round counter restarts with round 0 after a wrap-around. 560 3.5. Assisting Receiver-Side RTT Measurements 562 Receivers measure their RTT by sending a timestamp with a receiver 563 report, which is echoed by the sender. If congestion control 564 information is piggybacked onto data packets, usually only one receiver 565 ID and timestamp can be included. If multiple feedback messages from 566 different receivers arrive at the sender during the time interval 567 between two data packets, the sender has to decide which receiver to 568 allow to measure RTT. The same applies if separate congestion control 569 messages allow to echo multiple receiver timestamps simultaneously but 570 the number of receivers that gave feedback since the last congestion 571 control message exceeds the list size. 573 The sender's timestamp echoes are prioritized in the following order: 575 1. a new CLR (after a change of CLR's) or a CLR without any previous 576 RTT measurements 578 2. receivers without any previous RTT measurements in the order of the 579 feedback round echo of the corresponding receiver report (i.e., 580 older feedback first) 582 3. non-CLR receivers with previous RTT measurements, again in 583 ascending order of the feedback round echo of the report 585 4. the CLR 587 Ties are broken in favor of the receiver with the lowest reported rate. 589 It is necessary to account for the time that elapses between receiving a 590 report and sending the next data packet. This time needs to be deducted 591 from the RTT and thus has to be added to the receiver's timestamp value. 593 Whenever no feedback packets arrive in the interval between two data 594 packets, the CLR's last timestamp, adjusted by the appropriate offset, 595 is echoed. When the number of packets per RTT is so low that all 596 packets carry a non-CLR receiver's timestamp, the CLR's timestamp and ID 597 are included in a data packet at least once per feedback round. 599 3.6. Slowstart 601 TFMCC uses a slowstart mechanism to quickly approach its fair bandwidth 602 share at the start of a session. During slowstart, the sending rate 603 increases exponentially. The rate increase is limited to the minimum of 604 the rates included in the receiver reports and receivers report twice 605 the rate at which they currently receive data. As in normal congestion 606 control mode, the receiver with the smallest reported rate becomes CLR. 607 Since a receiver can never receive data at a rate higher than its link 608 bandwidth, this effectively limits the overshoot to twice this 609 bandwidth. In case the resulting increase over R_max is less than 610 8s/R_max bits/s, the sender may choose to increase the rate by up to 611 8s/R_max bits/s every R_max. The current sending rate is gradually 612 adjusted to the target rate reported in the receiver reports over the 613 course of a RTT. Slowstart is terminated as soon as any one of the 614 receivers experiences its first packet loss. Since that receiver's 615 calculated rate will be lower than the current sending rate, the 616 receiver will be selected as CLR. 618 During slowstart, the upper bound on the rate increase of 8s/R_max 619 bits/s every RTT does not apply. Only after the TFMCC sender receives 620 the first report with the have_loss flag set is the rate increase 621 limited in this way. 623 3.7. Scheduling of Packet Transmissions 625 As TFMCC is rate-based, and as operating systems typically cannot 626 schedule events precisely, it is necessary to be opportunistic about 627 sending data packets so that the correct average rate is maintained 628 despite the coarse-grain or irregular scheduling of the operating 629 system. Thus a typical sending loop will calculate the correct inter- 630 packet interval, t_ipi, as follows: 632 t_ipi = s/X 634 When a sender first starts sending at time t_0, it calculates t_ipi, and 635 calculates a nominal send time t_1 = t_0 + t_ipi for packet 1. When the 636 application becomes idle, it checks the current time, t_now, and then 637 requests re-scheduling after (t_ipi - (t_now - t_0)) seconds. When the 638 application is re-scheduled, it checks the current time, t_now, again. 639 If (t_now > t_1 - delta) then packet 1 is sent (see below for delta). 641 Now a new t_ipi may be calculated, and used to calculate a nominal send 642 time t_2 for packet 2: t2 = t_1 + t_ipi. The process then repeats, with 643 each successive packet's send time being calculated from the nominal 644 send time of the previous packet. 646 In some cases, when the nominal send time, t_i, of the next packet is 647 calculated, it may already be the case that t_now > t_i - delta. In 648 such a case the packet should be sent immediately. Thus if the 649 operating system has coarse timer granularity and the transmit rate is 650 high, then TFMCC may send short bursts of several packets separated by 651 intervals of the OS timer granularity. 653 The parameter delta is to allow a degree of flexibility in the send time 654 of a packet. If the operating system has a scheduling timer granularity 655 of t_gran seconds, then delta would typically be set to: 657 delta = min(t_ipi/2, t_gran/2) 659 t_gran is 10 milliseconds on many Unix systems. If t_gran is not known, 660 a value of 10 milliseconds can be safely assumed. 662 4. Data Receiver Protocol 664 Receivers measure the current network conditions, namely RTT and loss 665 event rate, and use this information to calculate a rate that is fair to 666 competing traffic. The rate is then communicated to the sender in 667 receiver reports. Due to the potentially large number of receivers, it 668 is undesirable that all receivers send reports, especially not at the 669 same time. 671 In the description of the receiver functionality, we will first address 672 how the receivers measure the network parameters and then discuss the 673 feedback process. 675 4.1. Receiver Initialization 677 The receiver is initialized when it receives the first data packet. The 678 RTT is set to the maximum RTT value contained in the data packet. This 679 initial value is used as the receiver's RTT until the first real RTT 680 measurement is made. The loss event rate is initialized to 0. 682 4.2. Receiver Leave 684 A receiver that sends feedback but wishes to leave the TFMCC session 685 within the next feedback round may indicate the pending leave by setting 686 the receiver_leave flag in its report. If the leaving receiver is the 687 CLR, the receiver_leave flag should be set for all the reports within 688 the feedback round before the leave takes effect. 690 4.3. Measurement of the Network Conditions 692 Receivers have to update their estimate of the network parameters with 693 each new data packet they receive. 695 4.3.1. Updating the Loss Event Rate 697 When a data packet is received, the receiver adds the packet to the 698 packet history. It then recalculates the new value of the loss event 699 rate p. The loss event rate measurement mechanism is described 700 separately in Section 5. 702 4.3.2. Basic Round-Trip Time Measurement 704 When a receiver gets a data packet that carries the receiver's own ID in 705 the r field, the receiver updates its RTT estimate. 707 1. The current RTT is calculated as: 709 R_sample = t_now - ts_r 711 where t_now is the time the data packet arrives at the receiver and 712 ts_r is the receiver report timestamp echoed in the data packet. 714 2. The smoothed RTT estimate R is updated: 716 If no feedback has been received before 717 R = R_sample 718 Else 719 R = q*R + (1-q)*R_sample 721 A filter parameter q of 0.5 is recommended for non-CLR receivers. 722 The CLR performs RTT measurements much more frequently and hence 723 should use a higher filter value. We recommend using q=0.9. Note 724 that TFMCC is not sensitive to the precise value for the filter 725 constant. 727 Optionally, sender-based instead of receiver-based RTT measurements may 728 be used. The sender already determines the RTT to a receiver from the 729 receiver's echo of the sender's own timestamp for the calculation of the 730 maximum RTT. For sender-based RTT measurements, this RTT measurement 731 needs to be communicated to the receiver. Instead of including an echo 732 of the receiver's timestamp, the sender includes the receiver's RTT in 733 the next data packet, using the prioritization rules described in 734 Section 3.5. 736 To simplify sender operation, smoothing of RTT samples as described 737 above should still be done at the receiver. 739 4.3.3. One-Way Delay Adjustments 741 When a RTT measurement is performed, the receiver also determines the 742 one-way delay D_r from itself to the sender: 744 D_r = ts_r - ts_i 746 where ts_i and ts_r are the timestamp and timestamp echo contained in 747 the data packet. With each new data packet i', the receiver can now 748 calculate an updated RTT estimate as: 750 R' = D_r + t_now - ts_i' 752 In between RTT measurements, the updated R' is used instead of the 753 smoothed RTT R. When a new measurement is made, all interim one-way 754 delay measurements are discarded (i.e., the smoothed RTT is updated 755 according to Section 4.3.2 without taking one-way delay adjustments into 756 account). 758 For the one-way delay measurements, the clocks of sender and receivers 759 need not be synchronized. Clock skew will cancel itself out when both 760 one-way measurements are added to form a RTT estimate, as long as clock 761 drift between real RTT measurements is negligible. 763 The same one-way delay adjustments should be applied to the RTT supplied 764 by the sender when using sender-based RTT measurements. 766 4.3.4. Receive Rate Measurements 768 When a receiver has not experienced any loss events, it cannot calculate 769 a TCP-friendly rate to include in the receiver reports. Instead, the 770 receiver measures the current receive rate and sets the desired rate X_r 771 to twice the receive rate. 773 The receive rate in bits/s is measured as the number of bits received 774 over the last k RTTs, taking into account the IP and transport packet 775 headers, but excluding the link-layer packet headers. A value for k 776 between 2 and 4 is recommended. 778 4.4. Setting the Desired Rate 780 When a receiver measures a non-zero loss event rate, it calculates the 781 desired rate using Equation (1). In case no RTT measurement is 782 available yet, the maximum RTT is used instead of the receiver's RTT. 783 The desired rate is updated whenever the loss event rate or the RTT 784 changes. 786 As mentioned above, calculation of the desired rate is not possible 787 before the receiver experiences the first loss event and in that case 788 twice the rate at which data is received is included in the receiver 789 reports as X_r. This mechanism allows the sender to slowstart as 790 described in Section 3.6. 792 4.5. Feedback and Feedback Suppression 794 Let fb_nr be the highest feedback round counter value received by a 795 receiver. When a new data packet arrives with a higher feedback round 796 counter than fb_nr, a new feedback round begins and fb_nr is updated. 797 Outstanding feedback for the old round is canceled. In case a feedback 798 number with a value that is more than half the feedback number space 799 lower than fb_nr is received, the receiver assumes that the feedback 800 round counter wrapped and also cancels the feedback timer and updates 801 fb_nr. 803 The CLR sends its feedback independently from all the other receivers 804 once per RTT. Its feedback does not suppress other feedback and cannot 805 be suppressed by other receiver's feedback. 807 Non-CLR receivers set a feedback timer at the beginning of a feedback 808 round. Using an exponentially weighted random timer mechanism, the 809 feedback timer is set to expire after 811 t = max(T * (1 + log(x)/log(N)), 0) 813 where 815 x is a random variable uniformly distributed in (0,1], 817 T is the duration of a feedback round (i.e., 6 * R_max), 819 N is an estimated upper bound on the number of receivers. 821 N is a constant specific to the TFMCC protocol. Since TFMCC scales to 822 up to thousands of receivers, setting N to 10,000 for all receivers (and 823 limiting the TFMCC session to at most 10,000 receivers) is recommended. 825 A feedback packet is sent when the feedback timer expires, unless the 826 timer is canceled beforehand. When the multicast model is ASM, feedback 827 is multicast to the whole group, otherwise the feedback is unicast to 828 the sender. The feedback packet includes the calculated rate valid at 829 the time the feedback packet is sent (not the rate at the point of time 830 when the feedback timer is set). The copy of the timestamp ts_i of the 831 last data packet received, which is included in the feedback packet, 832 needs to be adjusted by the time interval between receiving the data 833 packet and sending the report to allow the sender to correctly infer the 834 instantaneous RTT (i.e., that time interval has to be added to the 835 timestamp value). 837 The timer is canceled if a data packet with a lower suppression rate 838 than the receiver's calculated rate and a higher or equal maximum RTT 839 than the receiver's RTT is received. Likewise, a data packet indicating 840 the beginning of a new feedback round cancels all feedback for older 841 rounds. In case of ASM, the timer is also canceled if a feedback packet 842 from another non-CLR receiver reporting a lower rate is received. 844 The feedback suppression process is complicated by the fact that the 845 calculated rates of the receivers will change during a feedback round. 846 If the calculated rates decrease rapidly for all receivers, feedback 847 suppression can no longer prevent a feedback implosion since earlier 848 feedback will always report a higher rate than current feedback. To make 849 the feedback suppression mechanism robust in the face of changing rates, 850 it is necessary to introduce X_fbr, the calculated rate of a receiver at 851 the beginning of a feedback round. A receiver needs to suppress its 852 feedback not only when the suppression rate is less than the receiver's 853 current calculated rate but also in the case that the suppression rate 854 falls below X_fbr. 856 When the maximum RTT changes significantly during one feedback round, it 857 is necessary to reschedule the feedback timer in proportion to the 858 change. 860 t = t * R_max / R_max' 862 where R_max is the new maximum RTT and R_max' is the previous maximum 863 RTT. The same considerations hold, when the last data packets were 864 received more than a time interval of R_max ago. In this case, it is 865 necessary to add the difference of the inter-packet gap and the maximum 866 RTT to the feedback time to prevent a feedback implosion (e.g., in case 867 the sender crashed). 869 t = t + max(t_now - tr_i - R_max, 0) 871 where tr_i is the time when the last data packet arrived at the 872 receiver. 874 More details on the characteristics of the feedback suppression 875 mechanism can be found in [10] and [11]. 877 5. Calculation of the Loss Event Rate 879 Obtaining an accurate and stable measurement of the loss event rate is 880 of primary importance for TFMCC. Loss rate measurement is performed at 881 the receiver, based on the detection of lost or marked packets from the 882 sequence numbers of arriving packets. 884 5.1. Detection of Lost or Marked Packets 886 TFMCC assumes that all packets contain a sequence number that is 887 incremented by one for each packet that is sent. For the purposes of 888 this specification, we require that if a lost packet is retransmitted, 889 the retransmission is given a new sequence number that is the latest in 890 the transmission sequence, and not the same sequence number as the 891 packet that was lost. If a transport protocol has the requirement that 892 it must retransmit with the original sequence number, then the transport 893 protocol designer must figure out how to distinguish delayed from 894 retransmitted packets and how to detect lost retransmissions. 896 The receivers each maintain a data structure that keeps track of which 897 packets have arrived and which are missing. For the purposes of 898 specification, we assume that the data structure consists of a list of 899 packets that have arrived along with the timestamp when each packet was 900 received. In practice this data structure will normally be stored in a 901 more compact representation, but this is implementation-specific. 903 The loss of a packet is detected by the arrival of at least three 904 packets with a higher sequence number than the lost packet. The 905 requirement for three subsequent packets is the same as with TCP, and is 906 to make TFMCC more robust in the presence of reordering. In contrast to 907 TCP, if a packet arrives late (after 3 subsequent packets arrived) at a 908 receiver, the late packet can fill the hole in the reception record, and 909 the receiver can recalculate the loss event rate. Future versions of 910 TFMCC might make the requirement for three subsequent packets adaptive 911 based on experienced packet reordering, but we do not specify such a 912 mechanism here. 914 For an ECN-capable connection, a marked packet is detected as a 915 congestion event as soon as it arrives, without having to wait for the 916 arrival of subsequent packets. 918 5.2. Translation from Loss History to Loss Events 920 TFMCC requires that the loss event rate be robust to several consecutive 921 packets lost where those packets are part of the same loss event. This 922 is similar to TCP, which (typically) only performs one halving of the 923 congestion window during any single RTT. Thus the receivers needs to 924 map the packet loss history into a loss event record, where a loss event 925 is one or more packets lost in a RTT. 927 To determine whether a lost or marked packet should start a new loss 928 event, or be counted as part of an existing loss event, we need to 929 compare the sequence numbers and timestamps of the packets that arrived 930 at the receiver. For a marked packet S_new, its reception time T_new 931 can be noted directly. For a lost packet, we can interpolate to infer 932 the nominal "arrival time". Assume: 934 S_loss is the sequence number of a lost packet. 936 S_before is the sequence number of the last packet to arrive with 937 sequence number before S_loss. 939 S_after is the sequence number of the first packet to arrive with 940 sequence number after S_loss. 942 T_before is the reception time of S_before. 944 T_after is the reception time of S_after. 946 Note that T_before can either be before or after T_after due to 947 reordering. 949 For a lost packet S_loss, we can interpolate its nominal "arrival time" 950 at the receiver from the arrival times of S_before and S_after. Thus 952 T_loss = T_before + ( (T_after - T_before) 953 * (S_loss - S_before)/(S_after - S_before) ); 955 Note that if the sequence space wrapped between S_before and S_after, 956 then the sequence numbers must be modified to take this into account 957 before performing the calculation. If the largest possible sequence 958 number is S_max, and S_before > S_after, then modifying each sequence 959 number S by S' = (S + (S_max + 1)/2) mod (S_max + 1) would normally be 960 sufficient. 962 If the lost packet S_old was determined to have started the previous 963 loss event, and we have just determined that S_new has been lost, then 964 we interpolate the nominal arrival times of S_old and S_new, called 965 T_old and T_new respectively. 967 If T_old + R >= T_new, then S_new is part of the existing loss event. 968 Otherwise S_new is the first packet of a new loss event. 970 5.3. Inter-Loss Event Interval 972 If a loss interval, A, is determined to have started with packet 973 sequence number S_A and the next loss interval, B, started with packet 974 sequence number S_B, then the number of packets in loss interval A is 975 given by (S_B - S_A). 977 5.4. Average Loss Interval 979 To calculate the loss event rate p, we first calculate the average loss 980 interval. This is done using a filter that weights the n most recent 981 loss event intervals in such a way that the measured loss event rate 982 changes smoothly. 984 Weights w_0 to w_(n-1) are calculated as: 986 If (i < n/2) 987 w_i = 1; 988 Else 989 w_i = 1 - (i - (n/2 - 1))/(n/2 + 1); 991 Thus if n=8, the values of w_0 to w_7 are: 993 1.0, 1.0, 1.0, 1.0, 0.8, 0.6, 0.4, 0.2 995 The value n for the number of loss intervals used in calculating the 996 loss event rate determines TFMCC's speed in responding to changes in the 997 level of congestion. As currently specified, TFMCC should not be used 998 for values of n significantly greater than 8, for traffic that might 999 compete in the global Internet with TCP. At the very least, safe 1000 operation with values of n greater than 8 would require a slight change 1001 to TFMCC's mechanisms to include a more severe response to two or more 1002 round-trip times with heavy packet loss. 1004 When calculating the average loss interval we need to decide whether to 1005 include the interval since the most recent packet loss event. We only 1006 do this if it is sufficiently large to increase the average loss 1007 interval. 1009 Thus if the most recent loss intervals are I_0 to I_n, with I_0 being 1010 the interval since the most recent loss event, then we calculate the 1011 average loss interval I_mean as: 1013 I_tot0 = 0; 1014 I_tot1 = 0; 1015 W_tot = 0; 1016 for (i = 0 to n-1) { 1017 I_tot0 = I_tot0 + (I_i * w_i); 1018 W_tot = W_tot + w_i; 1019 } 1020 for (i = 1 to n) { 1021 I_tot1 = I_tot1 + (I_i * w_(i-1)); 1022 } 1023 I_tot = max(I_tot0, I_tot1); 1024 I_mean = I_tot/W_tot; 1026 The loss event rate, p is simply: 1028 p = 1 / I_mean; 1030 5.5. History Discounting 1032 As described in Section 5.4, the most recent loss interval is only 1033 assigned 4/(3*n) of the total weight in calculating the average loss 1034 interval, regardless of the size of the most recent loss interval. This 1035 section describes an optional history discounting mechanism that allows 1036 the TFMCC receivers to adjust the weights, concentrating more of the 1037 relative weight on the most recent loss interval, when the most recent 1038 loss interval is more than twice as large as the computed average loss 1039 interval. 1041 To carry out history discounting, we associate a discount factor DF_i 1042 with each loss interval L_i, where each discount factor is a floating 1043 point number. The discount array maintains the cumulative history of 1044 discounting for each loss interval. At the beginning, the values of 1045 DF_i in the discount array are initialized to 1: 1047 for (i = 1 to n) { 1048 DF_i = 1; 1049 } 1051 History discounting also uses a general discount factor DF, also a 1052 floating point number, that is also initialized to 1. First we show how 1053 the discount factors are used in calculating the average loss interval, 1054 and then we describe later in this section how the discount factors are 1055 modified over time. 1057 As described in Section 5.4 the average loss interval is calculated 1058 using the n previous loss intervals I_1, ..., I_n, and the interval I_0 1059 that represents the number of packets received since the last loss 1060 event. The computation of the average loss interval using the discount 1061 factors is a simple modification of the procedure in Section 5.4, as 1062 follows: 1064 I_tot0 = I_0 * w_0 1065 I_tot1 = 0; 1066 W_tot0 = w_0 1067 W_tot1 = 0; 1068 for (i = 1 to n-1) { 1069 I_tot0 = I_tot0 + (I_i * w_i * DF_i * DF); 1070 W_tot0 = W_tot0 + w_i * DF_i * DF; 1071 } 1072 for (i = 1 to n) { 1073 I_tot1 = I_tot1 + (I_i * w_(i-1) * DF_i); 1074 W_tot1 = W_tot1 + w_(i-1) * DF_i; 1075 } 1076 p = min(W_tot0/I_tot0, W_tot1/I_tot1); 1078 The general discounting factor, DF is updated on every packet arrival as 1079 follows. First, a receiver computes the weighted average I_mean of the 1080 loss intervals I_1, ..., I_n: 1082 I_tot = 0; 1083 W_tot = 0; 1084 for (i = 1 to n) { 1085 W_tot = w_(i-1) * DF_i; 1086 I_tot = I_tot + (I_i * w_(i-1) * DF_i); 1087 } 1088 I_mean = I_tot / W_tot; 1090 This weighted average I_mean is compared to I_0, the number of packets 1091 received since the last loss event. If I_0 is greater than twice 1092 I_mean, then the new loss interval is considerably larger than the old 1093 ones, and the general discount factor DF is updated to decrease the 1094 relative weight on the older intervals, as follows: 1096 if (I_0 > 2 * I_mean) { 1097 DF = 2 * I_mean/I_0; 1098 if (DF < THRESHOLD) 1099 DF = THRESHOLD; 1100 } else 1101 DF = 1; 1103 A nonzero value for THRESHOLD ensures that older loss intervals from an 1104 earlier time of high congestion are not discounted entirely. We 1105 recommend a THRESHOLD of 0.5. Note that with each new packet arrival, 1106 I_0 will increase further, and the discount factor DF will be updated. 1108 When a new loss event occurs, the current interval shifts from I_0 to 1109 I_1, loss interval I_i shifts to interval I_(i+1), and the loss interval 1110 I_n is forgotten. The previous discount factor DF has to be 1111 incorporated into the discount array. Because DF_i carries the discount 1112 factor associated with loss interval I_i, the DF_i array has to be 1113 shifted as well. This is done as follows: 1115 for (i = 1 to n) { 1116 DF_i = DF * DF_i; 1117 } 1118 for (i = n-1 to 0 step -1) { 1119 DF_(i+1) = DF_i; 1120 } 1121 I_0 = 1; 1122 DF_0 = 1; 1123 DF = 1; 1125 This completes the description of the optional history discounting 1126 mechanism. We emphasize that this is an optional mechanism whose sole 1127 purpose is to allow TFMCC to response somewhat more quickly to the 1128 sudden absence of congestion, as represented by a long current loss 1129 interval. 1131 5.6. Initializing the Loss History after the First Loss Event 1133 The number of packets received before the first loss event usually does 1134 not reflect the current loss event rate. When the first loss event 1135 occurs, a TFMCC receiver assumes that the correct data rate is the rate 1136 at which data was received during the last RTT when the loss occurred. 1137 Instead of initializing the first loss interval to the number of packets 1138 sent until the first loss event, the TFMCC receiver calculates the loss 1139 interval that would be required to produce the receive rate X_recv, and 1140 uses this synthetic loss interval l_0 to seed the loss history 1141 mechanism. 1143 The initial loss interval is calculated by inverting a simplified 1144 version of the TCP Equation (1). 1146 s 1147 X_recv = sqrt(3/2) * ----------------- 1148 R * sqrt(1/l_0) 1150 X_recv * R 1151 ==> l_0 = (---------------)^2 1152 sqrt(3/2) * s 1154 The resulting initial loss interval is too small at higher loss rates 1155 compared to using the more accurate Equation (1), which leads to a more 1156 conservative initial loss event rate. 1158 If a receiver still uses the initial RTT R_max instead of its real RTT, 1159 the initial loss interval is too large in case the initial RTT is higher 1160 than the actual RTT. As a consequence, the receiver will calculate a 1161 too high desired rate when the first RTT measurement R is made and the 1162 initial loss interval is still in the loss history. The receiver has to 1163 adjust l_0 as follows: 1165 l_0 = l_0 * (R/R_max)^2 1167 No action needs to be taken when the first RTT measurement is made after 1168 the initial loss interval left the loss history. 1170 6. Security Considerations 1172 TFMCC is not a transport protocol in its own right, but a congestion 1173 control mechanism that is intended to be used in conjunction with a 1174 transport protocol. Therefore security primarily needs to be considered 1175 in the context of a specific transport protocol and its authentication 1176 mechanisms. 1178 Congestion control mechanisms can potentially be exploited to create 1179 denial of service. This may occur through spoofed feedback. Thus any 1180 transport protocol that uses TFMCC should take care to ensure that 1181 feedback is only accepted from valid receivers of the data. The precise 1182 mechanism to achieve this will however depend on the transport protocol 1183 itself. 1185 Congestion control mechanisms may potentially be manipulated by a greedy 1186 receiver that wishes to receive more than its fair share of network 1187 bandwidth. However, in TFMCC a receiver can only influence the sending 1188 rate if it is the CLR and thus has the lowest calculated rate of all 1189 receivers. If the calculated rate is then manipulated such that it 1190 exceeds the calculated rate of the second to lowest receiver, it will 1191 cease to be CLR. A greedy receiver can only significantly increase then 1192 transmission rate if it is the only participant in the session. If such 1193 scenarios are of concern, possible defenses against such a receiver 1194 would normally include some form of nonce that the receiver must feed 1195 back to the sender to prove receipt. However, the details of such a 1196 nonce would depend on the transport protocol, and in particular on 1197 whether the transport protocol is reliable or unreliable. 1199 It is possible that a receiver sends feedback claiming it has a very low 1200 calculated rate. This will reduce the rate of the multicast session and 1201 might render it useless but obviously cannot hurt the network itself. 1203 We expect that protocols incorporating ECN with TFMCC will also want to 1204 incorporate feedback from the receiver to the sender using the ECN nonce 1205 [WES01]. The ECN nonce is a modification to ECN that protects the 1206 sender from the accidental or malicious concealment of marked packets. 1207 Again, the details of such a nonce would depend on the transport 1208 protocol, and are not addressed in this document. 1210 7. IANA Considerations 1212 There are no IANA actions required for this document. 1214 8. Authors' Addresses 1216 Joerg Widmer 1217 EPFL (Swiss Federal Institute of Technology - Lausanne) 1218 CH-1015 Lausanne, Switzerland 1219 joerg.widmer@epfl.ch 1221 Mark Handley 1222 UCL (University College London) 1223 Gower Street, London WC1E 6BT, UK 1224 m.handley@cs.ucl.ac.uk 1226 9. Acknowledgments 1228 We would like to acknowledge feedback and discussions on equation-based 1229 congestion control with a wide range of people, including members of the 1230 Reliable Multicast Research Group, the Reliable Multicast Transport 1231 Working Group, and the End-to-End Research Group. 1233 10. References 1235 [1] B. Adamson, C. Bormann, M. Handley, and J. Macker, "NACK-Oriented 1236 Reliable Multicast (NORM) Protocol Building Blocks", Internet Draft 1237 draft-ietf-rmt-norm-bb-08.txt, November 2003, work in progress. 1238 Citation for informational purposes only. 1240 [2] S. Floyd, M. Handley, J. Padhye, and J. Widmer, "Equation-Based 1241 Congestion Control for Unicast Applications", Proc ACM SIGCOMM 1242 2000, Stockholm, August 2000 1244 [3] H. W. Holbrook, "A Channel Model for Multicast," Ph.D. 1245 Dissertation, Stanford University, Department of Computer Science, 1246 Stanford, California, August 2001. 1248 [4] J. Padhye, V. Firoiu, D. Towsley, and J. Kurose, "Modeling TCP 1249 Throughput: A Simple Model and its Empirical Validation", Proc ACM 1250 SIGCOMM 1998. 1252 [5] V. Paxson and M. Allman, "Computing TCP's Retransmission Timer", RFC 1253 2988, November 2000. 1255 [6] K. Ramakrishnan and S. Floyd, "A Proposal to add Explicit Congestion 1256 Notification (ECN) to IP", RFC 2481, January 1999. 1258 [7] L. Rizzo, "pgmcc: a TCP-friendly single-rate multicast congestion 1259 control scheme", Proc ACM SIGCOMM 2000, Stockholm, August 2000 1261 [8] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, "RTP: A 1262 Transport Protocol for Real-Time Applications", RFC 1889, January 1263 1996. 1265 [9] D. Wetherall, D. Ely, and N. Spring, "Robust ECN Signaling with 1266 Nonces", Internet Draft draft-ietf-tsvwg-tcp-nonce-04.txt, October 1267 2002, work in progress. Citation for informational purposes only. 1269 [10] J. Widmer and T. Fuhrmann, "Extremum Feedback for Very Large 1270 Multicast Groups", Proc NGC 2001, London, November 2001. 1272 [11] J. Widmer and M. Handley, "Extending Equation-Based Congestion 1273 Control to Multicast Applications", Proc ACM SIGCOMM 2001, San 1274 Diego, August 2001 1276 11. Copyright Notice 1278 Copyright (C) The Internet Society (2004). This document is subject to 1279 the rights, licenses and restrictions contained in BCP 78, and except as 1280 set forth therein, the authors retain all their rights. 1282 This document and the information contained herein are provided on an 1283 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR 1284 IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 1285 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 1286 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 1287 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1288 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.