idnits 2.17.1 draft-ietf-rmcat-coupled-cc-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (December 7, 2016) is 2669 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Outdated reference: A later version (-13) exists of draft-ietf-rmcat-nada-03 == Outdated reference: A later version (-10) exists of draft-ietf-rmcat-eval-test-04 == Outdated reference: A later version (-11) exists of draft-ietf-rmcat-sbd-04 == Outdated reference: A later version (-17) exists of draft-ietf-rtcweb-transports-11 Summary: 0 errors (**), 0 flaws (~~), 5 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 RTP Media Congestion Avoidance S. Islam 3 Techniques (rmcat) M. Welzl 4 Internet-Draft S. Gjessing 5 Intended status: Experimental University of Oslo 6 Expires: June 10, 2017 December 7, 2016 8 Coupled congestion control for RTP media 9 draft-ietf-rmcat-coupled-cc-05 11 Abstract 13 When multiple congestion controlled RTP sessions traverse the same 14 network bottleneck, combining their controls can improve the total 15 on-the-wire behavior in terms of delay, loss and fairness. This 16 document describes such a method for flows that have the same sender, 17 in a way that is as flexible and simple as possible while minimizing 18 the amount of changes needed to existing RTP applications. It 19 specifies how to apply the method for the NADA congestion control 20 algorithm, and provides suggestions on how to apply it to other 21 congestion control algorithms. 23 Status of this Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at http://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on June 10, 2017. 40 Copyright Notice 42 Copyright (c) 2016 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (http://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 58 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3 59 3. Limitations . . . . . . . . . . . . . . . . . . . . . . . . . 4 60 4. Architectural overview . . . . . . . . . . . . . . . . . . . . 4 61 5. Roles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 62 5.1. SBD . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 63 5.2. FSE . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 64 5.3. Flows . . . . . . . . . . . . . . . . . . . . . . . . . . 8 65 5.3.1. Example algorithm 1 - Active FSE . . . . . . . . . . . 8 66 5.3.2. Example algorithm 2 - Conservative Active FSE . . . . 9 67 6. Application . . . . . . . . . . . . . . . . . . . . . . . . . 10 68 6.1. NADA . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 69 6.2. General recommendations . . . . . . . . . . . . . . . . . 11 70 7. Expected feedback from experiments . . . . . . . . . . . . . . 12 71 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 12 72 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 73 10. Security Considerations . . . . . . . . . . . . . . . . . . . 12 74 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 13 75 11.1. Normative References . . . . . . . . . . . . . . . . . . . 13 76 11.2. Informative References . . . . . . . . . . . . . . . . . . 13 77 Appendix A. Application to GCC . . . . . . . . . . . . . . . . . 15 78 Appendix B. Scheduling . . . . . . . . . . . . . . . . . . . . . 15 79 Appendix C. Example algorithm - Passive FSE . . . . . . . . . . . 15 80 C.1. Example operation (passive) . . . . . . . . . . . . . . . 18 81 Appendix D. Change log . . . . . . . . . . . . . . . . . . . . . 22 82 D.1. draft-welzl-rmcat-coupled-cc . . . . . . . . . . . . . . . 22 83 D.1.1. Changes from -00 to -01 . . . . . . . . . . . . . . . 22 84 D.1.2. Changes from -01 to -02 . . . . . . . . . . . . . . . 22 85 D.1.3. Changes from -02 to -03 . . . . . . . . . . . . . . . 23 86 D.1.4. Changes from -03 to -04 . . . . . . . . . . . . . . . 23 87 D.1.5. Changes from -04 to -05 . . . . . . . . . . . . . . . 23 88 D.2. draft-ietf-rmcat-coupled-cc . . . . . . . . . . . . . . . 23 89 D.2.1. Changes from draft-welzl-rmcat-coupled-cc-05 . . . . . 23 90 D.2.2. Changes from -00 to -01 . . . . . . . . . . . . . . . 23 91 D.2.3. Changes from -01 to -02 . . . . . . . . . . . . . . . 23 92 D.2.4. Changes from -02 to -03 . . . . . . . . . . . . . . . 24 93 D.2.5. Changes from -03 to -04 . . . . . . . . . . . . . . . 24 94 D.2.6. Changes from -04 to -05 . . . . . . . . . . . . . . . 24 95 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 24 97 1. Introduction 99 When there is enough data to send, a congestion controller must 100 increase its sending rate until the path's capacity has been reached; 101 depending on the controller, sometimes the rate is increased further, 102 until packets are ECN-marked or dropped. This process inevitably 103 creates undesirable queuing delay when multiple congestion controlled 104 connections traverse the same network bottleneck. 106 The Congestion Manager (CM) [RFC3124] couples flows by providing a 107 single congestion controller. It is hard to implement because it 108 requires an additional congestion controller and removes all per- 109 connection congestion control functionality, which is quite a 110 significant change to existing RTP based applications. This document 111 presents a method to combine the behavior of congestion control 112 mechanisms that is easier to implement than the Congestion Manager 113 [RFC3124] and also requires less significant changes to existing RTP 114 based applications. It attempts to roughly approximate the CM 115 behavior by sharing information between existing congestion 116 controllers. It is able to honor user-specified priorities, which is 117 required by rtcweb [RFC7478]. 119 2. Definitions 121 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 122 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 123 document are to be interpreted as described in RFC 2119 [RFC2119]. 125 Available Bandwidth: 126 The available bandwidth is the nominal link capacity minus the 127 amount of traffic that traversed the link during a certain time 128 interval, divided by that time interval. 130 Bottleneck: 131 The first link with the smallest available bandwidth along the 132 path between a sender and receiver. 134 Flow: 135 A flow is the entity that congestion control is operating on. 136 It could, for example, be a transport layer connection, an RTP 137 session, or a subsession that is multiplexed onto a single RTP 138 session together with other subsessions. 140 Flow Group Identifier (FGI): 141 A unique identifier for each subset of flows that is limited by 142 a common bottleneck. 144 Flow State Exchange (FSE): 145 The entity that maintains information that is exchanged between 146 flows. 148 Flow Group (FG): 149 A group of flows having the same FGI. 151 Shared Bottleneck Detection (SBD): 152 The entity that determines which flows traverse the same 153 bottleneck in the network, or the process of doing so. 155 3. Limitations 157 Sender-side only: 158 Coupled congestion control as described here only operates 159 inside a single host on the sender side. This is because, 160 irrespective of where the major decisions for congestion 161 control are taken, the sender of a flow needs to eventually 162 decide on the transmission rate. Additionally, the necessary 163 information about how much data an application can currently 164 send on a flow is often only available at the sender side, 165 making the sender an obvious choice for placement of the 166 elements and mechanisms described here. 168 Shared bottlenecks do not change quickly: 169 As per the definition above, a bottleneck depends on cross 170 traffic, and since such traffic can heavily fluctuate, 171 bottlenecks can change at a high frequency (e.g., there can be 172 oscillation between two or more links). This means that, when 173 flows are partially routed along different paths, they may 174 quickly change between sharing and not sharing a bottleneck. 175 For simplicity, here it is assumed that a shared bottleneck is 176 valid for a time interval that is significantly longer than the 177 interval at which congestion controllers operate. Note that, 178 for the only SBD mechanism defined in this document 179 (multiplexing on the same five-tuple), the notion of a shared 180 bottleneck stays correct even in the presence of fast traffic 181 fluctuations: since all flows that are assumed to share a 182 bottleneck are routed in the same way, if the bottleneck 183 changes, it will still be shared. 185 4. Architectural overview 187 Figure 1 shows the elements of the architecture for coupled 188 congestion control: the Flow State Exchange (FSE), Shared Bottleneck 189 Detection (SBD) and Flows. The FSE is a storage element that can be 190 implemented in two ways: active and passive. In the active version, 191 it initiates communication with flows and SBD. However, in the 192 passive version, it does not actively initiate communication with 193 flows and SBD; its only active role is internal state maintenance 194 (e.g., an implementation could use soft state to remove a flow's data 195 after long periods of inactivity). Every time a flow's congestion 196 control mechanism would normally update its sending rate, the flow 197 instead updates information in the FSE and performs a query on the 198 FSE, leading to a sending rate that can be different from what the 199 congestion controller originally determined. Using information 200 about/from the currently active flows, SBD updates the FSE with the 201 correct Flow State Identifiers (FSIs). This document describes both 202 active and passive versions, however the passive version is put into 203 the appendix as it is extremely experimental. Figure 2 shows the 204 interaction between flows and the FSE, using the variable names 205 defined in Section 5.2. 207 ------- <--- Flow 1 208 | FSE | <--- Flow 2 .. 209 ------- <--- .. Flow N 210 ^ 211 | | 212 ------- | 213 | SBD | <-------| 214 ------- 216 Figure 1: Coupled congestion control architecture 218 Flow#1(cc) FSE Flow#2(cc) 219 ---------- --- ---------- 220 #1 JOIN ----register--> REGISTER 222 REGISTER <--register-- JOIN #1 224 #2 CC_R ----UPDATE----> UPDATE (in) 226 #3 NEW RATE <---FSE_R------ UPDATE (out) --FSE_R----> #3 NEW RATE 228 Figure 2: Flow-FSE interaction 230 Since everything shown in Figure 1 is assumed to operate on a single 231 host (the sender) only, this document only describes aspects that 232 have an influence on the resulting on-the-wire behavior. It does, 233 for instance, not define how many bits must be used to represent 234 FSIs, or in which way the entities communicate. Implementations can 235 take various forms: for instance, all the elements in the figure 236 could be implemented within a single application, thereby operating 237 on flows generated by that application only. Another alternative 238 could be to implement both the FSE and SBD together in a separate 239 process which different applications communicate with via some form 240 of Inter-Process Communication (IPC). Such an implementation would 241 extend the scope to flows generated by multiple applications. The 242 FSE and SBD could also be included in the Operating System kernel. 244 5. Roles 246 This section gives an overview of the roles of the elements of 247 coupled congestion control, and provides an example of how coupled 248 congestion control can operate. 250 5.1. SBD 252 SBD uses knowledge about the flows to determine which flows belong in 253 the same Flow Group (FG), and assigns FGIs accordingly. This 254 knowledge can be derived in three basic ways: 256 1. From multiplexing: it can be based on the simple assumption that 257 packets sharing the same five-tuple (IP source and destination 258 address, protocol, and transport layer port number pair) and 259 having the same Differentiated Services Code Point (DSCP) in the 260 IP header are typically treated in the same way along the path. 261 The latter method is the only one specified in this document: SBD 262 MAY consider all flows that use the same five-tuple and DSCP to 263 belong to the same FG. This classification applies to certain 264 tunnels, or RTP flows that are multiplexed over one transport 265 (cf. [transport-multiplex]). Such multiplexing is also a 266 recommended usage of RTP in rtcweb [rtcweb-rtp-usage]. 268 2. Via configuration: e.g. by assuming that a common wireless uplink 269 is also a shared bottleneck. 271 3. From measurements: e.g. by considering correlations among 272 measured delay and loss as an indication of a shared bottleneck. 274 The methods above have some essential trade-offs: e.g., multiplexing 275 is a completely reliable measure, however it is limited in scope to 276 two end points (i.e., it cannot be applied to couple congestion 277 controllers of one sender talking to multiple receivers). A 278 measurement-based SBD mechanism is described in [I-D.ietf-rmcat-sbd]. 280 Measurements can never be 100% reliable, in particular because they 281 are based on the past but applying coupled congestion control means 282 to make an assumption about the future; it is therefore recommended 283 to implement cautionary measures, e.g. by disabling coupled 284 congestion control if enabling it causes a significant increase in 285 delay and/or packet loss. Measurements also take time, which entails 286 a certain delay for turning on coupling (refer to 287 [I-D.ietf-rmcat-sbd] for details). Using system configuration to 288 decide about shared bottlenecks can be more efficient (faster to 289 obtain) than using measurements, but it relies on assumptions about 290 the network environment. 292 5.2. FSE 294 The FSE contains a list of all flows that have registered with it. 295 For each flow, it stores the following: 297 o a unique flow number to identify the flow 299 o the FGI of the FG that it belongs to (based on the definitions in 300 this document, a flow has only one bottleneck, and can therefore 301 be in only one FG) 303 o a priority P, which here is assumed to be represented as a 304 floating point number in the range from 0.1 (unimportant) to 1 305 (very important). 307 o The rate used by the flow in bits per second, FSE_R. 309 Note that the priority does not need to be a floating point value and 310 its value range does not matter for this algorithm: the algorithm 311 works with a flow's priority portion of the sum of all priority 312 values. Priorities can therefore be mapped to the "very-low", "low", 313 "medium" or "high" priority levels described in 314 [I-D.ietf-rtcweb-transports] using the values 1, 2, 4 and 8, 315 respectively. 317 In the FSE, each FG contains one static variable S_CR which is the 318 sum of the calculated rates of all flows in the same FG. This value 319 is used to calculate the sending rate. 321 The information listed here is enough to implement the sample flow 322 algorithm given below. FSE implementations could easily be extended 323 to store, e.g., a flow's current sending rate for statistics 324 gathering or future potential optimizations. 326 5.3. Flows 328 Flows register themselves with SBD and FSE when they start, 329 deregister from the FSE when they stop, and carry out an UPDATE 330 function call every time their congestion controller calculates a new 331 sending rate. Via UPDATE, they provide the newly calculated rate and 332 optionally (if the algorithm supports it) the desired rate. The 333 desired rate is less than the calculated rate in case of application- 334 limited flows; otherwise, it is the same as the calculated rate. 336 Below, two example algorithms are described. While other algorithms 337 could be used instead, the same algorithm must be applied to all 338 flows. Names of variables used in the algorithms are explained 339 below. 341 o CC_R - The rate received from a flow's congestion controller when 342 it calls UPDATE. 344 o FSE_R - The rate calculated by the FSE for a flow. 346 o S_CR - The sum of the calculated rates of all flows in the same 347 FG; this value is used to calculate the sending rate. 349 o FG - A group of flows having the same FGI, and hence sharing the 350 same bottleneck. 352 o P - The priority of a flow which is received from the flow's 353 congestion controller; the FSE uses this variable for calculating 354 FSE R. 356 o S_P - The sum of all the priorities. 358 5.3.1. Example algorithm 1 - Active FSE 360 This algorithm was designed to be the simplest possible method to 361 assign rates according to the priorities of flows. Simulations 362 results in [fse] indicate that it does however not significantly 363 reduce queuing delay and packet loss. 365 (1) When a flow f starts, it registers itself with SBD and the FSE. 366 FSE_R is initialized with the congestion controller's initial 367 rate. SBD will assign the correct FGI. When a flow is assigned 368 an FGI, it adds its FSE_R to S_CR. 370 (2) When a flow f stops or pauses, its entry is removed from the 371 list. 373 (3) Every time the congestion controller of the flow f determines a 374 new sending rate CC_R, the flow calls UPDATE, which carries out 375 the tasks listed below to derive the new sending rates for all 376 the flows in the FG. A flow's UPDATE function uses a local 377 (i.e. per-flow) temporary variable S_P, which is the sum of all 378 the priorities. 380 (a) It updates S_CR. 382 S_CR = S_CR + CC_R - FSE_R(f) 384 (b) It calculates the sum of all the priorities, S_P. 386 S_P = 0 387 for all flows i in FG do 388 S_P = S_P + P(i) 389 end for 391 (c) It calculates the sending rates for all the flows in an FG 392 and distributes them. 394 for all flows i in FG do 395 FSE_R(i) = (P(i)*S_CR)/S_P 396 send FSE_R(i) to the flow i 397 end for 399 5.3.2. Example algorithm 2 - Conservative Active FSE 401 This algorithm extends algorithm 1 to conservatively emulate the 402 behavior of a single flow by proportionally reducing the aggregate 403 rate on congestion. Simulations results in [fse] indicate that it 404 can significantly reduce queuing delay and packet loss. 406 (1) When a flow f starts, it registers itself with SBD and the FSE. 407 FSE_R is initialized with the congestion controller's initial 408 rate. SBD will assign the correct FGI. When a flow is assigned 409 an FGI, it adds its FSE_R to S_CR. 411 (2) When a flow f stops or pauses, its entry is removed from the 412 list. 414 (3) Every time the congestion controller of the flow f determines a 415 new sending rate CC_R, the flow calls UPDATE, which carries out 416 the tasks listed below to derive the new sending rates for all 417 the flows in the FG. A flow's UPDATE function uses a local 418 (i.e. per-flow) temporary variable S_P, which is the sum of all 419 the priorities, and a local variable DELTA, which is used to 420 calculate the difference between CC_R and the previously stored 421 FSE_R. To prevent flows from either ignoring congestion or 422 overreacting, a timer keeps them from changing their rates 423 immediately after the common rate reduction that follows a 424 congestion event. This timer is set to 2 RTTs of the flow that 425 experienced congestion because it is assumed that a congestion 426 event can persist for up to one RTT of that flow, with another 427 RTT added to compensate for fluctuations in the measured RTT 428 value. 430 (a) It updates S_CR based on DELTA. 432 if Timer has expired or not set then 433 DELTA = CC_R - FSE_R(f) 434 if DELTA < 0 then // Reduce S_CR proportionally 435 S_CR = S_CR * CC_R / FSE_R(f) 436 Set Timer for 2 RTTs 437 else 438 S_CR = S_CR + DELTA 439 end if 440 end if 442 (b) It calculates the sum of all the priorities, S_P. 444 S_P = 0 445 for all flows i in FG do 446 S_P = S_P + P(i) 447 end for 449 (c) It calculates the sending rates for all the flows in an FG 450 and distributes them. 452 for all flows i in FG do 453 FSE_R(i) = (P(i)*S_CR)/S_P 454 send FSE_R(i) to the flow i 455 end for 457 6. Application 459 This section specifies how the FSE can be applied to specific 460 congestion control mechanisms and makes general recommendations that 461 facilitate applying the FSE to future congestion controls. 463 6.1. NADA 465 Network-Assisted Dynamic Adapation (NADA) [I-D.ietf-rmcat-nada] is a 466 congestion control scheme for rtcweb. It calculates a reference rate 467 r_ref upon receiving an acknowledgment, and then, based on the 468 reference rate, it calculates a video target rate r_vin and a sending 469 rate for the flows, r_send. 471 When applying the FSE to NADA, the UPDATE function call described in 472 Section 5.3 gives the FSE NADA's reference rate r_ref. The 473 recommended algorithm for NADA is the Active FSE in Section 5.3.1. 474 In step 3 (c), when the FSE_R(i) is "sent" to the flow i, this means 475 updating r_ref(r_vin and r_send) of flow i with the value of 476 FSE_R(i). 478 6.2. General recommendations 480 This section provides general advice for applying the FSE to 481 congestion control mechanisms. 483 Receiver-side calculations: 484 When receiver-side calculations make assumptions about the rate 485 of the sender, the calculations need to be synchronized or the 486 receiver needs to be updated accordingly. This applies to TFRC 487 [RFC5348], for example, where simulations showed somewhat less 488 favorable results when using the FSE without a receiver-side 489 change [fse]. 491 Stateful algorithms: 492 When a congestion control algorithm is stateful (e.g., TCP, 493 with Slow Start, Congestion Avoidance and Fast Recovery), these 494 states should be carefully considered such that the overall 495 state of the aggregate flow is correct. This may require 496 sharing more information in the UPDATE call. 498 Rate jumps: 499 The FSE-based coupling algorithms can let a flow quickly 500 increase its rate to its fair share, e.g. when a new flow joins 501 or after a quiescent period. In case of window-based 502 congestion controls, this may produce a burst which should be 503 mitigated in some way. An example of how this could be done 504 without using a timer is presented in [anrw2016], using TCP as 505 an example. 507 7. Expected feedback from experiments 509 The algorithm described in this memo has so far been evaluated using 510 simulations covering all the tests for more than one flow from 511 [I-D.ietf-rmcat-eval-test] (see [IETF-93], [IETF-94]). Experiments 512 should confirm these results using at least the NADA congestion 513 control algorithm with real-life code (e.g., browsers communicating 514 over an emulated network covering the conditions in 515 [I-D.ietf-rmcat-eval-test]. The tests with real-life code should be 516 repeated afterwards in real network environments and monitored. 517 Experiments should investigate cases where the media coder's output 518 rate is below the rate that is calculated by the coupling algorithm 519 (FSE_R in algorithms 1 and 2, section 5.3). Implementers and testers 520 are invited to document their findings in an Internet draft. 522 8. Acknowledgements 524 This document has benefitted from discussions with and feedback from 525 Andreas Petlund, Anna Brunstrom, David Hayes, David Ros (who also 526 gave the FSE its name), Ingemar Johansson, Karen Nielsen, Kristian 527 Hiorth, Mirja Kuehlewind, Martin Stiemerling, Varun Singh, Xiaoqing 528 Zhu, and Zaheduzzaman Sarker. The authors would like to especially 529 thank Xiaoqing Zhu and Stefan Holmer for helping with NADA and GCC. 531 This work was partially funded by the European Community under its 532 Seventh Framework Programme through the Reducing Internet Transport 533 Latency (RITE) project (ICT-317700). 535 9. IANA Considerations 537 This memo includes no request to IANA. 539 10. Security Considerations 541 In scenarios where the architecture described in this document is 542 applied across applications, various cheating possibilities arise: 543 e.g., supporting wrong values for the calculated rate, the desired 544 rate, or the priority of a flow. In the worst case, such cheating 545 could either prevent other flows from sending or make them send at a 546 rate that is unreasonably large. The end result would be unfair 547 behavior at the network bottleneck, akin to what could be achieved 548 with any UDP based application. Hence, since this is no worse than 549 UDP in general, there seems to be no significant harm in using this 550 in the absence of UDP rate limiters. 552 In the case of a single-user system, it should also be in the 553 interest of any application programmer to give the user the best 554 possible experience by using reasonable flow priorities or even 555 letting the user choose them. In a multi-user system, this interest 556 may not be given, and one could imagine the worst case of an "arms 557 race" situation, where applications end up setting their priorities 558 to the maximum value. If all applications do this, the end result is 559 a fair allocation in which the priority mechanism is implicitly 560 eliminated, and no major harm is done. 562 11. References 564 11.1. Normative References 566 [I-D.ietf-rmcat-nada] 567 Zhu, X., Pan, R., Ramalho, M., Cruz, S., Jones, P., Fu, 568 J., D'Aronco, S., and C. Ganzhorn, "NADA: A Unified 569 Congestion Control Scheme for Real-Time Media", 570 draft-ietf-rmcat-nada-03 (work in progress), 571 September 2016. 573 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 574 Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/ 575 RFC2119, March 1997, 576 . 578 [RFC3124] Balakrishnan, H. and S. Seshan, "The Congestion Manager", 579 RFC 3124, DOI 10.17487/RFC3124, June 2001, 580 . 582 [RFC5348] Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP 583 Friendly Rate Control (TFRC): Protocol Specification", 584 RFC 5348, DOI 10.17487/RFC5348, September 2008, 585 . 587 11.2. Informative References 589 [I-D.ietf-rmcat-eval-test] 590 Sarker, Z., Singh, V., Zhu, X., and M. Ramalho, "Test 591 Cases for Evaluating RMCAT Proposals", 592 draft-ietf-rmcat-eval-test-04 (work in progress), 593 October 2016. 595 [I-D.ietf-rmcat-gcc] 596 Holmer, S., Lundin, H., Carlucci, G., Cicco, L., and S. 597 Mascolo, "A Google Congestion Control Algorithm for Real- 598 Time Communication", draft-ietf-rmcat-gcc-02 (work in 599 progress), July 2016. 601 [I-D.ietf-rmcat-sbd] 602 Hayes, D., Ferlin, S., Welzl, M., and K. Hiorth, "Shared 603 Bottleneck Detection for Coupled Congestion Control for 604 RTP Media.", draft-ietf-rmcat-sbd-04 (work in progress), 605 March 2016. 607 [I-D.ietf-rtcweb-transports] 608 Alvestrand, H., "Transports for WebRTC", 609 draft-ietf-rtcweb-transports-11.txt (work in progress), 610 January 2016. 612 [IETF-93] Islam, S., Welzl, M., and S. Gjessing, "Updates on Coupled 613 Congestion Control for RTP Media", July 2015, 614 . 616 [IETF-94] Islam, S., Welzl, M., and S. Gjessing, "Updates on Coupled 617 Congestion Control for RTP Media", November 2015, 618 . 620 [RFC7478] Holmberg, C., Hakansson, S., and G. Eriksson, "Web Real- 621 Time Communication Use Cases and Requirements", RFC 7478, 622 DOI 10.17487/RFC7478, March 2015, 623 . 625 [anrw2016] 626 Islam, S. and M. Welzl, "Start Me Up:Determining and 627 Sharing TCP's Initial Congestion Window", ACM, IRTF, ISOC 628 Applied Networking Research Workshop 2016 (ANRW 2016) , 629 2016. 631 [fse] Islam, S., Welzl, M., Gjessing, S., and N. Khademi, 632 "Coupled Congestion Control for RTP Media", ACM SIGCOMM 633 Capacity Sharing Workshop (CSWS 2014) and ACM SIGCOMM CCR 634 44(4) 2014; extended version available as a technical 635 report from 636 http://safiquli.at.ifi.uio.no/paper/fse-tech-report.pdf , 637 2014. 639 [fse-noms] 640 Islam, S., Welzl, M., Hayes, D., and S. Gjessing, 641 "Managing Real-Time Media Flows through a Flow State 642 Exchange", IEEE NOMS 2016, Istanbul, Turkey , 2016. 644 [rtcweb-rtp-usage] 645 Perkins, C., Westerlund, M., and J. Ott, "Web Real-Time 646 Communication (WebRTC): Media Transport and Use of RTP", 647 draft-ietf-rtcweb-rtp-usage-26.txt (work in progress), 648 March 2016. 650 [transport-multiplex] 651 Westerlund, M. and C. Perkins, "Multiple RTP Sessions on a 652 Single Lower-Layer Transport", 653 draft-westerlund-avtcore-transport-multiplexing-07.txt 654 (work in progress), October 2013. 656 Appendix A. Application to GCC 658 Google Congestion Control (GCC) [I-D.ietf-rmcat-gcc] is another 659 congestion control scheme for RTP flows that is under development. 660 GCC is not yet finalised, but at the time of this writing, the rate 661 control of GCC employs two parts: controlling the bandwidth estimate 662 based on delay, and controlling the bandwidth estimate based on loss. 663 Both are designed to estimate the available bandwidth, A_hat. 665 When applying the FSE to GCC, the UPDATE function call described in 666 Section 5.3 gives the FSE GCC's estimate of available bandwidth 667 A_hat. The recommended algorithm for GCC is the Active FSE in 668 Section 5.3.1. In step 3 (c), when the FSE_R(i) is "sent" to the 669 flow i, this means updating A_hat of flow i with the value of 670 FSE_R(i). 672 Appendix B. Scheduling 674 When connections originate from the same host, it would be possible 675 to use only one single sender-side congestion controller which 676 determines the overall allowed sending rate, and then use a local 677 scheduler to assign a proportion of this rate to each RTP session. 678 This way, priorities could also be implemented as a function of the 679 scheduler. The Congestion Manager (CM) [RFC3124] also uses such a 680 scheduling function. 682 Appendix C. Example algorithm - Passive FSE 684 Active algorithms calculate the rates for all the flows in the FG and 685 actively distribute them. In a passive algorithm, UPDATE returns a 686 rate that should be used instead of the rate that the congestion 687 controller has determined. This can make a passive algorithm easier 688 to implement; however, when round-trip times of flows are unequal, 689 shorter-RTT flows may (depending on the congestion control algorithm) 690 update and react to the overall FSE state more often than longer-RTT 691 flows, which can produce unwanted side effects. This problem is more 692 significant when the congestion control convergence depends on the 693 RTT. While the passive algorithm works better for congestion 694 controls with RTT-independent convergence, it can still produce 695 oscillations on short time scales. The algorithm described below is 696 therefore considered as highly experimental. Results of a simplified 697 passive FSE algorithm with both NADA and GCC can be found in 698 [fse-noms]. 700 This passive version of the FSE stores the following information in 701 addition to the variables described in Section 5.2: 703 o The desired rate DR. This can be smaller than the calculated rate 704 if the application feeding into the flow has less data to send 705 than the congestion controller would allow. In case of a bulk 706 transfer, DR must be set to CC_R received from the flow's 707 congestion module. 709 The passive version of the FSE contains one static variable per FG 710 called TLO (Total Leftover Rate -- used to let a flow 'take' 711 bandwidth from application-limited or terminated flows) which is 712 initialized to 0. For the passive version, S_CR is limited to 713 increase or decrease as conservatively as a flow's congestion 714 controller decides in order to prohibit sudden rate jumps. 716 (1) When a flow f starts, it registers itself with SBD and the FSE. 717 FSE_R and DR are initialized with the congestion controller's 718 initial rate. SBD will assign the correct FGI. When a flow is 719 assigned an FGI, it adds its FSE_R to S_CR. 721 (2) When a flow f stops or pauses, it sets its DR to 0 and sets P to 722 -1. 724 (3) Every time the congestion controller of the flow f determines a 725 new sending rate CC_R, assuming the flow's new desired rate 726 new_DR to be "infinity" in case of a bulk data transfer with an 727 unknown maximum rate, the flow calls UPDATE, which carries out 728 the tasks listed below to derive the flow's new sending rate, 729 Rate. A flow's UPDATE function uses a few local (i.e. per-flow) 730 temporary variables, which are all initialized to 0: DELTA, 731 new_S_CR and S_P. 733 (a) For all the flows in its FG (including itself), it 734 calculates the sum of all the calculated rates, new_S_CR. 735 Then it calculates the difference between FSE_R(f) and 736 CC_R, DELTA. 738 for all flows i in FG do 739 new_S_CR = new_S_CR + FSE_R(i) 740 end for 741 DELTA = CC_R - FSE_R(f) 743 (b) It updates S_CR, FSE_R(f) and DR(f). 745 FSE_R(f) = CC_R 746 if DELTA > 0 then // the flow's rate has increased 747 S_CR = S_CR + DELTA 748 else if DELTA < 0 then 749 S_CR = new_S_CR + DELTA 750 end if 751 DR(f) = min(new_DR,FSE_R(f)) 753 (c) It calculates the leftover rate TLO, removes the terminated 754 flows from the FSE and calculates the sum of all the 755 priorities, S_P. 757 for all flows i in FG do 758 if P(i)<0 then 759 delete flow 760 else 761 S_P = S_P + P(i) 762 end if 763 end for 764 if DR(f) < FSE_R(f) then 765 TLO = TLO + (P(f)/S_P) * S_CR - DR(f)) 766 end if 768 (d) It calculates the sending rate, Rate. 770 Rate = min(new_DR, (P(f)*S_CR)/S_P + TLO) 772 if Rate != new_DR and TLO > 0 then 773 TLO = 0 // f has 'taken' TLO 774 end if 776 (e) It updates DR(f) and FSE_R(f) with Rate. 778 if Rate > DR(f) then 779 DR(f) = Rate 780 end if 781 FSE_R(f) = Rate 783 The goals of the flow algorithm are to achieve prioritization, 784 improve network utilization in the face of application-limited flows, 785 and impose limits on the increase behavior such that the negative 786 impact of multiple flows trying to increase their rate together is 787 minimized. It does that by assigning a flow a sending rate that may 788 not be what the flow's congestion controller expected. It therefore 789 builds on the assumption that no significant inefficiencies arise 790 from temporary application-limited behavior or from quickly jumping 791 to a rate that is higher than the congestion controller intended. 792 How problematic these issues really are depends on the controllers in 793 use and requires careful per-controller experimentation. The coupled 794 congestion control mechanism described here also does not require all 795 controllers to be equal; effects of heterogeneous controllers, or 796 homogeneous controllers being in different states, are also subject 797 to experimentation. 799 This algorithm gives all the leftover rate of application-limited 800 flows to the first flow that updates its sending rate, provided that 801 this flow needs it all (otherwise, its own leftover rate can be taken 802 by the next flow that updates its rate). Other policies could be 803 applied, e.g. to divide the leftover rate of a flow equally among all 804 other flows in the FGI. 806 C.1. Example operation (passive) 808 In order to illustrate the operation of the passive coupled 809 congestion control algorithm, this section presents a toy example of 810 two flows that use it. Let us assume that both flows traverse a 811 common 10 Mbit/s bottleneck and use a simplistic congestion 812 controller that starts out with 1 Mbit/s, increases its rate by 1 813 Mbit/s in the absence of congestion and decreases it by 2 Mbit/s in 814 the presence of congestion. For simplicity, flows are assumed to 815 always operate in a round-robin fashion. Rate numbers below without 816 units are assumed to be in Mbit/s. For illustration purposes, the 817 actual sending rate is also shown for every flow in FSE diagrams even 818 though it is not really stored in the FSE. 820 Flow #1 begins. It is a bulk data transfer and considers itself to 821 have top priority. This is the FSE after the flow algorithm's step 822 1: 824 ---------------------------------------- 825 | # | FGI | P | FSE_R | DR | Rate | 826 | | | | | | | 827 | 1 | 1 | 1 | 1 | 1 | 1 | 828 ---------------------------------------- 829 S_CR = 1, TLO = 0 830 Its congestion controller gradually increases its rate. Eventually, 831 at some point, the FSE should look like this: 833 ----------------------------------------- 834 | # | FGI | P | FSE_R | DR | Rate | 835 | | | | | | | 836 | 1 | 1 | 1 | 10 | 10 | 10 | 837 ----------------------------------------- 838 S_CR = 10, TLO = 0 840 Now another flow joins. It is also a bulk data transfer, and has a 841 lower priority (0.5): 843 ------------------------------------------ 844 | # | FGI | P | FSE_R | DR | Rate | 845 | | | | | | | 846 | 1 | 1 | 1 | 10 | 10 | 10 | 847 | 2 | 1 | 0.5 | 1 | 1 | 1 | 848 ------------------------------------------ 849 S_CR = 11, TLO = 0 851 Now assume that the first flow updates its rate to 8, because the 852 total sending rate of 11 exceeds the total capacity. Let us take a 853 closer look at what happens in step 3 of the flow algorithm. 855 CC_R = 8. new_DR = infinity. 856 3 a) new_S_CR = 11; DELTA = 8 - 10 = -2. 857 3 b) FSE_Rf) = 8. DELTA is negative, hence S_CR = 9; 858 DR(f) = 8. 859 3 c) S_P = 1.5. 860 3 d) new sending rate = min(infinity, 1/1.5 * 9 + 0) = 6. 861 3 e) FSE_R(f) = 6. 863 The resulting FSE looks as follows: 864 ------------------------------------------- 865 | # | FGI | P | FSE_R | DR | Rate | 866 | | | | | | | 867 | 1 | 1 | 1 | 6 | 8 | 6 | 868 | 2 | 1 | 0.5 | 1 | 1 | 1 | 869 ------------------------------------------- 870 S_CR = 9, TLO = 0 871 The effect is that flow #1 is sending with 6 Mbit/s instead of the 8 872 Mbit/s that the congestion controller derived. Let us now assume 873 that flow #2 updates its rate. Its congestion controller detects 874 that the network is not fully saturated (the actual total sending 875 rate is 6+1=7) and increases its rate. 877 CC_R=2. new_DR = infinity. 878 3 a) new_S_CR = 7; DELTA = 2 - 1 = 1. 879 3 b) FSE_R(f) = 2. DELTA is positive, hence S_CR = 9 + 1 = 10; 880 DR(f) = 2. 881 3 c) S_P = 1.5. 882 3 d) new sending rate = min(infinity, 0.5/1.5 * 10 + 0) = 3.33. 883 3 e) DR(f) = FSE_R(f) = 3.33. 885 The resulting FSE looks as follows: 886 ------------------------------------------- 887 | # | FGI | P | FSE_R | DR | Rate | 888 | | | | | | | 889 | 1 | 1 | 1 | 6 | 8 | 6 | 890 | 2 | 1 | 0.5 | 3.33 | 3.33 | 3.33 | 891 ------------------------------------------- 892 S_CR = 10, TLO = 0 894 The effect is that flow #2 is now sending with 3.33 Mbit/s, which is 895 close to half of the rate of flow #1 and leads to a total utilization 896 of 6(#1) + 3.33(#2) = 9.33 Mbit/s. Flow #2's congestion controller 897 has increased its rate faster than the controller actually expected. 898 Now, flow #1 updates its rate. Its congestion controller detects 899 that the network is not fully saturated and increases its rate. 900 Additionally, the application feeding into flow #1 limits the flow's 901 sending rate to at most 2 Mbit/s. 903 CC_R=7. new_DR=2. 904 3 a) new_S_CR = 9.33; DELTA = 1. 905 3 b) FSE_R(f) = 7, DELTA is positive, hence S_CR = 10 + 1 = 11; 906 DR = min(2, 7) = 2. 907 3 c) S_P = 1.5; DR(f) < FSE_R(f), hence TLO = 1/1.5 * 11 - 2 = 5.33. 908 3 d) new sending rate = min(2, 1/1.5 * 11 + 5.33) = 2. 909 3 e) FSE_R(f) = 2. 911 The resulting FSE looks as follows: 912 ------------------------------------------- 913 | # | FGI | P | FSE_R | DR | Rate | 914 | | | | | | | 915 | 1 | 1 | 1 | 2 | 2 | 2 | 916 | 2 | 1 | 0.5 | 3.33 | 3.33 | 3.33 | 917 ------------------------------------------- 918 S_CR = 11, TLO = 5.33 920 Now, the total rate of the two flows is 2 + 3.33 = 5.33 Mbit/s, i.e. 921 the network is significantly underutilized due to the limitation of 922 flow #1. Flow #2 updates its rate. Its congestion controller 923 detects that the network is not fully saturated and increases its 924 rate. 926 CC_R=4.33. new_DR = infinity. 927 3 a) new_S_CR = 5.33; DELTA = 1. 928 3 b) FSE_R(f) = 4.33. DELTA is positive, hence S_CR = 12; 929 DR(f) = 4.33. 930 3 c) S_P = 1.5. 931 3 d) new sending rate: min(infinity, 0.5/1.5 * 12 + 5.33 ) = 9.33. 932 3 e) FSE_R(f) = 9.33, DR(f) = 9.33. 934 The resulting FSE looks as follows: 935 ------------------------------------------- 936 | # | FGI | P | FSE_R | DR | Rate | 937 | | | | | | | 938 | 1 | 1 | 1 | 2 | 2 | 2 | 939 | 2 | 1 | 0.5 | 9.33 | 9.33 | 9.33 | 940 ------------------------------------------- 941 S_CR = 12, TLO = 0 943 Now, the total rate of the two flows is 2 + 9.33 = 11.33 Mbit/s. 944 Finally, flow #1 terminates. It sets P to -1 and DR to 0. Let us 945 assume that it terminated late enough for flow #2 to still experience 946 the network in a congested state, i.e. flow #2 decreases its rate in 947 the next iteration. 949 CC_R = 7.33. new_DR = infinity. 950 3 a) new_S_CR = 11.33; DELTA = -2. 951 3 b) FSE_R(f) = 7.33. DELTA is negative, hence S_CR = 9.33; 952 DR(f) = 7.33. 953 3 c) Flow 1 has P = -1, hence it is deleted from the FSE. 954 S_P = 0.5. 955 3 d) new sending rate: min(infinity, 0.5/0.5*9.33 + 0) = 9.33. 956 3 e) FSE_R(f) = DR(f) = 9.33. 958 The resulting FSE looks as follows: 959 ------------------------------------------- 960 | # | FGI | P | FSE_R | DR | Rate | 961 | | | | | | | 962 | 2 | 1 | 0.5 | 9.33 | 9.33 | 9.33 | 963 ------------------------------------------- 964 S_CR = 9.33, TLO = 0 966 Appendix D. Change log 968 D.1. draft-welzl-rmcat-coupled-cc 970 D.1.1. Changes from -00 to -01 972 o Added change log. 974 o Updated the example algorithm and its operation. 976 D.1.2. Changes from -01 to -02 978 o Included an active version of the algorithm which is simpler. 980 o Replaced "greedy flow" with "bulk data transfer" and "non-greedy" 981 with "application-limited". 983 o Updated new_CR to CC_R, and CR to FSE_R for better understanding. 985 D.1.3. Changes from -02 to -03 987 o Included an active conservative version of the algorithm which 988 reduces queue growth and packet loss; added a reference to a 989 technical report that shows these benefits with simulations. 991 o Moved the passive variant of the algorithm to appendix. 993 D.1.4. Changes from -03 to -04 995 o Extended SBD section. 997 o Added a note about window-based controllers. 999 D.1.5. Changes from -04 to -05 1001 o Added a section about applying the FSE to specific congestion 1002 control algorithms, with a subsection specifying its use with 1003 NADA. 1005 D.2. draft-ietf-rmcat-coupled-cc 1007 D.2.1. Changes from draft-welzl-rmcat-coupled-cc-05 1009 o Moved scheduling section to the appendix. 1011 D.2.2. Changes from -00 to -01 1013 o Included how to apply the algorithm to GCC. 1015 o Updated variable names of NADA to be in line with the latest 1016 version. 1018 o Added a reference to [I-D.ietf-rtcweb-transports] to make a 1019 connection to the prioritization text there. 1021 D.2.3. Changes from -01 to -02 1023 o Minor changes. 1025 o Moved references of NADA and GCC from informative to normative. 1027 o Added a reference for the passive variant of the algorithm. 1029 D.2.4. Changes from -02 to -03 1031 o Minor changes. 1033 o Added a section about expected feedback from experiments. 1035 D.2.5. Changes from -03 to -04 1037 o Described the names of variables used in the algorithms. 1039 o Added a diagram to illustrate the interaction between flows and 1040 the FSE. 1042 o Added text on the trade-off of using the configuration based 1043 approach. 1045 o Minor changes to enhance the readability. 1047 D.2.6. Changes from -04 to -05 1049 o Changed several occurrences of "NADA and GCC" to "NADA", including 1050 the abstract. 1052 o Moved the application to GCC to an appendix, and made the GCC 1053 reference informative. 1055 o Provided a few more general recommendations on applying the 1056 coupling algorithm. 1058 Authors' Addresses 1060 Safiqul Islam 1061 University of Oslo 1062 PO Box 1080 Blindern 1063 Oslo, N-0316 1064 Norway 1066 Phone: +47 22 84 08 37 1067 Email: safiquli@ifi.uio.no 1068 Michael Welzl 1069 University of Oslo 1070 PO Box 1080 Blindern 1071 Oslo, N-0316 1072 Norway 1074 Phone: +47 22 85 24 20 1075 Email: michawe@ifi.uio.no 1077 Stein Gjessing 1078 University of Oslo 1079 PO Box 1080 Blindern 1080 Oslo, N-0316 1081 Norway 1083 Phone: +47 22 85 24 44 1084 Email: steing@ifi.uio.no