idnits 2.17.1 draft-ietf-avtcore-rtp-circuit-breakers-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document updates RFC3550, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC3550, updated by this document, for RFC5378 checks: 1998-04-07) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (December 04, 2014) is 3402 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC5506' is defined on line 866, but no explicit reference was found in the text ** Obsolete normative reference: RFC 3448 (Obsoleted by RFC 5348) == Outdated reference: A later version (-12) exists of draft-ietf-avtcore-rtp-multi-stream-optimisation-04 -- Obsolete informational reference (is this intentional?): RFC 5405 (Obsoleted by RFC 8085) Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 AVTCORE Working Group C. S. Perkins 3 Internet-Draft University of Glasgow 4 Updates: 3550 (if approved) V. Singh 5 Intended status: Standards Track Aalto University 6 Expires: June 07, 2015 December 04, 2014 8 Multimedia Congestion Control: Circuit Breakers for Unicast RTP Sessions 9 draft-ietf-avtcore-rtp-circuit-breakers-08 11 Abstract 13 The Real-time Transport Protocol (RTP) is widely used in telephony, 14 video conferencing, and telepresence applications. Such applications 15 are often run on best-effort UDP/IP networks. If congestion control 16 is not implemented in the applications, then network congestion will 17 deteriorate the user's multimedia experience. This document does not 18 propose a congestion control algorithm; instead, it defines a minimal 19 set of RTP "circuit-breakers". Circuit-breakers are conditions under 20 which an RTP sender needs to stop transmitting media data in order to 21 protect the network from excessive congestion. It is expected that, 22 in the absence of severe congestion, all RTP applications running on 23 best-effort IP networks will be able to run without triggering these 24 circuit breakers. Any future RTP congestion control specification 25 will be expected to operate within the constraints defined by these 26 circuit breakers. 28 Status of This Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at http://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 This Internet-Draft will expire on June 07, 2015. 45 Copyright Notice 47 Copyright (c) 2014 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (http://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 Table of Contents 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 63 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 64 3. Background . . . . . . . . . . . . . . . . . . . . . . . . . 3 65 4. RTP Circuit Breakers for Systems Using the RTP/AVP Profile . 6 66 4.1. RTP/AVP Circuit Breaker #1: Media Timeout . . . . . . . . 8 67 4.2. RTP/AVP Circuit Breaker #2: RTCP Timeout . . . . . . . . 8 68 4.3. RTP/AVP Circuit Breaker #3: Congestion . . . . . . . . . 9 69 4.4. RTP/AVP Circuit Breaker #4: Media Usability . . . . . . . 13 70 4.5. Ceasing Transmission . . . . . . . . . . . . . . . . . . 14 71 5. RTP Circuit Breakers for Systems Using the RTP/AVPF Profile . 14 72 6. Impact of RTCP Extended Reports (XR) . . . . . . . . . . . . 15 73 7. Impact of RTCP Reporting Groups . . . . . . . . . . . . . . . 15 74 8. Impact of Explicit Congestion Notification (ECN) . . . . . . 16 75 9. Impact of Bundled Media and Layered Coding . . . . . . . . . 16 76 10. Security Considerations . . . . . . . . . . . . . . . . . . . 16 77 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17 78 12. Open Issues . . . . . . . . . . . . . . . . . . . . . . . . . 17 79 13. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 17 80 14. References . . . . . . . . . . . . . . . . . . . . . . . . . 17 81 14.1. Normative References . . . . . . . . . . . . . . . . . . 18 82 14.2. Informative References . . . . . . . . . . . . . . . . . 18 83 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 20 85 1. Introduction 87 The Real-time Transport Protocol (RTP) [RFC3550] is widely used in 88 voice-over-IP, video teleconferencing, and telepresence systems. 89 Many of these systems run over best-effort UDP/IP networks, and can 90 suffer from packet loss and increased latency if network congestion 91 occurs. Designing effective RTP congestion control algorithms, to 92 adapt the transmission of RTP-based media to match the available 93 network capacity, while also maintaining the user experience, is a 94 difficult but important problem. Many such congestion control and 95 media adaptation algorithms have been proposed, but to date there is 96 no consensus on the correct approach, or even that a single standard 97 algorithm is desirable. 99 This memo does not attempt to propose a new RTP congestion control 100 algorithm. Rather, it proposes a minimal set of "circuit breakers"; 101 conditions under which there is general agreement that an RTP flow is 102 causing serious congestion, and ought to cease transmission. It is 103 expected that future standards-track congestion control algorithms 104 for RTP will operate within the envelope defined by this memo. 106 2. Terminology 108 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 109 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 110 document are to be interpreted as described in RFC 2119 [RFC2119]. 111 This interpretation of these key words applies only when written in 112 ALL CAPS. Mixed- or lower-case uses of these key words are not to be 113 interpreted as carrying special significance in this memo. 115 3. Background 117 We consider congestion control for unicast RTP traffic flows. This 118 is the problem of adapting the transmission of an audio/visual data 119 flow, encapsulated within an RTP transport session, from one sender 120 to one receiver, so that it matches the available network bandwidth. 121 Such adaptation needs to be done in a way that limits the disruption 122 to the user experience caused by both packet loss and excessive rate 123 changes. Congestion control for multicast flows is outside the scope 124 of this memo. Multicast traffic needs different solutions, since the 125 available bandwidth estimator for a group of receivers will differ 126 from that for a single receiver, and because multicast congestion 127 control has to consider issues of fairness across groups of receivers 128 that do not apply to unicast flows. 130 Congestion control for unicast RTP traffic can be implemented in one 131 of two places in the protocol stack. One approach is to run the RTP 132 traffic over a congestion controlled transport protocol, for example 133 over TCP, and to adapt the media encoding to match the dictates of 134 the transport-layer congestion control algorithm. This is safe for 135 the network, but can be suboptimal for the media quality unless the 136 transport protocol is designed to support real-time media flows. We 137 do not consider this class of applications further in this memo, as 138 their network safety is guaranteed by the underlying transport. 140 Alternatively, RTP flows can be run over a non-congestion controlled 141 transport protocol, for example UDP, performing rate adaptation at 142 the application layer based on RTP Control Protocol (RTCP) feedback. 143 With a well-designed, network-aware, application, this allows highly 144 effective media quality adaptation, but there is potential to disrupt 145 the network's operation if the application does not adapt its sending 146 rate in a timely and effective manner. We consider this class of 147 applications in this memo. 149 Congestion control relies on monitoring the delivery of a media flow, 150 and responding to adapt the transmission of that flow when there are 151 signs that the network path is congested. Network congestion can be 152 detected in one of three ways: 1) a receiver can infer the onset of 153 congestion by observing an increase in one-way delay caused by queue 154 build-up within the network; 2) if Explicit Congestion Notification 155 (ECN) [RFC3168] is supported, the network can signal the presence of 156 congestion by marking packets using ECN Congestion Experienced (CE) 157 marks; or 3) in the extreme case, congestion will cause packet loss 158 that can be detected by observing a gap in the received RTP sequence 159 numbers. Once the onset of congestion is observed, the receiver has 160 to send feedback to the sender to indicate that the transmission rate 161 needs to be reduced. How the sender reduces the transmission rate is 162 highly dependent on the media codec being used, and is outside the 163 scope of this memo. 165 There are several ways in which a receiver can send feedback to a 166 media sender within the RTP framework: 168 o The base RTP specification [RFC3550] defines RTCP Reception Report 169 (RR) packets to convey reception quality feedback information, and 170 Sender Report (SR) packets to convey information about the media 171 transmission. RTCP SR packets contain data that can be used to 172 reconstruct media timing at a receiver, along with a count of the 173 total number of octets and packets sent. RTCP RR packets report 174 on the fraction of packets lost in the last reporting interval, 175 the cumulative number of packets lost, the highest sequence number 176 received, and the inter-arrival jitter. The RTCP RR packets also 177 contain timing information that allows the sender to estimate the 178 network round trip time (RTT) to the receivers. RTCP reports are 179 sent periodically, with the reporting interval being determined by 180 the number of SSRCs used in the session and a configured session 181 bandwidth estimate (the number of SSRCs used is usually two in a 182 unicast session, one for each participant, but can be greater if 183 the participants send multiple media streams). The interval 184 between reports sent from each receiver tends to be on the order 185 of a few seconds on average, although it varies with the session 186 bandwidth, and sub-second reporting intervals are possible in high 187 bandwidth sessions, and it is randomised to avoid synchronisation 188 of reports from multiple receivers. RTCP RR packets allow a 189 receiver to report ongoing network congestion to the sender. 190 However, if a receiver detects the onset of congestion part way 191 through a reporting interval, the base RTP specification contains 192 no provision for sending the RTCP RR packet early, and the 193 receiver has to wait until the next scheduled reporting interval. 195 o The RTCP Extended Reports (XR) [RFC3611] allow reporting of more 196 complex and sophisticated reception quality metrics, but do not 197 change the RTCP timing rules. RTCP extended reports of potential 198 interest for congestion control purposes are the extended packet 199 loss, discard, and burst metrics [RFC3611], [RFC7002], [RFC7097], 200 [RFC7003], [RFC6958]; and the extended delay metrics [RFC6843], 201 [RFC6798]. Other RTCP Extended Reports that could be helpful for 202 congestion control purposes might be developed in future. 204 o Rapid feedback about the occurrence of congestion events can be 205 achieved using the Extended RTP Profile for RTCP-Based Feedback 206 (RTP/AVPF) [RFC4585] (or its secure variant, RTP/SAVPF [RFC5124]) 207 in place of the RTP/AVP profile [RFC3551]. This modifies the RTCP 208 timing rules to allow RTCP reports to be sent early, in some cases 209 immediately, provided the RTCP transmission rate keeps within its 210 bandwidth allocation. It also defines transport-layer feedback 211 messages, including negative acknowledgements (NACKs), that can be 212 used to report on specific congestion events. RTP Codec Control 213 Messages [RFC5104] extend the RTP/AVPF profile with additional 214 feedback messages that can be used to influence that way in which 215 rate adaptation occurs, but do not further change the dynamics of 216 how rapidly feedback can be sent. Use of the RTP/AVPF profile is 217 dependent on signalling. 219 o Finally, Explicit Congestion Notification (ECN) for RTP over UDP 220 [RFC6679] can be used to provide feedback on the number of packets 221 that received an ECN Congestion Experienced (CE) mark. This RTCP 222 extension builds on the RTP/AVPF profile to allow rapid congestion 223 feedback when ECN is supported. 225 In addition to these mechanisms for providing feedback, the sender 226 can include an RTP header extension in each packet to record packet 227 transmission times. There are two methods: [RFC5450] represents the 228 transmission time in terms of a time-offset from the RTP timestamp of 229 the packet, while [RFC6051] includes an explicit NTP-format sending 230 timestamp (potentially more accurate, but a higher header overhead). 231 Accurate sending timestamps can be helpful for estimating queuing 232 delays, to get an early indication of the onset of congestion. 234 Taken together, these various mechanisms allow receivers to provide 235 feedback on the senders when congestion events occur, with varying 236 degrees of timeliness and accuracy. The key distinction is between 237 systems that use only the basic RTCP mechanisms, without RTP/AVPF 238 rapid feedback, and those that use the RTP/AVPF extensions to respond 239 to congestion more rapidly. 241 4. RTP Circuit Breakers for Systems Using the RTP/AVP Profile 243 The feedback mechanisms defined in [RFC3550] and available under the 244 RTP/AVP profile [RFC3551] are the minimum that can be assumed for a 245 baseline circuit breaker mechanism that is suitable for all unicast 246 applications of RTP. Accordingly, for an RTP circuit breaker to be 247 useful, it needs to be able to detect that an RTP flow is causing 248 excessive congestion using only basic RTCP features, without needing 249 RTCP XR feedback or the RTP/AVPF profile for rapid RTCP reports. 251 RTCP is a fundamental part of the RTP protocol, and the mechanisms 252 described here rely on the implementation of RTCP. Implementations 253 that claim to support RTP, but that do not implement RTCP, cannot use 254 the circuit breaker mechanisms described in this memo. Such 255 implementations SHOULD NOT be used on networks that might be subject 256 to congestion unless equivalent mechanisms are defined using some 257 non-RTCP feedback channel to report congestion and signal circuit 258 breaker conditions. 260 Three potential congestion signals are available from the basic RTCP 261 SR/RR packets and are reported for each synchronisation source (SSRC) 262 in the RTP session: 264 1. The sender can estimate the network round-trip time once per RTCP 265 reporting interval, based on the contents and timing of RTCP SR 266 and RR packets. 268 2. Receivers report a jitter estimate (the statistical variance of 269 the RTP data packet inter-arrival time) calculated over the RTCP 270 reporting interval. Due to the nature of the jitter calculation 271 ([RFC3550], section 6.4.4), the jitter is only meaningful for RTP 272 flows that send a single data packet for each RTP timestamp value 273 (i.e., audio flows, or video flows where each packet comprises 274 one video frame). 276 3. Receivers report the fraction of RTP data packets lost during the 277 RTCP reporting interval, and the cumulative number of RTP packets 278 lost over the entire RTP session. 280 These congestion signals limit the possible circuit breakers, since 281 they give only limited visibility into the behaviour of the network. 283 RTT estimates are widely used in congestion control algorithms, as a 284 proxy for queuing delay measures in delay-based congestion control or 285 to determine connection timeouts. RTT estimates derived from RTCP SR 286 and RR packets sent according to the RTP/AVP timing rules are too 287 infrequent to be useful though, and don't give enough information to 288 distinguish a delay change due to routing updates from queuing delay 289 caused by congestion. Accordingly, we cannot use the RTT estimate 290 alone as an RTP circuit breaker. 292 Increased jitter can be a signal of transient network congestion, but 293 in the highly aggregated form reported in RTCP RR packets, it offers 294 insufficient information to estimate the extent or persistence of 295 congestion. Jitter reports are a useful early warning of potential 296 network congestion, but provide an insufficiently strong signal to be 297 used as a circuit breaker. 299 The remaining congestion signals are the packet loss fraction and the 300 cumulative number of packets lost. If considered carefully, these 301 can be effective indicators that congestion is occurring in networks 302 where packet loss is primarily due to queue overflows, although loss 303 caused by non-congestive packet corruption can distort the result in 304 some networks. TCP congestion control [RFC5681] intentionally tries 305 to fill the router queues, and uses the resulting packet loss as 306 congestion feedback. An RTP flow competing with TCP traffic will 307 therefore expect to see a non-zero packet loss fraction that has to 308 be related to TCP dynamics to estimate available capacity. This 309 behaviour of TCP is reflected in the congestion circuit breaker 310 below, and will affect the design of any RTP congestion control 311 protocol. 313 Two packet loss regimes can be observed: 1) RTCP RR packets show a 314 non-zero packet loss fraction, while the extended highest sequence 315 number received continues to increment; and 2) RR packets show a loss 316 fraction of zero, but the extended highest sequence number received 317 does not increment even though the sender has been transmitting RTP 318 data packets. The former corresponds to the TCP congestion avoidance 319 state, and indicates a congested path that is still delivering data; 320 the latter corresponds to a TCP timeout, and is most likely due to a 321 path failure. A third condition is that data is being sent but no 322 RTCP feedback is received at all, corresponding to a failure of the 323 reverse path. We derive circuit breaker conditions for these loss 324 regimes in the following. 326 4.1. RTP/AVP Circuit Breaker #1: Media Timeout 328 If RTP data packets are being sent, but the RTCP SR or RR packets 329 reporting on that SSRC indicate a non-increasing extended highest 330 sequence number received, this is an indication that those RTP data 331 packets are not reaching the receiver. This could be a short-term 332 issue affecting only a few packets, perhaps caused by a slow-to-open 333 firewall or a transient connectivity problem, but if the issue 334 persists, it is a sign of a more ongoing and significant problem. 335 Accordingly, if a sender of RTP data packets receives three or more 336 consecutive RTCP SR or RR packets from the same receiver, and those 337 packets correspond to its transmission and have a non-increasing 338 extended highest sequence number received field, then that sender 339 SHOULD cease transmission (see Section 4.5). The extended highest 340 sequence number received field is non-increasing if the sender 341 receives at least three consecutive RTCP SR or RR packets that report 342 the same value for this field, but it has sent RTP data packets that 343 would have caused an increase in the reported value if they had 344 reached the receiver. 346 The reason for waiting for three or more consecutive RTCP packets 347 with a non-increasing extended highest sequence number is to give 348 enough time for transient reception problems to resolve themselves, 349 but to stop problem flows quickly enough to avoid causing serious 350 ongoing network congestion. A single RTCP report showing no 351 reception could be caused by a transient fault, and so will not cease 352 transmission. Waiting for more than three consecutive RTCP reports 353 before stopping a flow might avoid some false positives, but could 354 lead to problematic flows running for a long time period (potentially 355 tens of seconds, depending on the RTCP reporting interval) before 356 being cut off. Equally, an application that sends few packets when 357 the packet loss rate is high runs the risk that the media timeout 358 circuit breaker triggers inadvertently. The chosen timeout interval 359 is a trade-off between these extremes. 361 4.2. RTP/AVP Circuit Breaker #2: RTCP Timeout 363 In addition to media timeouts, as were discussed in Section 4.1, an 364 RTP session has the possibility of an RTCP timeout. This can occur 365 when RTP data packets are being sent, but there are no RTCP reports 366 returned from the receiver. This is either due to a failure of the 367 receiver to send RTCP reports, or a failure of the return path that 368 is preventing those RTCP reporting from being delivered. In either 369 case, it is not safe to continue transmission, since the sender has 370 no way of knowing if it is causing congestion. Accordingly, an RTP 371 sender that has not received any RTCP SR or RTCP RR packets reporting 372 on the SSRC it is using for three or more of its RTCP reporting 373 intervals SHOULD cease transmission (see Section 4.5). When 374 calculating the timeout, the deterministic RTCP reporting interval, 375 Td, without the randomization factor, and with a fixed minimum 376 interval Tmin=5 seconds) SHOULD be used. The rationale for this 377 choice of timeout is as described in Section 6.2 of RFC 3550 378 [RFC3550]. 380 The choice of three RTCP reporting intervals as the timeout is made 381 following Section 6.3.5 of RFC 3550 [RFC3550]. This specifies that 382 participants in an RTP session will timeout and remove an RTP sender 383 from the list of active RTP senders if no RTP data packets have been 384 received from that RTP sender within the last two RTCP reporting 385 intervals. Using a timeout of three RTCP reporting intervals is 386 therefore large enough that the other participants will have timed 387 out the sender if a network problem stops the data packets it is 388 sending from reaching the receivers, even allowing for loss of some 389 RTCP packets. 391 If a sender is transmitting a large number of RTP media streams, such 392 that the corresponding RTCP SR or RR packets are too large to fit 393 into the network MTU, the receiver will generate RTCP SR or RR 394 packets in a round-robin manner. In this case, the sender SHOULD 395 treat receipt of an RTCP SR or RR packet corresponding to any SSRC it 396 sent on the same 5-tuple of source and destination IP address, port, 397 and protocol, as an indication that the receiver and return path are 398 working, preventing the RTCP timeout circuit breaker from triggering. 400 4.3. RTP/AVP Circuit Breaker #3: Congestion 402 If RTP data packets are being sent, and the corresponding RTCP SR or 403 RR packets show non-zero packet loss fraction and increasing extended 404 highest sequence number received, then those RTP data packets are 405 arriving at the receiver, but some degree of congestion is occurring. 406 The RTP/AVP profile [RFC3551] states that: 408 If best-effort service is being used, RTP receivers SHOULD monitor 409 packet loss to ensure that the packet loss rate is within 410 acceptable parameters. Packet loss is considered acceptable if a 411 TCP flow across the same network path and experiencing the same 412 network conditions would achieve an average throughput, measured 413 on a reasonable time scale, that is not less than the RTP flow is 414 achieving. This condition can be satisfied by implementing 415 congestion control mechanisms to adapt the transmission rate (or 416 the number of layers subscribed for a layered multicast session), 417 or by arranging for a receiver to leave the session if the loss 418 rate is unacceptably high. 420 The comparison to TCP cannot be specified exactly, but is intended 421 as an "order-of-magnitude" comparison in time scale and 422 throughput. The time scale on which TCP throughput is measured is 423 the round-trip time of the connection. In essence, this 424 requirement states that it is not acceptable to deploy an 425 application (using RTP or any other transport protocol) on the 426 best-effort Internet which consumes bandwidth arbitrarily and does 427 not compete fairly with TCP within an order of magnitude. 429 The phase "order of magnitude" in the above means within a factor of 430 ten, approximately. In order to implement this, it is necessary to 431 estimate the throughput a TCP connection would achieve over the path. 432 For a long-lived TCP Reno connection, it has been shown that the TCP 433 throughput can be estimated using the following equation [Padhye]: 435 s 436 X = -------------------------------------------------------------- 437 R*sqrt(2*b*p/3) + (t_RTO * (3*sqrt(3*b*p/8) * p * (1+32*p^2))) 439 where: 441 X is the transmit rate in bytes/second. 443 s is the packet size in bytes. If data packets vary in size, then 444 the average size is to be used. 446 R is the round trip time in seconds. 448 p is the loss event rate, between 0 and 1.0, of the number of loss 449 events as a fraction of the number of packets transmitted. 451 t_RTO is the TCP retransmission timeout value in seconds, generally 452 approximated by setting t_RTO = 4*R. 454 b is the number of packets that are acknowledged by a single TCP 455 acknowledgement; [RFC3448] recommends the use of b=1 since many 456 TCP implementations do not use delayed acknowledgements. 458 This is the same approach to estimated TCP throughput that is used in 459 [RFC3448]. Under conditions of low packet loss the second term on 460 the denominator is small, so this formula can be approximated with 461 reasonable accuracy as follows [Mathis]: 463 s 464 X = ----------------- 465 R * sqrt(2*b*p/3) 467 It is RECOMMENDED that this simplified throughout equation be used, 468 since the reduction in accuracy is small, and it is much simpler to 469 calculate than the full equation. Measurements have shown that the 470 simplified TCP throughput equation is effective as an RTP circuit 471 breaker for multimedia flows sent to hosts on residential networks 472 using ADSL and cable modem links [Singh]. The data shows that the 473 full TCP throughput equation tends to be more sensitive to packet 474 loss and triggers the RTP circuit breaker earlier than the simplified 475 equation. Implementations that desire this extra sensitivity MAY use 476 the full TCP throughput equation in the RTP circuit breaker. Initial 477 measurements in LTE networks have shown that the extra sensitivity is 478 helpful in that environment, with the full TCP throughput equation 479 giving a more balanced circuit breaker response than the simplified 480 TCP equation [Sarker]; other networks might see similar behaviour. 482 No matter what TCP throughput equation is chosen, two parameters need 483 to be estimated and reported to the sender in order to calculate the 484 throughput: the round trip time, R, and the loss event rate, p (the 485 packet size, s, is known to the sender). The round trip time can be 486 estimated from RTCP SR and RR packets. This is done too infrequently 487 for accurate statistics, but is the best that can be done with the 488 standard RTCP mechanisms. 490 Report blocks in RTCP SR or RR packets contain the packet loss 491 fraction, rather than the loss event rate, so p cannot be reported 492 (TCP typically treats the loss of multiple packets within a single 493 RTT as one loss event, but RTCP RR packets report the overall 494 fraction of packets lost, and does not report when the packet losses 495 occurred). Using the loss fraction in place of the loss event rate 496 can overestimate the loss. We believe that this overestimate will 497 not be significant, given that we are only interested in order of 498 magnitude comparison ([Floyd] section 3.2.1 shows that the difference 499 is small for steady-state conditions and random loss, but using the 500 loss fraction is more conservative in the case of bursty loss). 502 The congestion circuit breaker is therefore: when a sender receives 503 an RTCP SR or RR packet that contains a report block for an SSRC it 504 is using, the sender MUST check the fraction lost field in the report 505 block to determine if there is a non-zero packet loss rate. If the 506 fraction lost field is zero, then continue sending as normal. If the 507 fraction lost is greater than zero, then estimate the TCP throughput 508 that would be achieved over the path using the chosen TCP throughput 509 equation and the measured values of the round-trip time, R, the loss 510 event rate, p (as approximated by the fraction lost), and the packet 511 size, s. Compare this with the actual sending rate. If the actual 512 sending rate has been more than ten times the TCP throughput estimate 513 for three (or more) consecutive RTCP reporting intervals, then the 514 congestion circuit breaker is triggered. 516 When the congestion circuit breaker is triggered, the sender SHOULD 517 cease transmission (see Section 4.5). However, if the sender is able 518 to reduce its sending rate by a factor of (approximately) ten, then 519 it MAY first reduce its sending rate by this factor (or some larger 520 amount) to see if that resolves the congestion. If the sending rate 521 is reduced in this way and the congestion circuit breaker triggers 522 again after the next three RTCP reporting intervals, the sender MUST 523 then cease transmission. An example of such a rate reduction might 524 be a video conferencing system that backs off to sending audio only, 525 before completely dropping the call. If such a reduction in sending 526 rate resolves the congestion problem, the sender MAY gradually 527 increase the rate at which it sends data after a reasonable amount of 528 time has passed, provided it takes care not to cause the problem to 529 recur ("reasonable" is intentionally not defined here). 531 The congestion circuit breaker depends on the fraction of RTP data 532 packets lost in a reporting interval. If the number of packets sent 533 in the reporting interval is too low, this statistic loses meaning, 534 and it is possible that a sampling error can give the appearance of 535 high packet loss rates. Following the guidelines in [RFC5405], an 536 RTP sender that sends not more than one RTP packet per RTT MAY ignore 537 a single trigger of the congestion circuit breaker, on the basis that 538 the packet loss rate estimate is unreliable with so few samples. 539 However, if the congestion circuit breaker triggers again after the 540 following three RTCP reporting intervals (i.e., if there have been 541 six or more consecutive RTCP reporting intervals where the actual 542 sending rate is more than ten times the estimated sending rate 543 derived from the TCP throughput equation), then the sender SHOULD 544 cease transmission (see Section 4.5). 546 The RTCP reporting interval of the media sender does not affect how 547 quickly congestion circuit breaker can trigger. The timing is based 548 on the RTCP reporting interval of the receiver that generates the SR/ 549 RR packets from which the loss rate and RTT estimate are derived 550 (note that RTCP requires all participants in a session to have 551 similar reporting intervals, else the participant timeout rules in 552 [RFC3550] will not work, so this interval is likely similar to that 553 of the sender). If the incoming RTCP SR or RR packets are using a 554 reduced minimum RTCP reporting interval (as specified in Section 6.2 555 of RFC 3550 [RFC3550] or the RTP/AVPF profile [RFC4585]), then that 556 reduced RTCP reporting interval is used when determining if the 557 circuit breaker is triggered. 559 As in Section 4.1 and Section 4.2, we use three reporting intervals 560 to avoid triggering the circuit breaker on transient failures. This 561 circuit breaker is a worst-case condition, and congestion control 562 needs to be performed to keep well within this bound. It is expected 563 that the circuit breaker will only be triggered if the usual 564 congestion control fails for some reason. 566 If there are more media streams that can be reported in a single RTCP 567 SR or RR packet, or if the size of a complete RTCP SR or RR packet 568 exceeds the network MTU, then the receiver will report on a subset of 569 sources in each reporting interval, with the subsets selected round- 570 robin across multiple intervals so that all sources are eventually 571 reported [RFC3550]. When generating such round-robin RTCP reports, 572 priority SHOULD be given to reports on sources that have high packet 573 loss rates, to ensure that senders are aware of network congestion 574 they are causing (this is an update to [RFC3550]). 576 4.4. RTP/AVP Circuit Breaker #4: Media Usability 578 Applications that use RTP are generally tolerant to some amount of 579 packet loss. How much packet loss can be tolerated will depend on 580 the application, media codec, and the amount of error correction and 581 packet loss concealment that is applied. There is an upper bound on 582 the amount of loss can be corrected, however, beyond which the media 583 becomes unusable. Similarly, many applications have some upper bound 584 on the media capture to play-out latency that can be tolerated before 585 the application becomes unusable. The latency bound will depend on 586 the application, but typical values can range from the order of a few 587 hundred milliseconds for voice telephony and interactive conferencing 588 applications, up to several seconds for some video-on-demand systems. 590 As a final circuit breaker, RTP senders SHOULD monitor the reported 591 packet loss and delay to estimate whether the media is likely to be 592 suitable for the intended purpose. If the packet loss rate and/or 593 latency is such that the media has become unusable, and has remained 594 unusable for a significant time period, then the application SHOULD 595 cease transmission. Similarly, receivers SHOULD monitor the quality 596 of the media they receive, and if the quality is unusable for a 597 significant time period, they SHOULD terminate the session. This 598 memo intentionally does not define a bound on the packet loss rate or 599 latency that will result in unusable media, nor does it specify what 600 time period is deemed significant, as these are highly application 601 dependent. 603 Sending media that suffers from such high packet loss or latency that 604 it is unusable at the receiver is both wasteful of resources, and of 605 no benefit to the user of the application. It also is highly likely 606 to be congesting the network, and disrupting other applications. As 607 such, the congestion circuit breaker will almost certainly trigger to 608 stop flows where the media would be unusable due to high packet loss 609 or latency. However, in pathological scenarios where the congestion 610 circuit breaker does not stop the flow, it is desirable that the RTP 611 application cease sending useless traffic. The role of the media 612 usability circuit breaker is to protect the network in such cases. 614 4.5. Ceasing Transmission 616 What it means to cease transmission depends on the application, but 617 the intention is that the application will stop sending RTP data 618 packets to a particular destination 3-tuple (transport protocol, 619 destination port, IP address), until the user makes an explicit 620 attempt to restart the call. It is important that a human user is 621 involved in the decision to try to restart the call, since that user 622 will eventually give up if the calls repeatedly trigger the circuit 623 breaker. This will help avoid problems with automatic redial systems 624 from congesting the network. Accordingly, RTP flows halted by the 625 circuit breaker SHOULD NOT be restarted automatically unless the 626 sender has received information that the congestion has dissipated. 628 It is recognised that the RTP implementation in some systems might 629 not be able to determine if a call set-up request was initiated by a 630 human user, or automatically by some scripted higher-level component 631 of the system. These implementations SHOULD rate limit attempts to 632 restart a call to the same destination 3-tuple as used by a previous 633 call that was recently halted by the circuit breaker. The chosen 634 rate limit ought to not exceed the rate at which an annoyed human 635 caller might redial a misbehaving phone. 637 5. RTP Circuit Breakers for Systems Using the RTP/AVPF Profile 639 Use of the Extended RTP Profile for RTCP-based Feedback (RTP/AVPF) 640 [RFC4585] allows receivers to send early RTCP reports in some cases, 641 to inform the sender about particular events in the media stream. 642 There are several use cases for such early RTCP reports, including 643 providing rapid feedback to a sender about the onset of congestion. 645 Receiving rapid feedback about congestion events potentially allows 646 congestion control algorithms to be more responsive, and to better 647 adapt the media transmission to the limitations of the network. It 648 is expected that many RTP congestion control algorithms will adopt 649 the RTP/AVPF profile for this reason, defining new transport layer 650 feedback reports that suit their requirements. Since these reports 651 are not yet defined, and likely very specific to the details of the 652 congestion control algorithm chosen, they cannot be used as part of 653 the generic RTP circuit breaker. 655 Reduced-size RTCP reports sent under the RTP/AVPF early feedback 656 rules that do not contain an RTCP SR or RR packet MUST be ignored by 657 the congestion circuit breaker (they do not contain the information 658 needed by the congestion circuit breaker algorithm), but MUST be 659 counted as received packets for the RTCP timeout circuit breaker. 660 Reduced-size RTCP reports sent under the RTP/AVPF early feedback 661 rules that contain RTCP SR or RR packets MUST be processed by the 662 congestion circuit breaker as if they were sent as regular RTCP 663 reports, and counted towards the circuit breaker conditions specified 664 in Section 4 of this memo. This will potentially make the RTP 665 circuit breaker fire earlier than it would if the RTP/AVPF profile 666 was not used. 668 When using ECN with RTP (see Section 8), early RTCP feedback packets 669 can contain ECN feedback reports. The count of ECN-CE marked packets 670 contained in those ECN feedback reports is counted towards the number 671 of lost packets reported if the ECN Feedback Report report is sent in 672 an compound RTCP packet along with an RTCP SR/RR report packet. 673 Reports of ECN-CE packets sent as reduced-size RTCP ECN feedback 674 packets without an RTCP SR/RR packet MUST be ignored. 676 These rules are intended to allow the use of low-overhead RTP/AVPF 677 feedback for generic NACK messages without triggering the RTP circuit 678 breaker. This is expected to make such feedback suitable for RTP 679 congestion control algorithms that need to quickly report loss events 680 in between regular RTCP reports. The reaction to reduced-size RTCP 681 SR/RR packets is to allow such algorithms to send feedback that can 682 trigger the circuit breaker, when desired. 684 6. Impact of RTCP Extended Reports (XR) 686 RTCP Extended Report (XR) blocks provide additional reception quality 687 metrics, but do not change the RTCP timing rules. Some of the RTCP 688 XR blocks provide information that might be useful for congestion 689 control purposes, others provided non-congestion-related metrics. 690 With the exception of RTCP XR ECN Summary Reports (see Section 8), 691 the presence of RTCP XR blocks in a compound RTCP packet does not 692 affect the RTP circuit breaker algorithm. For consistency and ease 693 of implementation, only the reception report blocks contained in RTCP 694 SR packets, RTCP RR packets, or RTCP XR ECN Summary Report packets, 695 are used by the RTP circuit breaker algorithm. 697 7. Impact of RTCP Reporting Groups 699 An optimisation for grouping RTCP reception statistics and other 700 feedback in RTP sessions with large numbers of participants is given 701 in [I-D.ietf-avtcore-rtp-multi-stream-optimisation]. This allows one 702 SSRC to act as a representative that sends reports on behalf of other 703 SSRCs that are co-located in the same endpoint and see identical 704 reception quality. When running the circuit breaker algorithms, an 705 endpoint MUST treat a reception report from the representative of the 706 reporting group as if a reception report was received from all 707 members of that group. 709 8. Impact of Explicit Congestion Notification (ECN) 711 The use of ECN for RTP flows does not affect the media timeout RTP 712 circuit breaker (Section 4.1) or the RTCP timeout circuit breaker 713 (Section 4.2), since these are both connectivity checks that simply 714 determinate if any packets are being received. 716 ECN-CE marked packets SHOULD be treated as if it were lost for the 717 purposes of congestion control, when determining the optimal media 718 sending rate for an RTP flow. If an RTP sender has negotiated ECN 719 support for an RTP session, and has successfully initiated ECN use on 720 the path to the receiver [RFC6679], then ECN-CE marked packets SHOULD 721 be treated as if they were lost when calculating if the congestion- 722 based RTP circuit breaker (Section 4.3) has been met. The count of 723 ECN-CE marked RTP packets is returned in RTCP XR ECN summary report 724 packets if support for ECN has been initiated for an RTP session. 726 9. Impact of Bundled Media and Layered Coding 728 The RTP circuit breaker operates on a per-RTP session basis. An RTP 729 sender that participates in several RTP sessions MUST treat each RTP 730 session independently with regards to the RTP circuit breaker. 732 An RTP sender can generate several media streams within a single RTP 733 session, with each stream using a different SSRC. This can happen if 734 bundled media are in use, when using simulcast, or when using layered 735 media coding. By default, each SSRC will be treated independently by 736 the RTP circuit breaker. However, the sender MAY choose to treat the 737 flows (or a subset thereof) as a group, such that a circuit breaker 738 trigger for one flow applies to the group of flows as a whole, and 739 either causes the entire group to cease transmission, or the sending 740 rate of the group to reduce by a factor of ten, depending on the RTP 741 circuit breaker triggered. Grouping flows in this way is expected to 742 be especially useful for layered flows sent using multiple SSRCs, as 743 it allows the layered flow to react as a whole, ceasing transmission 744 on the enhancement layers first to reduce sending rate if necessary, 745 rather than treating each layer independently. 747 10. Security Considerations 749 The security considerations of [RFC3550] apply. 751 If the RTP/AVPF profile is used to provide rapid RTCP feedback, the 752 security considerations of [RFC4585] apply. If ECN feedback for RTP 753 over UDP/IP is used, the security considerations of [RFC6679] apply. 755 If non-authenticated RTCP reports are used, an on-path attacker can 756 trivially generate fake RTCP packets that indicate high packet loss 757 rates, causing the circuit breaker to trigger and disrupting an RTP 758 session. This is somewhat more difficult for an off-path attacker, 759 due to the need to guess the randomly chosen RTP SSRC value and the 760 RTP sequence number. This attack can be avoided if RTCP packets are 761 authenticated; authentication options are discussed in [RFC7201]. 763 Timely operation of the RTP circuit breaker depends on the choice of 764 RTCP reporting interval. If the receiver has a reporting interval 765 that is overly long, then the responsiveness of the circuit breaker 766 decreases. In the limit, the RTP circuit breaker can be disabled for 767 all practical purposes by configuring an RTCP reporting interval that 768 is many minutes duration. This issue is not specific to the circuit 769 breaker: long RTCP reporting intervals also prevent reception quality 770 reports, feedback messages, codec control messages, etc., from being 771 used. Implementations SHOULD impose an upper limit on the RTCP 772 reporting interval they are willing to negotiate (based on the 773 session bandwidth and RTCP bandwidth fraction) when using the RTP 774 circuit breaker. An upper limit on the reporting interval on the 775 order of 10 seconds is a reasonable bound. 777 11. IANA Considerations 779 There are no actions for IANA. 781 12. Open Issues 783 o Should the number of RTCP reporting intervals needed to trigger 784 the media timeout and congestion circuit breakers scale with the 785 duration of the RTCP reporting interval, so the circuit breaker 786 triggers after a fixed duration, rather than after a fixed number 787 of reporting intervals? 789 13. Acknowledgements 791 The authors would like to thank Bernard Aboba, Harald Alvestrand, 792 Gorry Fairhurst, Kevin Gross, Cullen Jennings, Randell Jesup, 793 Jonathan Lennox, Matt Mathis, Stephen McQuistin, Eric Rescorla, 794 Abheek Saha, and Fabio Verdicchio, for their valuable feedback. 796 14. References 797 14.1. Normative References 799 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 800 Requirement Levels", BCP 14, RFC 2119, March 1997. 802 [RFC3448] Handley, M., Floyd, S., Padhye, J., and J. Widmer, "TCP 803 Friendly Rate Control (TFRC): Protocol Specification", RFC 804 3448, January 2003. 806 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 807 Jacobson, "RTP: A Transport Protocol for Real-Time 808 Applications", STD 64, RFC 3550, July 2003. 810 [RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and 811 Video Conferences with Minimal Control", STD 65, RFC 3551, 812 July 2003. 814 [RFC3611] Friedman, T., Caceres, R., and A. Clark, "RTP Control 815 Protocol Extended Reports (RTCP XR)", RFC 3611, November 816 2003. 818 [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, 819 "Extended RTP Profile for Real-time Transport Control 820 Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, July 821 2006. 823 14.2. Informative References 825 [Floyd] Floyd, S., Handley, M., Padhye, J., and J. Widmer, 826 "Equation-Based Congestion Control for Unicast 827 Applications", Proceedings of the ACM SIGCOMM conference, 828 2000, DOI 10.1145/347059.347397, August 2000. 830 [I-D.ietf-avtcore-rtp-multi-stream-optimisation] 831 Lennox, J., Westerlund, M., Wu, W., and C. Perkins, 832 "Sending Multiple Media Streams in a Single RTP Session: 833 Grouping RTCP Reception Statistics and Other Feedback", 834 draft-ietf-avtcore-rtp-multi-stream-optimisation-04 (work 835 in progress), August 2014. 837 [Mathis] Mathis, M., Semke, J., Mahdavi, J., and T. Ott, "The 838 macroscopic behavior of the TCP congestion avoidance 839 algorithm", ACM SIGCOMM Computer Communication Review 840 27(3), DOI 10.1145/263932.264023, July 1997. 842 [Padhye] Padhye, J., Firoiu, V., Towsley, D., and J. Kurose, 843 "Modeling TCP Throughput: A Simple Model and its Empirical 844 Validation", Proceedings of the ACM SIGCOMM conference, 845 1998, DOI 10.1145/285237.285291, August 1998. 847 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 848 of Explicit Congestion Notification (ECN) to IP", RFC 849 3168, September 2001. 851 [RFC5104] Wenger, S., Chandra, U., Westerlund, M., and B. Burman, 852 "Codec Control Messages in the RTP Audio-Visual Profile 853 with Feedback (AVPF)", RFC 5104, February 2008. 855 [RFC5124] Ott, J. and E. Carrara, "Extended Secure RTP Profile for 856 Real-time Transport Control Protocol (RTCP)-Based Feedback 857 (RTP/SAVPF)", RFC 5124, February 2008. 859 [RFC5405] Eggert, L. and G. Fairhurst, "Unicast UDP Usage Guidelines 860 for Application Designers", BCP 145, RFC 5405, November 861 2008. 863 [RFC5450] Singer, D. and H. Desineni, "Transmission Time Offsets in 864 RTP Streams", RFC 5450, March 2009. 866 [RFC5506] Johansson, I. and M. Westerlund, "Support for Reduced-Size 867 Real-Time Transport Control Protocol (RTCP): Opportunities 868 and Consequences", RFC 5506, April 2009. 870 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 871 Control", RFC 5681, September 2009. 873 [RFC6051] Perkins, C. and T. Schierl, "Rapid Synchronisation of RTP 874 Flows", RFC 6051, November 2010. 876 [RFC6679] Westerlund, M., Johansson, I., Perkins, C., O'Hanlon, P., 877 and K. Carlberg, "Explicit Congestion Notification (ECN) 878 for RTP over UDP", RFC 6679, August 2012. 880 [RFC6798] Clark, A. and Q. Wu, "RTP Control Protocol (RTCP) Extended 881 Report (XR) Block for Packet Delay Variation Metric 882 Reporting", RFC 6798, November 2012. 884 [RFC6843] Clark, A., Gross, K., and Q. Wu, "RTP Control Protocol 885 (RTCP) Extended Report (XR) Block for Delay Metric 886 Reporting", RFC 6843, January 2013. 888 [RFC6958] Clark, A., Zhang, S., Zhao, J., and Q. Wu, "RTP Control 889 Protocol (RTCP) Extended Report (XR) Block for Burst/Gap 890 Loss Metric Reporting", RFC 6958, May 2013. 892 [RFC7002] Clark, A., Zorn, G., and Q. Wu, "RTP Control Protocol 893 (RTCP) Extended Report (XR) Block for Discard Count Metric 894 Reporting", RFC 7002, September 2013. 896 [RFC7003] Clark, A., Huang, R., and Q. Wu, "RTP Control Protocol 897 (RTCP) Extended Report (XR) Block for Burst/Gap Discard 898 Metric Reporting", RFC 7003, September 2013. 900 [RFC7097] Ott, J., Singh, V., and I. Curcio, "RTP Control Protocol 901 (RTCP) Extended Report (XR) for RLE of Discarded Packets", 902 RFC 7097, January 2014. 904 [RFC7201] Westerlund, M. and C. Perkins, "Options for Securing RTP 905 Sessions", RFC 7201, April 2014. 907 [Sarker] Sarker, Z., Singh, V., and C.S. Perkins, "An Evaluation of 908 RTP Circuit Breaker Performance on LTE Networks", 909 Proceedings of the IEEE Infocom workshop on Communication 910 and Networking Techniques for Contemporary Video, 2014, 911 April 2014. 913 [Singh] Singh, V., McQuistin, S., Ellis, M., and C.S. Perkins, 914 "Circuit Breakers for Multimedia Congestion Control", 915 Proceedings of the International Packet Video Workshop, 916 2013, DOI 10.1109/PV.2013.6691439, December 2013. 918 Authors' Addresses 920 Colin Perkins 921 University of Glasgow 922 School of Computing Science 923 Glasgow G12 8QQ 924 United Kingdom 926 Email: csp@csperkins.org 927 Varun Singh 928 Aalto University 929 School of Electrical Engineering 930 Otakaari 5 A 931 Espoo, FIN 02150 932 Finland 934 Email: varun@comnet.tkk.fi 935 URI: http://www.netlab.tkk.fi/~varun/