idnits 2.17.1 draft-ietf-avtcore-rtp-circuit-breakers-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document updates RFC3550, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC3550, updated by this document, for RFC5378 checks: 1998-04-07) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (October 27, 2014) is 3462 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC5506' is defined on line 884, but no explicit reference was found in the text ** Obsolete normative reference: RFC 3448 (Obsoleted by RFC 5348) == Outdated reference: A later version (-12) exists of draft-ietf-avtcore-rtp-multi-stream-optimisation-04 -- Obsolete informational reference (is this intentional?): RFC 5405 (Obsoleted by RFC 8085) Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 AVTCORE Working Group C. S. Perkins 3 Internet-Draft University of Glasgow 4 Updates: 3550 (if approved) V. Singh 5 Intended status: Standards Track Aalto University 6 Expires: April 30, 2015 October 27, 2014 8 Multimedia Congestion Control: Circuit Breakers for Unicast RTP Sessions 9 draft-ietf-avtcore-rtp-circuit-breakers-07 11 Abstract 13 The Real-time Transport Protocol (RTP) is widely used in telephony, 14 video conferencing, and telepresence applications. Such applications 15 are often run on best-effort UDP/IP networks. If congestion control 16 is not implemented in the applications, then network congestion will 17 deteriorate the user's multimedia experience. This document does not 18 propose a congestion control algorithm; instead, it defines a minimal 19 set of RTP "circuit-breakers". Circuit-breakers are conditions under 20 which an RTP sender needs to stop transmitting media data in order to 21 protect the network from excessive congestion. It is expected that, 22 in the absence of severe congestion, all RTP applications running on 23 best-effort IP networks will be able to run without triggering these 24 circuit breakers. Any future RTP congestion control specification 25 will be expected to operate within the constraints defined by these 26 circuit breakers. 28 Status of This Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at http://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 This Internet-Draft will expire on April 30, 2015. 45 Copyright Notice 47 Copyright (c) 2014 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (http://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 Table of Contents 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 63 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 64 3. Background . . . . . . . . . . . . . . . . . . . . . . . . . 3 65 4. RTP Circuit Breakers for Systems Using the RTP/AVP Profile . 6 66 4.1. RTP/AVP Circuit Breaker #1: Media Timeout . . . . . . . . 8 67 4.2. RTP/AVP Circuit Breaker #2: RTCP Timeout . . . . . . . . 8 68 4.3. RTP/AVP Circuit Breaker #3: Congestion . . . . . . . . . 9 69 4.4. RTP/AVP Circuit Breaker #4: Media Usability . . . . . . . 13 70 4.5. Ceasing Transmission . . . . . . . . . . . . . . . . . . 14 71 5. RTP Circuit Breakers for Systems Using the RTP/AVPF Profile . 14 72 6. Impact of RTCP Extended Reports (XR) . . . . . . . . . . . . 15 73 7. Impact of RTCP Reporting Groups . . . . . . . . . . . . . . . 15 74 8. Impact of Explicit Congestion Notification (ECN) . . . . . . 16 75 9. Impact of Layered Coding . . . . . . . . . . . . . . . . . . 16 76 10. Security Considerations . . . . . . . . . . . . . . . . . . . 17 77 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17 78 12. Open Issues . . . . . . . . . . . . . . . . . . . . . . . . . 17 79 13. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 18 80 14. References . . . . . . . . . . . . . . . . . . . . . . . . . 18 81 14.1. Normative References . . . . . . . . . . . . . . . . . . 18 82 14.2. Informative References . . . . . . . . . . . . . . . . . 18 83 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 20 85 1. Introduction 87 The Real-time Transport Protocol (RTP) [RFC3550] is widely used in 88 voice-over-IP, video teleconferencing, and telepresence systems. 89 Many of these systems run over best-effort UDP/IP networks, and can 90 suffer from packet loss and increased latency if network congestion 91 occurs. Designing effective RTP congestion control algorithms, to 92 adapt the transmission of RTP-based media to match the available 93 network capacity, while also maintaining the user experience, is a 94 difficult but important problem. Many such congestion control and 95 media adaptation algorithms have been proposed, but to date there is 96 no consensus on the correct approach, or even that a single standard 97 algorithm is desirable. 99 This memo does not attempt to propose a new RTP congestion control 100 algorithm. Rather, it proposes a minimal set of "circuit breakers"; 101 conditions under which there is general agreement that an RTP flow is 102 causing serious congestion, and ought to cease transmission. It is 103 expected that future standards-track congestion control algorithms 104 for RTP will operate within the envelope defined by this memo. 106 2. Terminology 108 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 109 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 110 document are to be interpreted as described in RFC 2119 [RFC2119]. 111 This interpretation of these key words applies only when written in 112 ALL CAPS. Mixed- or lower-case uses of these key words are not to be 113 interpreted as carrying special significance in this memo. 115 3. Background 117 We consider congestion control for unicast RTP traffic flows. This 118 is the problem of adapting the transmission of an audio/visual data 119 flow, encapsulated within an RTP transport session, from one sender 120 to one receiver, so that it matches the available network bandwidth. 121 Such adaptation needs to be done in a way that limits the disruption 122 to the user experience caused by both packet loss and excessive rate 123 changes. Congestion control for multicast flows is outside the scope 124 of this memo. Multicast traffic needs different solutions, since the 125 available bandwidth estimator for a group of receivers will differ 126 from that for a single receiver, and because multicast congestion 127 control has to consider issues of fairness across groups of receivers 128 that do not apply to unicast flows. 130 Congestion control for unicast RTP traffic can be implemented in one 131 of two places in the protocol stack. One approach is to run the RTP 132 traffic over a congestion controlled transport protocol, for example 133 over TCP, and to adapt the media encoding to match the dictates of 134 the transport-layer congestion control algorithm. This is safe for 135 the network, but can be suboptimal for the media quality unless the 136 transport protocol is designed to support real-time media flows. We 137 do not consider this class of applications further in this memo, as 138 their network safety is guaranteed by the underlying transport. 140 Alternatively, RTP flows can be run over a non-congestion controlled 141 transport protocol, for example UDP, performing rate adaptation at 142 the application layer based on RTP Control Protocol (RTCP) feedback. 143 With a well-designed, network-aware, application, this allows highly 144 effective media quality adaptation, but there is potential to disrupt 145 the network's operation if the application does not adapt its sending 146 rate in a timely and effective manner. We consider this class of 147 applications in this memo. 149 Congestion control relies on monitoring the delivery of a media flow, 150 and responding to adapt the transmission of that flow when there are 151 signs that the network path is congested. Network congestion can be 152 detected in one of three ways: 1) a receiver can infer the onset of 153 congestion by observing an increase in one-way delay caused by queue 154 build-up within the network; 2) if Explicit Congestion Notification 155 (ECN) [RFC3168] is supported, the network can signal the presence of 156 congestion by marking packets using ECN Congestion Experienced (CE) 157 marks; or 3) in the extreme case, congestion will cause packet loss 158 that can be detected by observing a gap in the received RTP sequence 159 numbers. Once the onset of congestion is observed, the receiver has 160 to send feedback to the sender to indicate that the transmission rate 161 needs to be reduced. How the sender reduces the transmission rate is 162 highly dependent on the media codec being used, and is outside the 163 scope of this memo. 165 There are several ways in which a receiver can send feedback to a 166 media sender within the RTP framework: 168 o The base RTP specification [RFC3550] defines RTCP Reception Report 169 (RR) packets to convey reception quality feedback information, and 170 Sender Report (SR) packets to convey information about the media 171 transmission. RTCP SR packets contain data that can be used to 172 reconstruct media timing at a receiver, along with a count of the 173 total number of octets and packets sent. RTCP RR packets report 174 on the fraction of packets lost in the last reporting interval, 175 the cumulative number of packets lost, the highest sequence number 176 received, and the inter-arrival jitter. The RTCP RR packets also 177 contain timing information that allows the sender to estimate the 178 network round trip time (RTT) to the receivers. RTCP reports are 179 sent periodically, with the reporting interval being determined by 180 the number of SSRCs used in the session and a configured session 181 bandwidth estimate (the number of SSRCs used is usually two in a 182 unicast session, one for each participant, but can be greater if 183 the participants send multiple media streams). The interval 184 between reports sent from each receiver tends to be on the order 185 of a few seconds on average, although it varies with the session 186 bandwidth, and sub-second reporting intervals are possible in high 187 bandwidth sessions, and it is randomised to avoid synchronisation 188 of reports from multiple receivers. RTCP RR packets allow a 189 receiver to report ongoing network congestion to the sender. 190 However, if a receiver detects the onset of congestion part way 191 through a reporting interval, the base RTP specification contains 192 no provision for sending the RTCP RR packet early, and the 193 receiver has to wait until the next scheduled reporting interval. 195 o The RTCP Extended Reports (XR) [RFC3611] allow reporting of more 196 complex and sophisticated reception quality metrics, but do not 197 change the RTCP timing rules. RTCP extended reports of potential 198 interest for congestion control purposes are the extended packet 199 loss, discard, and burst metrics [RFC3611], [RFC7002], [RFC7097], 200 [RFC7003], [RFC6958]; and the extended delay metrics [RFC6843], 201 [RFC6798]. Other RTCP Extended Reports that could be helpful for 202 congestion control purposes might be developed in future. 204 o Rapid feedback about the occurrence of congestion events can be 205 achieved using the Extended RTP Profile for RTCP-Based Feedback 206 (RTP/AVPF) [RFC4585] (or its secure variant, RTP/SAVPF [RFC5124]) 207 in place of the RTP/AVP profile [RFC3551]. This modifies the RTCP 208 timing rules to allow RTCP reports to be sent early, in some cases 209 immediately, provided the RTCP transmission rate keeps within its 210 bandwidth allocation. It also defines transport-layer feedback 211 messages, including negative acknowledgements (NACKs), that can be 212 used to report on specific congestion events. RTP Codec Control 213 Messages [RFC5104] extend the RTP/AVPF profile with additional 214 feedback messages that can be used to influence that way in which 215 rate adaptation occurs, but do not further change the dynamics of 216 how rapidly feedback can be sent. Use of the RTP/AVPF profile is 217 dependent on signalling. 219 o Finally, Explicit Congestion Notification (ECN) for RTP over UDP 220 [RFC6679] can be used to provide feedback on the number of packets 221 that received an ECN Congestion Experienced (CE) mark. This RTCP 222 extension builds on the RTP/AVPF profile to allow rapid congestion 223 feedback when ECN is supported. 225 In addition to these mechanisms for providing feedback, the sender 226 can include an RTP header extension in each packet to record packet 227 transmission times. There are two methods: [RFC5450] represents the 228 transmission time in terms of a time-offset from the RTP timestamp of 229 the packet, while [RFC6051] includes an explicit NTP-format sending 230 timestamp (potentially more accurate, but a higher header overhead). 231 Accurate sending timestamps can be helpful for estimating queuing 232 delays, to get an early indication of the onset of congestion. 234 Taken together, these various mechanisms allow receivers to provide 235 feedback on the senders when congestion events occur, with varying 236 degrees of timeliness and accuracy. The key distinction is between 237 systems that use only the basic RTCP mechanisms, without RTP/AVPF 238 rapid feedback, and those that use the RTP/AVPF extensions to respond 239 to congestion more rapidly. 241 4. RTP Circuit Breakers for Systems Using the RTP/AVP Profile 243 The feedback mechanisms defined in [RFC3550] and available under the 244 RTP/AVP profile [RFC3551] are the minimum that can be assumed for a 245 baseline circuit breaker mechanism that is suitable for all unicast 246 applications of RTP. Accordingly, for an RTP circuit breaker to be 247 useful, it needs to be able to detect that an RTP flow is causing 248 excessive congestion using only basic RTCP features, without needing 249 RTCP XR feedback or the RTP/AVPF profile for rapid RTCP reports. 251 RTCP is a fundamental part of the RTP protocol, and the mechanisms 252 described here rely on the implementation of RTCP. Implementations 253 that claim to support RTP, but that do not implement RTCP, cannot use 254 the circuit breaker mechanisms described in this memo. Such 255 implementations SHOULD NOT be used on networks that might be subject 256 to congestion unless equivalent mechanisms are defined using some 257 non-RTCP feedback channel to report congestion and signal circuit 258 breaker conditions. 260 Three potential congestion signals are available from the basic RTCP 261 SR/RR packets and are reported for each synchronisation source (SSRC) 262 in the RTP session: 264 1. The sender can estimate the network round-trip time once per RTCP 265 reporting interval, based on the contents and timing of RTCP SR 266 and RR packets. 268 2. Receivers report a jitter estimate (the statistical variance of 269 the RTP data packet inter-arrival time) calculated over the RTCP 270 reporting interval. Due to the nature of the jitter calculation 271 ([RFC3550], section 6.4.4), the jitter is only meaningful for RTP 272 flows that send a single data packet for each RTP timestamp value 273 (i.e., audio flows, or video flows where each packet comprises 274 one video frame). 276 3. Receivers report the fraction of RTP data packets lost during the 277 RTCP reporting interval, and the cumulative number of RTP packets 278 lost over the entire RTP session. 280 These congestion signals limit the possible circuit breakers, since 281 they give only limited visibility into the behaviour of the network. 283 RTT estimates are widely used in congestion control algorithms, as a 284 proxy for queuing delay measures in delay-based congestion control or 285 to determine connection timeouts. RTT estimates derived from RTCP SR 286 and RR packets sent according to the RTP/AVP timing rules are too 287 infrequent to be useful though, and don't give enough information to 288 distinguish a delay change due to routing updates from queuing delay 289 caused by congestion. Accordingly, we cannot use the RTT estimate 290 alone as an RTP circuit breaker. 292 Increased jitter can be a signal of transient network congestion, but 293 in the highly aggregated form reported in RTCP RR packets, it offers 294 insufficient information to estimate the extent or persistence of 295 congestion. Jitter reports are a useful early warning of potential 296 network congestion, but provide an insufficiently strong signal to be 297 used as a circuit breaker. 299 The remaining congestion signals are the packet loss fraction and the 300 cumulative number of packets lost. If considered carefully, these 301 can be effective indicators that congestion is occurring in networks 302 where packet loss is primarily due to queue overflows, although loss 303 caused by non-congestive packet corruption can distort the result in 304 some networks. TCP congestion control [RFC5681] intentionally tries 305 to fill the router queues, and uses the resulting packet loss as 306 congestion feedback. An RTP flow competing with TCP traffic will 307 therefore expect to see a non-zero packet loss fraction that has to 308 be related to TCP dynamics to estimate available capacity. This 309 behaviour of TCP is reflected in the congestion circuit breaker 310 below, and will affect the design of any RTP congestion control 311 protocol. 313 Two packet loss regimes can be observed: 1) RTCP RR packets show a 314 non-zero packet loss fraction, while the extended highest sequence 315 number received continues to increment; and 2) RR packets show a loss 316 fraction of zero, but the extended highest sequence number received 317 does not increment even though the sender has been transmitting RTP 318 data packets. The former corresponds to the TCP congestion avoidance 319 state, and indicates a congested path that is still delivering data; 320 the latter corresponds to a TCP timeout, and is most likely due to a 321 path failure. A third condition is that data is being sent but no 322 RTCP feedback is received at all, corresponding to a failure of the 323 reverse path. We derive circuit breaker conditions for these loss 324 regimes in the following. 326 4.1. RTP/AVP Circuit Breaker #1: Media Timeout 328 If RTP data packets are being sent, but the RTCP SR or RR packets 329 reporting on that SSRC indicate a non-increasing extended highest 330 sequence number received, this is an indication that those RTP data 331 packets are not reaching the receiver. This could be a short-term 332 issue affecting only a few packets, perhaps caused by a slow-to-open 333 firewall or a transient connectivity problem, but if the issue 334 persists, it is a sign of a more ongoing and significant problem. 335 Accordingly, if a sender of RTP data packets receives three or more 336 consecutive RTCP SR or RR packets from the same receiver, and those 337 packets correspond to its transmission and have a non-increasing 338 extended highest sequence number received field, then that sender 339 SHOULD cease transmission (see Section 4.5). The extended highest 340 sequence number received field is non-increasing if the sender 341 receives at least three consecutive RTCP SR or RR packets that report 342 the same value for this field, but it has sent RTP data packets that 343 would have caused an increase in the reported value if they had 344 reached the receiver. 346 The reason for waiting for three or more consecutive RTCP packets 347 with a non-increasing extended highest sequence number is to give 348 enough time for transient reception problems to resolve themselves, 349 but to stop problem flows quickly enough to avoid causing serious 350 ongoing network congestion. A single RTCP report showing no 351 reception could be caused by a transient fault, and so will not cease 352 transmission. Waiting for more than three consecutive RTCP reports 353 before stopping a flow might avoid some false positives, but could 354 lead to problematic flows running for a long time period (potentially 355 tens of seconds, depending on the RTCP reporting interval) before 356 being cut off. Equally, an application that sends few packets when 357 the packet loss rate is high runs the risk that the media timeout 358 circuit breaker triggers inadvertently. The chosen timeout interval 359 is a trade-off between these extremes. 361 4.2. RTP/AVP Circuit Breaker #2: RTCP Timeout 363 In addition to media timeouts, as were discussed in Section 4.1, an 364 RTP session has the possibility of an RTCP timeout. This can occur 365 when RTP data packets are being sent, but there are no RTCP reports 366 returned from the receiver. This is either due to a failure of the 367 receiver to send RTCP reports, or a failure of the return path that 368 is preventing those RTCP reporting from being delivered. In either 369 case, it is not safe to continue transmission, since the sender has 370 no way of knowing if it is causing congestion. Accordingly, an RTP 371 sender that has not received any RTCP SR or RTCP RR packets reporting 372 on the SSRC it is using for three or more of its RTCP reporting 373 intervals SHOULD cease transmission (see Section 4.5). When 374 calculating the timeout, the deterministic RTCP reporting interval, 375 Td, without the randomization factor, and with a fixed minimum 376 interval Tmin=5 seconds) SHOULD be used. The rationale for this 377 choice of timeout is as described in Section 6.2 of RFC 3550 378 [RFC3550]. 380 The choice of three RTCP reporting intervals as the timeout is made 381 following Section 6.3.5 of RFC 3550 [RFC3550]. This specifies that 382 participants in an RTP session will timeout and remove an RTP sender 383 from the list of active RTP senders if no RTP data packets have been 384 received from that RTP sender within the last two RTCP reporting 385 intervals. Using a timeout of three RTCP reporting intervals is 386 therefore large enough that the other participants will have timed 387 out the sender if a network problem stops the data packets it is 388 sending from reaching the receivers, even allowing for loss of some 389 RTCP packets. 391 If a sender is transmitting a large number of RTP media streams, such 392 that the corresponding RTCP SR or RR packets are too large to fit 393 into the network MTU, the receiver will generate RTCP SR or RR 394 packets in a round-robin manner. In this case, the sender SHOULD 395 treat receipt of an RTCP SR or RR packet corresponding to any SSRC it 396 sent on the same 5-tuple of source and destination IP address, port, 397 and protocol, as an indication that the receiver and return path are 398 working, preventing the RTCP timeout circuit breaker from triggering. 400 4.3. RTP/AVP Circuit Breaker #3: Congestion 402 If RTP data packets are being sent, and the corresponding RTCP SR or 403 RR packets show non-zero packet loss fraction and increasing extended 404 highest sequence number received, then those RTP data packets are 405 arriving at the receiver, but some degree of congestion is occurring. 406 The RTP/AVP profile [RFC3551] states that: 408 If best-effort service is being used, RTP receivers SHOULD monitor 409 packet loss to ensure that the packet loss rate is within 410 acceptable parameters. Packet loss is considered acceptable if a 411 TCP flow across the same network path and experiencing the same 412 network conditions would achieve an average throughput, measured 413 on a reasonable time scale, that is not less than the RTP flow is 414 achieving. This condition can be satisfied by implementing 415 congestion control mechanisms to adapt the transmission rate (or 416 the number of layers subscribed for a layered multicast session), 417 or by arranging for a receiver to leave the session if the loss 418 rate is unacceptably high. 420 The comparison to TCP cannot be specified exactly, but is intended 421 as an "order-of-magnitude" comparison in time scale and 422 throughput. The time scale on which TCP throughput is measured is 423 the round-trip time of the connection. In essence, this 424 requirement states that it is not acceptable to deploy an 425 application (using RTP or any other transport protocol) on the 426 best-effort Internet which consumes bandwidth arbitrarily and does 427 not compete fairly with TCP within an order of magnitude. 429 The phase "order of magnitude" in the above means within a factor of 430 ten, approximately. In order to implement this, it is necessary to 431 estimate the throughput a TCP connection would achieve over the path. 432 For a long-lived TCP Reno connection, it has been shown that the TCP 433 throughput can be estimated using the following equation [Padhye]: 435 s 436 X = -------------------------------------------------------------- 437 R*sqrt(2*b*p/3) + (t_RTO * (3*sqrt(3*b*p/8) * p * (1+32*p^2))) 439 where: 441 X is the transmit rate in bytes/second. 443 s is the packet size in bytes. If data packets vary in size, then 444 the average size is to be used. 446 R is the round trip time in seconds. 448 p is the loss event rate, between 0 and 1.0, of the number of loss 449 events as a fraction of the number of packets transmitted. 451 t_RTO is the TCP retransmission timeout value in seconds, generally 452 approximated by setting t_RTO = 4*R. 454 b is the number of packets that are acknowledged by a single TCP 455 acknowledgement; [RFC3448] recommends the use of b=1 since many 456 TCP implementations do not use delayed acknowledgements. 458 This is the same approach to estimated TCP throughput that is used in 459 [RFC3448]. Under conditions of low packet loss the second term on 460 the denominator is small, so this formula can be approximated with 461 reasonable accuracy as follows [Mathis]: 463 s 464 X = ----------------- 465 R * sqrt(2*b*p/3) 467 It is RECOMMENDED that this simplified throughout equation be used, 468 since the reduction in accuracy is small, and it is much simpler to 469 calculate than the full equation. Measurements have shown that the 470 simplified TCP throughput equation is effective as an RTP circuit 471 breaker for multimedia flows sent to hosts on residential networks 472 using ADSL and cable modem links [Singh]. The data shows that the 473 full TCP throughput equation tends to be more sensitive to packet 474 loss and triggers the RTP circuit breaker earlier than the simplified 475 equation. Implementations that desire this extra sensitivity MAY use 476 the full TCP throughput equation in the RTP circuit breaker. Initial 477 measurements in LTE networks have shown that the extra sensitivity is 478 helpful in that environment, with the full TCP throughput equation 479 giving a more balanced circuit breaker response than the simplified 480 TCP equation [Sarker]; other networks might see similar behaviour. 482 No matter what TCP throughput equation is chosen, two parameters need 483 to be estimated and reported to the sender in order to calculate the 484 throughput: the round trip time, R, and the loss event rate, p (the 485 packet size, s, is known to the sender). The round trip time can be 486 estimated from RTCP SR and RR packets. This is done too infrequently 487 for accurate statistics, but is the best that can be done with the 488 standard RTCP mechanisms. 490 Report blocks in RTCP SR or RR packets contain the packet loss 491 fraction, rather than the loss event rate, so p cannot be reported 492 (TCP typically treats the loss of multiple packets within a single 493 RTT as one loss event, but RTCP RR packets report the overall 494 fraction of packets lost, and does not report when the packet losses 495 occurred). Using the loss fraction in place of the loss event rate 496 can overestimate the loss. We believe that this overestimate will 497 not be significant, given that we are only interested in order of 498 magnitude comparison ([Floyd] section 3.2.1 shows that the difference 499 is small for steady-state conditions and random loss, but using the 500 loss fraction is more conservative in the case of bursty loss). 502 The congestion circuit breaker is therefore: when a sender receives 503 an RTCP SR or RR packet that contains a report block for an SSRC it 504 is using, that sender has to check the fraction lost field in that 505 report block to determine if there is a non-zero packet loss rate. 506 If the fraction lost field is zero, then continue sending as normal. 507 If the fraction lost is greater than zero, then estimate the TCP 508 throughput using the simplified equation above, and the measured R, p 509 (approximated by the fraction lost), and s. Compare this with the 510 actual sending rate. If the actual sending rate is more than ten 511 times the estimated sending rate derived from the TCP throughput 512 equation for three consecutive RTCP reporting intervals, the sender 513 SHOULD cease transmission (see Section 4.5). 515 Systems that usually send at a high data rate, but that can reduce 516 their data rate significantly (i.e., by at least a factor of ten), 517 MAY first reduce their sending rate to this lower value to see if 518 this resolves the congestion, but MUST then cease transmission if the 519 problem does not resolve itself within a further two RTCP reporting 520 intervals (see Section 4.5). An example of this might be a video 521 conferencing system that backs off to sending audio only, before 522 completely dropping the call. If such a reduction in sending rate 523 resolves the congestion problem, the sender MAY gradually increase 524 the rate at which it sends data after a reasonable amount of time has 525 passed, provided it takes care not to cause the problem to recur 526 ("reasonable" is intentionally not defined here). 528 The congestion circuit breaker depends on the fraction of RTP data 529 packets lost in a reporting interval. If the number of packets sent 530 in the reporting interval is too low, this statistic loses meaning, 531 and it is possible that a sampling error can give the appearance of 532 high packet loss rates. Following the guidelines in [RFC5405], an 533 RTP sender that sends not more than one RTP packet per RTT MAY ignore 534 a single trigger of the congestion circuit breaker, on the basis that 535 the packet loss rate estimate is unreliable with so few samples. 536 However, if the congestion circuit breaker triggers again after the 537 following three RTCP reporting intervals (i.e., if there have been 538 six or more consecutive RTCP reporting intervals where the actual 539 sending rate is more than ten times the estimated sending rate 540 derived from the TCP throughput equation), then the sender SHOULD 541 cease transmission (see Section 4.5). 543 The RTCP reporting interval of the media sender does not affect how 544 quickly congestion circuit breaker can trigger. The timing is based 545 on the RTCP reporting interval of the receiver that generates the SR/ 546 RR packets from which the loss rate and RTT estimate are derived 547 (note that RTCP requires all participants in a session to have 548 similar reporting intervals, else the participant timeout rules in 549 [RFC3550] will not work, so this interval is likely similar to that 550 of the sender). If the incoming RTCP SR or RR packets are using a 551 reduced minimum RTCP reporting interval (as specified in Section 6.2 552 of RFC 3550 [RFC3550] or the RTP/AVPF profile [RFC4585]), then that 553 reduced RTCP reporting interval is used when determining if the 554 circuit breaker is triggered. 556 As in Section 4.1 and Section 4.2, we use three reporting intervals 557 to avoid triggering the circuit breaker on transient failures. This 558 circuit breaker is a worst-case condition, and congestion control 559 needs to be performed to keep well within this bound. It is expected 560 that the circuit breaker will only be triggered if the usual 561 congestion control fails for some reason. 563 If there are more media streams that can be reported in a single RTCP 564 SR or RR packet, or if the size of a complete RTCP SR or RR packet 565 exceeds the network MTU, then the receiver will report on a subset of 566 sources in each reporting interval, with the subsets selected round- 567 robin across multiple intervals so that all sources are eventually 568 reported [RFC3550]. When generating such round-robin RTCP reports, 569 priority SHOULD be given to reports on sources that have high packet 570 loss rates, to ensure that senders are aware of network congestion 571 they are causing (this is an update to [RFC3550]). 573 4.4. RTP/AVP Circuit Breaker #4: Media Usability 575 Applications that use RTP are generally tolerant to some amount of 576 packet loss. How much packet loss can be tolerated will depend on 577 the application, media codec, and the amount of error correction and 578 packet loss concealment that is applied. There is an upper bound on 579 the amount of loss can be corrected, however, beyond which the media 580 becomes unusable. Similarly, many applications have some upper bound 581 on the media capture to play-out latency that can be tolerated before 582 the application becomes unusable. The latency bound will depend on 583 the application, but typical values can range from the order of a few 584 hundred milliseconds for voice telephony and interactive conferencing 585 applications, up to several seconds for some video-on-demand systems. 587 As a final circuit breaker, RTP senders SHOULD monitor the reported 588 packet loss and delay to estimate whether the media is likely to be 589 suitable for the intended purpose. If the packet loss rate and/or 590 latency is such that the media has become unusable, and has remained 591 unusable for a significant time period, then the application SHOULD 592 cease transmission. Similarly, receivers SHOULD monitor the quality 593 of the media they receive, and if the quality is unusable for a 594 significant time period, they SHOULD terminate the session. This 595 memo intentionally does not define a bound on the packet loss rate or 596 latency that will result in unusable media, nor does it specify what 597 time period is deemed significant, as these are highly application 598 dependent. 600 Sending media that suffers from such high packet loss or latency that 601 it is unusable at the receiver is both wasteful of resources, and of 602 no benefit to the user of the application. It also is highly likely 603 to be congesting the network, and disrupting other applications. As 604 such, the congestion circuit breaker will almost certainly trigger to 605 stop flows where the media would be unusable due to high packet loss 606 or latency. However, in pathological scenarios where the congestion 607 circuit breaker does not stop the flow, it is desirable that the RTP 608 application cease sending useless traffic. The role of the media 609 usability circuit breaker is to protect the network in such cases. 611 4.5. Ceasing Transmission 613 What it means to cease transmission depends on the application, but 614 the intention is that the application will stop sending RTP data 615 packets to a particular destination 3-tuple (transport protocol, 616 destination port, IP address), until the user makes an explicit 617 attempt to restart the call. It is important that a human user is 618 involved in the decision to try to restart the call, since that user 619 will eventually give up if the calls repeatedly trigger the circuit 620 breaker. This will help avoid problems with automatic redial systems 621 from congesting the network. Accordingly, RTP flows halted by the 622 circuit breaker SHOULD NOT be restarted automatically unless the 623 sender has received information that the congestion has dissipated. 625 It is recognised that the RTP implementation in some systems might 626 not be able to determine if a call set-up request was initiated by a 627 human user, or automatically by some scripted higher-level component 628 of the system. These implementations SHOULD rate limit attempts to 629 restart a call to the same destination 3-tuple as used by a previous 630 call that was recently halted by the circuit breaker. The chosen 631 rate limit ought to not exceed the rate at which an annoyed human 632 caller might redial a misbehaving phone. 634 5. RTP Circuit Breakers for Systems Using the RTP/AVPF Profile 636 Use of the Extended RTP Profile for RTCP-based Feedback (RTP/AVPF) 637 [RFC4585] allows receivers to send early RTCP reports in some cases, 638 to inform the sender about particular events in the media stream. 639 There are several use cases for such early RTCP reports, including 640 providing rapid feedback to a sender about the onset of congestion. 642 Receiving rapid feedback about congestion events potentially allows 643 congestion control algorithms to be more responsive, and to better 644 adapt the media transmission to the limitations of the network. It 645 is expected that many RTP congestion control algorithms will adopt 646 the RTP/AVPF profile for this reason, defining new transport layer 647 feedback reports that suit their requirements. Since these reports 648 are not yet defined, and likely very specific to the details of the 649 congestion control algorithm chosen, they cannot be used as part of 650 the generic RTP circuit breaker. 652 Reduced-size RTCP reports sent under the RTP/AVPF early feedback 653 rules that do not contain an RTCP SR or RR packet MUST be ignored by 654 the congestion circuit breaker (they do not contain the information 655 needed by the congestion circuit breaker algorithm), but MUST be 656 counted as received packets for the RTCP timeout circuit breaker. 657 Reduced-size RTCP reports sent under the RTP/AVPF early feedback 658 rules that contain RTCP SR or RR packets MUST be processed by the 659 congestion circuit breaker as if they were sent as regular RTCP 660 reports, and counted towards the circuit breaker conditions specified 661 in Section 4 of this memo. This will potentially make the RTP 662 circuit breaker fire earlier than it would if the RTP/AVPF profile 663 was not used. 665 When using ECN with RTP (see Section 8), early RTCP feedback packets 666 can contain ECN feedback reports. The count of ECN-CE marked packets 667 contained in those ECN feedback reports is counted towards the number 668 of lost packets reported if the ECN Feedback Report report is sent in 669 an compound RTCP packet along with an RTCP SR/RR report packet. 670 Reports of ECN-CE packets sent as reduced-size RTCP ECN feedback 671 packets without an RTCP SR/RR packet MUST be ignored. 673 These rules are intended to allow the use of low-overhead RTP/AVPF 674 feedback for generic NACK messages without triggering the RTP circuit 675 breaker. This is expected to make such feedback suitable for RTP 676 congestion control algorithms that need to quickly report loss events 677 in between regular RTCP reports. The reaction to reduced-size RTCP 678 SR/RR packets is to allow such algorithms to send feedback that can 679 trigger the circuit breaker, when desired. 681 6. Impact of RTCP Extended Reports (XR) 683 RTCP Extended Report (XR) blocks provide additional reception quality 684 metrics, but do not change the RTCP timing rules. Some of the RTCP 685 XR blocks provide information that might be useful for congestion 686 control purposes, others provided non-congestion-related metrics. 687 With the exception of RTCP XR ECN Summary Reports (see Section 8), 688 the presence of RTCP XR blocks in a compound RTCP packet does not 689 affect the RTP circuit breaker algorithm. For consistency and ease 690 of implementation, only the reception report blocks contained in RTCP 691 SR packets, RTCP RR packets, or RTCP XR ECN Summary Report packets, 692 are used by the RTP circuit breaker algorithm. 694 7. Impact of RTCP Reporting Groups 696 An optimisation for grouping RTCP reception statistics and other 697 feedback in RTP sessions with large numbers of participants is given 698 in [I-D.ietf-avtcore-rtp-multi-stream-optimisation]. This allows one 699 SSRC to act as a representative that sends reports on behalf of other 700 SSRCs that are co-located in the same endpoint and see identical 701 reception quality. When running the circuit breaker algorithms, an 702 endpoint MUST treat a reception report from the representative of the 703 reporting group as if a reception report was received from all 704 members of that group. 706 8. Impact of Explicit Congestion Notification (ECN) 708 The use of ECN for RTP flows does not affect the media timeout RTP 709 circuit breaker (Section 4.1) or the RTCP timeout circuit breaker 710 (Section 4.2), since these are both connectivity checks that simply 711 determinate if any packets are being received. 713 ECN-CE marked packets SHOULD be treated as if it were lost for the 714 purposes of congestion control, when determining the optimal media 715 sending rate for an RTP flow. If an RTP sender has negotiated ECN 716 support for an RTP session, and has successfully initiated ECN use on 717 the path to the receiver [RFC6679], then ECN-CE marked packets SHOULD 718 be treated as if they were lost when calculating if the congestion- 719 based RTP circuit breaker (Section 4.3) has been met. The count of 720 ECN-CE marked RTP packets is returned in RTCP XR ECN summary report 721 packets if support for ECN has been initiated for an RTP session. 723 9. Impact of Layered Coding 725 Layered coding is a method of encoding a single media stream into 726 disparate layers, such that a receiver can decode a subset of the 727 layers to vary the quality of the media. Layered coding is often 728 used to aid congestion control in group communication systems, where 729 a different subset of the layers is sent to each receiver, depending 730 on the available network capacity. 732 Media using layered coding can be transported within RTP in several 733 ways: each layer can be sent as a separate RTP session; each layer 734 can be sent using a separate SSRC within a single RTP session; or 735 each layer can be identified by some payload-specific header field, 736 with all layers being sent by a single SSRC within a single RTP 737 session. The choice depends on the features provided by the RTP 738 payload format for the layered encoding, and on the application 739 requirements. 741 The RTP circuit breaker operates on a per-RTP session basis. If a 742 layered encoding is split across multiple RTP sessions, then each 743 session MUST be treated independently for the RTP circuit breaker. 745 Within an RTP session, if an application that sends a layered media 746 encoding using a single SSRC, with the layers identified using some 747 payload-specific mechanism, then it MUST apply the RTP circuit 748 breaker to that layered flow as a whole, considering RTCP feedback 749 for the SSRC sending the layered flow and applying the RTP circuit 750 breaker as usual. 752 Within an RTP session, if the layered coding is sent using several 753 SSRC values within a single RTP session, the flows for those SSRCs 754 MAY be treated together, so that a circuit breaker trigger for any 755 SSRC in the layered media flow causes the entire layered flow to 756 either cease transmission or reduce its sending rate by a factor of 757 ten. The intent of this is to allow a layered flow to reduce its 758 sending rate by dropping higher layers if the circuit breaker fails, 759 rather than requiring the layer that triggered the RTP circuit 760 breaker to cease transmission (layers are additive in many layered 761 codecs, so forcing a lower layer to cease transmission while allowing 762 higher layers to continue is pointless). 764 10. Security Considerations 766 The security considerations of [RFC3550] apply. 768 If the RTP/AVPF profile is used to provide rapid RTCP feedback, the 769 security considerations of [RFC4585] apply. If ECN feedback for RTP 770 over UDP/IP is used, the security considerations of [RFC6679] apply. 772 If non-authenticated RTCP reports are used, an on-path attacker can 773 trivially generate fake RTCP packets that indicate high packet loss 774 rates, causing the circuit breaker to trigger and disrupting an RTP 775 session. This is somewhat more difficult for an off-path attacker, 776 due to the need to guess the randomly chosen RTP SSRC value and the 777 RTP sequence number. This attack can be avoided if RTCP packets are 778 authenticated; authentication options are discussed in [RFC7201]. 780 Timely operation of the RTP circuit breaker depends on the choice of 781 RTCP reporting interval. If the receiver has a reporting interval 782 that is overly long, then the responsiveness of the circuit breaker 783 decreases. In the limit, the RTP circuit breaker can be disabled for 784 all practical purposes by configuring an RTCP reporting interval that 785 is many minutes duration. This issue is not specific to the circuit 786 breaker: long RTCP reporting intervals also prevent reception quality 787 reports, feedback messages, codec control messages, etc., from being 788 used. Implementations SHOULD impose an upper limit on the RTCP 789 reporting interval they are willing to negotiate (based on the 790 session bandwidth and RTCP bandwidth fraction) when using the RTP 791 circuit breaker. An upper limit on the reporting interval on the 792 order of 10 seconds is a reasonable bound. 794 11. IANA Considerations 796 There are no actions for IANA. 798 12. Open Issues 800 o Should the number of RTCP reporting intervals needed to trigger 801 the media timeout and congestion circuit breakers scale with the 802 duration of the RTCP reporting interval, so the circuit breaker 803 triggers after a fixed duration, rather than after a fixed number 804 of reporting intervals? 806 13. Acknowledgements 808 The authors would like to thank Bernard Aboba, Harald Alvestrand, 809 Gorry Fairhurst, Kevin Gross, Cullen Jennings, Randell Jesup, 810 Jonathan Lennox, Matt Mathis, Stephen McQuistin, Eric Rescorla, 811 Abheek Saha, and Fabio Verdicchio, for their valuable feedback. 813 14. References 815 14.1. Normative References 817 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 818 Requirement Levels", BCP 14, RFC 2119, March 1997. 820 [RFC3448] Handley, M., Floyd, S., Padhye, J., and J. Widmer, "TCP 821 Friendly Rate Control (TFRC): Protocol Specification", RFC 822 3448, January 2003. 824 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 825 Jacobson, "RTP: A Transport Protocol for Real-Time 826 Applications", STD 64, RFC 3550, July 2003. 828 [RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and 829 Video Conferences with Minimal Control", STD 65, RFC 3551, 830 July 2003. 832 [RFC3611] Friedman, T., Caceres, R., and A. Clark, "RTP Control 833 Protocol Extended Reports (RTCP XR)", RFC 3611, November 834 2003. 836 [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, 837 "Extended RTP Profile for Real-time Transport Control 838 Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, July 839 2006. 841 14.2. Informative References 843 [Floyd] Floyd, S., Handley, M., Padhye, J., and J. Widmer, 844 "Equation-Based Congestion Control for Unicast 845 Applications", Proceedings of the ACM SIGCOMM conference, 846 2000, DOI 10.1145/347059.347397, August 2000. 848 [I-D.ietf-avtcore-rtp-multi-stream-optimisation] 849 Lennox, J., Westerlund, M., Wu, W., and C. Perkins, 850 "Sending Multiple Media Streams in a Single RTP Session: 851 Grouping RTCP Reception Statistics and Other Feedback", 852 draft-ietf-avtcore-rtp-multi-stream-optimisation-04 (work 853 in progress), August 2014. 855 [Mathis] Mathis, M., Semke, J., Mahdavi, J., and T. Ott, "The 856 macroscopic behavior of the TCP congestion avoidance 857 algorithm", ACM SIGCOMM Computer Communication Review 858 27(3), DOI 10.1145/263932.264023, July 1997. 860 [Padhye] Padhye, J., Firoiu, V., Towsley, D., and J. Kurose, 861 "Modeling TCP Throughput: A Simple Model and its Empirical 862 Validation", Proceedings of the ACM SIGCOMM conference, 863 1998, DOI 10.1145/285237.285291, August 1998. 865 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 866 of Explicit Congestion Notification (ECN) to IP", RFC 867 3168, September 2001. 869 [RFC5104] Wenger, S., Chandra, U., Westerlund, M., and B. Burman, 870 "Codec Control Messages in the RTP Audio-Visual Profile 871 with Feedback (AVPF)", RFC 5104, February 2008. 873 [RFC5124] Ott, J. and E. Carrara, "Extended Secure RTP Profile for 874 Real-time Transport Control Protocol (RTCP)-Based Feedback 875 (RTP/SAVPF)", RFC 5124, February 2008. 877 [RFC5405] Eggert, L. and G. Fairhurst, "Unicast UDP Usage Guidelines 878 for Application Designers", BCP 145, RFC 5405, November 879 2008. 881 [RFC5450] Singer, D. and H. Desineni, "Transmission Time Offsets in 882 RTP Streams", RFC 5450, March 2009. 884 [RFC5506] Johansson, I. and M. Westerlund, "Support for Reduced-Size 885 Real-Time Transport Control Protocol (RTCP): Opportunities 886 and Consequences", RFC 5506, April 2009. 888 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 889 Control", RFC 5681, September 2009. 891 [RFC6051] Perkins, C. and T. Schierl, "Rapid Synchronisation of RTP 892 Flows", RFC 6051, November 2010. 894 [RFC6679] Westerlund, M., Johansson, I., Perkins, C., O'Hanlon, P., 895 and K. Carlberg, "Explicit Congestion Notification (ECN) 896 for RTP over UDP", RFC 6679, August 2012. 898 [RFC6798] Clark, A. and Q. Wu, "RTP Control Protocol (RTCP) Extended 899 Report (XR) Block for Packet Delay Variation Metric 900 Reporting", RFC 6798, November 2012. 902 [RFC6843] Clark, A., Gross, K., and Q. Wu, "RTP Control Protocol 903 (RTCP) Extended Report (XR) Block for Delay Metric 904 Reporting", RFC 6843, January 2013. 906 [RFC6958] Clark, A., Zhang, S., Zhao, J., and Q. Wu, "RTP Control 907 Protocol (RTCP) Extended Report (XR) Block for Burst/Gap 908 Loss Metric Reporting", RFC 6958, May 2013. 910 [RFC7002] Clark, A., Zorn, G., and Q. Wu, "RTP Control Protocol 911 (RTCP) Extended Report (XR) Block for Discard Count Metric 912 Reporting", RFC 7002, September 2013. 914 [RFC7003] Clark, A., Huang, R., and Q. Wu, "RTP Control Protocol 915 (RTCP) Extended Report (XR) Block for Burst/Gap Discard 916 Metric Reporting", RFC 7003, September 2013. 918 [RFC7097] Ott, J., Singh, V., and I. Curcio, "RTP Control Protocol 919 (RTCP) Extended Report (XR) for RLE of Discarded Packets", 920 RFC 7097, January 2014. 922 [RFC7201] Westerlund, M. and C. Perkins, "Options for Securing RTP 923 Sessions", RFC 7201, April 2014. 925 [Sarker] Sarker, Z., Singh, V., and C.S. Perkins, "An Evaluation of 926 RTP Circuit Breaker Performance on LTE Networks", 927 Proceedings of the IEEE Infocom workshop on Communication 928 and Networking Techniques for Contemporary Video, 2014, 929 April 2014. 931 [Singh] Singh, V., McQuistin, S., Ellis, M., and C.S. Perkins, 932 "Circuit Breakers for Multimedia Congestion Control", 933 Proceedings of the International Packet Video Workshop, 934 2013, DOI 10.1109/PV.2013.6691439, December 2013. 936 Authors' Addresses 938 Colin Perkins 939 University of Glasgow 940 School of Computing Science 941 Glasgow G12 8QQ 942 United Kingdom 944 Email: csp@csperkins.org 945 Varun Singh 946 Aalto University 947 School of Electrical Engineering 948 Otakaari 5 A 949 Espoo, FIN 02150 950 Finland 952 Email: varun@comnet.tkk.fi 953 URI: http://www.netlab.tkk.fi/~varun/