idnits 2.17.1 draft-ietf-rtcweb-rtp-usage-15.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 28, 2014) is 3593 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-13) exists of draft-ietf-avtcore-multi-media-rtp-session-05 == Outdated reference: A later version (-18) exists of draft-ietf-avtcore-rtp-circuit-breakers-05 == Outdated reference: A later version (-11) exists of draft-ietf-avtcore-rtp-multi-stream-04 == Outdated reference: A later version (-12) exists of draft-ietf-avtcore-rtp-multi-stream-optimisation-02 == Outdated reference: A later version (-12) exists of draft-ietf-rtcweb-security-06 == Outdated reference: A later version (-20) exists of draft-ietf-rtcweb-security-arch-09 ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) ** Obsolete normative reference: RFC 5285 (Obsoleted by RFC 8285) == Outdated reference: A later version (-12) exists of draft-ietf-avtcore-multiplex-guidelines-02 == Outdated reference: A later version (-10) exists of draft-ietf-avtcore-rtp-topologies-update-02 == Outdated reference: A later version (-08) exists of draft-ietf-avtext-rtp-grouping-taxonomy-01 == Outdated reference: A later version (-17) exists of draft-ietf-mmusic-msid-05 == Outdated reference: A later version (-54) exists of draft-ietf-mmusic-sdp-bundle-negotiation-07 == Outdated reference: A later version (-14) exists of draft-ietf-payload-rtp-howto-13 == Outdated reference: A later version (-09) exists of draft-ietf-rmcat-cc-requirements-04 == Outdated reference: A later version (-11) exists of draft-ietf-rtcweb-audio-05 == Outdated reference: A later version (-19) exists of draft-ietf-rtcweb-overview-09 == Outdated reference: A later version (-16) exists of draft-ietf-rtcweb-use-cases-and-requirements-14 == Outdated reference: A later version (-18) exists of draft-ietf-tsvwg-rtcweb-qos-00 -- Obsolete informational reference (is this intentional?): RFC 5245 (Obsoleted by RFC 8445, RFC 8839) Summary: 2 errors (**), 0 flaws (~~), 18 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 RTCWEB Working Group C. Perkins 3 Internet-Draft University of Glasgow 4 Intended status: Standards Track M. Westerlund 5 Expires: November 29, 2014 Ericsson 6 J. Ott 7 Aalto University 8 May 28, 2014 10 Web Real-Time Communication (WebRTC): Media Transport and Use of RTP 11 draft-ietf-rtcweb-rtp-usage-15 13 Abstract 15 The Web Real-Time Communication (WebRTC) framework provides support 16 for direct interactive rich communication using audio, video, text, 17 collaboration, games, etc. between two peers' web-browsers. This 18 memo describes the media transport aspects of the WebRTC framework. 19 It specifies how the Real-time Transport Protocol (RTP) is used in 20 the WebRTC context, and gives requirements for which RTP features, 21 profiles, and extensions need to be supported. 23 Status of This Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at http://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on November 29, 2014. 40 Copyright Notice 42 Copyright (c) 2014 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (http://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 58 2. Rationale . . . . . . . . . . . . . . . . . . . . . . . . . . 4 59 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 60 4. WebRTC Use of RTP: Core Protocols . . . . . . . . . . . . . . 5 61 4.1. RTP and RTCP . . . . . . . . . . . . . . . . . . . . . . 5 62 4.2. Choice of the RTP Profile . . . . . . . . . . . . . . . . 7 63 4.3. Choice of RTP Payload Formats . . . . . . . . . . . . . . 8 64 4.4. Use of RTP Sessions . . . . . . . . . . . . . . . . . . . 9 65 4.5. RTP and RTCP Multiplexing . . . . . . . . . . . . . . . . 10 66 4.6. Reduced Size RTCP . . . . . . . . . . . . . . . . . . . . 10 67 4.7. Symmetric RTP/RTCP . . . . . . . . . . . . . . . . . . . 11 68 4.8. Choice of RTP Synchronisation Source (SSRC) . . . . . . . 11 69 4.9. Generation of the RTCP Canonical Name (CNAME) . . . . . . 12 70 4.10. Handling of Leap Seconds . . . . . . . . . . . . . . . . 13 71 5. WebRTC Use of RTP: Extensions . . . . . . . . . . . . . . . . 13 72 5.1. Conferencing Extensions and Topologies . . . . . . . . . 13 73 5.1.1. Full Intra Request (FIR) . . . . . . . . . . . . . . 15 74 5.1.2. Picture Loss Indication (PLI) . . . . . . . . . . . . 15 75 5.1.3. Slice Loss Indication (SLI) . . . . . . . . . . . . . 15 76 5.1.4. Reference Picture Selection Indication (RPSI) . . . . 15 77 5.1.5. Temporal-Spatial Trade-off Request (TSTR) . . . . . . 16 78 5.1.6. Temporary Maximum Media Stream Bit Rate Request 79 (TMMBR) . . . . . . . . . . . . . . . . . . . . . . . 16 80 5.2. Header Extensions . . . . . . . . . . . . . . . . . . . . 16 81 5.2.1. Rapid Synchronisation . . . . . . . . . . . . . . . . 17 82 5.2.2. Client-to-Mixer Audio Level . . . . . . . . . . . . . 17 83 5.2.3. Mixer-to-Client Audio Level . . . . . . . . . . . . . 17 84 6. WebRTC Use of RTP: Improving Transport Robustness . . . . . . 18 85 6.1. Negative Acknowledgements and RTP Retransmission . . . . 18 86 6.2. Forward Error Correction (FEC) . . . . . . . . . . . . . 19 87 7. WebRTC Use of RTP: Rate Control and Media Adaptation . . . . 19 88 7.1. Boundary Conditions and Circuit Breakers . . . . . . . . 20 89 7.2. Congestion Control Interoperability and Legacy Systems . 21 90 8. WebRTC Use of RTP: Performance Monitoring . . . . . . . . . . 22 91 9. WebRTC Use of RTP: Future Extensions . . . . . . . . . . . . 22 92 10. Signalling Considerations . . . . . . . . . . . . . . . . . . 22 93 11. WebRTC API Considerations . . . . . . . . . . . . . . . . . . 24 94 12. RTP Implementation Considerations . . . . . . . . . . . . . . 26 95 12.1. Configuration and Use of RTP Sessions . . . . . . . . . 26 96 12.1.1. Use of Multiple Media Sources Within an RTP Session 26 97 12.1.2. Use of Multiple RTP Sessions . . . . . . . . . . . . 28 98 12.1.3. Differentiated Treatment of RTP Packet Streams . . . 32 99 12.2. Media Source, RTP Packet Streams, and Participant 100 Identification . . . . . . . . . . . . . . . . . . . . . 34 101 12.2.1. Media Source Identification . . . . . . . . . . . . 34 102 12.2.2. SSRC Collision Detection . . . . . . . . . . . . . . 35 103 12.2.3. Media Synchronisation Context . . . . . . . . . . . 36 104 13. Security Considerations . . . . . . . . . . . . . . . . . . . 36 105 14. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 38 106 15. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 38 107 16. References . . . . . . . . . . . . . . . . . . . . . . . . . 38 108 16.1. Normative References . . . . . . . . . . . . . . . . . . 38 109 16.2. Informative References . . . . . . . . . . . . . . . . . 41 110 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 43 112 1. Introduction 114 The Real-time Transport Protocol (RTP) [RFC3550] provides a framework 115 for delivery of audio and video teleconferencing data and other real- 116 time media applications. Previous work has defined the RTP protocol, 117 along with numerous profiles, payload formats, and other extensions. 118 When combined with appropriate signalling, these form the basis for 119 many teleconferencing systems. 121 The Web Real-Time communication (WebRTC) framework provides the 122 protocol building blocks to support direct, interactive, real-time 123 communication using audio, video, collaboration, games, etc., between 124 two peers' web-browsers. This memo describes how the RTP framework 125 is to be used in the WebRTC context. It proposes a baseline set of 126 RTP features that are to be implemented by all WebRTC-aware end- 127 points, along with suggested extensions for enhanced functionality. 129 This memo specifies a protocol intended for use within the WebRTC 130 framework, but is not restricted to that context. An overview of the 131 WebRTC framework is given in [I-D.ietf-rtcweb-overview]. 133 The structure of this memo is as follows. Section 2 outlines our 134 rationale in preparing this memo and choosing these RTP features. 135 Section 3 defines terminology. Requirements for core RTP protocols 136 are described in Section 4 and suggested RTP extensions are described 137 in Section 5. Section 6 outlines mechanisms that can increase 138 robustness to network problems, while Section 7 describes congestion 139 control and rate adaptation mechanisms. The discussion of mandated 140 RTP mechanisms concludes in Section 8 with a review of performance 141 monitoring and network management tools that can be used in the 142 WebRTC context. Section 9 gives some guidelines for future 143 incorporation of other RTP and RTP Control Protocol (RTCP) extensions 144 into this framework. Section 10 describes requirements placed on the 145 signalling channel. Section 11 discusses the relationship between 146 features of the RTP framework and the WebRTC application programming 147 interface (API), and Section 12 discusses RTP implementation 148 considerations. The memo concludes with security considerations 149 (Section 13) and IANA considerations (Section 14). 151 2. Rationale 153 The RTP framework comprises the RTP data transfer protocol, the RTP 154 control protocol, and numerous RTP payload formats, profiles, and 155 extensions. This range of add-ons has allowed RTP to meet various 156 needs that were not envisaged by the original protocol designers, and 157 to support many new media encodings, but raises the question of what 158 extensions are to be supported by new implementations. The 159 development of the WebRTC framework provides an opportunity to review 160 the available RTP features and extensions, and to define a common 161 baseline feature set for all WebRTC implementations of RTP. This 162 builds on the past 20 years development of RTP to mandate the use of 163 extensions that have shown widespread utility, while still remaining 164 compatible with the wide installed base of RTP implementations where 165 possible. 167 RTP and RTCP extensions that are not discussed in this document can 168 be implemented by WebRTC end-points if they are beneficial for new 169 use cases. However, they are not necessary to address the WebRTC use 170 cases and requirements identified in 171 [I-D.ietf-rtcweb-use-cases-and-requirements]. 173 While the baseline set of RTP features and extensions defined in this 174 memo is targeted at the requirements of the WebRTC framework, it is 175 expected to be broadly useful for other conferencing-related uses of 176 RTP. In particular, it is likely that this set of RTP features and 177 extensions will be appropriate for other desktop or mobile video 178 conferencing systems, or for room-based high-quality telepresence 179 applications. 181 3. Terminology 183 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 184 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 185 document are to be interpreted as described in [RFC2119]. The RFC 186 2119 interpretation of these key words applies only when written in 187 ALL CAPS. Lower- or mixed-case uses of these key words are not to be 188 interpreted as carrying special significance in this memo. 190 We define the following additional terms: 192 WebRTC MediaStream: The MediaStream concept defined by the W3C in 193 the WebRTC API [W3C.WD-mediacapture-streams-20130903]. 195 Transport-layer Flow: A uni-directional flow of transport packets 196 that are identified by having a particular 5-tuple of source IP 197 address, source port, destination IP address, destination port, 198 and transport protocol used. 200 Bi-directional Transport-layer Flow: A bi-directional transport- 201 layer flow is a transport-layer flow that is symmetric. That is, 202 the transport-layer flow in the reverse direction has a 5-tuple 203 where the source and destination address and ports are swapped 204 compared to the forward path transport-layer flow, and the 205 transport protocol is the same. 207 This document uses the terminology from 208 [I-D.ietf-avtext-rtp-grouping-taxonomy]. Other terms are used 209 according to their definitions from the RTP Specification [RFC3550]. 210 Especially note the following frequently used terms: RTP Packet 211 Stream, RTP Session, and End-point. 213 4. WebRTC Use of RTP: Core Protocols 215 The following sections describe the core features of RTP and RTCP 216 that need to be implemented, along with the mandated RTP profiles. 217 Also described are the core extensions providing essential features 218 that all WebRTC implementations need to implement to function 219 effectively on today's networks. 221 4.1. RTP and RTCP 223 The Real-time Transport Protocol (RTP) [RFC3550] is REQUIRED to be 224 implemented as the media transport protocol for WebRTC. RTP itself 225 comprises two parts: the RTP data transfer protocol, and the RTP 226 control protocol (RTCP). RTCP is a fundamental and integral part of 227 RTP, and MUST be implemented in all WebRTC applications. 229 The following RTP and RTCP features are sometimes omitted in limited 230 functionality implementations of RTP, but are REQUIRED in all WebRTC 231 implementations: 233 o Support for use of multiple simultaneous SSRC values in a single 234 RTP session, including support for RTP end-points that send many 235 SSRC values simultaneously, following [RFC3550] and 236 [I-D.ietf-avtcore-rtp-multi-stream]. The RTCP optimisations for 237 multi-SSRC sessions defined in 238 [I-D.ietf-avtcore-rtp-multi-stream-optimisation] MAY be supported; 239 if supported the usage MUST be signalled. 241 o Random choice of SSRC on joining a session; collision detection 242 and resolution for SSRC values (see also Section 4.8). 244 o Support for reception of RTP data packets containing CSRC lists, 245 as generated by RTP mixers, and RTCP packets relating to CSRCs. 247 o Sending correct synchronisation information in the RTCP Sender 248 Reports, to allow receivers to implement lip-synchronisation; see 249 Section 5.2.1 regarding support for the rapid RTP synchronisation 250 extensions. 252 o Support for multiple synchronisation contexts. Participants that 253 send multiple simultaneous RTP packet streams SHOULD do so as part 254 of a single synchronisation context, using a single RTCP CNAME for 255 all streams and allowing receivers to play the streams out in a 256 synchronised manner. For compatibility with potential future 257 versions of this specification, or for interoperability with non- 258 WebRTC devices through a gateway, receivers MUST support multiple 259 synchronisation contexts, indicated by the use of multiple RTCP 260 CNAMEs in an RTP session. This specification requires the usage 261 of a single CNAME when sending RTP Packet Streams in some 262 circumstances, see Section 4.9. 264 o Support for sending and receiving RTCP SR, RR, SDES, and BYE 265 packet types, with OPTIONAL support for other RTCP packet types 266 unless mandated by other parts of this specification. Note that 267 additional RTCP Packet types are used by the RTP/SAVPF Profile 268 (Section 4.2) and the other RTCP extensions (Section 5). 270 o Support for multiple end-points in a single RTP session, and for 271 scaling the RTCP transmission interval according to the number of 272 participants in the session; support for randomised RTCP 273 transmission intervals to avoid synchronisation of RTCP reports; 274 support for RTCP timer reconsideration (Section 6.3.6 of 275 [RFC3550]) and reverse reconsideration (Section 6.3.4 of 276 [RFC3550]). 278 o Support for configuring the RTCP bandwidth as a fraction of the 279 media bandwidth, and for configuring the fraction of the RTCP 280 bandwidth allocated to senders, e.g., using the SDP "b=" line 281 [RFC4566][RFC3556]. 283 o Support for the reduced minimum RTCP reporting interval described 284 in Section 6.2 of [RFC3550] is REQUIRED. When using the reduced 285 minimum RTCP reporting interval, the fixed (non-reduced) minimum 286 interval MUST be used when calculating the participant timeout 287 interval (see Sections 6.2 and 6.3.5 of [RFC3550]). The delay 288 before sending the initial compound RTCP packet can be set to zero 289 (see Section 6.2 of [RFC3550] as updated by 290 [I-D.ietf-avtcore-rtp-multi-stream]). 292 o Ignore unknown RTCP packet types and RTP header extensions. This 293 to ensure robust handling of future extensions, middlebox 294 behaviours, etc., that can result in not signalled RTCP packet 295 types or RTP header extensions being received. If a compound RTCP 296 packet is received that contains a mixture of known and unknown 297 RTCP packet types, the known packets types need to be processed as 298 usual, with only the unknown packet types being discarded. 300 It is known that a significant number of legacy RTP implementations, 301 especially those targeted at VoIP-only systems, do not support all of 302 the above features, and in some cases do not support RTCP at all. 303 Implementers are advised to consider the requirements for graceful 304 degradation when interoperating with legacy implementations. 306 Other implementation considerations are discussed in Section 12. 308 4.2. Choice of the RTP Profile 310 The complete specification of RTP for a particular application domain 311 requires the choice of an RTP Profile. For WebRTC use, the Extended 312 Secure RTP Profile for RTCP-Based Feedback (RTP/SAVPF) [RFC5124], as 313 extended by [RFC7007], MUST be implemented. The RTP/SAVPF profile is 314 the combination of basic RTP/AVP profile [RFC3551], the RTP profile 315 for RTCP-based feedback (RTP/AVPF) [RFC4585], and the secure RTP 316 profile (RTP/SAVP) [RFC3711]. 318 The RTCP-based feedback extensions [RFC4585] are needed for the 319 improved RTCP timer model. This allows more flexible transmission of 320 RTCP packets in response to events, rather than strictly according to 321 bandwidth, and is vital for being able to report congestion signals 322 as well as media events. These extensions also allow saving RTCP 323 bandwidth, and an end-point will commonly only use the full RTCP 324 bandwidth allocation if there are many events that require feedback. 325 The timer rules are also needed to make use of the RTP conferencing 326 extensions discussed in Section 5.1. 328 Note: The enhanced RTCP timer model defined in the RTP/AVPF 329 profile is backwards compatible with legacy systems that implement 330 only the RTP/AVP or RTP/SAVP profile, given some constraints on 331 parameter configuration such as the RTCP bandwidth value and "trr- 332 int" (the most important factor for interworking with RTP/(S)AVP 333 end-points via a gateway is to set the trr-int parameter to a 334 value representing 4 seconds, see Section 6.1 in 335 [I-D.ietf-avtcore-rtp-multi-stream]). 337 The secure RTP (SRTP) profile extensions [RFC3711] are needed to 338 provide media encryption, integrity protection, replay protection and 339 a limited form of source authentication. WebRTC implementations MUST 340 NOT send packets using the basic RTP/AVP profile or the RTP/AVPF 341 profile; they MUST employ the full RTP/SAVPF profile to protect all 342 RTP and RTCP packets that are generated (i.e., implementations MUST 343 use SRTP and SRTCP). The RTP/SAVPF profile MUST be configured using 344 the cipher suites, DTLS-SRTP protection profiles, keying mechanisms, 345 and other parameters described in [I-D.ietf-rtcweb-security-arch]. 347 4.3. Choice of RTP Payload Formats 349 The set of mandatory to implement codecs and RTP payload formats for 350 WebRTC is not specified in this memo, instead they are defined in 351 separate specifications, such as [I-D.ietf-rtcweb-audio]. 352 Implementations can support any codec for which an RTP payload format 353 and associated signalling is defined. Implementation cannot assume 354 that the other participants in an RTP session understand any RTP 355 payload format, no matter how common; the mapping between RTP payload 356 type numbers and specific configurations of particular RTP payload 357 formats MUST be agreed before those payload types/formats can be 358 used. In an SDP context, this can be done using the "a=rtpmap:" and 359 "a=fmtp:" attributes associated with an "m=" line, along with any 360 other SDP attributes needed to configure the RTP payload format. 362 End-points can signal support for multiple RTP payload formats, or 363 multiple configurations of a single RTP payload format, as long as 364 each unique RTP payload format configuration uses a different RTP 365 payload type number. As outlined in Section 4.8, the RTP payload 366 type number is sometimes used to associate an RTP packet stream with 367 a signalling context. This association is possible provided unique 368 RTP payload type numbers are used in each context. For example, an 369 RTP packet stream can be associated with an SDP "m=" line by 370 comparing the RTP payload type numbers used by the RTP packet stream 371 with payload types signalled in the "a=rtpmap:" lines in the media 372 sections of the SDP. This leads to the following considerations: 374 If RTP packet streams are being associated with signalling 375 contexts based on the RTP payload type, then the assignment of RTP 376 payload type numbers MUST be unique across signalling contexts. 378 If the same RTP payload format configuration is used in multiple 379 contexts, then a different RTP payload type number has to be 380 assigned in each context to ensure uniqueness. 382 If the RTP payload type number is not being used to associate RTP 383 packet streams with a signalling context, then the same RTP 384 payload type number can be used to indicate the exact same RTP 385 payload format configuration in multiple contexts. 387 A single RTP payload type number MUST NOT be assigned to different 388 RTP payload formats, or different configurations of the same RTP 389 payload format, within a single RTP session (note that the "m=" lines 390 in an SDP bundle group [I-D.ietf-mmusic-sdp-bundle-negotiation] form 391 a single RTP session). 393 An end-point that has signalled support for multiple RTP payload 394 formats MUST be able to accept data in any of those payload formats 395 at any time, unless it has previously signalled limitations on its 396 decoding capability. This requirement is constrained if several 397 types of media (e.g., audio and video) are sent in the same RTP 398 session. In such a case, a source (SSRC) is restricted to switching 399 only between the RTP payload formats signalled for the type of media 400 that is being sent by that source; see Section 4.4. To support rapid 401 rate adaptation by changing codec, RTP does not require advance 402 signalling for changes between RTP payload formats used by a single 403 SSRC that were signalled during session set-up. 405 If performing changes between two RTP payload types that use 406 different RTP clock rates, an RTP sender MUST follow the 407 recommendations in Section 4.1 of [RFC7160]. RTP receivers MUST 408 follow the recommendations in Section 4.3 of [RFC7160] in order to 409 support sources that switch between clock rates in an RTP session 410 (these recommendations for receivers are backwards compatible with 411 the case where senders use only a single clock rate). 413 4.4. Use of RTP Sessions 415 An association amongst a set of end-points communicating using RTP is 416 known as an RTP session [RFC3550]. An end-point can be involved in 417 several RTP sessions at the same time. In a multimedia session, each 418 type of media has typically been carried in a separate RTP session 419 (e.g., using one RTP session for the audio, and a separate RTP 420 session using a different transport-layer flow for the video). 421 WebRTC implementations of RTP are REQUIRED to implement support for 422 multimedia sessions in this way, separating each session using 423 different transport-layer flows for compatibility with legacy 424 systems. 426 In modern day networks, however, with the widespread use of network 427 address/port translators (NAT/NAPT) and firewalls, it is desirable to 428 reduce the number of transport-layer flows used by RTP applications. 429 This can be done by sending all the RTP packet streams in a single 430 RTP session, which will comprise a single transport-layer flow (this 431 will prevent the use of some quality-of-service mechanisms, as 432 discussed in Section 12.1.3). Implementations are therefore also 433 REQUIRED to support transport of all RTP packet streams, independent 434 of media type, in a single RTP session using a single transport layer 435 flow, according to [I-D.ietf-avtcore-multi-media-rtp-session]. If 436 multiple types of media are to be used in a single RTP session, all 437 participants in that RTP session MUST agree to this usage. In an SDP 438 context, [I-D.ietf-mmusic-sdp-bundle-negotiation] can be used to 439 signal such a bundle of RTP packet streams forming a single RTP 440 session. 442 Further discussion about the suitability of different RTP session 443 structures and multiplexing methods to different scenarios can be 444 found in [I-D.ietf-avtcore-multiplex-guidelines]. 446 4.5. RTP and RTCP Multiplexing 448 Historically, RTP and RTCP have been run on separate transport layer 449 flows (e.g., two UDP ports for each RTP session, one port for RTP and 450 one port for RTCP). With the increased use of Network Address/Port 451 Translation (NAT/NAPT) this has become problematic, since maintaining 452 multiple NAT bindings can be costly. It also complicates firewall 453 administration, since multiple ports need to be opened to allow RTP 454 traffic. To reduce these costs and session set-up times, 455 implementations are REQUIRED to support multiplexing RTP data packets 456 and RTCP control packets on a single transport-layer flow [RFC5761]. 457 Such RTP and RTCP multiplexing MUST be negotiated in the signalling 458 channel before it is used. If SDP is used for signalling, this 459 negotiation MUST use the attributes defined in [RFC5761]. For 460 backwards compatibility, implementations are also REQUIRED to support 461 RTP and RTCP sent on separate transport-layer flows. 463 Note that the use of RTP and RTCP multiplexed onto a single 464 transport-layer flow ensures that there is occasional traffic sent on 465 that port, even if there is no active media traffic. This can be 466 useful to keep NAT bindings alive [RFC6263]. 468 4.6. Reduced Size RTCP 470 RTCP packets are usually sent as compound RTCP packets, and [RFC3550] 471 requires that those compound packets start with an Sender Report (SR) 472 or Receiver Report (RR) packet. When using frequent RTCP feedback 473 messages under the RTP/AVPF Profile [RFC4585] these statistics are 474 not needed in every packet, and unnecessarily increase the mean RTCP 475 packet size. This can limit the frequency at which RTCP packets can 476 be sent within the RTCP bandwidth share. 478 To avoid this problem, [RFC5506] specifies how to reduce the mean 479 RTCP message size and allow for more frequent feedback. Frequent 480 feedback, in turn, is essential to make real-time applications 481 quickly aware of changing network conditions, and to allow them to 482 adapt their transmission and encoding behaviour. Implementations 483 MUST support sending and receiving non-compound RTCP feedback packets 484 [RFC5506]. Use of non-compound RTCP packets MUST be negotiated using 485 the signalling channel. If SDP is used for signalling, this 486 negotiation MUST use the attributes defined in [RFC5506]. For 487 backwards compatibility, implementations are also REQUIRED to support 488 the use of compound RTCP feedback packets if the remote end-point 489 does not agree to the use of non-compound RTCP in the signalling 490 exchange. 492 4.7. Symmetric RTP/RTCP 494 To ease traversal of NAT and firewall devices, implementations are 495 REQUIRED to implement and use Symmetric RTP [RFC4961]. The reason 496 for using symmetric RTP is primarily to avoid issues with NATs and 497 Firewalls by ensuring that the send and receive RTP packet streams, 498 as well as RTCP, are actually bi-directional transport-layer flows. 499 This will keep alive the NAT and firewall pinholes, and help indicate 500 consent that the receive direction is a transport-layer flow the 501 intended recipient actually wants. In addition, it saves resources, 502 specifically ports at the end-points, but also in the network as NAT 503 mappings or firewall state is not unnecessary bloated. The amount of 504 per flow QoS state kept in the network is also reduced. 506 4.8. Choice of RTP Synchronisation Source (SSRC) 508 Implementations are REQUIRED to support signalled RTP synchronisation 509 source (SSRC) identifiers. If SDP is used, this MUST be done using 510 the "a=ssrc:" SDP attribute defined in Section 4.1 and Section 5 of 511 [RFC5576] and the "previous-ssrc" source attribute defined in 512 Section 6.2 of [RFC5576]; other per-SSRC attributes defined in 513 [RFC5576] MAY be supported. 515 While support for signalled SSRC identifiers is mandated, their use 516 in an RTP session is OPTIONAL. Implementations MUST be prepared to 517 accept RTP and RTCP packets using SSRCs that have not been explicitly 518 signalled ahead of time. Implementations MUST support random SSRC 519 assignment, and MUST support SSRC collision detection and resolution, 520 according to [RFC3550]. When using signalled SSRC values, collision 521 detection MUST be performed as described in Section 5 of [RFC5576]. 523 It is often desirable to associate an RTP packet stream with a non- 524 RTP context. For users of the WebRTC API a mapping between SSRCs and 525 MediaStreamTracks are provided per Section 11. For gateways or other 526 usages it is possible to associate an RTP packet stream with an "m=" 527 line in a session description formatted using SDP. If SSRCs are 528 signalled this is straightforward (in SDP the "a=ssrc:" line will be 529 at the media level, allowing a direct association with an "m=" line). 530 If SSRCs are not signalled, the RTP payload type numbers used in an 531 RTP packet stream are often sufficient to associate that packet 532 stream with a signalling context (e.g., if RTP payload type numbers 533 are assigned as described in Section 4.3 of this memo, the RTP 534 payload types used by an RTP packet stream can be compared with 535 values in SDP "a=rtpmap:" lines, which are at the media level in SDP, 536 and so map to an "m=" line). 538 4.9. Generation of the RTCP Canonical Name (CNAME) 540 The RTCP Canonical Name (CNAME) provides a persistent transport-level 541 identifier for an RTP end-point. While the Synchronisation Source 542 (SSRC) identifier for an RTP end-point can change if a collision is 543 detected, or when the RTP application is restarted, its RTCP CNAME is 544 meant to stay unchanged for the duration of a RTCPeerConnection 545 [W3C.WD-webrtc-20130910], so that RTP end-points can be uniquely 546 identified and associated with their RTP packet streams within a set 547 of related RTP sessions. 549 Each RTP end-point MUST have at least one RTCP CNAME, and that RTCP 550 CNAME MUST be unique within the RTCPeerConnection. RTCP CNAMEs 551 identify a particular synchronisation context, i.e., all SSRCs 552 associated with a single RTCP CNAME share a common reference clock. 553 If an end-point has SSRCs that are associated with several 554 unsynchronised reference clocks, and hence different synchronisation 555 contexts, it will need to use multiple RTCP CNAMEs, one for each 556 synchronisation context. 558 Taking the discussion in Section 11 into account, a WebRTC end-point 559 MUST NOT use more than one RTCP CNAME in the RTP sessions belonging 560 to single RTCPeerConnection (that is, an RTCPeerConnection forms a 561 synchronisation context). RTP middleboxes MAY generate RTP packet 562 streams associated with more than one RTCP CNAME, to allow them to 563 avoid having to resynchronize media from multiple different end- 564 points part of a multi-party RTP session. 566 The RTP specification [RFC3550] includes guidelines for choosing a 567 unique RTP CNAME, but these are not sufficient in the presence of NAT 568 devices. In addition, long-term persistent identifiers can be 569 problematic from a privacy viewpoint (Section 13). Accordingly, a 570 WebRTC endpoint MUST generate a new, unique, short-term persistent 571 RTCP CNAME for each RTCPeerConnection, following [RFC7022], with a 572 single exception; if explicitly requested at creation an 573 RTCPeerConnection MAY use the same CNAME as as an existing 574 RTCPeerConnection within their common same-origin context. 576 An WebRTC end-point MUST support reception of any CNAME that matches 577 the syntax limitations specified by the RTP specification [RFC3550] 578 and cannot assume that any CNAME will be chosen according to the form 579 suggested above. 581 4.10. Handling of Leap Seconds 583 The guidelines regarding handling of leap seconds to limit their 584 impact on RTP media play-out and synchronization given in [RFC7164] 585 SHOULD be followed. 587 5. WebRTC Use of RTP: Extensions 589 There are a number of RTP extensions that are either needed to obtain 590 full functionality, or extremely useful to improve on the baseline 591 performance, in the WebRTC application context. One set of these 592 extensions is related to conferencing, while others are more generic 593 in nature. The following subsections describe the various RTP 594 extensions mandated or suggested for use within the WebRTC context. 596 5.1. Conferencing Extensions and Topologies 598 RTP is a protocol that inherently supports group communication. 599 Groups can be implemented by having each endpoint send its RTP packet 600 streams to an RTP middlebox that redistributes the traffic, by using 601 a mesh of unicast RTP packet streams between endpoints, or by using 602 an IP multicast group to distribute the RTP packet streams. These 603 topologies can be implemented in a number of ways as discussed in 604 [I-D.ietf-avtcore-rtp-topologies-update]. 606 While the use of IP multicast groups is popular in IPTV systems, the 607 topologies based on RTP middleboxes are dominant in interactive video 608 conferencing environments. Topologies based on a mesh of unicast 609 transport-layer flows to create a common RTP session have not seen 610 widespread deployment to date. Accordingly, WebRTC implementations 611 are not expected to support topologies based on IP multicast groups 612 or to support mesh-based topologies, such as a point-to-multipoint 613 mesh configured as a single RTP session (Topo-Mesh in the terminology 614 of [I-D.ietf-avtcore-rtp-topologies-update]). However, a point-to- 615 multipoint mesh constructed using several RTP sessions, implemented 616 in the WebRTC context using independent RTCPeerConnections 617 [W3C.WD-webrtc-20130910], can be expected to be utilised by WebRTC 618 applications and needs to be supported. 620 WebRTC implementations of RTP endpoints implemented according to this 621 memo are expected to support all the topologies described in 622 [I-D.ietf-avtcore-rtp-topologies-update] where the RTP endpoints send 623 and receive unicast RTP packet streams to and from some peer device, 624 provided that peer can participate in performing congestion control 625 on the RTP packet streams. The peer device could be another RTP 626 endpoint, or it could be an RTP middlebox that redistributes the RTP 627 packet streams to other RTP endpoints. This limitation means that 628 some of the RTP middlebox-based topologies are not suitable for use 629 in the WebRTC environment. Specifically: 631 o Video switching MCUs (Topo-Video-switch-MCU) SHOULD NOT be used, 632 since they make the use of RTCP for congestion control and quality 633 of service reports problematic (see Section 3.8 of 634 [I-D.ietf-avtcore-rtp-topologies-update]). 636 o The Relay-Transport Translator (Topo-PtM-Trn-Translator) topology 637 SHOULD NOT be used because its safe use requires a congestion 638 control algorithm or RTP circuit breaker that handles point to 639 multipoint, which has not yet been standardised. 641 The following topology can be used, however it has some issues worth 642 noting: 644 o Content modifying MCUs with RTCP termination (Topo-RTCP- 645 terminating-MCU) MAY be used. Note that in this RTP Topology, RTP 646 loop detection and identification of active senders is the 647 responsibility of the WebRTC application; since the clients are 648 isolated from each other at the RTP layer, RTP cannot assist with 649 these functions (see section 3.9 of 650 [I-D.ietf-avtcore-rtp-topologies-update]). 652 The RTP extensions described in Section 5.1.1 to Section 5.1.6 are 653 designed to be used with centralised conferencing, where an RTP 654 middlebox (e.g., a conference bridge) receives a participant's RTP 655 packet streams and distributes them to the other participants. These 656 extensions are not necessary for interoperability; an RTP end-point 657 that does not implement these extensions will work correctly, but 658 might offer poor performance. Support for the listed extensions will 659 greatly improve the quality of experience and, to provide a 660 reasonable baseline quality, some of these extensions are mandatory 661 to be supported by WebRTC end-points. 663 The RTCP conferencing extensions are defined in Extended RTP Profile 664 for Real-time Transport Control Protocol (RTCP)-Based Feedback (RTP/ 665 AVPF) [RFC4585] and the memo on Codec Control Messages (CCM) in RTP/ 666 AVPF [RFC5104]; they are fully usable by the Secure variant of this 667 profile (RTP/SAVPF) [RFC5124]. 669 5.1.1. Full Intra Request (FIR) 671 The Full Intra Request message is defined in Sections 3.5.1 and 4.3.1 672 of the Codec Control Messages [RFC5104]. It is used to make the 673 mixer request a new Intra picture from a participant in the session. 674 This is used when switching between sources to ensure that the 675 receivers can decode the video or other predictive media encoding 676 with long prediction chains. WebRTC senders MUST understand and 677 react to FIR feedback messages they receive, since this greatly 678 improves the user experience when using centralised mixer-based 679 conferencing. Support for sending FIR messages is OPTIONAL. 681 5.1.2. Picture Loss Indication (PLI) 683 The Picture Loss Indication message is defined in Section 6.3.1 of 684 the RTP/AVPF profile [RFC4585]. It is used by a receiver to tell the 685 sending encoder that it lost the decoder context and would like to 686 have it repaired somehow. This is semantically different from the 687 Full Intra Request above as there could be multiple ways to fulfil 688 the request. WebRTC senders MUST understand and react to PLI 689 feedback messages as a loss tolerance mechanism. Receivers MAY send 690 PLI messages. 692 5.1.3. Slice Loss Indication (SLI) 694 The Slice Loss Indication message is defined in Section 6.3.2 of the 695 RTP/AVPF profile [RFC4585]. It is used by a receiver to tell the 696 encoder that it has detected the loss or corruption of one or more 697 consecutive macro blocks, and would like to have these repaired 698 somehow. It is RECOMMENDED that receivers generate SLI feedback 699 messages if slices are lost when using a codec that supports the 700 concept of macro blocks. A sender that receives an SLI feedback 701 message SHOULD attempt to repair the lost slice(s). 703 5.1.4. Reference Picture Selection Indication (RPSI) 705 Reference Picture Selection Indication (RPSI) messages are defined in 706 Section 6.3.3 of the RTP/AVPF profile [RFC4585]. Some video encoding 707 standards allow the use of older reference pictures than the most 708 recent one for predictive coding. If such a codec is in use, and if 709 the encoder has learnt that encoder-decoder synchronisation has been 710 lost, then a known as correct reference picture can be used as a base 711 for future coding. The RPSI message allows this to be signalled. 712 Receivers that detect that encoder-decoder synchronisation has been 713 lost SHOULD generate an RPSI feedback message if codec being used 714 supports reference picture selection. A RTP packet stream sender 715 that receives such an RPSI message SHOULD act on that messages to 716 change the reference picture, if it is possible to do so within the 717 available bandwidth constraints, and with the codec being used. 719 5.1.5. Temporal-Spatial Trade-off Request (TSTR) 721 The temporal-spatial trade-off request and notification are defined 722 in Sections 3.5.2 and 4.3.2 of [RFC5104]. This request can be used 723 to ask the video encoder to change the trade-off it makes between 724 temporal and spatial resolution, for example to prefer high spatial 725 image quality but low frame rate. Support for TSTR requests and 726 notifications is OPTIONAL. 728 5.1.6. Temporary Maximum Media Stream Bit Rate Request (TMMBR) 730 The TMMBR feedback message is defined in Sections 3.5.4 and 4.2.1 of 731 the Codec Control Messages [RFC5104]. This request and its 732 notification message are used by a media receiver to inform the 733 sending party that there is a current limitation on the amount of 734 bandwidth available to this receiver. This can be various reasons 735 for this: for example, an RTP mixer can use this message to limit the 736 media rate of the sender being forwarded by the mixer (without doing 737 media transcoding) to fit the bottlenecks existing towards the other 738 session participants. WebRTC senders are REQUIRED to implement 739 support for TMMBR messages, and MUST follow bandwidth limitations set 740 by a TMMBR message received for their SSRC. The sending of TMMBR 741 requests is OPTIONAL. 743 5.2. Header Extensions 745 The RTP specification [RFC3550] provides the capability to include 746 RTP header extensions containing in-band data, but the format and 747 semantics of the extensions are poorly specified. The use of header 748 extensions is OPTIONAL in the WebRTC context, but if they are used, 749 they MUST be formatted and signalled following the general mechanism 750 for RTP header extensions defined in [RFC5285], since this gives 751 well-defined semantics to RTP header extensions. 753 As noted in [RFC5285], the requirement from the RTP specification 754 that header extensions are "designed so that the header extension may 755 be ignored" [RFC3550] stands. To be specific, header extensions MUST 756 only be used for data that can safely be ignored by the recipient 757 without affecting interoperability, and MUST NOT be used when the 758 presence of the extension has changed the form or nature of the rest 759 of the packet in a way that is not compatible with the way the stream 760 is signalled (e.g., as defined by the payload type). Valid examples 761 of RTP header extensions might include metadata that is additional to 762 the usual RTP information, but that can safely be ignored without 763 compromising interoperability. 765 5.2.1. Rapid Synchronisation 767 Many RTP sessions require synchronisation between audio, video, and 768 other content. This synchronisation is performed by receivers, using 769 information contained in RTCP SR packets, as described in the RTP 770 specification [RFC3550]. This basic mechanism can be slow, however, 771 so it is RECOMMENDED that the rapid RTP synchronisation extensions 772 described in [RFC6051] be implemented in addition to RTCP SR-based 773 synchronisation. The rapid synchronisation extensions use the 774 general RTP header extension mechanism [RFC5285], which requires 775 signalling, but are otherwise backwards compatible. 777 5.2.2. Client-to-Mixer Audio Level 779 The Client to Mixer Audio Level extension [RFC6464] is an RTP header 780 extension used by an endpoint to inform a mixer about the level of 781 audio activity in the packet to which the header is attached. This 782 enables an RTP middlebox to make mixing or selection decisions 783 without decoding or detailed inspection of the payload, reducing the 784 complexity in some types of mixers. It can also save decoding 785 resources in receivers, which can choose to decode only the most 786 relevant RTP packet streams based on audio activity levels. 788 The Client-to-Mixer Audio Level [RFC6464] header extension is 789 RECOMMENDED to be implemented. If this header extension is 790 implemented, it is REQUIRED that implementations are capable of 791 encrypting the header extension according to [RFC6904] since the 792 information contained in these header extensions can be considered 793 sensitive. The use of this encryption is RECOMMENDED, however usage 794 of the encryption can be explicitly disabled through API or 795 signalling. 797 5.2.3. Mixer-to-Client Audio Level 799 The Mixer to Client Audio Level header extension [RFC6465] provides 800 an endpoint with the audio level of the different sources mixed into 801 a common source stream by a RTP mixer. This enables a user interface 802 to indicate the relative activity level of each session participant, 803 rather than just being included or not based on the CSRC field. This 804 is a pure optimisation of non critical functions, and is hence 805 OPTIONAL to implement. If this header extension is implemented, it 806 is REQUIRED that implementations are capable of encrypting the header 807 extension according to [RFC6904] since the information contained in 808 these header extensions can be considered sensitive. It is further 809 RECOMMENDED that this encryption is used, unless the encryption has 810 been explicitly disabled through API or signalling. 812 6. WebRTC Use of RTP: Improving Transport Robustness 814 There are tools that can make RTP packet streams robust against 815 packet loss and reduce the impact of loss on media quality. However, 816 they generally add some overhead compared to a non-robust stream. 817 The overhead needs to be considered, and the aggregate bit-rate MUST 818 be rate controlled to avoid causing network congestion (see 819 Section 7). As a result, improving robustness might require a lower 820 base encoding quality, but has the potential to deliver that quality 821 with fewer errors. The mechanisms described in the following sub- 822 sections can be used to improve tolerance to packet loss. 824 6.1. Negative Acknowledgements and RTP Retransmission 826 As a consequence of supporting the RTP/SAVPF profile, implementations 827 can send negative acknowledgements (NACKs) for RTP data packets 828 [RFC4585]. This feedback can be used to inform a sender of the loss 829 of particular RTP packets, subject to the capacity limitations of the 830 RTCP feedback channel. A sender can use this information to optimise 831 the user experience by adapting the media encoding to compensate for 832 known lost packets. 834 RTP packet stream senders are REQUIRED to understand the Generic NACK 835 message defined in Section 6.2.1 of [RFC4585], but MAY choose to 836 ignore some or all of this feedback (following Section 4.2 of 837 [RFC4585]). Receivers MAY send NACKs for missing RTP packets. 838 Guidelines on when to send NACKs are provided in [RFC4585]. It is 839 not expected that a receiver will send a NACK for every lost RTP 840 packet, rather it needs to consider the cost of sending NACK 841 feedback, and the importance of the lost packet, to make an informed 842 decision on whether it is worth telling the sender about a packet 843 loss event. 845 The RTP Retransmission Payload Format [RFC4588] offers the ability to 846 retransmit lost packets based on NACK feedback. Retransmission needs 847 to be used with care in interactive real-time applications to ensure 848 that the retransmitted packet arrives in time to be useful, but can 849 be effective in environments with relatively low network RTT (an RTP 850 sender can estimate the RTT to the receivers using the information in 851 RTCP SR and RR packets, as described at the end of Section 6.4.1 of 852 [RFC3550]). The use of retransmissions can also increase the forward 853 RTP bandwidth, and can potentially caused increased packet loss if 854 the original packet loss was caused by network congestion. Note, 855 however, that retransmission of an important lost packet to repair 856 decoder state can have lower cost than sending a full intra frame. 857 It is not appropriate to blindly retransmit RTP packets in response 858 to a NACK. The importance of lost packets and the likelihood of them 859 arriving in time to be useful needs to be considered before RTP 860 retransmission is used. 862 Receivers are REQUIRED to implement support for RTP retransmission 863 packets [RFC4588]. Senders MAY send RTP retransmission packets in 864 response to NACKs if the RTP retransmission payload format has been 865 negotiated for the session, and if the sender believes it is useful 866 to send a retransmission of the packet(s) referenced in the NACK. An 867 RTP sender does not need to retransmit every NACKed packet. 869 6.2. Forward Error Correction (FEC) 871 The use of Forward Error Correction (FEC) can provide an effective 872 protection against some degree of packet loss, at the cost of steady 873 bandwidth overhead. There are several FEC schemes that are defined 874 for use with RTP. Some of these schemes are specific to a particular 875 RTP payload format, others operate across RTP packets and can be used 876 with any payload format. It needs to be noted that using redundant 877 encoding or FEC will lead to increased play out delay, which needs to 878 be considered when choosing the redundancy or FEC formats and their 879 respective parameters. 881 If an RTP payload format negotiated for use in a RTCPeerConnection 882 supports redundant transmission or FEC as a standard feature of that 883 payload format, then that support MAY be used in the 884 RTCPeerConnection, subject to any appropriate signalling. 886 There are several block-based FEC schemes that are designed for use 887 with RTP independent of the chosen RTP payload format. At the time 888 of this writing there is no consensus on which, if any, of these FEC 889 schemes is appropriate for use in the WebRTC context. Accordingly, 890 this memo makes no recommendation on the choice of block-based FEC 891 for WebRTC use. 893 7. WebRTC Use of RTP: Rate Control and Media Adaptation 895 WebRTC will be used in heterogeneous network environments using a 896 variety set of link technologies, including both wired and wireless 897 links, to interconnect potentially large groups of users around the 898 world. As a result, the network paths between users can have widely 899 varying one-way delays, available bit-rates, load levels, and traffic 900 mixtures. Individual end-points can send one or more RTP packet 901 streams to each participant in a WebRTC conference, and there can be 902 several participants. Each of these RTP packet streams can contain 903 different types of media, and the type of media, bit rate, and number 904 of RTP packet streams as well as transport-layer flows can be highly 905 asymmetric. Non-RTP traffic can share the network paths with RTP 906 transport-layer flows. Since the network environment is not 907 predictable or stable, WebRTC end-points MUST ensure that the RTP 908 traffic they generate can adapt to match changes in the available 909 network capacity. 911 The quality of experience for users of WebRTC implementation is very 912 dependent on effective adaptation of the media to the limitations of 913 the network. End-points have to be designed so they do not transmit 914 significantly more data than the network path can support, except for 915 very short time periods, otherwise high levels of network packet loss 916 or delay spikes will occur, causing media quality degradation. The 917 limiting factor on the capacity of the network path might be the link 918 bandwidth, or it might be competition with other traffic on the link 919 (this can be non-WebRTC traffic, traffic due to other WebRTC flows, 920 or even competition with other WebRTC flows in the same session). 922 An effective media congestion control algorithm is therefore an 923 essential part of the WebRTC framework. However, at the time of this 924 writing, there is no standard congestion control algorithm that can 925 be used for interactive media applications such as WebRTC's flows. 926 Some requirements for congestion control algorithms for 927 RTCPeerConnections are discussed in [I-D.ietf-rmcat-cc-requirements]. 928 A future version of this memo will mandate the use of a congestion 929 control algorithm that satisfies these requirements. 931 7.1. Boundary Conditions and Circuit Breakers 933 WebRTC implementations MUST implement the RTP circuit breaker 934 algorithm that is described in 935 [I-D.ietf-avtcore-rtp-circuit-breakers]. The RTP circuit breaker is 936 designed to enable applications to recognise and react to situations 937 of extreme network congestion. However, since the RTP circuit 938 breaker might not be triggered until congestion becomes extreme, it 939 cannot be considered a substitute for congestion control, and 940 applications MUST also implement congestion control to allow them to 941 adapt to changes in network capacity. Any future RTP congestion 942 control algorithms are expected to operate within the envelope 943 allowed by the circuit breaker. 945 The session establishment signalling will also necessarily establish 946 boundaries to which the media bit-rate will conform. The choice of 947 media codecs provides upper- and lower-bounds on the supported bit- 948 rates that the application can utilise to provide useful quality, and 949 the packetisation choices that exist. In addition, the signalling 950 channel can establish maximum media bit-rate boundaries using, for 951 example, the SDP "b=AS:" or "b=CT:" lines and the RTP/AVPF Temporary 952 Maximum Media Stream Bit Rate (TMMBR) Requests (see Section 5.1.6 of 953 this memo). Signalled bandwidth limitations, such as SDP "b=AS:" or 954 "b=CT:" lines received from the peer, MUST be followed when sending 955 RTP packet streams. A WebRTC endpoint receiving media SHOULD signal 956 its bandwidth limitations, these limitations have to be based on 957 known bandwidth limitations, for example the capacity of the edge 958 links. 960 7.2. Congestion Control Interoperability and Legacy Systems 962 There are legacy RTP implementations that do not implement RTCP, and 963 hence do not provide any congestion feedback. Congestion control 964 cannot be performed with these end-points. WebRTC implementations 965 that need to interwork with such end-points MUST limit their 966 transmission to a low rate, equivalent to a VoIP call using a low 967 bandwidth codec, that is unlikely to cause any significant 968 congestion. 970 When interworking with legacy implementations that support RTCP using 971 the RTP/AVP profile [RFC3551], congestion feedback is provided in 972 RTCP RR packets every few seconds. Implementations that have to 973 interwork with such end-points MUST ensure that they keep within the 974 RTP circuit breaker [I-D.ietf-avtcore-rtp-circuit-breakers] 975 constraints to limit the congestion they can cause. 977 If a legacy end-point supports RTP/AVPF, this enables negotiation of 978 important parameters for frequent reporting, such as the "trr-int" 979 parameter, and the possibility that the end-point supports some 980 useful feedback format for congestion control purpose such as TMMBR 981 [RFC5104]. Implementations that have to interwork with such end- 982 points MUST ensure that they stay within the RTP circuit breaker 983 [I-D.ietf-avtcore-rtp-circuit-breakers] constraints to limit the 984 congestion they can cause, but might find that they can achieve 985 better congestion response depending on the amount of feedback that 986 is available. 988 With proprietary congestion control algorithms issues can arise when 989 different algorithms and implementations interact in a communication 990 session. If the different implementations have made different 991 choices in regards to the type of adaptation, for example one sender 992 based, and one receiver based, then one could end up in situation 993 where one direction is dual controlled, when the other direction is 994 not controlled. This memo cannot mandate behaviour for proprietary 995 congestion control algorithms, but implementations that use such 996 algorithms ought to be aware of this issue, and try to ensure that 997 effective congestion control is negotiated for media flowing in both 998 directions. If the IETF were to standardise both sender- and 999 receiver-based congestion control algorithms for WebRTC traffic in 1000 the future, the issues of interoperability, control, and ensuring 1001 that both directions of media flow are congestion controlled would 1002 also need to be considered. 1004 8. WebRTC Use of RTP: Performance Monitoring 1006 As described in Section 4.1, implementations are REQUIRED to generate 1007 RTCP Sender Report (SR) and Reception Report (RR) packets relating to 1008 the RTP packet streams they send and receive. These RTCP reports can 1009 be used for performance monitoring purposes, since they include basic 1010 packet loss and jitter statistics. 1012 A large number of additional performance metrics are supported by the 1013 RTCP Extended Reports (XR) framework [RFC3611][RFC6792]. At the time 1014 of this writing, it is not clear what extended metrics are suitable 1015 for use in the WebRTC context, so there is no requirement that 1016 implementations generate RTCP XR packets. However, implementations 1017 that can use detailed performance monitoring data MAY generate RTCP 1018 XR packets as appropriate; the use of such packets SHOULD be 1019 signalled in advance. 1021 9. WebRTC Use of RTP: Future Extensions 1023 It is possible that the core set of RTP protocols and RTP extensions 1024 specified in this memo will prove insufficient for the future needs 1025 of WebRTC applications. In this case, future updates to this memo 1026 MUST be made following the Guidelines for Writers of RTP Payload 1027 Format Specifications [RFC2736], How to Write an RTP Payload Format 1028 [I-D.ietf-payload-rtp-howto] and Guidelines for Extending the RTP 1029 Control Protocol [RFC5968], and SHOULD take into account any future 1030 guidelines for extending RTP and related protocols that have been 1031 developed. 1033 Authors of future extensions are urged to consider the wide range of 1034 environments in which RTP is used when recommending extensions, since 1035 extensions that are applicable in some scenarios can be problematic 1036 in others. Where possible, the WebRTC framework will adopt RTP 1037 extensions that are of general utility, to enable easy implementation 1038 of a gateway to other applications using RTP, rather than adopt 1039 mechanisms that are narrowly targeted at specific WebRTC use cases. 1041 10. Signalling Considerations 1043 RTP is built with the assumption that an external signalling channel 1044 exists, and can be used to configure RTP sessions and their features. 1045 The basic configuration of an RTP session consists of the following 1046 parameters: 1048 RTP Profile: The name of the RTP profile to be used in session. The 1049 RTP/AVP [RFC3551] and RTP/AVPF [RFC4585] profiles can interoperate 1050 on basic level, as can their secure variants RTP/SAVP [RFC3711] 1051 and RTP/SAVPF [RFC5124]. The secure variants of the profiles do 1052 not directly interoperate with the non-secure variants, due to the 1053 presence of additional header fields for authentication in SRTP 1054 packets and cryptographic transformation of the payload. WebRTC 1055 requires the use of the RTP/SAVPF profile, and this MUST be 1056 signalled. Interworking functions might transform this into the 1057 RTP/SAVP profile for a legacy use case, by indicating to the 1058 WebRTC end-point that the RTP/SAVPF is used and configuring a trr- 1059 int value of 4 seconds. 1061 Transport Information: Source and destination IP address(s) and 1062 ports for RTP and RTCP MUST be signalled for each RTP session. In 1063 WebRTC these transport addresses will be provided by ICE [RFC5245] 1064 that signals candidates and arrives at nominated candidate address 1065 pairs. If RTP and RTCP multiplexing [RFC5761] is to be used, such 1066 that a single port, i.e. transport-layer flow, is used for RTP and 1067 RTCP flows, this MUST be signalled (see Section 4.5). 1069 RTP Payload Types, media formats, and format parameters: The mapping 1070 between media type names (and hence the RTP payload formats to be 1071 used), and the RTP payload type numbers MUST be signalled. Each 1072 media type MAY also have a number of media type parameters that 1073 MUST also be signalled to configure the codec and RTP payload 1074 format (the "a=fmtp:" line from SDP). Section 4.3 of this memo 1075 discusses requirements for uniqueness of payload types. 1077 RTP Extensions: The use of any additional RTP header extensions and 1078 RTCP packet types, including any necessary parameters, MUST be 1079 signalled. This signalling is to ensure that a WebRTC endpoint's 1080 behaviour, especially when sending, of any extensions is 1081 predictable and consistent. For robustness, and for compatibility 1082 with non-WebRTC systems that might be connected to a WebRTC 1083 session via a gateway, implementations are REQUIRED to ignore 1084 unknown RTCP packets and RTP header extensions (see also 1085 Section 4.1). 1087 RTCP Bandwidth: Support for exchanging RTCP Bandwidth values to the 1088 end-points will be necessary. This SHALL be done as described in 1089 "Session Description Protocol (SDP) Bandwidth Modifiers for RTP 1090 Control Protocol (RTCP) Bandwidth" [RFC3556] if using SDP, or 1091 something semantically equivalent. This also ensures that the 1092 end-points have a common view of the RTCP bandwidth. A common 1093 RTCP bandwidth is important as a too different view of the 1094 bandwidths can lead to failure to interoperate. 1096 These parameters are often expressed in SDP messages conveyed within 1097 an offer/answer exchange. RTP does not depend on SDP or on the 1098 offer/answer model, but does require all the necessary parameters to 1099 be agreed upon, and provided to the RTP implementation. Note that in 1100 the WebRTC context it will depend on the signalling model and API how 1101 these parameters need to be configured but they will be need to 1102 either be set in the API or explicitly signalled between the peers. 1104 11. WebRTC API Considerations 1106 The WebRTC API [W3C.WD-webrtc-20130910] and the Media Capture and 1107 Streams API [W3C.WD-mediacapture-streams-20130903] defines and uses 1108 the concept of a MediaStream that consists of zero or more 1109 MediaStreamTracks. A MediaStreamTrack is an individual stream of 1110 media from any type of media source like a microphone or a camera, 1111 but also conceptual sources, like a audio mix or a video composition, 1112 are possible. The MediaStreamTracks within a MediaStream need to be 1113 possible to play out synchronised. 1115 A MediaStreamTrack's realisation in RTP in the context of an 1116 RTCPeerConnection consists of a source packet stream identified with 1117 an SSRC within an RTP session part of the RTCPeerConnection. The 1118 MediaStreamTrack can also result in additional packet streams, and 1119 thus SSRCs, in the same RTP session. These can be dependent packet 1120 streams from scalable encoding of the source stream associated with 1121 the MediaStreamTrack, if such a media encoder is used. They can also 1122 be redundancy packet streams, these are created when applying Forward 1123 Error Correction (Section 6.2) or RTP retransmission (Section 6.1) to 1124 the source packet stream. 1126 It is important to note that the same media source can be feeding 1127 multiple MediaStreamTracks. As different sets of constraints or 1128 other parameters can be applied to the MediaStreamTrack, each 1129 MediaStreamTrack instance added to a RTCPeerConnection SHALL result 1130 in an independent source packet stream, with its own set of 1131 associated packet streams, and thus different SSRC(s). It will 1132 depend on applied constraints and parameters if the source stream and 1133 the encoding configuration will be identical between different 1134 MediaStreamTracks sharing the same media source. If the encoding 1135 parameters and constraints are the same, an implementation could 1136 choose to use only one encoded stream to create the different RTP 1137 packet streams. Note that such optimisations would need to take into 1138 account that the constraints for one of the MediaStreamTracks can at 1139 any moment change, meaning that the encoding configurations might no 1140 longer be identical and two different encoder instances would then be 1141 needed. 1143 The same MediaStreamTrack can also be included in multiple 1144 MediaStreams, thus multiple sets of MediaStreams can implicitly need 1145 to use the same synchronisation base. To ensure that this works in 1146 all cases, and does not force an end-point to to disrupt the media by 1147 changing synchronisation base and CNAME during delivery of any 1148 ongoing packet streams, all MediaStreamTracks and their associated 1149 SSRCs originating from the same end-point need to be sent using the 1150 same CNAME within one RTCPeerConnection. This is motivating the 1151 strong recommendation in Section 4.9 to only use a single CNAME. 1153 The requirement on using the same CNAME for all SSRCs that 1154 originate from the same end-point, does not require a middlebox 1155 that forwards traffic from multiple end-points to only use a 1156 single CNAME. 1158 Different CNAMEs normally need to be used for different 1159 RTCPeerConnection instances, as specified in Section 4.9. Having two 1160 communication sessions with the same CNAME could enable tracking of a 1161 user or device across different services (see Section 4.4.1 of 1162 [I-D.ietf-rtcweb-security] for details). A web application can 1163 request that the CNAMEs used in different RTCPeerConnections (within 1164 a same-orign context) be the same, this allows for synchronization of 1165 the endpoint's RTP packet streams across the different 1166 RTCPeerConnections. 1168 Note: this doesn't result in a tracking issue, since the creation 1169 of matching CNAMEs depends on existing tracking. 1171 The above will currently force a WebRTC end-point that receives a 1172 MediaStreamTrack on one RTCPeerConnection and adds it as an outgoing 1173 on any RTCPeerConnection to perform resynchronisation of the stream. 1174 This, as the sending party needs to change the CNAME to the one it 1175 uses, which implies that the sender has to use a local system clock 1176 as timebase for the synchronisation. Thus, the relative relation 1177 between the timebase of the incoming stream and the system sending 1178 out needs to defined. This relation also needs monitoring for clock 1179 drift and likely adjustments of the synchronisation. The sending 1180 entity is also responsible for congestion control for its sent 1181 streams. In cases of packet loss the loss of incoming data also 1182 needs to be handled. This leads to the observation that the method 1183 that is least likely to cause issues or interruptions in the outgoing 1184 source packet stream is a model of full decoding, including repair 1185 etc., followed by encoding of the media again into the outgoing 1186 packet stream. Optimisations of this method is clearly possible and 1187 implementation specific. 1189 A WebRTC end-point MUST support receiving multiple MediaStreamTracks, 1190 where each of different MediaStreamTracks (and their sets of 1191 associated packet streams) uses different CNAMEs. However, 1192 MediaStreamTracks that are received with different CNAMEs have no 1193 defined synchronisation. 1195 Note: The motivation for supporting reception of multiple CNAMEs 1196 is to allow for forward compatibility with any future changes that 1197 enables more efficient stream handling when end-points relay/ 1198 forward streams. It also ensures that end-points can interoperate 1199 with certain types of multi-stream middleboxes or end-points that 1200 are not WebRTC. 1202 The binding between the WebRTC MediaStreams, MediaStreamTracks and 1203 the SSRC is done as specified in "Cross Session Stream Identification 1204 in the Session Description Protocol" [I-D.ietf-mmusic-msid]. This 1205 document [I-D.ietf-mmusic-msid] also defines, in section 4.1, how to 1206 map unknown source packet stream SSRCs to MediaStreamTracks and 1207 MediaStreams. This later is relevant to handle some cases of legacy 1208 interop. Commonly the RTP Payload Type of any incoming packets will 1209 reveal if the packet stream is a source stream or a redundancy or 1210 dependent packet stream. The association to the correct source 1211 packet stream depends on the payload format in use for the packet 1212 stream. 1214 Finally this specification puts a requirement on the WebRTC API to 1215 realize a method for determining the CSRC list (Section 4.1) as well 1216 as the Mixer-to-Client audio levels (Section 5.2.3) (when supported) 1217 and the basic requirements for this is further discussed in 1218 Section 12.2.1. 1220 12. RTP Implementation Considerations 1222 The following discussion provides some guidance on the implementation 1223 of the RTP features described in this memo. The focus is on a WebRTC 1224 end-point implementation perspective, and while some mention is made 1225 of the behaviour of middleboxes, that is not the focus of this memo. 1227 12.1. Configuration and Use of RTP Sessions 1229 A WebRTC end-point will be a simultaneous participant in one or more 1230 RTP sessions. Each RTP session can convey multiple media sources, 1231 and can include media data from multiple end-points. In the 1232 following, some ways in which WebRTC end-points can configure and use 1233 RTP sessions is outlined. 1235 12.1.1. Use of Multiple Media Sources Within an RTP Session 1237 RTP is a group communication protocol, and every RTP session can 1238 potentially contain multiple RTP packet streams. There are several 1239 reasons why this might be desirable: 1241 Multiple media types: Outside of WebRTC, it is common to use one RTP 1242 session for each type of media sources (e.g., one RTP session for 1243 audio sources and one for video sources, each sent over different 1244 transport layer flows). However, to reduce the number of UDP 1245 ports used, the default in WebRTC is to send all types of media in 1246 a single RTP session, as described in Section 4.4, using RTP and 1247 RTCP multiplexing (Section 4.5) to further reduce the number of 1248 UDP ports needed. This RTP session then uses only one bi- 1249 directional transport-layer flow, but will contain multiple RTP 1250 packet streams, each containing a different type of media. A 1251 common example might be an end-point with a camera and microphone 1252 that sends two RTP packet streams, one video and one audio, into a 1253 single RTP session. 1255 Multiple Capture Devices: A WebRTC end-point might have multiple 1256 cameras, microphones, or other media capture devices, and so might 1257 want to generate several RTP packet streams of the same media 1258 type. Alternatively, it might want to send media from a single 1259 capture device in several different formats or quality settings at 1260 once. Both can result in a single end-point sending multiple RTP 1261 packet streams of the same media type into a single RTP session at 1262 the same time. 1264 Associated Repair Data: An end-point might send a RTP packet stream 1265 that is somehow associated with another stream. For example, it 1266 might send an RTP packet stream that contains FEC or 1267 retransmission data relating to another stream. Some RTP payload 1268 formats send this sort of associated repair data as part of the 1269 source packet stream, while others send it as a separate packet 1270 stream. 1272 Layered or Multiple Description Coding: An end-point can use a 1273 layered media codec, for example H.264 SVC, or a multiple 1274 description codec, that generates multiple RTP packet streams, 1275 each with a distinct RTP SSRC, within a single RTP session. 1277 RTP Mixers, Translators, and Other Middleboxes: An RTP session, in 1278 the WebRTC context, is a point-to-point association between an 1279 end-point and some other peer device, where those devices share a 1280 common SSRC space. The peer device might be another WebRTC end- 1281 point, or it might be an RTP mixer, translator, or some other form 1282 of media processing middlebox. In the latter cases, the middlebox 1283 might send mixed or relayed RTP streams from several participants, 1284 that the WebRTC end-point will need to render. Thus, even though 1285 a WebRTC end-point might only be a member of a single RTP session, 1286 the peer device might be extending that RTP session to incorporate 1287 other end-points. WebRTC is a group communication environment and 1288 end-points need to be capable of receiving, decoding, and playing 1289 out multiple RTP packet streams at once, even in a single RTP 1290 session. 1292 12.1.2. Use of Multiple RTP Sessions 1294 In addition to sending and receiving multiple RTP packet streams 1295 within a single RTP session, a WebRTC end-point might participate in 1296 multiple RTP sessions. There are several reasons why a WebRTC end- 1297 point might choose to do this: 1299 To interoperate with legacy devices: The common practice in the non- 1300 WebRTC world is to send different types of media in separate RTP 1301 sessions, for example using one RTP session for audio and another 1302 RTP session, on a separate transport layer flow, for video. All 1303 WebRTC end-points need to support the option of sending different 1304 types of media on different RTP sessions, so they can interwork 1305 with such legacy devices. This is discussed further in 1306 Section 4.4. 1308 To provide enhanced quality of service: Some network-based quality 1309 of service mechanisms operate on the granularity of transport 1310 layer flows. If it is desired to use these mechanisms to provide 1311 differentiated quality of service for some RTP packet streams, 1312 then those RTP packet streams need to be sent in a separate RTP 1313 session using a different transport-layer flow, and with 1314 appropriate quality of service marking. This is discussed further 1315 in Section 12.1.3. 1317 To separate media with different purposes: An end-point might want 1318 to send RTP packet streams that have different purposes on 1319 different RTP sessions, to make it easy for the peer device to 1320 distinguish them. For example, some centralised multiparty 1321 conferencing systems display the active speaker in high 1322 resolution, but show low resolution "thumbnails" of other 1323 participants. Such systems might configure the end-points to send 1324 simulcast high- and low-resolution versions of their video using 1325 separate RTP sessions, to simplify the operation of the RTP 1326 middlebox. In the WebRTC context this is currently possible by 1327 establishing multiple WebRTC MediaStreamTracks that have the same 1328 media source in one (or more) RTCPeerConnection. Each 1329 MediaStreamTrack is then configured to deliver a particular media 1330 quality and thus media bit-rate, and will produce an independently 1331 encoded version with the codec parameters agreed specifically in 1332 the context of that RTCPeerConnection. The RTP middlebox can 1333 distinguish packets corresponding to the low- and high-resolution 1334 streams by inspecting their SSRC, RTP payload type, or some other 1335 information contained in RTP payload, RTP header extension or RTCP 1336 packets, but it can be easier to distinguish the RTP packet 1337 streams if they arrive on separate RTP sessions on separate 1338 transport-layer flows. 1340 To directly connect with multiple peers: A multi-party conference 1341 does not need to use an RTP middlebox. Rather, a multi-unicast 1342 mesh can be created, comprising several distinct RTP sessions, 1343 with each participant sending RTP traffic over a separate RTP 1344 session (that is, using an independent RTCPeerConnection object) 1345 to every other participant, as shown in Figure 1. This topology 1346 has the benefit of not requiring an RTP middlebox node that is 1347 trusted to access and manipulate the media data. The downside is 1348 that it increases the used bandwidth at each sender by requiring 1349 one copy of the RTP packet streams for each participant that are 1350 part of the same session beyond the sender itself. 1352 +---+ +---+ 1353 | A |<--->| B | 1354 +---+ +---+ 1355 ^ ^ 1356 \ / 1357 \ / 1358 v v 1359 +---+ 1360 | C | 1361 +---+ 1363 Figure 1: Multi-unicast using several RTP sessions 1365 The multi-unicast topology could also be implemented as a single 1366 RTP session, spanning multiple peer-to-peer transport layer 1367 connections, or as several pairwise RTP sessions, one between each 1368 pair of peers. To maintain a coherent mapping between the 1369 relation between RTP sessions and RTCPeerConnection objects it is 1370 recommend that this is implemented as several individual RTP 1371 sessions. The only downside is that end-point A will not learn of 1372 the quality of any transmission happening between B and C, since 1373 it will not see RTCP reports for the RTP session between B and C, 1374 whereas it would it all three participants were part of a single 1375 RTP session. Experience with the Mbone tools (experimental RTP- 1376 based multicast conferencing tools from the late 1990s) has showed 1377 that RTCP reception quality reports for third parties can be 1378 presented to users in a way that helps them understand asymmetric 1379 network problems, and the approach of using separate RTP sessions 1380 prevents this. However, an advantage of using separate RTP 1381 sessions is that it enables using different media bit-rates and 1382 RTP session configurations between the different peers, thus not 1383 forcing B to endure the same quality reductions if there are 1384 limitations in the transport from A to C as C will. It is 1385 believed that these advantages outweigh the limitations in 1386 debugging power. 1388 To indirectly connect with multiple peers: A common scenario in 1389 multi-party conferencing is to create indirect connections to 1390 multiple peers, using an RTP mixer, translator, or some other type 1391 of RTP middlebox. Figure 2 outlines a simple topology that might 1392 be used in a four-person centralised conference. The middlebox 1393 acts to optimise the transmission of RTP packet streams from 1394 certain perspectives, either by only sending some of the received 1395 RTP packet stream to any given receiver, or by providing a 1396 combined RTP packet stream out of a set of contributing streams. 1398 +---+ +-------------+ +---+ 1399 | A |<---->| |<---->| B | 1400 +---+ | RTP mixer, | +---+ 1401 | translator, | 1402 | or other | 1403 +---+ | middlebox | +---+ 1404 | C |<---->| |<---->| D | 1405 +---+ +-------------+ +---+ 1407 Figure 2: RTP mixer with only unicast paths 1409 There are various methods of implementation for the middlebox. If 1410 implemented as a standard RTP mixer or translator, a single RTP 1411 session will extend across the middlebox and encompass all the 1412 end-points in one multi-party session. Other types of middlebox 1413 might use separate RTP sessions between each end-point and the 1414 middlebox. A common aspect is that these RTP middleboxes can use 1415 a number of tools to control the media encoding provided by a 1416 WebRTC end-point. This includes functions like requesting the 1417 breaking of the encoding chain and have the encoder produce a so 1418 called Intra frame. Another is limiting the bit-rate of a given 1419 stream to better suit the mixer view of the multiple down-streams. 1420 Others are controlling the most suitable frame-rate, picture 1421 resolution, the trade-off between frame-rate and spatial quality. 1422 The middlebox has the responsibility to correctly perform 1423 congestion control, source identification, manage synchronisation 1424 while providing the application with suitable media optimisations. 1425 The middlebox also has to be a trusted node when it comes to 1426 security, since it manipulates either the RTP header or the media 1427 itself (or both) received from one end-point, before sending it on 1428 towards the end-point(s), thus they need to be able to decrypt and 1429 then re-encrypt the RTP packet stream before sending it out. 1431 RTP Mixers can create a situation where an end-point experiences a 1432 situation in-between a session with only two end-points and 1433 multiple RTP sessions. Mixers are expected to not forward RTCP 1434 reports regarding RTP packet streams across themselves. This is 1435 due to the difference in the RTP packet streams provided to the 1436 different end-points. The original media source lacks information 1437 about a mixer's manipulations prior to sending it the different 1438 receivers. This scenario also results in that an end-point's 1439 feedback or requests goes to the mixer. When the mixer can't act 1440 on this by itself, it is forced to go to the original media source 1441 to fulfil the receivers request. This will not necessarily be 1442 explicitly visible any RTP and RTCP traffic, but the interactions 1443 and the time to complete them will indicate such dependencies. 1445 Providing source authentication in multi-party scenarios is a 1446 challenge. In the mixer-based topologies, end-points source 1447 authentication is based on, firstly, verifying that media comes 1448 from the mixer by cryptographic verification and, secondly, trust 1449 in the mixer to correctly identify any source towards the end- 1450 point. In RTP sessions where multiple end-points are directly 1451 visible to an end-point, all end-points will have knowledge about 1452 each others' master keys, and can thus inject packets claimed to 1453 come from another end-point in the session. Any node performing 1454 relay can perform non-cryptographic mitigation by preventing 1455 forwarding of packets that have SSRC fields that came from other 1456 end-points before. For cryptographic verification of the source, 1457 SRTP would require additional security mechanisms, for example 1458 TESLA for SRTP [RFC4383], that are not part of the base WebRTC 1459 standards. 1461 To forward media between multiple peers: It is sometimes desirable 1462 for an end-point that receives an RTP packet stream to be able to 1463 forward that RTP packet stream to a third party. The are some 1464 obvious security and privacy implications in supporting this, but 1465 also potential uses. This is supported in the W3C API by taking 1466 the received and decoded media and using it as media source that 1467 is re-encoding and transmitted as a new stream. 1469 At the RTP layer, media forwarding acts as a back-to-back RTP 1470 receiver and RTP sender. The receiving side terminates the RTP 1471 session and decodes the media, while the sender side re-encodes 1472 and transmits the media using an entirely separate RTP session. 1473 The original sender will only see a single receiver of the media, 1474 and will not be able to tell that forwarding is happening based on 1475 RTP-layer information since the RTP session that is used to send 1476 the forwarded media is not connected to the RTP session on which 1477 the media was received by the node doing the forwarding. 1479 The end-point that is performing the forwarding is responsible for 1480 producing an RTP packet stream suitable for onwards transmission. 1481 The outgoing RTP session that is used to send the forwarded media 1482 is entirely separate to the RTP session on which the media was 1483 received. This will require media transcoding for congestion 1484 control purpose to produce a suitable bit-rate for the outgoing 1485 RTP session, reducing media quality and forcing the forwarding 1486 end-point to spend the resource on the transcoding. The media 1487 transcoding does result in a separation of the two different legs 1488 removing almost all dependencies, and allowing the forwarding end- 1489 point to optimise its media transcoding operation. The cost is 1490 greatly increased computational complexity on the forwarding node. 1491 Receivers of the forwarded stream will see the forwarding device 1492 as the sender of the stream, and will not be able to tell from the 1493 RTP layer that they are receiving a forwarded stream rather than 1494 an entirely new RTP packet stream generated by the forwarding 1495 device. 1497 12.1.3. Differentiated Treatment of RTP Packet Streams 1499 There are use cases for differentiated treatment of RTP packet 1500 streams. Such differentiation can happen at several places in the 1501 system. First of all is the prioritization within the end-point 1502 sending the media, which controls, both which RTP packet streams that 1503 will be sent, and their allocation of bit-rate out of the current 1504 available aggregate as determined by the congestion control. 1506 It is expected that the WebRTC API [W3C.WD-webrtc-20130910] will 1507 allow the application to indicate relative priorities for different 1508 MediaStreamTracks. These priorities can then be used to influence 1509 the local RTP processing, especially when it comes to congestion 1510 control response in how to divide the available bandwidth between the 1511 RTP packet streams. Any changes in relative priority will also need 1512 to be considered for RTP packet streams that are associated with the 1513 main RTP packet streams, such as redundant streams for RTP 1514 retransmission and FEC. The importance of such redundant RTP packet 1515 streams is dependent on the media type and codec used, in regards to 1516 how robust that codec is to packet loss. However, a default policy 1517 might to be to use the same priority for redundant RTP packet stream 1518 as for the source RTP packet stream. 1520 Secondly, the network can prioritize transport-layer flows and sub- 1521 flows, including RTP packet streams. Typically, differential 1522 treatment includes two steps, the first being identifying whether an 1523 IP packet belongs to a class that has to be treated differently, the 1524 second consisting of the actual mechanism to prioritize packets. 1525 This is done according to three methods: 1527 DiffServ: The end-point marks a packet with a DiffServ code point to 1528 indicate to the network that the packet belongs to a particular 1529 class. 1531 Flow based: Packets that need to be given a particular treatment are 1532 identified using a combination of IP and port address. 1534 Deep Packet Inspection: A network classifier (DPI) inspects the 1535 packet and tries to determine if the packet represents a 1536 particular application and type that is to be prioritized. 1538 Flow-based differentiation will provide the same treatment to all 1539 packets within a transport-layer flow, i.e., relative prioritization 1540 is not possible. Moreover, if the resources are limited it might not 1541 be possible to provide differential treatment compared to best-effort 1542 for all the RTP packet streams in a WebRTC application. When flow- 1543 based differentiation is available the WebRTC application needs to 1544 know about it so that it can provide the separation of the RTP packet 1545 streams onto different UDP flows to enable a more granular usage of 1546 flow based differentiation. That way at least providing different 1547 prioritization of audio and video if desired by application. 1549 DiffServ assumes that either the end-point or a classifier can mark 1550 the packets with an appropriate DSCP so that the packets are treated 1551 according to that marking. If the end-point is to mark the traffic 1552 two requirements arise in the WebRTC context: 1) The WebRTC 1553 application or browser has to know which DSCP to use and that it can 1554 use them on some set of RTP packet streams. 2) The information needs 1555 to be propagated to the operating system when transmitting the 1556 packet. Details of this process are outside the scope of this memo 1557 and are further discussed in "DSCP and other packet markings for 1558 RTCWeb QoS" [I-D.ietf-tsvwg-rtcweb-qos]. 1560 For packet based marking schemes it might be possible to mark 1561 individual RTP packets differently based on the relative priority of 1562 the RTP payload. For example video codecs that have I, P, and B 1563 pictures could prioritise any payloads carrying only B frames less, 1564 as these are less damaging to loose. However, depending on the QoS 1565 mechanism and what markings that are applied, this can result in not 1566 only different packet drop probabilities but also packet reordering, 1567 see [I-D.ietf-tsvwg-rtcweb-qos] for further discussion. As a default 1568 policy all RTP packets related to a RTP packet stream ought to be 1569 provided with the same prioritization; per-packet prioritization is 1570 outside the scope of this memo, but might be specified elsewhere in 1571 future. 1573 It is also important to consider how RTCP packets associated with a 1574 particular RTP packet stream need to be marked. RTCP compound 1575 packets with Sender Reports (SR), ought to be marked with the same 1576 priority as the RTP packet stream itself, so the RTCP-based round- 1577 trip time (RTT) measurements are done using the same transport-layer 1578 flow priority as the RTP packet stream experiences. RTCP compound 1579 packets containing RR packet ought to be sent with the priority used 1580 by the majority of the RTP packet streams reported on. RTCP packets 1581 containing time-critical feedback packets can use higher priority to 1582 improve the timeliness and likelihood of delivery of such feedback. 1584 12.2. Media Source, RTP Packet Streams, and Participant Identification 1586 12.2.1. Media Source Identification 1588 Each RTP packet stream is identified by a unique synchronisation 1589 source (SSRC) identifier. The SSRC identifier is carried in each of 1590 the RTP packets comprising a RTP packet stream, and is also used to 1591 identify that stream in the corresponding RTCP reports. The SSRC is 1592 chosen as discussed in Section 4.8. The first stage in 1593 demultiplexing RTP and RTCP packets received on a single transport 1594 layer flow at a WebRTC end-point is to separate the RTP packet 1595 streams based on their SSRC value; once that is done, additional 1596 demultiplexing steps can determine how and where to render the media. 1598 RTP allows a mixer, or other RTP-layer middlebox, to combine encoded 1599 streams from multiple media sources to form a new encoded stream from 1600 a new media source (the mixer). The RTP packets in that new RTP 1601 packet stream can include a Contributing Source (CSRC) list, 1602 indicating which original SSRCs contributed to the combined source 1603 stream. As described in Section 4.1, implementations need to support 1604 reception of RTP data packets containing a CSRC list and RTCP packets 1605 that relate to sources present in the CSRC list. The CSRC list can 1606 change on a packet-by-packet basis, depending on the mixing operation 1607 being performed. Knowledge of what media sources contributed to a 1608 particular RTP packet can be important if the user interface 1609 indicates which participants are active in the session. Changes in 1610 the CSRC list included in packets needs to be exposed to the WebRTC 1611 application using some API, if the application is to be able to track 1612 changes in session participation. It is desirable to map CSRC values 1613 back into WebRTC MediaStream identities as they cross this API, to 1614 avoid exposing the SSRC/CSRC name space to JavaScript applications. 1616 If the mixer-to-client audio level extension [RFC6465] is being used 1617 in the session (see Section 5.2.3), the information in the CSRC list 1618 is augmented by audio level information for each contributing source. 1619 It is desirable to expose this information to the WebRTC application 1620 using some API, after mapping the CSRC values to WebRTC MediaStream 1621 identities, so it can be exposed in the user interface. 1623 12.2.2. SSRC Collision Detection 1625 The RTP standard requires RTP implementations to have support for 1626 detecting and handling SSRC collisions, i.e., resolve the conflict 1627 when two different end-points use the same SSRC value (see section 1628 8.2 of [RFC3550]). This requirement also applies to WebRTC end- 1629 points. There are several scenarios where SSRC collisions can occur: 1631 o In a point-to-point session where each SSRC is associated with 1632 either of the two end-points and where the main media carrying 1633 SSRC identifier will be announced in the signalling channel, a 1634 collision is less likely to occur due to the information about 1635 used SSRCs. If SDP is used, this information is provided by 1636 Source-Specific SDP Attributes [RFC5576]. Still, collisions can 1637 occur if both end-points start using a new SSRC identifier prior 1638 to having signalled it to the peer and received acknowledgement on 1639 the signalling message. The Source-Specific SDP Attributes 1640 [RFC5576] contains a mechanism to signal how the end-point 1641 resolved the SSRC collision. 1643 o SSRC values that have not been signalled could also appear in an 1644 RTP session. This is more likely than it appears, since some RTP 1645 functions use extra SSRCs to provide their functionality. For 1646 example, retransmission data might be transmitted using a separate 1647 RTP packet stream that requires its own SSRC, separate to the SSRC 1648 of the source RTP packet stream [RFC4588]. In those cases, an 1649 end-point can create a new SSRC that strictly doesn't need to be 1650 announced over the signalling channel to function correctly on 1651 both RTP and RTCPeerConnection level. 1653 o Multiple end-points in a multiparty conference can create new 1654 sources and signal those towards the RTP middlebox. In cases 1655 where the SSRC/CSRC are propagated between the different end- 1656 points from the RTP middlebox collisions can occur. 1658 o An RTP middlebox could connect an end-point's RTCPeerConnection to 1659 another RTCPeerConnection from the same end-point, thus forming a 1660 loop where the end-point will receive its own traffic. While it 1661 is clearly considered a bug, it is important that the end-point is 1662 able to recognise and handle the case when it occurs. This case 1663 becomes even more problematic when media mixers, and so on, are 1664 involved, where the stream received is a different stream but 1665 still contains this client's input. 1667 These SSRC/CSRC collisions can only be handled on RTP level as long 1668 as the same RTP session is extended across multiple 1669 RTCPeerConnections by a RTP middlebox. To resolve the more generic 1670 case where multiple RTCPeerConnections are interconnected, 1671 identification of the media source(s) part of a MediaStreamTrack 1672 being propagated across multiple interconnected RTCPeerConnection 1673 needs to be preserved across these interconnections. 1675 12.2.3. Media Synchronisation Context 1677 When an end-point sends media from more than one media source, it 1678 needs to consider if (and which of) these media sources are to be 1679 synchronized. In RTP/RTCP, synchronisation is provided by having a 1680 set of RTP packet streams be indicated as coming from the same 1681 synchronisation context and logical end-point by using the same RTCP 1682 CNAME identifier. 1684 The next provision is that the internal clocks of all media sources, 1685 i.e., what drives the RTP timestamp, can be correlated to a system 1686 clock that is provided in RTCP Sender Reports encoded in an NTP 1687 format. By correlating all RTP timestamps to a common system clock 1688 for all sources, the timing relation of the different RTP packet 1689 streams, also across multiple RTP sessions can be derived at the 1690 receiver and, if desired, the streams can be synchronized. The 1691 requirement is for the media sender to provide the correlation 1692 information; it is up to the receiver to use it or not. 1694 13. Security Considerations 1696 The overall security architecture for WebRTC is described in 1697 [I-D.ietf-rtcweb-security-arch], and security considerations for the 1698 WebRTC framework are described in [I-D.ietf-rtcweb-security]. These 1699 considerations also apply to this memo. 1701 The security considerations of the RTP specification, the RTP/SAVPF 1702 profile, and the various RTP/RTCP extensions and RTP payload formats 1703 that form the complete protocol suite described in this memo apply. 1704 It is not believed there are any new security considerations 1705 resulting from the combination of these various protocol extensions. 1707 The Extended Secure RTP Profile for Real-time Transport Control 1708 Protocol (RTCP)-Based Feedback [RFC5124] (RTP/SAVPF) provides 1709 handling of fundamental issues by offering confidentiality, integrity 1710 and partial source authentication. A mandatory to implement media 1711 security solution is created by combing this secured RTP profile and 1712 DTLS-SRTP keying [RFC5764] as defined by Section 5.5 of 1713 [I-D.ietf-rtcweb-security-arch]. 1715 RTCP packets convey a Canonical Name (CNAME) identifier that is used 1716 to associate RTP packet streams that need to be synchronised across 1717 related RTP sessions. Inappropriate choice of CNAME values can be a 1718 privacy concern, since long-term persistent CNAME identifiers can be 1719 used to track users across multiple WebRTC calls. Section 4.9 of 1720 this memo provides guidelines for generation of untraceable CNAME 1721 values that alleviate this risk. 1723 Some potential denial of service attacks exist if the RTCP reporting 1724 interval is configured to an inappropriate value. This could be done 1725 by configuring the RTCP bandwidth fraction to an excessively large or 1726 small value using the SDP "b=RR:" or "b=RS:" lines [RFC3556], or some 1727 similar mechanism, or by choosing an excessively large or small value 1728 for the RTP/AVPF minimal receiver report interval (if using SDP, this 1729 is the "a=rtcp-fb:... trr-int" parameter) [RFC4585]. The risks are 1730 as follows: 1732 1. the RTCP bandwidth could be configured to make the regular 1733 reporting interval so large that effective congestion control 1734 cannot be maintained, potentially leading to denial of service 1735 due to congestion caused by the media traffic; 1737 2. the RTCP interval could be configured to a very small value, 1738 causing endpoints to generate high rate RTCP traffic, potentially 1739 leading to denial of service due to the non-congestion controlled 1740 RTCP traffic; and 1742 3. RTCP parameters could be configured differently for each 1743 endpoint, with some of the endpoints using a large reporting 1744 interval and some using a smaller interval, leading to denial of 1745 service due to premature participant timeouts due to mismatched 1746 timeout periods which are based on the reporting interval (this 1747 is a particular concern if endpoints use a small but non-zero 1748 value for the RTP/AVPF minimal receiver report interval (trr-int) 1749 [RFC4585], as discussed in Section 6.1 of 1750 [I-D.ietf-avtcore-rtp-multi-stream]). 1752 Premature participant timeout can be avoided by using the fixed (non- 1753 reduced) minimum interval when calculating the participant timeout 1754 (see Section 4.1 of this memo and Section 6.1 of 1755 [I-D.ietf-avtcore-rtp-multi-stream]). To address the other concerns, 1756 endpoints SHOULD ignore parameters that configure the RTCP reporting 1757 interval to be significantly longer than the default five second 1758 interval specified in [RFC3550] (unless the media data rate is so low 1759 that the longer reporting interval roughly corresponds to 5% of the 1760 media data rate), or that configure the RTCP reporting interval small 1761 enough that the RTCP bandwidth would exceed the media bandwidth. 1763 The guidelines in [RFC6562] apply when using variable bit rate (VBR) 1764 audio codecs such as Opus (see Section 4.3 for discussion of mandated 1765 audio codecs). The guidelines in [RFC6562] also apply, but are of 1766 lesser importance, when using the client-to-mixer audio level header 1767 extensions (Section 5.2.2) or the mixer-to-client audio level header 1768 extensions (Section 5.2.3). The use of the encryption of the header 1769 extensions are RECOMMENDED, unless there are known reasons, like RTP 1770 middleboxes or third party monitoring that will greatly benefit from 1771 the information, and this has been expressed using API or signalling. 1772 If further evidence are produced to show that information leakage is 1773 significant from audio level indications, then use of encryption 1774 needs to be mandated at that time. 1776 14. IANA Considerations 1778 This memo makes no request of IANA. 1780 Note to RFC Editor: this section is to be removed on publication as 1781 an RFC. 1783 15. Acknowledgements 1785 The authors would like to thank Bernard Aboba, Harald Alvestrand, 1786 Cary Bran, Ben Campbell, Charles Eckel, Alex Eleftheriadis, Christian 1787 Groves, Cullen Jennings, Olle Johansson, Suhas Nandakumar, Dan 1788 Romascanu, Jim Spring, Martin Thomson, and the other members of the 1789 IETF RTCWEB working group for their valuable feedback. 1791 16. References 1793 16.1. Normative References 1795 [I-D.ietf-avtcore-multi-media-rtp-session] 1796 Westerlund, M., Perkins, C., and J. Lennox, "Sending 1797 Multiple Types of Media in a Single RTP Session", draft- 1798 ietf-avtcore-multi-media-rtp-session-05 (work in 1799 progress), February 2014. 1801 [I-D.ietf-avtcore-rtp-circuit-breakers] 1802 Perkins, C. and V. Singh, "Multimedia Congestion Control: 1803 Circuit Breakers for Unicast RTP Sessions", draft-ietf- 1804 avtcore-rtp-circuit-breakers-05 (work in progress), 1805 February 2014. 1807 [I-D.ietf-avtcore-rtp-multi-stream] 1808 Lennox, J., Westerlund, M., Wu, W., and C. Perkins, 1809 "Sending Multiple Media Streams in a Single RTP Session", 1810 draft-ietf-avtcore-rtp-multi-stream-04 (work in progress), 1811 May 2014. 1813 [I-D.ietf-avtcore-rtp-multi-stream-optimisation] 1814 Lennox, J., Westerlund, M., Wu, W., and C. Perkins, 1815 "Sending Multiple Media Streams in a Single RTP Session: 1816 Grouping RTCP Reception Statistics and Other Feedback", 1817 draft-ietf-avtcore-rtp-multi-stream-optimisation-02 (work 1818 in progress), February 2014. 1820 [I-D.ietf-rtcweb-security] 1821 Rescorla, E., "Security Considerations for WebRTC", draft- 1822 ietf-rtcweb-security-06 (work in progress), January 2014. 1824 [I-D.ietf-rtcweb-security-arch] 1825 Rescorla, E., "WebRTC Security Architecture", draft-ietf- 1826 rtcweb-security-arch-09 (work in progress), February 2014. 1828 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1829 Requirement Levels", BCP 14, RFC 2119, March 1997. 1831 [RFC2736] Handley, M. and C. Perkins, "Guidelines for Writers of RTP 1832 Payload Format Specifications", BCP 36, RFC 2736, December 1833 1999. 1835 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 1836 Jacobson, "RTP: A Transport Protocol for Real-Time 1837 Applications", STD 64, RFC 3550, July 2003. 1839 [RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and 1840 Video Conferences with Minimal Control", STD 65, RFC 3551, 1841 July 2003. 1843 [RFC3556] Casner, S., "Session Description Protocol (SDP) Bandwidth 1844 Modifiers for RTP Control Protocol (RTCP) Bandwidth", RFC 1845 3556, July 2003. 1847 [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 1848 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 1849 RFC 3711, March 2004. 1851 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 1852 Description Protocol", RFC 4566, July 2006. 1854 [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, 1855 "Extended RTP Profile for Real-time Transport Control 1856 Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, July 1857 2006. 1859 [RFC4588] Rey, J., Leon, D., Miyazaki, A., Varsa, V., and R. 1860 Hakenberg, "RTP Retransmission Payload Format", RFC 4588, 1861 July 2006. 1863 [RFC4961] Wing, D., "Symmetric RTP / RTP Control Protocol (RTCP)", 1864 BCP 131, RFC 4961, July 2007. 1866 [RFC5104] Wenger, S., Chandra, U., Westerlund, M., and B. Burman, 1867 "Codec Control Messages in the RTP Audio-Visual Profile 1868 with Feedback (AVPF)", RFC 5104, February 2008. 1870 [RFC5124] Ott, J. and E. Carrara, "Extended Secure RTP Profile for 1871 Real-time Transport Control Protocol (RTCP)-Based Feedback 1872 (RTP/SAVPF)", RFC 5124, February 2008. 1874 [RFC5285] Singer, D. and H. Desineni, "A General Mechanism for RTP 1875 Header Extensions", RFC 5285, July 2008. 1877 [RFC5506] Johansson, I. and M. Westerlund, "Support for Reduced-Size 1878 Real-Time Transport Control Protocol (RTCP): Opportunities 1879 and Consequences", RFC 5506, April 2009. 1881 [RFC5761] Perkins, C. and M. Westerlund, "Multiplexing RTP Data and 1882 Control Packets on a Single Port", RFC 5761, April 2010. 1884 [RFC5764] McGrew, D. and E. Rescorla, "Datagram Transport Layer 1885 Security (DTLS) Extension to Establish Keys for the Secure 1886 Real-time Transport Protocol (SRTP)", RFC 5764, May 2010. 1888 [RFC6051] Perkins, C. and T. Schierl, "Rapid Synchronisation of RTP 1889 Flows", RFC 6051, November 2010. 1891 [RFC6464] Lennox, J., Ivov, E., and E. Marocco, "A Real-time 1892 Transport Protocol (RTP) Header Extension for Client-to- 1893 Mixer Audio Level Indication", RFC 6464, December 2011. 1895 [RFC6465] Ivov, E., Marocco, E., and J. Lennox, "A Real-time 1896 Transport Protocol (RTP) Header Extension for Mixer-to- 1897 Client Audio Level Indication", RFC 6465, December 2011. 1899 [RFC6562] Perkins, C. and JM. Valin, "Guidelines for the Use of 1900 Variable Bit Rate Audio with Secure RTP", RFC 6562, March 1901 2012. 1903 [RFC6904] Lennox, J., "Encryption of Header Extensions in the Secure 1904 Real-time Transport Protocol (SRTP)", RFC 6904, April 1905 2013. 1907 [RFC7007] Terriberry, T., "Update to Remove DVI4 from the 1908 Recommended Codecs for the RTP Profile for Audio and Video 1909 Conferences with Minimal Control (RTP/AVP)", RFC 7007, 1910 August 2013. 1912 [RFC7022] Begen, A., Perkins, C., Wing, D., and E. Rescorla, 1913 "Guidelines for Choosing RTP Control Protocol (RTCP) 1914 Canonical Names (CNAMEs)", RFC 7022, September 2013. 1916 [RFC7160] Petit-Huguenin, M. and G. Zorn, "Support for Multiple 1917 Clock Rates in an RTP Session", RFC 7160, April 2014. 1919 [RFC7164] Gross, K. and R. Brandenburg, "RTP and Leap Seconds", RFC 1920 7164, March 2014. 1922 16.2. Informative References 1924 [I-D.ietf-avtcore-multiplex-guidelines] 1925 Westerlund, M., Perkins, C., and H. Alvestrand, 1926 "Guidelines for using the Multiplexing Features of RTP to 1927 Support Multiple Media Streams", draft-ietf-avtcore- 1928 multiplex-guidelines-02 (work in progress), January 2014. 1930 [I-D.ietf-avtcore-rtp-topologies-update] 1931 Westerlund, M. and S. Wenger, "RTP Topologies", draft- 1932 ietf-avtcore-rtp-topologies-update-02 (work in progress), 1933 May 2014. 1935 [I-D.ietf-avtext-rtp-grouping-taxonomy] 1936 Lennox, J., Gross, K., Nandakumar, S., and G. Salgueiro, 1937 "A Taxonomy of Grouping Semantics and Mechanisms for Real- 1938 Time Transport Protocol (RTP) Sources", draft-ietf-avtext- 1939 rtp-grouping-taxonomy-01 (work in progress), February 1940 2014. 1942 [I-D.ietf-mmusic-msid] 1943 Alvestrand, H., "WebRTC MediaStream Identification in the 1944 Session Description Protocol", draft-ietf-mmusic-msid-05 1945 (work in progress), March 2014. 1947 [I-D.ietf-mmusic-sdp-bundle-negotiation] 1948 Holmberg, C., Alvestrand, H., and C. Jennings, 1949 "Negotiating Media Multiplexing Using the Session 1950 Description Protocol (SDP)", draft-ietf-mmusic-sdp-bundle- 1951 negotiation-07 (work in progress), April 2014. 1953 [I-D.ietf-payload-rtp-howto] 1954 Westerlund, M., "How to Write an RTP Payload Format", 1955 draft-ietf-payload-rtp-howto-13 (work in progress), 1956 January 2014. 1958 [I-D.ietf-rmcat-cc-requirements] 1959 Jesup, R., "Congestion Control Requirements For RMCAT", 1960 draft-ietf-rmcat-cc-requirements-04 (work in progress), 1961 April 2014. 1963 [I-D.ietf-rtcweb-audio] 1964 Valin, J. and C. Bran, "WebRTC Audio Codec and Processing 1965 Requirements", draft-ietf-rtcweb-audio-05 (work in 1966 progress), February 2014. 1968 [I-D.ietf-rtcweb-overview] 1969 Alvestrand, H., "Overview: Real Time Protocols for Brower- 1970 based Applications", draft-ietf-rtcweb-overview-09 (work 1971 in progress), February 2014. 1973 [I-D.ietf-rtcweb-use-cases-and-requirements] 1974 Holmberg, C., Hakansson, S., and G. Eriksson, "Web Real- 1975 Time Communication Use-cases and Requirements", draft- 1976 ietf-rtcweb-use-cases-and-requirements-14 (work in 1977 progress), February 2014. 1979 [I-D.ietf-tsvwg-rtcweb-qos] 1980 Dhesikan, S., Druta, D., Jones, P., and J. Polk, "DSCP and 1981 other packet markings for RTCWeb QoS", draft-ietf-tsvwg- 1982 rtcweb-qos-00 (work in progress), April 2014. 1984 [RFC3611] Friedman, T., Caceres, R., and A. Clark, "RTP Control 1985 Protocol Extended Reports (RTCP XR)", RFC 3611, November 1986 2003. 1988 [RFC4383] Baugher, M. and E. Carrara, "The Use of Timed Efficient 1989 Stream Loss-Tolerant Authentication (TESLA) in the Secure 1990 Real-time Transport Protocol (SRTP)", RFC 4383, February 1991 2006. 1993 [RFC5245] Rosenberg, J., "Interactive Connectivity Establishment 1994 (ICE): A Protocol for Network Address Translator (NAT) 1995 Traversal for Offer/Answer Protocols", RFC 5245, April 1996 2010. 1998 [RFC5576] Lennox, J., Ott, J., and T. Schierl, "Source-Specific 1999 Media Attributes in the Session Description Protocol 2000 (SDP)", RFC 5576, June 2009. 2002 [RFC5968] Ott, J. and C. Perkins, "Guidelines for Extending the RTP 2003 Control Protocol (RTCP)", RFC 5968, September 2010. 2005 [RFC6263] Marjou, X. and A. Sollaud, "Application Mechanism for 2006 Keeping Alive the NAT Mappings Associated with RTP / RTP 2007 Control Protocol (RTCP) Flows", RFC 6263, June 2011. 2009 [RFC6792] Wu, Q., Hunt, G., and P. Arden, "Guidelines for Use of the 2010 RTP Monitoring Framework", RFC 6792, November 2012. 2012 [W3C.WD-mediacapture-streams-20130903] 2013 Burnett, D., Bergkvist, A., Jennings, C., and A. 2014 Narayanan, "Media Capture and Streams", World Wide Web 2015 Consortium WD WD-mediacapture-streams-20130903, September 2016 2013, . 2019 [W3C.WD-webrtc-20130910] 2020 Bergkvist, A., Burnett, D., Jennings, C., and A. 2021 Narayanan, "WebRTC 1.0: Real-time Communication Between 2022 Browsers", World Wide Web Consortium WD WD- 2023 webrtc-20130910, September 2013, 2024 . 2026 Authors' Addresses 2028 Colin Perkins 2029 University of Glasgow 2030 School of Computing Science 2031 Glasgow G12 8QQ 2032 United Kingdom 2034 Email: csp@csperkins.org 2035 URI: http://csperkins.org/ 2037 Magnus Westerlund 2038 Ericsson 2039 Farogatan 6 2040 SE-164 80 Kista 2041 Sweden 2043 Phone: +46 10 714 82 87 2044 Email: magnus.westerlund@ericsson.com 2045 Joerg Ott 2046 Aalto University 2047 School of Electrical Engineering 2048 Espoo 02150 2049 Finland 2051 Email: jorg.ott@aalto.fi