idnits 2.17.1 draft-ietf-rtcweb-jsep-20.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 3644 has weird spacing: '... mid a1...' == Line 3645 has weird spacing: '... attr candi...' == Line 3650 has weird spacing: '... mid a1...' == Line 3651 has weird spacing: '... attr candi...' == Line 3658 has weird spacing: '... mid a1...' == (11 more instances...) -- The document date (March 29, 2017) is 2557 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '0' on line 783 == Unused Reference: 'RFC7941' is defined on line 4421, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-avtext-lrr' is defined on line 4428, but no explicit reference was found in the text == Unused Reference: 'RFC3550' is defined on line 4447, but no explicit reference was found in the text == Unused Reference: 'RFC3611' is defined on line 4456, but no explicit reference was found in the text == Unused Reference: 'RFC5104' is defined on line 4478, but no explicit reference was found in the text == Unused Reference: 'RFC7656' is defined on line 4511, but no explicit reference was found in the text == Outdated reference: A later version (-09) exists of draft-ietf-avtext-rid-00 == Outdated reference: A later version (-13) exists of draft-ietf-mmusic-4572-update-05 == Outdated reference: A later version (-32) exists of draft-ietf-mmusic-dtls-sdp-14 == Outdated reference: A later version (-17) exists of draft-ietf-mmusic-msid-01 == Outdated reference: A later version (-12) exists of draft-ietf-mmusic-mux-exclusive-08 == Outdated reference: A later version (-15) exists of draft-ietf-mmusic-rid-04 == Outdated reference: A later version (-26) exists of draft-ietf-mmusic-sctp-sdp-04 == Outdated reference: A later version (-54) exists of draft-ietf-mmusic-sdp-bundle-negotiation-04 == Outdated reference: A later version (-19) exists of draft-ietf-mmusic-sdp-mux-attributes-01 == Outdated reference: A later version (-14) exists of draft-ietf-mmusic-sdp-simulcast-04 == Outdated reference: A later version (-11) exists of draft-ietf-rtcweb-audio-02 == Outdated reference: A later version (-10) exists of draft-ietf-rtcweb-fec-00 == Outdated reference: A later version (-26) exists of draft-ietf-rtcweb-rtp-usage-09 == Outdated reference: A later version (-12) exists of draft-ietf-rtcweb-security-06 == Outdated reference: A later version (-20) exists of draft-ietf-rtcweb-security-arch-09 == Outdated reference: A later version (-06) exists of draft-ietf-rtcweb-video-00 ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) ** Obsolete normative reference: RFC 4572 (Obsoleted by RFC 8122) ** Obsolete normative reference: RFC 5245 (Obsoleted by RFC 8445, RFC 8839) ** Obsolete normative reference: RFC 5285 (Obsoleted by RFC 8285) ** Obsolete normative reference: RFC 6347 (Obsoleted by RFC 9147) == Outdated reference: A later version (-07) exists of draft-ietf-avtext-lrr-03 == Outdated reference: A later version (-12) exists of draft-ietf-rtcweb-ip-handling-01 == Outdated reference: A later version (-08) exists of draft-nandakumar-rtcweb-sdp-02 Summary: 5 errors (**), 0 flaws (~~), 32 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group J. Uberti 3 Internet-Draft Google 4 Intended status: Standards Track C. Jennings 5 Expires: September 30, 2017 Cisco 6 E. Rescorla, Ed. 7 Mozilla 8 March 29, 2017 10 Javascript Session Establishment Protocol 11 draft-ietf-rtcweb-jsep-20 13 Abstract 15 This document describes the mechanisms for allowing a Javascript 16 application to control the signaling plane of a multimedia session 17 via the interface specified in the W3C RTCPeerConnection API, and 18 discusses how this relates to existing signaling protocols. 20 Status of This Memo 22 This Internet-Draft is submitted in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at http://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on September 30, 2017. 37 Copyright Notice 39 Copyright (c) 2017 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents 44 (http://trustee.ietf.org/license-info) in effect on the date of 45 publication of this document. Please review these documents 46 carefully, as they describe your rights and restrictions with respect 47 to this document. Code Components extracted from this document must 48 include Simplified BSD License text as described in Section 4.e of 49 the Trust Legal Provisions and are provided without warranty as 50 described in the Simplified BSD License. 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 55 1.1. General Design of JSEP . . . . . . . . . . . . . . . . . 4 56 1.2. Other Approaches Considered . . . . . . . . . . . . . . . 5 57 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 58 3. Semantics and Syntax . . . . . . . . . . . . . . . . . . . . 6 59 3.1. Signaling Model . . . . . . . . . . . . . . . . . . . . . 6 60 3.2. Session Descriptions and State Machine . . . . . . . . . 7 61 3.3. Session Description Format . . . . . . . . . . . . . . . 10 62 3.4. Session Description Control . . . . . . . . . . . . . . . 10 63 3.4.1. RtpTransceivers . . . . . . . . . . . . . . . . . . . 10 64 3.4.2. RtpSenders . . . . . . . . . . . . . . . . . . . . . 11 65 3.4.3. RtpReceivers . . . . . . . . . . . . . . . . . . . . 11 66 3.5. ICE . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 67 3.5.1. ICE Gathering Overview . . . . . . . . . . . . . . . 11 68 3.5.2. ICE Candidate Trickling . . . . . . . . . . . . . . . 12 69 3.5.2.1. ICE Candidate Format . . . . . . . . . . . . . . 13 70 3.5.3. ICE Candidate Policy . . . . . . . . . . . . . . . . 13 71 3.5.4. ICE Candidate Pool . . . . . . . . . . . . . . . . . 14 72 3.6. Video Size Negotiation . . . . . . . . . . . . . . . . . 15 73 3.6.1. Creating an imageattr Attribute . . . . . . . . . . . 15 74 3.6.2. Interpreting an imageattr Attribute . . . . . . . . . 16 75 3.7. Simulcast . . . . . . . . . . . . . . . . . . . . . . . . 17 76 3.8. Interactions With Forking . . . . . . . . . . . . . . . . 18 77 3.8.1. Sequential Forking . . . . . . . . . . . . . . . . . 19 78 3.8.2. Parallel Forking . . . . . . . . . . . . . . . . . . 19 79 4. Interface . . . . . . . . . . . . . . . . . . . . . . . . . . 20 80 4.1. PeerConnection . . . . . . . . . . . . . . . . . . . . . 20 81 4.1.1. Constructor . . . . . . . . . . . . . . . . . . . . . 20 82 4.1.2. addTrack . . . . . . . . . . . . . . . . . . . . . . 22 83 4.1.3. removeTrack . . . . . . . . . . . . . . . . . . . . . 23 84 4.1.4. addTransceiver . . . . . . . . . . . . . . . . . . . 23 85 4.1.5. createDataChannel . . . . . . . . . . . . . . . . . . 23 86 4.1.6. createOffer . . . . . . . . . . . . . . . . . . . . . 23 87 4.1.7. createAnswer . . . . . . . . . . . . . . . . . . . . 24 88 4.1.8. SessionDescriptionType . . . . . . . . . . . . . . . 25 89 4.1.8.1. Use of Provisional Answers . . . . . . . . . . . 26 90 4.1.8.2. Rollback . . . . . . . . . . . . . . . . . . . . 27 91 4.1.9. setLocalDescription . . . . . . . . . . . . . . . . . 28 92 4.1.10. setRemoteDescription . . . . . . . . . . . . . . . . 28 93 4.1.11. currentLocalDescription . . . . . . . . . . . . . . . 29 94 4.1.12. pendingLocalDescription . . . . . . . . . . . . . . . 29 95 4.1.13. currentRemoteDescription . . . . . . . . . . . . . . 29 96 4.1.14. pendingRemoteDescription . . . . . . . . . . . . . . 29 97 4.1.15. canTrickleIceCandidates . . . . . . . . . . . . . . . 30 98 4.1.16. setConfiguration . . . . . . . . . . . . . . . . . . 30 99 4.1.17. addIceCandidate . . . . . . . . . . . . . . . . . . . 31 100 4.2. RtpTransceiver . . . . . . . . . . . . . . . . . . . . . 32 101 4.2.1. stop . . . . . . . . . . . . . . . . . . . . . . . . 32 102 4.2.2. stopped . . . . . . . . . . . . . . . . . . . . . . . 32 103 4.2.3. setDirection . . . . . . . . . . . . . . . . . . . . 32 104 4.2.4. direction . . . . . . . . . . . . . . . . . . . . . . 32 105 4.2.5. currentDirection . . . . . . . . . . . . . . . . . . 33 106 4.2.6. setCodecPreferences . . . . . . . . . . . . . . . . . 33 107 5. SDP Interaction Procedures . . . . . . . . . . . . . . . . . 33 108 5.1. Requirements Overview . . . . . . . . . . . . . . . . . . 34 109 5.1.1. Usage Requirements . . . . . . . . . . . . . . . . . 34 110 5.1.2. Profile Names and Interoperability . . . . . . . . . 34 111 5.2. Constructing an Offer . . . . . . . . . . . . . . . . . . 35 112 5.2.1. Initial Offers . . . . . . . . . . . . . . . . . . . 35 113 5.2.2. Subsequent Offers . . . . . . . . . . . . . . . . . . 42 114 5.2.3. Options Handling . . . . . . . . . . . . . . . . . . 46 115 5.2.3.1. IceRestart . . . . . . . . . . . . . . . . . . . 46 116 5.2.3.2. VoiceActivityDetection . . . . . . . . . . . . . 46 117 5.3. Generating an Answer . . . . . . . . . . . . . . . . . . 47 118 5.3.1. Initial Answers . . . . . . . . . . . . . . . . . . . 47 119 5.3.2. Subsequent Answers . . . . . . . . . . . . . . . . . 53 120 5.3.3. Options Handling . . . . . . . . . . . . . . . . . . 54 121 5.3.3.1. VoiceActivityDetection . . . . . . . . . . . . . 55 122 5.4. Modifying an Offer or Answer . . . . . . . . . . . . . . 55 123 5.5. Processing a Local Description . . . . . . . . . . . . . 56 124 5.6. Processing a Remote Description . . . . . . . . . . . . . 56 125 5.7. Parsing a Session Description . . . . . . . . . . . . . . 57 126 5.7.1. Session-Level Parsing . . . . . . . . . . . . . . . . 57 127 5.7.2. Media Section Parsing . . . . . . . . . . . . . . . . 59 128 5.7.3. Semantics Verification . . . . . . . . . . . . . . . 61 129 5.8. Applying a Local Description . . . . . . . . . . . . . . 63 130 5.9. Applying a Remote Description . . . . . . . . . . . . . . 64 131 5.10. Applying an Answer . . . . . . . . . . . . . . . . . . . 68 132 6. Processing RTP/RTCP . . . . . . . . . . . . . . . . . . . . . 70 133 7. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 70 134 7.1. Simple Example . . . . . . . . . . . . . . . . . . . . . 71 135 7.2. Detailed Example . . . . . . . . . . . . . . . . . . . . 76 136 7.3. Early Transport Warmup Example . . . . . . . . . . . . . 85 137 8. Security Considerations . . . . . . . . . . . . . . . . . . . 93 138 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 94 139 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 94 140 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 94 141 11.1. Normative References . . . . . . . . . . . . . . . . . . 94 142 11.2. Informative References . . . . . . . . . . . . . . . . . 98 143 Appendix A. Appendix A . . . . . . . . . . . . . . . . . . . . . 100 144 Appendix B. Change log . . . . . . . . . . . . . . . . . . . . . 101 145 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 110 147 1. Introduction 149 This document describes how the W3C WEBRTC RTCPeerConnection 150 interface [W3C.WD-webrtc-20140617] is used to control the setup, 151 management and teardown of a multimedia session. 153 1.1. General Design of JSEP 155 The thinking behind WebRTC call setup has been to fully specify and 156 control the media plane, but to leave the signaling plane up to the 157 application as much as possible. The rationale is that different 158 applications may prefer to use different protocols, such as the 159 existing SIP or Jingle call signaling protocols, or something custom 160 to the particular application, perhaps for a novel use case. In this 161 approach, the key information that needs to be exchanged is the 162 multimedia session description, which specifies the necessary 163 transport and media configuration information necessary to establish 164 the media plane. 166 With these considerations in mind, this document describes the 167 Javascript Session Establishment Protocol (JSEP) that allows for full 168 control of the signaling state machine from Javascript. As described 169 above, JSEP assumes a model in which a Javascript application 170 executes inside a runtime containing WebRTC APIs (the "JSEP 171 implementation"). The JSEP implementation is almost entirely 172 divorced from the core signaling flow, which is instead handled by 173 the Javascript making use of two interfaces: (1) passing in local and 174 remote session descriptions and (2) interacting with the ICE state 175 machine. The combination of the JSEP implementation and the 176 Javascript application is referred to throughout this document as a 177 "JSEP endpoint". 179 In this document, the use of JSEP is described as if it always occurs 180 between two JSEP endpoints. Note though in many cases it will 181 actually be between a JSEP endpoint and some kind of server, such as 182 a gateway or MCU. This distinction is invisible to the JSEP 183 endpoint; it just follows the instructions it is given via the API. 185 JSEP's handling of session descriptions is simple and 186 straightforward. Whenever an offer/answer exchange is needed, the 187 initiating side creates an offer by calling a createOffer() API. The 188 application then uses that offer to set up its local config via the 189 setLocalDescription() API. The offer is finally sent off to the 190 remote side over its preferred signaling mechanism (e.g., 191 WebSockets); upon receipt of that offer, the remote party installs it 192 using the setRemoteDescription() API. 194 To complete the offer/answer exchange, the remote party uses the 195 createAnswer() API to generate an appropriate answer, applies it 196 using the setLocalDescription() API, and sends the answer back to the 197 initiator over the signaling channel. When the initiator gets that 198 answer, it installs it using the setRemoteDescription() API, and 199 initial setup is complete. This process can be repeated for 200 additional offer/answer exchanges. 202 Regarding ICE [RFC5245], JSEP decouples the ICE state machine from 203 the overall signaling state machine, as the ICE state machine must 204 remain in the JSEP implementation, because only the implementation 205 has the necessary knowledge of candidates and other transport info. 206 Performing this separation also provides additional flexibility; in 207 protocols that decouple session descriptions from transport, such as 208 Jingle, the session description can be sent immediately and the 209 transport information can be sent when available. In protocols that 210 don't, such as SIP, the information can be used in the aggregated 211 form. Sending transport information separately can allow for faster 212 ICE and DTLS startup, since ICE checks can start as soon as any 213 transport information is available rather than waiting for all of it. 215 Through its abstraction of signaling, the JSEP approach does require 216 the application to be aware of the signaling process. While the 217 application does not need to understand the contents of session 218 descriptions to set up a call, the application must call the right 219 APIs at the right times, convert the session descriptions and ICE 220 information into the defined messages of its chosen signaling 221 protocol, and perform the reverse conversion on the messages it 222 receives from the other side. 224 One way to mitigate this is to provide a Javascript library that 225 hides this complexity from the developer; said library would 226 implement a given signaling protocol along with its state machine and 227 serialization code, presenting a higher level call-oriented interface 228 to the application developer. For example, libraries exist to adapt 229 the JSEP API into an API suitable for a SIP or XMPP. Thus, JSEP 230 provides greater control for the experienced developer without 231 forcing any additional complexity on the novice developer. 233 1.2. Other Approaches Considered 235 One approach that was considered instead of JSEP was to include a 236 lightweight signaling protocol. Instead of providing session 237 descriptions to the API, the API would produce and consume messages 238 from this protocol. While providing a more high-level API, this put 239 more control of signaling within the JSEP implementation, forcing it 240 to have to understand and handle concepts like signaling glare. In 241 addition, it prevented the application from driving the state machine 242 to a desired state, as is needed in the page reload case. 244 A second approach that was considered but not chosen was to decouple 245 the management of the media control objects from session 246 descriptions, instead offering APIs that would control each component 247 directly. This was rejected based on a feeling that requiring 248 exposure of this level of complexity to the application programmer 249 would not be beneficial; it would result in an API where even a 250 simple example would require a significant amount of code to 251 orchestrate all the needed interactions, as well as creating a large 252 API surface that needed to be agreed upon and documented. In 253 addition, these API points could be called in any order, resulting in 254 a more complex set of interactions with the media subsystem than the 255 JSEP approach, which specifies how session descriptions are to be 256 evaluated and applied. 258 One variation on JSEP that was considered was to keep the basic 259 session description-oriented API, but to move the mechanism for 260 generating offers and answers out of the JSEP implementation. 261 Instead of providing createOffer/createAnswer methods within the 262 implementation, this approach would instead expose a getCapabilities 263 API which would provide the application with the information it 264 needed in order to generate its own session descriptions. This 265 increases the amount of work that the application needs to do; it 266 needs to know how to generate session descriptions from capabilities, 267 and especially how to generate the correct answer from an arbitrary 268 offer and the supported capabilities. While this could certainly be 269 addressed by using a library like the one mentioned above, it 270 basically forces the use of said library even for a simple example. 271 Providing createOffer/createAnswer avoids this problem. 273 2. Terminology 275 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 276 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 277 document are to be interpreted as described in [RFC2119]. 279 3. Semantics and Syntax 281 3.1. Signaling Model 283 JSEP does not specify a particular signaling model or state machine, 284 other than the generic need to exchange session descriptions in the 285 fashion described by [RFC3264] (offer/answer) in order for both sides 286 of the session to know how to conduct the session. JSEP provides 287 mechanisms to create offers and answers, as well as to apply them to 288 a session. However, the JSEP implementation is totally decoupled 289 from the actual mechanism by which these offers and answers are 290 communicated to the remote side, including addressing, 291 retransmission, forking, and glare handling. These issues are left 292 entirely up to the application; the application has complete control 293 over which offers and answers get handed to the implementation, and 294 when. 296 +-----------+ +-----------+ 297 | Web App |<--- App-Specific Signaling -->| Web App | 298 +-----------+ +-----------+ 299 ^ ^ 300 | SDP | SDP 301 V V 302 +-----------+ +-----------+ 303 | JSEP |<----------- Media ------------>| JSEP | 304 | Impl. | | Impl. | 305 +-----------+ +-----------+ 307 Figure 1: JSEP Signaling Model 309 3.2. Session Descriptions and State Machine 311 In order to establish the media plane, the user agent needs specific 312 parameters to indicate what to transmit to the remote side, as well 313 as how to handle the media that is received. These parameters are 314 determined by the exchange of session descriptions in offers and 315 answers, and there are certain details to this process that must be 316 handled in the JSEP APIs. 318 Whether a session description applies to the local side or the remote 319 side affects the meaning of that description. For example, the list 320 of codecs sent to a remote party indicates what the local side is 321 willing to receive, which, when intersected with the set of codecs 322 the remote side supports, specifies what the remote side should send. 323 However, not all parameters follow this rule; for example, the 324 fingerprints [I-D.ietf-mmusic-4572-update] sent to a remote party are 325 calculated based on the local certificate(s) offered; the remote 326 party MUST either accept these parameters or reject them altogether, 327 with no option to choose different values. 329 In addition, various RFCs put different conditions on the format of 330 offers versus answers. For example, an offer may propose an 331 arbitrary number of m= sections (i.e., media descriptions as 332 described in [RFC4566], Section 5.14), but an answer must contain the 333 exact same number as the offer. 335 Lastly, while the exact media parameters are only known only after an 336 offer and an answer have been exchanged, it is possible for the 337 offerer to receive media after they have sent an offer and before 338 they have received an answer. To properly process incoming media in 339 this case, the offerer's media handler must be aware of the details 340 of the offer before the answer arrives. 342 Therefore, in order to handle session descriptions properly, the user 343 agent needs: 345 1. To know if a session description pertains to the local or remote 346 side. 348 2. To know if a session description is an offer or an answer. 350 3. To allow the offer to be specified independently of the answer. 352 JSEP addresses this by adding both setLocalDescription and 353 setRemoteDescription methods and having session description objects 354 contain a type field indicating the type of session description being 355 supplied. This satisfies the requirements listed above for both the 356 offerer, who first calls setLocalDescription(sdp [offer]) and then 357 later setRemoteDescription(sdp [answer]), as well as for the 358 answerer, who first calls setRemoteDescription(sdp [offer]) and then 359 later setLocalDescription(sdp [answer]). 361 JSEP also allows for an answer to be treated as provisional by the 362 application. Provisional answers provide a way for an answerer to 363 communicate initial session parameters back to the offerer, in order 364 to allow the session to begin, while allowing a final answer to be 365 specified later. This concept of a final answer is important to the 366 offer/answer model; when such an answer is received, any extra 367 resources allocated by the caller can be released, now that the exact 368 session configuration is known. These "resources" can include things 369 like extra ICE components, TURN candidates, or video decoders. 370 Provisional answers, on the other hand, do no such deallocation; as a 371 result, multiple dissimilar provisional answers, with their own codec 372 choices, transport parameters, etc., can be received and applied 373 during call setup. Note that the final answer itself may be 374 different than any received provisional answers. 376 In [RFC3264], the constraint at the signaling level is that only one 377 offer can be outstanding for a given session, but at the media stack 378 level, a new offer can be generated at any point. For example, when 379 using SIP for signaling, if one offer is sent, then cancelled using a 380 SIP CANCEL, another offer can be generated even though no answer was 381 received for the first offer. To support this, the JSEP media layer 382 can provide an offer via the createOffer() method whenever the 383 Javascript application needs one for the signaling. The answerer can 384 send back zero or more provisional answers, and finally end the 385 offer-answer exchange by sending a final answer. The state machine 386 for this is as follows: 388 setRemote(OFFER) setLocal(PRANSWER) 389 /-----\ /-----\ 390 | | | | 391 v | v | 392 +---------------+ | +---------------+ | 393 | |----/ | |----/ 394 | | setLocal(PRANSWER) | | 395 | Remote-Offer |------------------- >| Local-Pranswer| 396 | | | | 397 | | | | 398 +---------------+ +---------------+ 399 ^ | | 400 | | setLocal(ANSWER) | 401 setRemote(OFFER) | | 402 | V setLocal(ANSWER) | 403 +---------------+ | 404 | | | 405 | |<---------------------------+ 406 | Stable | 407 | |<---------------------------+ 408 | | | 409 +---------------+ setRemote(ANSWER) | 410 ^ | | 411 | | setLocal(OFFER) | 412 setRemote(ANSWER) | | 413 | V | 414 +---------------+ +---------------+ 415 | | | | 416 | | setRemote(PRANSWER) | | 417 | Local-Offer |------------------- >|Remote-Pranswer| 418 | | | | 419 | |----\ | |----\ 420 +---------------+ | +---------------+ | 421 ^ | ^ | 422 | | | | 423 \-----/ \-----/ 424 setLocal(OFFER) setRemote(PRANSWER) 426 Figure 2: JSEP State Machine 428 Aside from these state transitions there is no other difference 429 between the handling of provisional ("pranswer") and final ("answer") 430 answers. 432 3.3. Session Description Format 434 JSEP's session descriptions use SDP syntax for their internal 435 representation. While this format is not optimal for manipulation 436 from Javascript, it is widely accepted, and frequently updated with 437 new features; any alternate encoding of session descriptions would 438 have to keep pace with the changes to SDP, at least until the time 439 that this new encoding eclipsed SDP in popularity. 441 However, to simplify Javascript processing, and provide for future 442 flexibility, the SDP syntax is encapsulated within a 443 SessionDescription object, which can be constructed from SDP, and be 444 serialized out to SDP. If future specifications agree on a JSON 445 format for session descriptions, we could easily enable this object 446 to generate and consume that JSON. 448 Other methods may be added to SessionDescription in the future to 449 simplify handling of SessionDescriptions from Javascript. In the 450 meantime, Javascript libraries can be used to perform these 451 manipulations. 453 Note that most applications should be able to treat the 454 SessionDescriptions produced and consumed by these various API calls 455 as opaque blobs; that is, the application will not need to read or 456 change them. 458 3.4. Session Description Control 460 In order to give the application control over various common session 461 parameters, JSEP provides control surfaces which tell the JSEP 462 implementation how to generate session descriptions. This avoids the 463 need for Javascript to modify session descriptions in most cases. 465 Changes to these objects result in changes to the session 466 descriptions generated by subsequent createOffer/Answer calls. 468 3.4.1. RtpTransceivers 470 RtpTransceivers allow the application to control the RTP media 471 associated with one m= section. Each RtpTransceiver has an RtpSender 472 and an RtpReceiver, which an application can use to control the 473 sending and receiving of RTP media. The application may also modify 474 the RtpTransceiver directly, for instance, by stopping it. 476 RtpTransceivers generally have a 1:1 mapping with m= sections, 477 although there may be more RtpTransceivers than m= sections when 478 RtpTransceivers are created but not yet associated with a m= section, 479 or if RtpTransceivers have been stopped and disassociated from m= 480 sections. An RtpTransceiver is said to be associated with an m= 481 section if its mid property is non-null; otherwise it is said to be 482 disassociated. The associated m= section is determined using a 483 mapping between transceivers and m= section indices, formed when 484 creating an offer or applying a remote offer. An RtpTransceiver is 485 never associated with more than one m= section, and once a session 486 description is applied, a m= section is always associated with 487 exactly one RtpTransceiver. 489 RtpTransceivers can be created explicitly by the application or 490 implicitly by calling setRemoteDescription with an offer that adds 491 new m= sections. 493 3.4.2. RtpSenders 495 RtpSenders allow the application to control how RTP media is sent. 496 An RtpSender is conceptually responsible for the outgoing RTP 497 stream(s) described by an m= section. This includes encoding the 498 attached MediaStreamTrack, sending RTP media packets, and generating/ 499 processing RTCP for the outgoing RTP streams(s). 501 3.4.3. RtpReceivers 503 RtpReceivers allow the application to inspect how RTP media is 504 received. An RtpReceiver is conceptually responsible for the 505 incoming RTP stream(s) described by an m= section. This includes 506 processing received RTP media packets, decoding the incoming 507 stream(s) to produce a remote MediaStreamTrack, and generating/ 508 processing RTCP for the incoming RTP stream(s). 510 3.5. ICE 512 3.5.1. ICE Gathering Overview 514 JSEP gathers ICE candidates as needed by the application. Collection 515 of ICE candidates is referred to as a gathering phase, and this is 516 triggered either by the addition of a new or recycled m= section to 517 the local session description, or new ICE credentials in the 518 description, indicating an ICE restart. Use of new ICE credentials 519 can be triggered explicitly by the application, or implicitly by the 520 JSEP implementation in response to changes in the ICE configuration. 522 When the ICE configuration changes in a way that requires a new 523 gathering phase, a 'needs-ice-restart' bit is set. When this bit is 524 set, calls to the createOffer API will generate new ICE credentials. 525 This bit is cleared by a call to the setLocalDescription API with new 526 ICE credentials from either an offer or an answer, i.e., from either 527 a local- or remote-initiated ICE restart. 529 When a new gathering phase starts, the ICE agent will notify the 530 application that gathering is occurring through an event. Then, when 531 each new ICE candidate becomes available, the ICE agent will supply 532 it to the application via an additional event; these candidates will 533 also automatically be added to the current and/or pending local 534 session description. Finally, when all candidates have been 535 gathered, an event will be dispatched to signal that the gathering 536 process is complete. 538 Note that gathering phases only gather the candidates needed by 539 new/recycled/restarting m= sections; other m= sections continue to 540 use their existing candidates. Also, when bundling is active, 541 candidates are only gathered (and exchanged) for the m= sections 542 referenced in BUNDLE-tags, as described in 543 [I-D.ietf-mmusic-sdp-bundle-negotiation]. 545 3.5.2. ICE Candidate Trickling 547 Candidate trickling is a technique through which a caller may 548 incrementally provide candidates to the callee after the initial 549 offer has been dispatched; the semantics of "Trickle ICE" are defined 550 in [I-D.ietf-ice-trickle]. This process allows the callee to begin 551 acting upon the call and setting up the ICE (and perhaps DTLS) 552 connections immediately, without having to wait for the caller to 553 gather all possible candidates. This results in faster media setup 554 in cases where gathering is not performed prior to initiating the 555 call. 557 JSEP supports optional candidate trickling by providing APIs, as 558 described above, that provide control and feedback on the ICE 559 candidate gathering process. Applications that support candidate 560 trickling can send the initial offer immediately and send individual 561 candidates when they get the notified of a new candidate; 562 applications that do not support this feature can simply wait for the 563 indication that gathering is complete, and then create and send their 564 offer, with all the candidates, at this time. 566 Upon receipt of trickled candidates, the receiving application will 567 supply them to its ICE agent. This triggers the ICE agent to start 568 using the new remote candidates for connectivity checks. 570 3.5.2.1. ICE Candidate Format 572 In JSEP, ICE candidates are abstracted by an IceCandidate object, and 573 as with session descriptions, SDP syntax is used for the internal 574 representation. 576 The candidate details are specified in an IceCandidate field, using 577 the same SDP syntax as the "candidate-attribute" field defined in 578 [RFC5245], Section 15.1. For example: 580 candidate:1 1 UDP 1694498815 192.0.2.33 10000 typ host 582 The IceCandidate object contains a field to indicate which ICE ufrag 583 it is associated with, as defined in [RFC5245], Section 15.4. This 584 value is used to determine which session description (and thereby 585 which gathering phase) this IceCandidate belongs to, which helps 586 resolve ambiguities during ICE restarts. If this field is absent in 587 a received IceCandidate (perhaps when communicating with a non-JSEP 588 endpoint), the most recently received session description is assumed. 590 The IceCandidate object also contains fields to indicate which m= 591 section it is associated with, which can be identified in one of two 592 ways, either by a m= section index, or a MID. The m= section index 593 is a zero-based index, with index N referring to the N+1th m= section 594 in the session description referenced by this IceCandidate. The MID 595 is a "media stream identification" value, as defined in [RFC5888], 596 Section 4, which provides a more robust way to identify the m= 597 section in the session description, using the MID of the associated 598 RtpTransceiver object (which may have been locally generated by the 599 answerer when interacting with a non-JSEP endpoint that does not 600 support the MID attribute, as discussed in Section 5.9 below). If 601 the MID field is present in a received IceCandidate, it MUST be used 602 for identification; otherwise, the m= section index is used instead. 604 When creating an IceCandidate object, JSEP implementations MUST 605 populate all of these fields. 607 3.5.3. ICE Candidate Policy 609 Typically, when gathering ICE candidates, the JSEP implementation 610 will gather all possible forms of initial candidates - host, server 611 reflexive, and relay. However, in certain cases, applications may 612 want to have more specific control over the gathering process, due to 613 privacy or related concerns. For example, one may want to only use 614 relay candidates, to leak as little location information as possible 615 (keeping in mind that this choice comes with corresponding 616 operational costs). To accomplish this, JSEP allows the application 617 to restrict which ICE candidates are used in a session. Note that 618 this filtering is applied on top of any restrictions the 619 implementation chooses to enforce regarding which IP addresses are 620 permitted for the application, as discussed in 621 [I-D.ietf-rtcweb-ip-handling]. 623 There may also be cases where the application wants to change which 624 types of candidates are used while the session is active. A prime 625 example is where a callee may initially want to use only relay 626 candidates, to avoid leaking location information to an arbitrary 627 caller, but then change to use all candidates (for lower operational 628 cost) once the user has indicated they want to take the call. For 629 this scenario, the JSEP implementation MUST allow the candidate 630 policy to be changed in mid-session, subject to the aforementioned 631 interactions with local policy. 633 To administer the ICE candidate policy, the JSEP implementation will 634 determine the current setting at the start of each gathering phase. 635 Then, during the gathering phase, the implementation MUST NOT expose 636 candidates disallowed by the current policy to the application, use 637 them as the source of connectivity checks, or indirectly expose them 638 via other fields, such as the raddr/rport attributes for other ICE 639 candidates. Later, if a different policy is specified by the 640 application, the application can apply it by kicking off a new 641 gathering phase via an ICE restart. 643 3.5.4. ICE Candidate Pool 645 JSEP applications typically inform the JSEP implementation to begin 646 ICE gathering via the information supplied to setLocalDescription, as 647 this is where the app specifies the number of media streams, and 648 thereby ICE components, for which to gather candidates. However, to 649 accelerate cases where the application knows the number of ICE 650 components to use ahead of time, it may ask the implementation to 651 gather a pool of potential ICE candidates to help ensure rapid media 652 setup. 654 When setLocalDescription is eventually called, and the JSEP 655 implementation goes to gather the needed ICE candidates, it SHOULD 656 start by checking if any candidates are available in the pool. If 657 there are candidates in the pool, they SHOULD be handed to the 658 application immediately via the ICE candidate event. If the pool 659 becomes depleted, either because a larger-than-expected number of ICE 660 components is used, or because the pool has not had enough time to 661 gather candidates, the remaining candidates are gathered as usual. 662 This only occurs for the first offer/answer exchange, after which the 663 candidate pool is emptied and no longer used. 665 One example of where this concept is useful is an application that 666 expects an incoming call at some point in the future, and wants to 667 minimize the time it takes to establish connectivity, to avoid 668 clipping of initial media. By pre-gathering candidates into the 669 pool, it can exchange and start sending connectivity checks from 670 these candidates almost immediately upon receipt of a call. Note 671 though that by holding on to these pre-gathered candidates, which 672 will be kept alive as long as they may be needed, the application 673 will consume resources on the STUN/TURN servers it is using. 675 3.6. Video Size Negotiation 677 Video size negotiation is the process through which a receiver can 678 use the "a=imageattr" SDP attribute [RFC6236] to indicate what video 679 frame sizes it is capable of receiving. A receiver may have hard 680 limits on what its video decoder can process, or it may have some 681 maximum set by policy. 683 Note that certain codecs support transmission of samples with aspect 684 ratios other than 1.0 (i.e., non-square pixels). JSEP 685 implementations will not transmit non-square pixels, but SHOULD 686 receive and render such video with the correct aspect ratio. 687 However, sample aspect ratio has no impact on the size negotiation 688 described below; all dimensions are measured in pixels, whether 689 square or not. 691 3.6.1. Creating an imageattr Attribute 693 The receiver will first intersect any known local limits (e.g., 694 hardware decoder capababilities, local policy) to determine the 695 absolute minimum and maximum sizes it can receive. If there are no 696 known local limits, the "a=imageattr" attribute SHOULD be omitted. 698 Otherwise, an "a=imageattr" attribute is created with "recv" 699 direction, and the resulting resolution space formed from the 700 aforementioned intersection is used to specify its minimum and 701 maximum x= and y= values. If the intersection is the null set, i.e., 702 the degenerate case of no permitted resolutions, this MUST be 703 represented by x=0 and y=0 values. 705 The rules here express a single set of preferences, and therefore, 706 the "a=imageattr" q= value is not important. It SHOULD be set to 707 1.0. 709 The "a=imageattr" field is payload type specific. When all video 710 codecs supported have the same capabilities, use of a single 711 attribute, with the wildcard payload type (*), is RECOMMENDED. 712 However, when the supported video codecs have different limitations, 713 specific "a=imageattr" attributes MUST be inserted for each payload 714 type. 716 As an example, consider a system with a multiformat video decoder, 717 which is capable of decoding any resolution from 48x48 to 720p, In 718 this case, the implementation would generate this attribute: 720 a=imageattr:* recv [x=[48:1280],y=[48:720],q=1.0] 722 This declaration indicates that the receiver is capable of decoding 723 any image resolution from 48x48 up to 1280x720 pixels. 725 3.6.2. Interpreting an imageattr Attribute 727 [RFC6236] defines "a=imageattr" to be an advisory field. This means 728 that it does not absolutely constrain the video formats that the 729 sender can use, but gives an indication of the preferred values. 731 This specification prescribes more specific behavior. When a sender 732 of a given MediaStreamTrack, which is producing video of a certain 733 resolution, receives an "a=imageattr recv" attribute, it MUST check 734 to see if the original resolution meets the size criteria specified 735 in the attribute, and adapt the resolution accordingly by scaling (if 736 appropriate). Note that when considering a MediaStreamTrack that is 737 producing rotated video, the unrotated resolution MUST be used. This 738 is required regardless of whether the receiver supports performing 739 receive-side rotation (e.g., through CVO [TS26.114]), as it 740 significantly simplifies the matching logic. 742 For the purposes of resolution negotiation, only size limits are 743 considered. Any other values, e.g. picture or sample aspect ratio, 744 MUST be ignored. 746 When communicating with a non-JSEP endpoint, multiple relevant 747 "a=imageattr recv" attributes may be present in a received m= 748 section. If this occurs, attributes other than the one with the 749 highest "q=" value MUST be ignored. If multiple attributes have the 750 same "q=" value, those that appear after the first such attribute in 751 the m= section MUST be ignored. 753 If an "a=imageattr recv" attribute references a different video 754 payload type than what has been selected for sending the 755 MediaStreamTrack, it MUST be ignored. 757 If the original resolution matches the size limits in the attribute, 758 the track MUST be transmitted untouched. 760 If the original resolution exceeds the size limits in the attribute, 761 the sender SHOULD apply downscaling to the output of the 762 MediaStreamTrack in order to satisfy the limits. Downscaling MUST 763 NOT change the track aspect ratio. 765 If the original resolution is less than the size limits in the 766 attribute, upscaling is needed, but this may not be appropriate in 767 all cases. To address this concern, the application can set an 768 upscaling policy for each sent track. For this case, if upscaling is 769 permitted by policy, the sender SHOULD apply upscaling in order to 770 provide the desired resolution. Otherwise, the sender MUST NOT apply 771 upscaling. The sender SHOULD NOT upscale in other cases, even if the 772 policy permits it. Upscaling MUST NOT change the track aspect ratio. 774 If there is no appropriate and permitted scaling mechanism that 775 allows the received size limits to be satisfied, the sender MUST NOT 776 transmit the track. 778 If the attribute includes a "sar=" (sample aspect ratio) value set to 779 something other than "1.0", indicating the receiver wants to receive 780 non-square pixels, this cannot be satisfied and the sender MUST NOT 781 transmit the track. 783 In the special case of receiving a maximum resolution of [0, 0], as 784 described above, the sender MUST NOT transmit the track. 786 3.7. Simulcast 788 JSEP supports simulcast transmission of a MediaStreamTrack, where 789 multiple encodings of the source media can be transmitted within the 790 context of a single m= section. The current JSEP API is designed to 791 allow applications to send simulcasted media but only to receive a 792 single encoding. This allows for multi-user scenarios where each 793 sending client sends multiple encodings to a server, which then, for 794 each receiving client, chooses the appropriate encoding to forward. 796 Applications request support for simulcast by configuring multiple 797 encodings on an RtpSender, which, upon generation of an offer or 798 answer, are indicated in SDP markings on the corresponding m= 799 section, as described below. Receivers that understand simulcast and 800 are willing to receive it will also include SDP markings to indicate 801 their support, and JSEP endpoints will use these markings to 802 determine whether simulcast is permitted for a given RtpSender. If 803 simulcast support is not negotiated, the RtpSender will only use the 804 first configured encoding. 806 Note that the exact simulcast parameters are up to the sending 807 application. While the aforementioned SDP markings are provided to 808 ensure the remote side can receive and demux multiple simulcast 809 encodings, the specific resolutions and bitrates to be used for each 810 encoding are purely a send-side decision in JSEP. 812 JSEP currently does not provide a mechanism to configure receipt of 813 simulcast. This means that if simulcast is offered by the remote 814 endpoint, the answer generated by a JSEP endpoint will not indicate 815 support for receipt of simulcast, and as such the remote endpoint 816 will only send a single encoding per m= section. 818 In addition, JSEP does not provide a mechanism to handle an incoming 819 offer requesting simulcast from the JSEP endpoint. This means that 820 established simulcast streams will continue to work through a 821 received re-offer, but setting up initial simulcast by way of a 822 received offer requires out-of-band signaling or SDP inspection. 823 Future versions of this specification may add additional APIs to 824 provide direct control. 826 When using JSEP to transmit multiple encodings from a RtpSender, the 827 techniques from [I-D.ietf-mmusic-sdp-simulcast] and 828 [I-D.ietf-mmusic-rid] are used. Specifically, when multiple 829 encodings have been configured for a RtpSender, the m= section for 830 the RtpSender will include an "a=simulcast" attribute, as defined in 831 [I-D.ietf-mmusic-sdp-simulcast], Section 6.2, with a "send" simulcast 832 stream description that lists each desired encoding, and no "recv" 833 simulcast stream description. The m= section will also include an 834 "a=rid" attribute for each encoding, as specified in 835 [I-D.ietf-mmusic-rid], Section 4; the use of RID identifiers allows 836 the individual encodings to be disambiguated even though they are all 837 part of the same m= section. 839 3.8. Interactions With Forking 841 Some call signaling systems allow various types of forking where an 842 SDP Offer may be provided to more than one device. For example, SIP 843 [RFC3261] defines both a "Parallel Search" and "Sequential Search". 844 Although these are primarily signaling level issues that are outside 845 the scope of JSEP, they do have some impact on the configuration of 846 the media plane that is relevant. When forking happens at the 847 signaling layer, the Javascript application responsible for the 848 signaling needs to make the decisions about what media should be sent 849 or received at any point of time, as well as which remote endpoint it 850 should communicate with; JSEP is used to make sure the media engine 851 can make the RTP and media perform as required by the application. 852 The basic operations that the applications can have the media engine 853 do are: 855 o Start exchanging media with a given remote peer, but keep all the 856 resources reserved in the offer. 858 o Start exchanging media with a given remote peer, and free any 859 resources in the offer that are not being used. 861 3.8.1. Sequential Forking 863 Sequential forking involves a call being dispatched to multiple 864 remote callees, where each callee can accept the call, but only one 865 active session ever exists at a time; no mixing of received media is 866 performed. 868 JSEP handles sequential forking well, allowing the application to 869 easily control the policy for selecting the desired remote endpoint. 870 When an answer arrives from one of the callees, the application can 871 choose to apply it either as a provisional answer, leaving open the 872 possibility of using a different answer in the future, or apply it as 873 a final answer, ending the setup flow. 875 In a "first-one-wins" situation, the first answer will be applied as 876 a final answer, and the application will reject any subsequent 877 answers. In SIP parlance, this would be ACK + BYE. 879 In a "last-one-wins" situation, all answers would be applied as 880 provisional answers, and any previous call leg will be terminated. 881 At some point, the application will end the setup process, perhaps 882 with a timer; at this point, the application could reapply the 883 pending remote description as a final answer. 885 3.8.2. Parallel Forking 887 Parallel forking involves a call being dispatched to multiple remote 888 callees, where each callee can accept the call, and multiple 889 simultaneous active signaling sessions can be established as a 890 result. If multiple callees send media at the same time, the 891 possibilities for handling this are described in Section 3.1 of 892 [RFC3960]. Most SIP devices today only support exchanging media with 893 a single device at a time, and do not try to mix multiple early media 894 audio sources, as that could result in a confusing situation. For 895 example, consider having a European ringback tone mixed together with 896 the North American ringback tone - the resulting sound would not be 897 like either tone, and would confuse the user. If the signaling 898 application wishes to only exchange media with one of the remote 899 endpoints at a time, then from a media engine point of view, this is 900 exactly like the sequential forking case. 902 In the parallel forking case where the Javascript application wishes 903 to simultaneously exchange media with multiple peers, the flow is 904 slightly more complex, but the Javascript application can follow the 905 strategy that [RFC3960] describes using UPDATE. The UPDATE approach 906 allows the signaling to set up a separate media flow for each peer 907 that it wishes to exchange media with. In JSEP, this offer used in 908 the UPDATE would be formed by simply creating a new PeerConnection 909 and making sure that the same local media streams have been added 910 into this new PeerConnection. Then the new PeerConnection object 911 would produce a SDP offer that could be used by the signaling to 912 perform the UPDATE strategy discussed in [RFC3960]. 914 As a result of sharing the media streams, the application will end up 915 with N parallel PeerConnection sessions, each with a local and remote 916 description and their own local and remote addresses. The media flow 917 from these sessions can be managed using setDirection (see 918 Section 4.2.3), or the application can choose to play out the media 919 from all sessions mixed together. Of course, if the application 920 wants to only keep a single session, it can simply terminate the 921 sessions that it no longer needs. 923 4. Interface 925 This section details the basic operations that must be present to 926 implement JSEP functionality. The actual API exposed in the W3C API 927 may have somewhat different syntax, but should map easily to these 928 concepts. 930 4.1. PeerConnection 932 4.1.1. Constructor 934 The PeerConnection constructor allows the application to specify 935 global parameters for the media session, such as the STUN/TURN 936 servers and credentials to use when gathering candidates, as well as 937 the initial ICE candidate policy and pool size, and also the bundle 938 policy to use. 940 If an ICE candidate policy is specified, it functions as described in 941 Section 3.5.3, causing the JSEP implementation to only surface the 942 permitted candidates (including any implementation-internal 943 filtering) to the application, and only use those candidates for 944 connectivity checks. The set of available policies is as follows: 946 all: All candidates permitted by implementation policy will be 947 gathered and used. 949 relay: All candidates except relay candidates will be filtered out. 950 This obfuscates the location information that might be ascertained 951 by the remote peer from the received candidates. Depending on how 952 the application deploys and chooses relay servers, this could 953 obfuscate location to a metro or possibly even global level. 955 The default ICE candidate policy MUST be set to "all" as this is 956 generally the desired policy, and also typically reduces use of 957 application TURN server resources significantly. 959 If a size is specified for the ICE candidate pool, this indicates the 960 number of ICE components to pre-gather candidates for. Because pre- 961 gathering results in utilizing STUN/TURN server resources for 962 potentially long periods of time, this must only occur upon 963 application request, and therefore the default candidate pool size 964 MUST be zero. 966 The application can specify its preferred policy regarding use of 967 bundle, the multiplexing mechanism defined in 968 [I-D.ietf-mmusic-sdp-bundle-negotiation]. Regardless of policy, the 969 application will always try to negotiate bundle onto a single 970 transport, and will offer a single bundle group across all m= 971 sections; use of this single transport is contingent upon the 972 answerer accepting bundle. However, by specifying a policy from the 973 list below, the application can control exactly how aggressively it 974 will try to bundle media streams together, which affects how it will 975 interoperate with a non-bundle-aware endpoint. When negotiating with 976 a non-bundle-aware endpoint, only the streams not marked as bundle- 977 only streams will be established. 979 The set of available policies is as follows: 981 balanced: The first m= section of each type (audio, video, or 982 application) will contain transport parameters, which will allow 983 an answerer to unbundle that section. The second and any 984 subsequent m= section of each type will be marked bundle-only. 985 The result is that if there are N distinct media types, then 986 candidates will be gathered for for N media streams. This policy 987 balances desire to multiplex with the need to ensure basic audio 988 and video can still be negotiated in legacy cases. When acting as 989 answerer, if there is no bundle group in the offer, the 990 implementation will reject all but the first m= section of each 991 type. 993 max-compat: All m= sections will contain transport parameters; none 994 will be marked as bundle-only. This policy will allow all streams 995 to be received by non-bundle-aware endpoints, but require separate 996 candidates to be gathered for each media stream. 998 max-bundle: Only the first m= section will contain transport 999 parameters; all streams other than the first will be marked as 1000 bundle-only. This policy aims to minimize candidate gathering and 1001 maximize multiplexing, at the cost of less compatibility with 1002 legacy endpoints. When acting as answerer, the implementation 1003 will reject any m= sections other than the first m= section, 1004 unless they are in the same bundle group as that m= section. 1006 As it provides the best tradeoff between performance and 1007 compatibility with legacy endpoints, the default bundle policy MUST 1008 be set to "balanced". 1010 The application can specify its preferred policy regarding use of 1011 RTP/RTCP multiplexing [RFC5761] using one of the following policies: 1013 negotiate: The JSEP implementation will gather both RTP and RTCP 1014 candidates but also will offer "a=rtcp-mux", thus allowing for 1015 compatibility with either multiplexing or non-multiplexing 1016 endpoints. 1018 require: The JSEP implementation will only gather RTP candidates and 1019 will insert an "a=rtcp-mux-only" indication into any new m= 1020 sections in offers it generates. This halves the number of 1021 candidates that the offerer needs to gather. Applying a 1022 description with an m= section that does not contain an "a=rtcp- 1023 mux" attribute will cause an error to be returned. 1025 The default multiplexing policy MUST be set to "require". 1026 Implementations MAY choose to reject attempts by the application to 1027 set the multiplexing policy to "negotiate". 1029 4.1.2. addTrack 1031 The addTrack method adds a MediaStreamTrack to the PeerConnection, 1032 using the MediaStream argument to associate the track with other 1033 tracks in the same MediaStream, so that they can be added to the same 1034 "LS" group when creating an offer or answer. addTrack attempts to 1035 minimize the number of transceivers as follows: If the PeerConnection 1036 is in the "have-remote-offer" state, the track will be attached to 1037 the first compatible transceiver that was created by the most recent 1038 call to setRemoteDescription() and does not have a local track. 1039 Otherwise, a new transceiver will be created, as described in 1040 Section 4.1.4. 1042 4.1.3. removeTrack 1044 The removeTrack method removes a MediaStreamTrack from the 1045 PeerConnection, using the RtpSender argument to indicate which sender 1046 should have its track removed. The sender's track is cleared, and 1047 the sender stops sending. Future calls to createOffer will mark the 1048 m= section associated with the sender as recvonly (if 1049 transceiver.currentDirection is sendrecv) or as inactive (if 1050 transceiver.currentDirection is sendonly). 1052 4.1.4. addTransceiver 1054 The addTransceiver method adds a new RtpTransceiver to the 1055 PeerConnection. If a MediaStreamTrack argument is provided, then the 1056 transceiver will be configured with that media type and the track 1057 will be attached to the transceiver. Otherwise, the application MUST 1058 explicitly specify the type; this mode is useful for creating 1059 recvonly transceivers as well as for creating transceivers to which a 1060 track can be attached at some later point. 1062 At the time of creation, the application can also specify a 1063 transceiver direction attribute, a set of MediaStreams which the 1064 transceiver is associated with (allowing LS group assignments), and a 1065 set of encodings for the media (used for simulcast as described in 1066 Section 3.7). 1068 4.1.5. createDataChannel 1070 The createDataChannel method creates a new data channel and attaches 1071 it to the PeerConnection. If no data channel currently exists for 1072 this PeerConnection, then a new offer/answer exchange is required. 1073 All data channels on a given PeerConnection share the same SCTP/DTLS 1074 association and therefore the same m= section, so subsequent creation 1075 of data channels does not have any impact on the JSEP state. 1077 The createDataChannel method also includes a number of arguments 1078 which are used by the PeerConnection (e.g., maxPacketLifetime) but 1079 are not reflected in the SDP and do not affect the JSEP state. 1081 4.1.6. createOffer 1083 The createOffer method generates a blob of SDP that contains a 1084 [RFC3264] offer with the supported configurations for the session, 1085 including descriptions of the media added to this PeerConnection, the 1086 codec/RTP/RTCP options supported by this implementation, and any 1087 candidates that have been gathered by the ICE agent. An options 1088 parameter may be supplied to provide additional control over the 1089 generated offer. This options parameter allows an application to 1090 trigger an ICE restart, for the purpose of reestablishing 1091 connectivity. 1093 In the initial offer, the generated SDP will contain all desired 1094 functionality for the session (functionality that is supported but 1095 not desired by default may be omitted); for each SDP line, the 1096 generation of the SDP will follow the process defined for generating 1097 an initial offer from the document that specifies the given SDP line. 1098 The exact handling of initial offer generation is detailed in 1099 Section 5.2.1 below. 1101 In the event createOffer is called after the session is established, 1102 createOffer will generate an offer to modify the current session 1103 based on any changes that have been made to the session, e.g., adding 1104 or stopping RtpTransceivers, or requesting an ICE restart. For each 1105 existing stream, the generation of each SDP line must follow the 1106 process defined for generating an updated offer from the RFC that 1107 specifies the given SDP line. For each new stream, the generation of 1108 the SDP must follow the process of generating an initial offer, as 1109 mentioned above. If no changes have been made, or for SDP lines that 1110 are unaffected by the requested changes, the offer will only contain 1111 the parameters negotiated by the last offer-answer exchange. The 1112 exact handling of subsequent offer generation is detailed in 1113 Section 5.2.2. below. 1115 Session descriptions generated by createOffer must be immediately 1116 usable by setLocalDescription; if a system has limited resources 1117 (e.g. a finite number of decoders), createOffer should return an 1118 offer that reflects the current state of the system, so that 1119 setLocalDescription will succeed when it attempts to acquire those 1120 resources. 1122 Calling this method may do things such as generating new ICE 1123 credentials, but does not result in candidate gathering, or cause 1124 media to start or stop flowing. Specifically, the offer is not 1125 applied, and does not become the pending local description, until 1126 setLocalDescription is called. 1128 4.1.7. createAnswer 1130 The createAnswer method generates a blob of SDP that contains a 1131 [RFC3264] SDP answer with the supported configuration for the session 1132 that is compatible with the parameters supplied in the most recent 1133 call to setRemoteDescription, which MUST have been called prior to 1134 calling createAnswer. Like createOffer, the returned blob contains 1135 descriptions of the media added to this PeerConnection, the 1136 codec/RTP/RTCP options negotiated for this session, and any 1137 candidates that have been gathered by the ICE agent. An options 1138 parameter may be supplied to provide additional control over the 1139 generated answer. 1141 As an answer, the generated SDP will contain a specific configuration 1142 that specifies how the media plane should be established; for each 1143 SDP line, the generation of the SDP must follow the process defined 1144 for generating an answer from the document that specifies the given 1145 SDP line. The exact handling of answer generation is detailed in 1146 Section 5.3. below. 1148 Session descriptions generated by createAnswer must be immediately 1149 usable by setLocalDescription; like createOffer, the returned 1150 description should reflect the current state of the system. 1152 Calling this method may do things such as generating new ICE 1153 credentials, but does not trigger candidate gathering or cause a 1154 media state change. Specifically, the answer is not applied, and 1155 does not become the pending local description, until 1156 setLocalDescription is called. 1158 4.1.8. SessionDescriptionType 1160 Session description objects (RTCSessionDescription) may be of type 1161 "offer", "pranswer", "answer" or "rollback". These types provide 1162 information as to how the description parameter should be parsed, and 1163 how the media state should be changed. 1165 "offer" indicates that a description should be parsed as an offer; 1166 said description may include many possible media configurations. A 1167 description used as an "offer" may be applied anytime the 1168 PeerConnection is in a stable state, or as an update to a previously 1169 supplied but unanswered "offer". 1171 "pranswer" indicates that a description should be parsed as an 1172 answer, but not a final answer, and so should not result in the 1173 freeing of allocated resources. It may result in the start of media 1174 transmission, if the answer does not specify an inactive media 1175 direction. A description used as a "pranswer" may be applied as a 1176 response to an "offer", or an update to a previously sent "pranswer". 1178 "answer" indicates that a description should be parsed as an answer, 1179 the offer-answer exchange should be considered complete, and any 1180 resources (decoders, candidates) that are no longer needed can be 1181 released. A description used as an "answer" may be applied as a 1182 response to an "offer", or an update to a previously sent "pranswer". 1184 The only difference between a provisional and final answer is that 1185 the final answer results in the freeing of any unused resources that 1186 were allocated as a result of the offer. As such, the application 1187 can use some discretion on whether an answer should be applied as 1188 provisional or final, and can change the type of the session 1189 description as needed. For example, in a serial forking scenario, an 1190 application may receive multiple "final" answers, one from each 1191 remote endpoint. The application could choose to accept the initial 1192 answers as provisional answers, and only apply an answer as final 1193 when it receives one that meets its criteria (e.g. a live user 1194 instead of voicemail). 1196 "rollback" is a special session description type implying that the 1197 state machine should be rolled back to the previous stable state, as 1198 described in Section 4.1.8.2. The contents MUST be empty. 1200 4.1.8.1. Use of Provisional Answers 1202 Most applications will not need to create answers using the 1203 "pranswer" type. While it is good practice to send an immediate 1204 response to an offer, in order to warm up the session transport and 1205 prevent media clipping, the preferred handling for a JSEP application 1206 is to create and send a "sendonly" final answer with a null 1207 MediaStreamTrack immediately after receiving the offer, which will 1208 prevent media from being sent by the caller, and allow media to be 1209 sent immediately upon answer by the callee. Later, when the callee 1210 actually accepts the call, the application can plug in the real 1211 MediaStreamTrack and create a new "sendrecv" offer to update the 1212 previous offer/answer pair and start bidirectional media flow. While 1213 this could also be done with a "sendonly" pranswer, followed by a 1214 "sendrecv" answer, the initial pranswer leaves the offer-answer 1215 exchange open, which means that the caller cannot send an updated 1216 offer during this time. 1218 As an example, consider a typical JSEP application that wants to set 1219 up audio and video as quickly as possible. When the callee receives 1220 an offer with audio and video MediaStreamTracks, it will send an 1221 immediate answer accepting these tracks as sendonly (meaning that the 1222 caller will not send the callee any media yet, and because the callee 1223 has not yet added its own MediaStreamTracks, the callee will not send 1224 any media either). It will then ask the user to accept the call and 1225 acquire the needed local tracks. Upon acceptance by the user, the 1226 application will plug in the tracks it has acquired, which, because 1227 ICE and DTLS handshaking have likely completed by this point, can 1228 start transmitting immediately. The application will also send a new 1229 offer to the remote side indicating call acceptance and moving the 1230 audio and video to be two-way media. A detailed example flow along 1231 these lines is shown in Section 7.3. 1233 Of course, some applications may not be able to perform this double 1234 offer-answer exchange, particularly ones that are attempting to 1235 gateway to legacy signaling protocols. In these cases, pranswer can 1236 still provide the application with a mechanism to warm up the 1237 transport. 1239 4.1.8.2. Rollback 1241 In certain situations it may be desirable to "undo" a change made to 1242 setLocalDescription or setRemoteDescription. Consider a case where a 1243 call is ongoing, and one side wants to change some of the session 1244 parameters; that side generates an updated offer and then calls 1245 setLocalDescription. However, the remote side, either before or 1246 after setRemoteDescription, decides it does not want to accept the 1247 new parameters, and sends a reject message back to the offerer. Now, 1248 the offerer, and possibly the answerer as well, need to return to a 1249 stable state and the previous local/remote description. To support 1250 this, we introduce the concept of "rollback". 1252 A rollback discards any proposed changes to the session, returning 1253 the state machine to the stable state, and setting the pending local 1254 and/or remote description (see Section 4.1.12 and Section 4.1.14) to 1255 null. Any resources or candidates that were allocated by the 1256 abandoned local description are discarded; any media that is received 1257 will be processed according to the previous local and remote 1258 descriptions. Rollback can only be used to cancel proposed changes; 1259 there is no support for rolling back from a stable state to a 1260 previous stable state. Note that this implies that once the answerer 1261 has performed setLocalDescription with his answer, this cannot be 1262 rolled back. 1264 A rollback will disassociate any RtpTransceivers that were associated 1265 with m= sections by the application of the rolled-back session 1266 description (see Section 5.9 and Section 5.8). This means that some 1267 RtpTransceivers that were previously associated will no longer be 1268 associated with any m= section; in such cases, the value of the 1269 RtpTransceiver's mid property MUST be set to null, and the mapping 1270 between the transceiver and its m= section index MUST be discarded. 1271 RtpTransceivers that were created by applying a remote offer that was 1272 subsequently rolled back MUST be stopped and removed from the 1273 PeerConnection. However, a RtpTransceiver MUST NOT be removed if a 1274 track was attached to the RtpTransceiver via the addTrack method. 1275 This is so that an application may call addTrack, then call 1276 setRemoteDescription with an offer, then roll back that offer, then 1277 call createOffer and have a m= section for the added track appear in 1278 the generated offer. 1280 A rollback is performed by supplying a session description of type 1281 "rollback" with empty contents to either setLocalDescription or 1282 setRemoteDescription, depending on which was most recently used (i.e. 1283 if the new offer was supplied to setLocalDescription, the rollback 1284 should be done using setLocalDescription as well). 1286 4.1.9. setLocalDescription 1288 The setLocalDescription method instructs the PeerConnection to apply 1289 the supplied session description as its local configuration. The 1290 type field indicates whether the description should be processed as 1291 an offer, provisional answer, or final answer; offers and answers are 1292 checked differently, using the various rules that exist for each SDP 1293 line. 1295 This API changes the local media state; among other things, it sets 1296 up local resources for receiving and decoding media. In order to 1297 successfully handle scenarios where the application wants to offer to 1298 change from one media format to a different, incompatible format, the 1299 PeerConnection must be able to simultaneously support use of both the 1300 current and pending local descriptions (e.g., support the codecs that 1301 exist in either description). This dual processing begins when the 1302 PeerConnection enters the have-local-offer state, and continues until 1303 setRemoteDescription is called with either a final answer, at which 1304 point the PeerConnection can fully adopt the pending local 1305 description, or a rollback, which results in a revert to the current 1306 local description. 1308 This API indirectly controls the candidate gathering process. When a 1309 local description is supplied, and the number of transports currently 1310 in use does not match the number of transports needed by the local 1311 description, the PeerConnection will create transports as needed and 1312 begin gathering candidates for each transport, using ones from the 1313 candidate pool if available. 1315 If setRemoteDescription was previously called with an offer, and 1316 setLocalDescription is called with an answer (provisional or final), 1317 and the media directions are compatible, and media is available to 1318 send, this will result in the starting of media transmission. 1320 4.1.10. setRemoteDescription 1322 The setRemoteDescription method instructs the PeerConnection to apply 1323 the supplied session description as the desired remote configuration. 1324 As in setLocalDescription, the type field of the description 1325 indicates how it should be processed. 1327 This API changes the local media state; among other things, it sets 1328 up local resources for sending and encoding media. 1330 If setLocalDescription was previously called with an offer, and 1331 setRemoteDescription is called with an answer (provisional or final), 1332 and the media directions are compatible, and media is available to 1333 send, this will result in the starting of media transmission. 1335 4.1.11. currentLocalDescription 1337 The currentLocalDescription method returns the current negotiated 1338 local description - i.e., the local description from the last 1339 successful offer/answer exchange - in addition to any local 1340 candidates that have been generated by the ICE agent since the local 1341 description was set. 1343 A null object will be returned if an offer/answer exchange has not 1344 yet been completed. 1346 4.1.12. pendingLocalDescription 1348 The pendingLocalDescription method returns a copy of the local 1349 description currently in negotiation - i.e., a local offer set 1350 without any corresponding remote answer - in addition to any local 1351 candidates that have been generated by the ICE agent since the local 1352 description was set. 1354 A null object will be returned if the state of the PeerConnection is 1355 "stable" or "have-remote-offer". 1357 4.1.13. currentRemoteDescription 1359 The currentRemoteDescription method returns a copy of the current 1360 negotiated remote description - i.e., the remote description from the 1361 last successful offer/answer exchange - in addition to any remote 1362 candidates that have been supplied via processIceMessage since the 1363 remote description was set. 1365 A null object will be returned if an offer/answer exchange has not 1366 yet been completed. 1368 4.1.14. pendingRemoteDescription 1370 The pendingRemoteDescription method returns a copy of the remote 1371 description currently in negotiation - i.e., a remote offer set 1372 without any corresponding local answer - in addition to any remote 1373 candidates that have been supplied via processIceMessage since the 1374 remote description was set. 1376 A null object will be returned if the state of the PeerConnection is 1377 "stable" or "have-local-offer". 1379 4.1.15. canTrickleIceCandidates 1381 The canTrickleIceCandidates property indicates whether the remote 1382 side supports receiving trickled candidates. There are three 1383 potential values: 1385 null: No SDP has been received from the other side, so it is not 1386 known if it can handle trickle. This is the initial value before 1387 setRemoteDescription() is called. 1389 true: SDP has been received from the other side indicating that it 1390 can support trickle. 1392 false: SDP has been received from the other side indicating that it 1393 cannot support trickle. 1395 As described in Section 3.5.2, JSEP implementations always provide 1396 candidates to the application individually, consistent with what is 1397 needed for Trickle ICE. However, applications can use the 1398 canTrickleIceCandidates property to determine whether their peer can 1399 actually do Trickle ICE, i.e., whether it is safe to send an initial 1400 offer or answer followed later by candidates as they are gathered. 1401 As "true" is the only value that definitively indicates remote 1402 Trickle ICE support, an application which compares 1403 canTrickleIceCandidates against "true" will by default attempt Half 1404 Trickle on initial offers and Full Trickle on subsequent interactions 1405 with a Trickle ICE-compatible agent. 1407 4.1.16. setConfiguration 1409 The setConfiguration method allows the global configuration of the 1410 PeerConnection, which was initially set by constructor parameters, to 1411 be changed during the session. The effects of this method call 1412 depend on when it is invoked, and differ depending on which specific 1413 parameters are changed: 1415 o Any changes to the STUN/TURN servers to use affect the next 1416 gathering phase. If an ICE gathering phase has already started or 1417 completed, the 'needs-ice-restart' bit mentioned in Section 3.5.1 1418 will be set. This will cause the next call to createOffer to 1419 generate new ICE credentials, for the purpose of forcing an ICE 1420 restart and kicking off a new gathering phase, in which the new 1421 servers will be used. If the ICE candidate pool has a nonzero 1422 size, and a local description has not yet been applied, any 1423 existing candidates will be discarded, and new candidates will be 1424 gathered from the new servers. 1426 o Any change to the ICE candidate policy affects the next gathering 1427 phase. If an ICE gathering phase has already started or 1428 completed, the 'needs-ice-restart' bit will be set. Either way, 1429 changes to the policy have no effect on the candidate pool, 1430 because pooled candidates are not surfaced to the application 1431 until a gathering phase occurs, and so any necessary filtering can 1432 still be done on any pooled candidates. 1434 o The ICE candidate pool size MUST NOT be changed after applying a 1435 local description. If a local description has not yet been 1436 applied, any changes to the ICE candidate pool size take effect 1437 immediately; if increased, additional candidates are pre-gathered; 1438 if decreased, the now-superfluous candidates are discarded. 1440 o The bundle and RTCP-multiplexing policies MUST NOT be changed 1441 after the construction of the PeerConnection. 1443 This call may result in a change to the state of the ICE Agent. 1445 4.1.17. addIceCandidate 1447 The addIceCandidate method provides a remote candidate to the ICE 1448 agent, which, if parsed successfully, will be added to the current 1449 and/or pending remote description according to the rules defined for 1450 Trickle ICE. The pair of MID and ufrag is used to determine the m= 1451 section and ICE candidate generation to which the candidate belongs. 1452 If the MID is not present, the m= section index is used to look up 1453 the locally generated MID (see Section 5.9), which is used in place 1454 of a supplied MID. If these values or the candidate string are 1455 invalid, an error is generated. 1457 The purpose of the ufrag is to resolve ambiguities when trickle ICE 1458 is in progress during an ICE restart. If the ufrag is absent, the 1459 candidate MUST be assumed to belong to the most recently applied 1460 remote description. Connectivity checks will be sent to the new 1461 candidate. 1463 This method can also be used to provide an end-of-candidates 1464 indication to the ICE agent, as defined in [I-D.ietf-ice-trickle]). 1465 The MID and ufrag are used as described above to determine the m= 1466 section and ICE generation for which candidate gathering is complete. 1467 If the ufrag is not present, then the end-of-candidates indication 1468 MUST be assumed to apply to the relevant m= section in the most 1469 recently applied remote description. If neither the MID nor the m= 1470 index is present, then the indication MUST be assumed to apply to all 1471 m= sections in the most recently applied remote description. 1473 This call will result in a change to the state of the ICE Agent, and 1474 may result in a change to media state if it results in connectivity 1475 being established. 1477 4.2. RtpTransceiver 1479 4.2.1. stop 1481 The stop method stops an RtpTransceiver. This will cause future 1482 calls to createOffer to generate a zero port for the associated m= 1483 section. See below for more details. 1485 4.2.2. stopped 1487 The stopped property indicates whether the transceiver has been 1488 stopped, either by a call to stopTransceiver or by applying an answer 1489 that rejects the associated m= section. In either of these cases, it 1490 is set to "true", and otherwise will be set to "false". 1492 A stopped RtpTransceiver does not send any outgoing RTP or RTCP or 1493 process any incoming RTP or RTCP. It cannot be restarted. 1495 4.2.3. setDirection 1497 The setDirection method sets the direction of a transceiver, which 1498 affects the direction property of the associated m= section on future 1499 calls to createOffer and createAnswer. 1501 When creating offers, the transceiver direction is directly reflected 1502 in the output, even for reoffers. When creating answers, the 1503 transceiver direction is intersected with the offered direction, as 1504 explained in the Section 5.3 section below. 1506 Note that while setDirection sets the direction property of the 1507 transceiver immediately (Section 4.2.4), this property does not 1508 immediately affect whether the transceiver's RtpSender will send or 1509 its RtpReceiver will receive. The direction in effect is represented 1510 by the currentDirection property, which is only updated when an 1511 answer is applied. 1513 4.2.4. direction 1515 The direction property indicates the last value passed into 1516 setDirection. If setDirection has never been called, it is set to 1517 the direction the transceiver was initialized with. 1519 4.2.5. currentDirection 1521 The currentDirection property indicates the last negotiated direction 1522 for the transceiver's associated m= section. More specifically, it 1523 indicates the [RFC3264] directional attribute of the associated m= 1524 section in the last applied answer, with "send" and "recv" directions 1525 reversed if it was a remote answer. For example, if the directional 1526 attribute for the associated m= section in a remote answer is 1527 "recvonly", currentDirection is set to "sendonly". 1529 If an answer that references this transceiver has not yet been 1530 applied, or if the transceiver is stopped, currentDirection is set to 1531 null. 1533 4.2.6. setCodecPreferences 1535 The setCodecPreferences method sets the codec preferences of a 1536 transceiver, which in turn affect the presence and order of codecs of 1537 the associated m= section on future calls to createOffer and 1538 createAnswer. Note that setCodecPreferences does not directly affect 1539 which codec the implementation decides to send. It only affects 1540 which codecs the implementation indicates that it prefers to receive, 1541 via the offer or answer. Even when a codec is excluded by 1542 setCodecPreferences, it still may be used to send until the next 1543 offer/answer exchange discards it. 1545 The codec preferences of an RtpTransceiver can cause codecs to be 1546 excluded by subsequent calls to createOffer and createAnswer, in 1547 which case the corresponding media formats in the associated m= 1548 section will be excluded. The codec preferences cannot add media 1549 formats that would otherwise not be present. This includes codecs 1550 that were not negotiated in a previous offer/answer exchange that 1551 included the transceiver. 1553 The codec preferences of an RtpTransceiver can also determine the 1554 order of codecs in subsequent calls to createOffer and createAnswer, 1555 in which case the order of the media formats in the associated m= 1556 section will follow the specified preferences. 1558 5. SDP Interaction Procedures 1560 This section describes the specific procedures to be followed when 1561 creating and parsing SDP objects. 1563 5.1. Requirements Overview 1565 JSEP implementations must comply with the specifications listed below 1566 that govern the creation and processing of offers and answers. 1568 5.1.1. Usage Requirements 1570 All session descriptions handled by JSEP implementations, both local 1571 and remote, MUST indicate support for the following specifications. 1572 If any of these are absent, this omission MUST be treated as an 1573 error. 1575 o ICE, as specified in [RFC5245], MUST be used. Note that the 1576 remote endpoint may use a Lite implementation; implementations 1577 MUST properly handle remote endpoints which do ICE-Lite. 1579 o DTLS [RFC6347] or DTLS-SRTP [RFC5763], MUST be used, as 1580 appropriate for the media type, as specified in 1581 [I-D.ietf-rtcweb-security-arch] 1583 The SDES SRTP keying mechanism from [RFC4568] MUST NOT be used, as 1584 discussed in [I-D.ietf-rtcweb-security-arch]. 1586 5.1.2. Profile Names and Interoperability 1588 For media m= sections, JSEP implementations MUST support the 1589 "UDP/TLS/RTP/SAVPF" profile specified in [RFC7850], and MUST indicate 1590 this profile for each media m= line they produce in an offer. For 1591 data m= sections, implementations MUST support the "UDP/DTLS/SCTP" 1592 profile and MUST indicate this profile for each data m= line they 1593 produce in an offer. Because ICE can select either UDP [RFC5245] or 1594 TCP [RFC6544] transport depending on network conditions, this 1595 advertisement is consistent with ICE eventually selecting either 1596 either UDP or TCP. 1598 Unfortunately, in an attempt at compatibility, some endpoints 1599 generate other profile strings even when they mean to support one of 1600 these profiles. For instance, an endpoint might generate "RTP/AVP" 1601 but supply "a=fingerprint" and "a=rtcp-fb" attributes, indicating its 1602 willingness to support "(UDP,TCP)/TLS/RTP/SAVPF". In order to 1603 simplify compatibility with such endpoints, JSEP implementations MUST 1604 follow the following rules when processing the media m= sections in 1605 an offer: 1607 o The profile in any "m=" line in any answer MUST exactly match the 1608 profile provided in the offer. 1610 o Any profile matching the following patterns MUST be accepted: 1611 "RTP/[S]AVP[F]" and "(UDP/TCP)/TLS/RTP/SAVP[F]" 1613 o Because DTLS-SRTP is REQUIRED, the choice of SAVP or AVP has no 1614 effect; support for DTLS-SRTP is determined by the presence of one 1615 or more "a=fingerprint" attribute. Note that lack of an 1616 "a=fingerprint" attribute will lead to negotiation failure. 1618 o The use of AVPF or AVP simply controls the timing rules used for 1619 RTCP feedback. If AVPF is provided, or an "a=rtcp-fb" attribute 1620 is present, assume AVPF timing, i.e., a default value of "trr- 1621 int=0". Otherwise, assume that AVPF is being used in an AVP 1622 compatible mode and use a value of "trr-int=4000". 1624 o For data m= sections, implementations MUST support receiving the 1625 "UDP/DTLS/SCTP", "TCP/DTLS/SCTP", or "DTLS/SCTP" (for backwards 1626 compatibility) profiles. 1628 Note that re-offers by JSEP implementations MUST use the correct 1629 profile strings even if the initial offer/answer exchange used an 1630 (incorrect) older profile string. 1632 5.2. Constructing an Offer 1634 When createOffer is called, a new SDP description must be created 1635 that includes the functionality specified in 1636 [I-D.ietf-rtcweb-rtp-usage]. The exact details of this process are 1637 explained below. 1639 5.2.1. Initial Offers 1641 When createOffer is called for the first time, the result is known as 1642 the initial offer. 1644 The first step in generating an initial offer is to generate session- 1645 level attributes, as specified in [RFC4566], Section 5. 1646 Specifically: 1648 o The first SDP line MUST be "v=0", as specified in [RFC4566], 1649 Section 5.1 1651 o The second SDP line MUST be an "o=" line, as specified in 1652 [RFC4566], Section 5.2. The value of the field SHOULD 1653 be "-". [RFC3264] requires that the be representable as 1654 a 64-bit signed integer. It is RECOMMENDED that the be 1655 generated as a 64-bit quantity with the high bit being sent to 1656 zero and the remaining 63 bits being cryptographically random. 1657 The value of the tuple 1658 SHOULD be set to a non-meaningful address, such as IN IP4 0.0.0.0, 1659 to prevent leaking the local address in this field. As mentioned 1660 in [RFC4566], the entire o= line needs to be unique, but selecting 1661 a random number for is sufficient to accomplish this. 1663 o The third SDP line MUST be a "s=" line, as specified in [RFC4566], 1664 Section 5.3; to match the "o=" line, a single dash SHOULD be used 1665 as the session name, e.g. "s=-". Note that this differs from the 1666 advice in [RFC4566] which proposes a single space, but as both 1667 "o=" and "s=" are meaningless, having the same meaningless value 1668 seems clearer. 1670 o Session Information ("i="), URI ("u="), Email Address ("e="), 1671 Phone Number ("p="), Repeat Times ("r="), and Time Zones ("z=") 1672 lines are not useful in this context and SHOULD NOT be included. 1674 o Encryption Keys ("k=") lines do not provide sufficient security 1675 and MUST NOT be included. 1677 o A "t=" line MUST be added, as specified in [RFC4566], Section 5.9; 1678 both and SHOULD be set to zero, e.g. "t=0 1679 0". 1681 o An "a=ice-options" line with the "trickle" option MUST be added, 1682 as specified in [I-D.ietf-ice-trickle], Section 4. 1684 o If WebRTC identity is being used, an "a=identity" line as 1685 described in [I-D.ietf-rtcweb-security-arch], Section 5. 1687 The next step is to generate m= sections, as specified in [RFC4566] 1688 Section 5.14. An m= section is generated for each RtpTransceiver 1689 that has been added to the PeerConnection, excluding any stopped 1690 RtpTransceivers. This is done in the order the RtpTransceivers were 1691 added to the PeerConnection. 1693 For each m= section generated for an RtpTransceiver, establish a 1694 mapping between the transceiver and the index of the generated m= 1695 section. 1697 Each m= section, provided it is not marked as bundle-only, MUST 1698 generate a unique set of ICE credentials and gather its own unique 1699 set of ICE candidates. Bundle-only m= sections MUST NOT contain any 1700 ICE credentials and MUST NOT gather any candidates. 1702 For DTLS, all m= sections MUST use all the certificate(s) that have 1703 been specified for the PeerConnection; as a result, they MUST all 1704 have the same [I-D.ietf-mmusic-4572-update] fingerprint value(s), or 1705 these value(s) MUST be session-level attributes. 1707 Each m= section should be generated as specified in [RFC4566], 1708 Section 5.14. For the m= line itself, the following rules MUST be 1709 followed: 1711 o The port value is set to the port of the default ICE candidate for 1712 this m= section, but given that no candidates are available yet, 1713 the "dummy" port value of 9 (Discard) MUST be used, as indicated 1714 in [I-D.ietf-ice-trickle], Section 5.1. 1716 o To properly indicate use of DTLS, the field MUST be set to 1717 "UDP/TLS/RTP/SAVPF", as specified in [RFC5764], Section 8. 1719 o If codec preferences have been set for the associated transceiver, 1720 media formats MUST be generated in the corresponding order, and 1721 MUST exclude any codecs not present in the codec preferences. 1723 o The media formats in the answer MAY include codecs present in the 1724 offer that were discarded in a previous offer/answer exchange. 1725 This is necessary for compatibility with third- party call control 1726 and SIP use cases. 1728 o Unless excluded by the above restrictions, the media formats MUST 1729 include the mandatory audio/video codecs as specified in 1730 [I-D.ietf-rtcweb-audio] (see Section 3) and 1731 [I-D.ietf-rtcweb-video] (see Section 5). 1733 The m= line MUST be followed immediately by a "c=" line, as specified 1734 in [RFC4566], Section 5.7. Again, as no candidates are available 1735 yet, the "c=" line must contain the "dummy" value "IN IP4 0.0.0.0", 1736 as defined in [I-D.ietf-ice-trickle], Section 5.1. 1738 [I-D.ietf-mmusic-sdp-mux-attributes] groups SDP attributes into 1739 different categories. To avoid unnecessary duplication when 1740 bundling, Section 8.1 of [I-D.ietf-mmusic-sdp-bundle-negotiation] 1741 specifies that attributes of category IDENTICAL or TRANSPORT should 1742 not be repeated in bundled m= sections. 1744 The following attributes, which are of a category other than 1745 IDENTICAL or TRANSPORT, MUST be included in each m= section: 1747 o An "a=mid" line, as specified in [RFC5888], Section 4. All MID 1748 values MUST be generated in a fashion that does not leak user 1749 information, e.g., randomly or using a per-PeerConnection counter, 1750 and SHOULD be 3 bytes or less, to allow them to efficiently fit 1751 into the RTP header extension defined in 1752 [I-D.ietf-mmusic-sdp-bundle-negotiation], Section 14. Note that 1753 this does not set the RtpTransceiver mid property, as that only 1754 occurs when the description is applied. The generated MID value 1755 can be considered a "proposed" MID at this point. 1757 o A direction attribute which is the same as that of the associated 1758 transceiver. 1760 o For each media format on the m= line, "a=rtpmap" and "a=fmtp" 1761 lines, as specified in [RFC4566], Section 6, and [RFC3264], 1762 Section 5.1. 1764 o For each primary codec where RTP retransmission should be used, a 1765 corresponding "a=rtpmap" line indicating "rtx" with the clock rate 1766 of the primary codec and an "a=fmtp" line that references the 1767 payload type of the primary codec, as specified in [RFC4588], 1768 Section 8.1. 1770 o For each supported FEC mechanism, "a=rtpmap" and "a=fmtp" lines, 1771 as specified in [RFC4566], Section 6. The FEC mechanisms that 1772 MUST be supported are specified in [I-D.ietf-rtcweb-fec], 1773 Section 6, and specific usage for each media type is outlined in 1774 Sections 4 and 5. 1776 o If this m= section is for media with configurable durations of 1777 media per packet, e.g., audio, an "a=maxptime" line, indicating 1778 the maximum amount of media, specified in milliseconds, that can 1779 be encapsulated in each packet, as specified in [RFC4566], 1780 Section 6. This value is set to the smallest of the maximum 1781 duration values across all the codecs included in the m= section. 1783 o If this m= section is for video media, and there are known 1784 limitations on the size of images which can be decoded, an 1785 "a=imageattr" line, as specified in Section 3.6. 1787 o For each supported RTP header extension, an "a=extmap" line, as 1788 specified in [RFC5285], Section 5. The list of header extensions 1789 that SHOULD/MUST be supported is specified in 1790 [I-D.ietf-rtcweb-rtp-usage], Section 5.2. Any header extensions 1791 that require encryption MUST be specified as indicated in 1792 [RFC6904], Section 4. 1794 o For each supported RTCP feedback mechanism, an "a=rtcp-fb" 1795 mechanism, as specified in [RFC4585], Section 4.2. The list of 1796 RTCP feedback mechanisms that SHOULD/MUST be supported is 1797 specified in [I-D.ietf-rtcweb-rtp-usage], Section 5.1. 1799 o If the RtpTransceiver has a sendrecv or sendonly direction: 1801 * For each MediaStream that was associated with the transceiver 1802 when it was created via addTrack or addTransceiver, an "a=msid" 1803 line, as specified in [I-D.ietf-mmusic-msid], Section 2. If a 1804 MediaStreamTrack is attached to the transceiver's RtpSender, 1805 the "a=msid" lines MUST use that track's ID. If no 1806 MediaStreamTrack is attached, a valid ID MUST be generated, in 1807 the same way that the implementation generates IDs for local 1808 tracks. 1810 * If no MediaStream is associated with the transceiver, a single 1811 "a=msid" line with the special value "-" in place of the 1812 MediaStream ID, as specified in [I-D.ietf-mmusic-msid], 1813 Section 3. The track ID MUST be selected as described above. 1815 o If the RtpTransceiver has a sendrecv or sendonly direction, and 1816 the application has specified RID values or has specified more 1817 than one encoding in the RtpSenders's parameters, an "a=rid" line 1818 for each encoding specified. The "a=rid" line is specified in 1819 [I-D.ietf-mmusic-rid], and its direction MUST be "send". If the 1820 application has chosen a RID value, it MUST be used as the rid- 1821 identifier; otherwise a RID value MUST be generated by the 1822 implementation. RID values MUST be generated in a fashion that 1823 does not leak user information, e.g., randomly or using a per- 1824 PeerConnection counter, and SHOULD be 3 bytes or less, to allow 1825 them to efficiently fit into the RTP header extension defined in 1826 [I-D.ietf-avtext-rid], Section 3. If no encodings have been 1827 specified, or only one encoding is specified but without a RID 1828 value, then no "a=rid" lines are generated. 1830 o If the RtpTransceiver has a sendrecv or sendonly direction and 1831 more than one "a=rid" line has been generated, an "a=simulcast" 1832 line, with direction "send", as defined in 1833 [I-D.ietf-mmusic-sdp-simulcast], Section 6.2. The list of RIDs 1834 MUST include all of the RID identifiers used in the "a=rid" lines 1835 for this m= section. 1837 o If the bundle policy for this PeerConnection is set to "max- 1838 bundle", and this is not the first m= section, or the bundle 1839 policy is set to "balanced", and this is not the first m= section 1840 for this media type, an "a=bundle-only" line. 1842 The following attributes, which are of category IDENTICAL or 1843 TRANSPORT, MUST appear only in "m=" sections which either have a 1844 unique address or which are associated with the bundle-tag. (In 1845 initial offers, this means those "m=" sections which do not contain 1846 an "a=bundle-only" attribute.) 1847 o "a=ice-ufrag" and "a=ice-pwd" lines, as specified in [RFC5245], 1848 Section 15.4. 1850 o An "a=fingerprint" line for each of the endpoint's certificates, 1851 as specified in [RFC4572], Section 5; the digest algorithm used 1852 for the fingerprint MUST match that used in the certificate 1853 signature. 1855 o An "a=setup" line, as specified in [RFC4145], Section 4, and 1856 clarified for use in DTLS-SRTP scenarios in [RFC5763], Section 5. 1857 The role value in the offer MUST be "actpass". 1859 o An "a=dtls-id" line, as specified in [I-D.ietf-mmusic-dtls-sdp] 1860 Section 5.2. 1862 o An "a=rtcp" line, as specified in [RFC3605], Section 2.1, 1863 containing the dummy value "9 IN IP4 0.0.0.0", because no 1864 candidates have yet been gathered. 1866 o An "a=rtcp-mux" line, as specified in [RFC5761], Section 5.1.3. 1868 o If the RTP/RTCP multiplexing policy is "require", an "a=rtcp-mux- 1869 only" line, as specified in [I-D.ietf-mmusic-mux-exclusive], 1870 Section 4. 1872 o An "a=rtcp-rsize" line, as specified in [RFC5506], Section 5. 1874 Lastly, if a data channel has been created, a m= section MUST be 1875 generated for data. The field MUST be set to "application" 1876 and the field MUST be set to "UDP/DTLS/SCTP" 1877 [I-D.ietf-mmusic-sctp-sdp]. The "fmt" value MUST be set to "webrtc- 1878 datachannel" as specified in [I-D.ietf-mmusic-sctp-sdp], Section 4.1. 1880 Within the data m= section, an "a=mid" line MUST be generated and 1881 included as described above, along with an "a=sctp-port" line 1882 referencing the SCTP port number, as defined in 1883 [I-D.ietf-mmusic-sctp-sdp], Section 5.1, and, if appropriate, an 1884 "a=max-message-size" line, as defined in [I-D.ietf-mmusic-sctp-sdp], 1885 Section 6.1. 1887 As discussed above, the following attributes of category IDENTICAL or 1888 TRANSPORT are included only if the data m= section either has a 1889 unique address or is associated with the bundle-tag (e.g., if it is 1890 the only m= section): 1892 o "a=ice-ufrag" 1894 o "a=ice-pwd" 1895 o "a=fingerprint" 1897 o "a=setup" 1899 o "a=dtls-id" 1901 Once all m= sections have been generated, a session-level "a=group" 1902 attribute MUST be added as specified in [RFC5888]. This attribute 1903 MUST have semantics "BUNDLE", and MUST include the mid identifiers of 1904 each m= section. The effect of this is that the JSEP implementation 1905 offers all m= sections as one bundle group. However, whether the m= 1906 sections are bundle-only or not depends on the bundle policy. 1908 The next step is to generate session-level lip sync groups as defined 1909 in [RFC5888], Section 7. For each MediaStream referenced by more 1910 than one RtpTransceiver (by passing those MediaStreams as arguments 1911 to the addTrack and addTransceiver methods), a group of type "LS" 1912 MUST be added that contains the mid values for each RtpTransceiver. 1914 Attributes which SDP permits to either be at the session level or the 1915 media level SHOULD generally be at the media level even if they are 1916 identical. This promotes readability, especially if one of a set of 1917 initially identical attributes is subsequently changed. 1919 Attributes other than the ones specified above MAY be included, 1920 except for the following attributes which are specifically 1921 incompatible with the requirements of [I-D.ietf-rtcweb-rtp-usage], 1922 and MUST NOT be included: 1924 o "a=crypto" 1926 o "a=key-mgmt" 1928 o "a=ice-lite" 1930 Note that when bundle is used, any additional attributes that are 1931 added MUST follow the advice in [I-D.ietf-mmusic-sdp-mux-attributes] 1932 on how those attributes interact with bundle. 1934 Note that these requirements are in some cases stricter than those of 1935 SDP. Implementations MUST be prepared to accept compliant SDP even 1936 if it would not conform to the requirements for generating SDP in 1937 this specification. 1939 5.2.2. Subsequent Offers 1941 When createOffer is called a second (or later) time, or is called 1942 after a local description has already been installed, the processing 1943 is somewhat different than for an initial offer. 1945 If the initial offer was not applied using setLocalDescription, 1946 meaning the PeerConnection is still in the "stable" state, the steps 1947 for generating an initial offer should be followed, subject to the 1948 following restriction: 1950 o The fields of the "o=" line MUST stay the same except for the 1951 field, which MUST increment by one on each call 1952 to createOffer if the offer might differ from the output of the 1953 previous call to createOffer; implementations MAY opt to increment 1954 on every call. The value of the generated 1955 is independent of the of the 1956 current local description; in particular, in the case where the 1957 current version is N, an offer is created and applied with version 1958 N+1, and then that offer is rolled back so that the current 1959 version is again N, the next generated offer will still have 1960 version N+2. 1962 Note that if the application creates an offer by reading 1963 currentLocalDescription instead of calling createOffer, the returned 1964 SDP may be different than when setLocalDescription was originally 1965 called, due to the addition of gathered ICE candidates, but the 1966 will not have changed. There are no known 1967 scenarios in which this causes problems, but if this is a concern, 1968 the solution is simply to use createOffer to ensure a unique 1969 . 1971 If the initial offer was applied using setLocalDescription, but an 1972 answer from the remote side has not yet been applied, meaning the 1973 PeerConnection is still in the "local-offer" state, an offer is 1974 generated by following the steps in the "stable" state above, along 1975 with these exceptions: 1977 o The "s=" and "t=" lines MUST stay the same. 1979 o If any RtpTransceiver has been added, and there exists an m= 1980 section with a zero port in the current local description or the 1981 current remote description, that m= section MUST be recycled by 1982 generating an m= section for the added RtpTransceiver as if the m= 1983 section were being added to the session description, placed at the 1984 same index as the m= section with a zero port. 1986 o If an RtpTransceiver is stopped and is not associated with an m= 1987 section, an m= section MUST NOT be generated for it. This 1988 prevents adding back RtpTransceivers whose m= sections were 1989 recycled and used for a new RtpTransceiver in a previous offer/ 1990 answer exchange, as described above. 1992 o If an RtpTransceiver has been stopped and is associated with an m= 1993 section, and the m= section is not being recycled as described 1994 above, an m= section MUST be generated for it with the port set to 1995 zero and all "a=msid" lines removed. 1997 o For RtpTransceivers that are not stopped, the "a=msid" line(s) 1998 MUST stay the same if they are present in the current description, 1999 regardless of changes to the transceiver's direction or track. If 2000 no "a=msid" line is present in the current description, "a=msid" 2001 line(s) MUST be generated according to the same rules as for an 2002 initial offer. 2004 o Each "m=" and c=" line MUST be filled in with the port, protocol, 2005 and address of the default candidate for the m= section, as 2006 described in [RFC5245], Section 4.3. If ICE checking has already 2007 completed for one or more candidate pairs and a candidate pair is 2008 in active use, then that pair MUST be used, even if ICE has not 2009 yet completed. Note that this differs from the guidance in 2010 [RFC5245], Section 9.1.2.2, which only refers to offers created 2011 when ICE has completed. In each case, if no RTP candidates have 2012 yet been gathered, dummy values MUST be used, as described above. 2014 o Each "a=mid" line MUST stay the same. 2016 o Each "a=ice-ufrag" and "a=ice-pwd" line MUST stay the same, unless 2017 the ICE configuration has changed (either changes to the supported 2018 STUN/TURN servers, or the ICE candidate policy), or the 2019 "IceRestart" option ( Section 5.2.3.1 was specified. If the m= 2020 section is bundled into another m= section, it still MUST NOT 2021 contain any ICE credentials. 2023 o If the m= section is not bundled into another m= section, its 2024 "a=rtcp" attribute line MUST be filled in with the port and 2025 address of the default RTCP candidate, as indicated in [RFC5761], 2026 Section 5.1.3. If no RTCP candidates have yet been gathered, 2027 dummy values MUST be used, as described in the initial offer 2028 section above. 2030 o If the m= section is not bundled into another m= section, for each 2031 candidate that has been gathered during the most recent gathering 2032 phase (see Section 3.5.1), an "a=candidate" line MUST be added, as 2033 defined in [RFC5245], Section 4.3., paragraph 3. If candidate 2034 gathering for the section has completed, an "a=end-of-candidates" 2035 attribute MUST be added, as described in [I-D.ietf-ice-trickle], 2036 Section 9.3. If the m= section is bundled into another m= 2037 section, both "a=candidate" and "a=end-of-candidates" MUST be 2038 omitted. 2040 o For RtpTransceivers that are still present, the "a=rid" lines MUST 2041 stay the same. 2043 o For RtpTransceivers that are still present, any "a=simulcast" line 2044 MUST stay the same. 2046 o If any RtpTransceiver has been stopped, the port MUST be set to 2047 zero and all "a=msid" lines MUST be removed. 2049 o If any RtpTransceiver has been added, and there exists a m= 2050 section with a zero port in the current local description or the 2051 current remote description, that m= section MUST be recycled by 2052 generating a m= section for the added RtpTransceiver as if the m= 2053 section were being added to session description, except that 2054 instead of adding it, the generated m= section replaces the m= 2055 section with a zero port. The new m= section MUST contain a new 2056 MID. 2058 If the initial offer was applied using setLocalDescription, and an 2059 answer from the remote side has been applied using 2060 setRemoteDescription, meaning the PeerConnection is in the "remote- 2061 pranswer" or "stable" states, an offer is generated based on the 2062 negotiated session descriptions by following the steps mentioned for 2063 the "local-offer" state above. 2065 In addition, for each non-recycled, non-rejected m= section in the 2066 new offer, the following adjustments are made based on the contents 2067 of the corresponding m= section in the current remote description, if 2068 any: 2070 o The m= line and corresponding "a=rtpmap" and "a=fmtp" lines MUST 2071 only include codecs present in the most recent answer which have 2072 not been excluded by the codec preferences of the associated 2073 transceiver. Note that non-JSEP endpoints are not subject to 2074 these restrictions, and might offer media formats that were not 2075 present in the most recent answer, as specified in [RFC3264], 2076 Section 8. Therefore, JSEP implementations MUST be prepared to 2077 receive such offers. 2079 o Unless codec preferences have been set for the associated 2080 transceiver, the media formats on the m= line MUST be generated in 2081 the same order as in the current local description. 2083 o The RTP header extensions MUST only include those that are present 2084 in the most recent answer. 2086 o The RTCP feedback extensions MUST only include those that are 2087 present in the most recent answer. 2089 o The "a=rtcp" line MUST only be added if the most recent answer did 2090 not include an "a=rtcp-mux" line. 2092 o The "a=rtcp-mux" line MUST only be added if present in the most 2093 recent answer. 2095 o The "a=rtcp-mux-only" line MUST NOT be added. 2097 o The "a=rtcp-rsize" line MUST only be added if present in the most 2098 recent answer. 2100 o An "a=bundle-only" line MUST NOT be added, as indicated in 2101 [I-D.ietf-mmusic-sdp-bundle-negotiation], Section 6. Instead, 2102 JSEP implementations MUST simply omit parameters in the IDENTICAL 2103 and TRANSPORT categories for bundled m= sections, as described in 2104 [I-D.ietf-mmusic-sdp-bundle-negotiation], Section 8.1. 2106 o Note that if media m= sections are bundled into a data m= section, 2107 then certain TRANSPORT and IDENTICAL attributes may appear in the 2108 data m= section even if they would otherwise only be appropriate 2109 for a media m= section (e.g., "a=rtcp-mux"). This cannot happen 2110 in initial offers because in the initial offer JSEP 2111 implementations always list media m= sections (if any) before the 2112 data m= section (if any), and at least one of those media m= 2113 sections will not have the "a=bundle-only" attribute. Therefore, 2114 in initial offers, any "a=bundle-only" m= sections will be bundled 2115 into a preceding non-bundle-only media m= section. 2117 The "a=group:BUNDLE" attribute MUST include the MID identifiers 2118 specified in the bundle group in the most recent answer, minus any m= 2119 sections that have been marked as rejected, plus any newly added or 2120 re-enabled m= sections. In other words, the bundle attribute must 2121 contain all m= sections that were previously bundled, as long as they 2122 are still alive, as well as any new m= sections. 2124 "a=group:LS" attributes are generated in the same way as for initial 2125 offers, with the additional stipulation that any lip sync groups that 2126 were present in the most recent answer MUST continue to exist and 2127 MUST contain any previously existing MID identifiers, as long as the 2128 identified m= sections still exist and are not rejected, and the 2129 group still contains at least two MID identifiers. This ensures that 2130 any synchronized "recvonly" m= sections continue to be synchronized 2131 in the new offer. 2133 5.2.3. Options Handling 2135 The createOffer method takes as a parameter an RTCOfferOptions 2136 object. Special processing is performed when generating a SDP 2137 description if the following options are present. 2139 5.2.3.1. IceRestart 2141 If the "IceRestart" option is specified, with a value of "true", the 2142 offer MUST indicate an ICE restart by generating new ICE ufrag and 2143 pwd attributes, as specified in [RFC5245], Section 9.1.1.1. If this 2144 option is specified on an initial offer, it has no effect (since a 2145 new ICE ufrag and pwd are already generated). Similarly, if the ICE 2146 configuration has changed, this option has no effect, since new ufrag 2147 and pwd attributes will be generated automatically. This option is 2148 primarily useful for reestablishing connectivity in cases where 2149 failures are detected by the application. 2151 5.2.3.2. VoiceActivityDetection 2153 If the "VoiceActivityDetection" option is specified, with a value of 2154 "true", the offer MUST indicate support for silence suppression in 2155 the audio it receives by including comfort noise ("CN") codecs for 2156 each offered audio codec, as specified in [RFC3389], Section 5.1, 2157 except for codecs that have their own internal silence suppression 2158 support. For codecs that have their own internal silence suppression 2159 support, the appropriate fmtp parameters for that codec MUST be 2160 specified to indicate that silence suppression for received audio is 2161 desired. For example, when using the Opus codec [RFC6716], the 2162 "usedtx=1" parameter, specified in [RFC7587], would be used in the 2163 offer. This option allows the endpoint to significantly reduce the 2164 amount of audio bandwidth it receives, at the cost of some fidelity, 2165 depending on the quality of the remote VAD algorithm. 2167 If the "VoiceActivityDetection" option is specified, with a value of 2168 "false", the JSEP implementation MUST NOT emit "CN" codecs. For 2169 codecs that have their own internal silence suppression support, the 2170 appropriate fmtp parameters for that codec MUST be specified to 2171 indicate that silence suppression for received audio is not desired. 2172 For example, when using the Opus codec, the "usedtx=0" parameter 2173 would be specified in the offer. 2175 Note that setting the "VoiceActivityDetection" parameter when 2176 generating an offer is a request to receive audio with silence 2177 suppression. It has no impact on whether the local endpoint does 2178 silence suppression for the audio it sends. 2180 The "VoiceActivityDetection" option does not have any impact on the 2181 setting of the "vad" value in the signaling of the client to mixer 2182 audio level header extension described in [RFC6464], Section 4. 2184 5.3. Generating an Answer 2186 When createAnswer is called, a new SDP description must be created 2187 that is compatible with the supplied remote description as well as 2188 the requirements specified in [I-D.ietf-rtcweb-rtp-usage]. The exact 2189 details of this process are explained below. 2191 5.3.1. Initial Answers 2193 When createAnswer is called for the first time after a remote 2194 description has been provided, the result is known as the initial 2195 answer. If no remote description has been installed, an answer 2196 cannot be generated, and an error MUST be returned. 2198 Note that the remote description SDP may not have been created by a 2199 JSEP endpoint and may not conform to all the requirements listed in 2200 Section 5.2. For many cases, this is not a problem. However, if any 2201 mandatory SDP attributes are missing, or functionality listed as 2202 mandatory-to-use above is not present, this MUST be treated as an 2203 error, and MUST cause the affected m= sections to be marked as 2204 rejected. 2206 The first step in generating an initial answer is to generate 2207 session-level attributes. The process here is identical to that 2208 indicated in the initial offers section above, except that the 2209 "a=ice-options" line, with the "trickle" option as specified in 2210 [I-D.ietf-ice-trickle], Section 4, is only included if such an option 2211 was present in the offer. 2213 The next step is to generate session-level lip sync groups, as 2214 defined in [RFC5888], Section 7. For each group of type "LS" present 2215 in the offer, select the local RtpTransceivers that are referenced by 2216 the MID values in the specified group, and determine which of them 2217 either reference a common local MediaStream (specified in the calls 2218 to addTrack/addTransceiver used to create them), or have no 2219 MediaStream to reference because they were not created by addTrack/ 2220 addTransceiver. If at least two such RtpTransceivers exist, a group 2221 of type "LS" with the mid values of these RtpTransceivers MUST be 2222 added. Otherwise the offered "LS" group MUST be ignored and no 2223 corresponding group generated in the answer. 2225 As a simple example, consider the following offer of a single audio 2226 and single video track contained in the same MediaStream. SDP lines 2227 not relevant to this example have been removed for clarity. As 2228 explained in Section 5.2, a group of type "LS" has been added that 2229 references each track's RtpTransceiver. 2231 a=group:LS a1 v1 2232 m=audio 10000 UDP/TLS/RTP/SAVPF 0 2233 a=mid:a1 2234 a=msid:ms1 mst1a 2235 m=video 10001 UDP/TLS/RTP/SAVPF 96 2236 a=mid:v1 2237 a=msid:ms1 mst1v 2239 If the answerer uses a single MediaStream when it adds its tracks, 2240 both of its transceivers will reference this stream, and so the 2241 subsequent answer will contain a "LS" group identical to that in the 2242 offer, as shown below: 2244 a=group:LS a1 v1 2245 m=audio 20000 UDP/TLS/RTP/SAVPF 0 2246 a=mid:a1 2247 a=msid:ms2 mst2a 2248 m=video 20001 UDP/TLS/RTP/SAVPF 96 2249 a=mid:v1 2250 a=msid:ms2 mst2v 2252 However, if the answerer groups its tracks into separate 2253 MediaStreams, its transceivers will reference different streams, and 2254 so the subsequent answer will not contain a "LS" group. 2256 m=audio 20000 UDP/TLS/RTP/SAVPF 0 2257 a=mid:a1 2258 a=msid:ms2a mst2a 2259 m=video 20001 UDP/TLS/RTP/SAVPF 96 2260 a=mid:v1 2261 a=msid:ms2b mst2v 2263 Finally, if the answerer does not add any tracks, its transceivers 2264 will not reference any MediaStreams, causing the preferences of the 2265 offerer to be maintained, and so the subsequent answer will contain 2266 an identical "LS" group. 2268 a=group:LS a1 v1 2269 m=audio 20000 UDP/TLS/RTP/SAVPF 0 2270 a=mid:a1 2271 a=recvonly 2272 m=video 20001 UDP/TLS/RTP/SAVPF 96 2273 a=mid:v1 2274 a=recvonly 2276 The Section 7.2 example later in this document shows a more involved 2277 case of "LS" group generation. 2279 The next step is to generate m= sections for each m= section that is 2280 present in the remote offer, as specified in [RFC3264], Section 6. 2281 For the purposes of this discussion, any session-level attributes in 2282 the offer that are also valid as media-level attributes are 2283 considered to be present in each m= section. 2285 The next step is to go through each offered m= section. Each offered 2286 m= section will have an associated RtpTransceiver, as described in 2287 Section 5.9. If there are more RtpTransceivers than there are m= 2288 sections, the unmatched RtpTransceivers will need to be associated in 2289 a subsequent offer. 2291 For each offered m= section, if any of the following conditions are 2292 true, the corresponding m= section in the answer MUST be marked as 2293 rejected by setting the port in the m= line to zero, as indicated in 2294 [RFC3264], Section 6., and further processing for this m= section can 2295 be skipped: 2297 o The associated RtpTransceiver has been stopped. 2299 o No supported codec is present in the offer. 2301 o The bundle policy is "max-bundle", and this is not the first m= 2302 section or in the same bundle group as the first m= section. 2304 o The bundle policy is "balanced", and this is not the first m= 2305 section for this media type or in the same bundle group as the 2306 first m= section for this media type. 2308 Otherwise, each m= section in the answer should then be generated as 2309 specified in [RFC3264], Section 6.1. For the m= line itself, the 2310 following rules must be followed: 2312 o The port value would normally be set to the port of the default 2313 ICE candidate for this m= section, but given that no candidates 2314 are available yet, the "dummy" port value of 9 (Discard) MUST be 2315 used, as indicated in [I-D.ietf-ice-trickle], Section 5.1. 2317 o The field MUST be set to exactly match the field 2318 for the corresponding m= line in the offer. 2320 o If codec preferences have been set for the associated transceiver, 2321 media formats MUST be generated in the corresponding order, and 2322 MUST exclude any codecs not present in the codec preferences or 2323 not present in the offer. Note that non-JSEP endpoints are not 2324 subject to this restriction, and might add media formats in the 2325 answer that are not present in the offer, as specified in 2326 [RFC3264], Section 6.1. Therefore, JSEP implementations MUST be 2327 prepared to receive such answers. 2329 o Unless excluded by the above restrictions, the media formats MUST 2330 include the mandatory audio/video codecs as specified in 2331 [I-D.ietf-rtcweb-audio] (see Section 3) and 2332 [I-D.ietf-rtcweb-video] (see Section 5). 2334 The m= line MUST be followed immediately by a "c=" line, as specified 2335 in [RFC4566], Section 5.7. Again, as no candidates are available 2336 yet, the "c=" line must contain the "dummy" value "IN IP4 0.0.0.0", 2337 as defined in [I-D.ietf-ice-trickle], Section 5.1. 2339 If the offer supports bundle, all m= sections to be bundled must use 2340 the same ICE credentials and candidates; all m= sections not being 2341 bundled must use unique ICE credentials and candidates. Each m= 2342 section MUST contain the following attributes (which are of attribute 2343 types other than IDENTICAL and TRANSPORT): 2345 o If and only if present in the offer, an "a=mid" line, as specified 2346 in [RFC5888], Section 9.1. The "mid" value MUST match that 2347 specified in the offer. 2349 o A direction attribute, determined by applying the rules regarding 2350 the offered direction specified in [RFC3264], Section 6.1, and 2351 then intersecting with the direction of the associated 2352 RtpTransceiver. For example, in the case where an m= section is 2353 offered as "sendonly", and the local transceiver is set to 2354 "sendrecv", the result in the answer is a "recvonly" direction. 2356 o For each media format on the m= line, "a=rtpmap" and "a=fmtp" 2357 lines, as specified in [RFC4566], Section 6, and [RFC3264], 2358 Section 6.1. 2360 o If "rtx" is present in the offer, for each primary codec where RTP 2361 retransmission should be used, a corresponding "a=rtpmap" line 2362 indicating "rtx" with the clock rate of the primary codec and an 2363 "a=fmtp" line that references the payload type of the primary 2364 codec, as specified in [RFC4588], Section 8.1. 2366 o For each supported FEC mechanism, "a=rtpmap" and "a=fmtp" lines, 2367 as specified in [RFC4566], Section 6. The FEC mechanisms that 2368 MUST be supported are specified in [I-D.ietf-rtcweb-fec], 2369 Section 6, and specific usage for each media type is outlined in 2370 Sections 4 and 5. 2372 o If this m= section is for media with configurable durations of 2373 media per packet, e.g., audio, an "a=maxptime" line, as described 2374 in Section 5.2. 2376 o If this m= section is for video media, and there are known 2377 limitations on the size of images which can be decoded, an 2378 "a=imageattr" line, as specified in Section 3.6. 2380 o For each supported RTP header extension that is present in the 2381 offer, an "a=extmap" line, as specified in [RFC5285], Section 5. 2382 The list of header extensions that SHOULD/MUST be supported is 2383 specified in [I-D.ietf-rtcweb-rtp-usage], Section 5.2. Any header 2384 extensions that require encryption MUST be specified as indicated 2385 in [RFC6904], Section 4. 2387 o For each supported RTCP feedback mechanism that is present in the 2388 offer, an "a=rtcp-fb" mechanism, as specified in [RFC4585], 2389 Section 4.2. The list of RTCP feedback mechanisms that SHOULD/ 2390 MUST be supported is specified in [I-D.ietf-rtcweb-rtp-usage], 2391 Section 5.1. 2393 o If the RtpTransceiver has a sendrecv or sendonly direction: 2395 * For each MediaStream that was associated with the transceiver 2396 when it was created via addTrack or addTransceiver, an "a=msid" 2397 line, as specified in [I-D.ietf-mmusic-msid], Section 2. If a 2398 MediaStreamTrack is attached to the transceiver's RtpSender, 2399 the "a=msid" lines MUST use that track's ID. If no 2400 MediaStreamTrack is attached, a valid ID MUST be generated, in 2401 the same way that the implementation generates IDs for local 2402 tracks. 2404 * If no MediaStream is associated with the transceiver, a single 2405 "a=msid" line with the special value "-" in place of the 2406 MediaStream ID, as specified in [I-D.ietf-mmusic-msid], 2407 Section 3. The track ID MUST be selected as described above. 2409 Each m= section which is not bundled into another m= section, MUST 2410 contain the following attributes (which are of category IDENTICAL or 2411 TRANSPORT): 2413 o "a=ice-ufrag" and "a=ice-pwd" lines, as specified in [RFC5245], 2414 Section 15.4. 2416 o An "a=fingerprint" line for each of the endpoint's certificates, 2417 as specified in [RFC4572], Section 5; the digest algorithm used 2418 for the fingerprint MUST match that used in the certificate 2419 signature. 2421 o An "a=setup" line, as specified in [RFC4145], Section 4, and 2422 clarified for use in DTLS-SRTP scenarios in [RFC5763], Section 5. 2423 The role value in the answer MUST be "active" or "passive"; the 2424 "active" role is RECOMMENDED. 2426 o An "a=dtls-id" line, as specified in [I-D.ietf-mmusic-dtls-sdp] 2427 Section 5.3. 2429 o If present in the offer, an "a=rtcp-mux" line, as specified in 2430 [RFC5761], Section 5.1.3. Otherwise, an "a=rtcp" line, as 2431 specified in [RFC3605], Section 2.1, containing the dummy value "9 2432 IN IP4 0.0.0.0" (because no candidates have yet been gathered). 2434 o If present in the offer, an "a=rtcp-rsize" line, as specified in 2435 [RFC5506], Section 5. 2437 If a data channel m= section has been offered, a m= section MUST also 2438 be generated for data. The field MUST be set to 2439 "application" and the and fields MUST be set to exactly 2440 match the fields in the offer. 2442 Within the data m= section, an "a=mid" line MUST be generated and 2443 included as described above, along with an "a=sctp-port" line 2444 referencing the SCTP port number, as defined in 2445 [I-D.ietf-mmusic-sctp-sdp], Section 5.1, and, if appropriate, an 2446 "a=max-message-size" line, as defined in [I-D.ietf-mmusic-sctp-sdp], 2447 Section 6.1. 2449 As discussed above, the following attributes of category IDENTICAL or 2450 TRANSPORT are included only if the data m= section is not bundled 2451 into another m= section: 2453 o "a=ice-ufrag" 2455 o "a=ice-pwd" 2456 o "a=fingerprint" 2458 o "a=setup" 2460 o "a=dtls-id" 2462 Note that if media m= sections are bundled into a data m= section, 2463 then certain TRANSPORT and IDENTICAL attributes may also appear in 2464 the data m= section even if they would otherwise only be appropriate 2465 for a media m= section (e.g., "a=rtcp-mux"). 2467 If "a=group" attributes with semantics of "BUNDLE" are offered, 2468 corresponding session-level "a=group" attributes MUST be added as 2469 specified in [RFC5888]. These attributes MUST have semantics 2470 "BUNDLE", and MUST include the all mid identifiers from the offered 2471 bundle groups that have not been rejected. Note that regardless of 2472 the presence of "a=bundle-only" in the offer, no m= sections in the 2473 answer should have an "a=bundle-only" line. 2475 Attributes that are common between all m= sections MAY be moved to 2476 session-level, if explicitly defined to be valid at session-level. 2478 The attributes prohibited in the creation of offers are also 2479 prohibited in the creation of answers. 2481 5.3.2. Subsequent Answers 2483 When createAnswer is called a second (or later) time, or is called 2484 after a local description has already been installed, the processing 2485 is somewhat different than for an initial answer. 2487 If the initial answer was not applied using setLocalDescription, 2488 meaning the PeerConnection is still in the "have-remote-offer" state, 2489 the steps for generating an initial answer should be followed, 2490 subject to the following restriction: 2492 o The fields of the "o=" line MUST stay the same except for the 2493 field, which MUST increment if the session 2494 description changes in any way from the previously generated 2495 answer. 2497 If any session description was previously supplied to 2498 setLocalDescription, an answer is generated by following the steps in 2499 the "have-remote-offer" state above, along with these exceptions: 2501 o The "s=" and "t=" lines MUST stay the same. 2503 o Each "m=" and c=" line MUST be filled in with the port and address 2504 of the default candidate for the m= section, as described in 2505 [RFC5245], Section 4.3. Note, however, that the m= line protocol 2506 need not match the default candidate, because this protocol value 2507 must instead match what was supplied in the offer, as described 2508 above. 2510 o Unless codec preferences have been set for the associated 2511 transceiver, the media formats on the m= line MUST be generated in 2512 the same order as in the current local description. 2514 o Each "a=ice-ufrag" and "a=ice-pwd" line MUST stay the same, unless 2515 the m= section is restarting, in which case new ICE credentials 2516 must be created as specified in [RFC5245], Section 9.2.1.1. If 2517 the m= section is bundled into another m= section, it still MUST 2518 NOT contain any ICE credentials. 2520 o Each "a=setup" line MUST use an "active" or "passive" role value 2521 consistent with the existing DTLS association, if the association 2522 is being continued by the offerer. 2524 o If the m= section is not bundled into another m= section and RTCP 2525 multiplexing is not active, an "a=rtcp" attribute line MUST be 2526 filled in with the port and address of the default RTCP candidate. 2527 If no RTCP candidates have yet been gathered, dummy values MUST be 2528 used, as described in the initial answer section above. 2530 o If the m= section is not bundled into another m= section, for each 2531 candidate that has been gathered during the most recent gathering 2532 phase (see Section 3.5.1), an "a=candidate" line MUST be added, as 2533 defined in [RFC5245], Section 4.3., paragraph 3. If candidate 2534 gathering for the section has completed, an "a=end-of-candidates" 2535 attribute MUST be added, as described in [I-D.ietf-ice-trickle], 2536 Section 9.3. If the m= section is bundled into another m= 2537 section, both "a=candidate" and "a=end-of-candidates" MUST be 2538 omitted. 2540 o For RtpTransceivers that are not stopped, the "a=msid" line(s) 2541 MUST stay the same, regardless of changes to the transceiver's 2542 direction or track. If no "a=msid" line is present in the current 2543 description, "a=msid" line(s) MUST be generated according to the 2544 same rules as for an initial answer. 2546 5.3.3. Options Handling 2548 The createAnswer method takes as a parameter an RTCAnswerOptions 2549 object. The set of parameters for RTCAnswerOptions is different than 2550 those supported in RTCOfferOptions; the IceRestart option is 2551 unnecessary, as ICE credentials will automatically be changed for all 2552 m= sections where the offerer chose to perform ICE restart. 2554 The following options are supported in RTCAnswerOptions. 2556 5.3.3.1. VoiceActivityDetection 2558 Silence suppression in the answer is handled as described in 2559 Section 5.2.3.2, with one exception: if support for silence 2560 suppression was not indicated in the offer, the 2561 VoiceActivityDetection parameter has no effect, and the answer should 2562 be generated as if VoiceActivityDetection was set to false. This is 2563 done on a per-codec basis (e.g., if the offerer somehow offered 2564 support for CN but set "usedtx=0" for Opus, setting 2565 VoiceActivityDetection to true would result in an answer with CN 2566 codecs and "usedtx=0"). 2568 5.4. Modifying an Offer or Answer 2570 The SDP returned from createOffer or createAnswer MUST NOT be changed 2571 before passing it to setLocalDescription. If precise control over 2572 the SDP is needed, the aforementioned createOffer/createAnswer 2573 options or RtpTransceiver APIs MUST be used. 2575 Note that the application MAY modify the SDP to reduce the 2576 capabilities in the offer it sends to the far side (post- 2577 setLocalDescription) or the offer that it installs from the far side 2578 (pre-setRemoteDescription), as long as it remains a valid SDP offer 2579 and specifies a subset of what was in the original offer. This is 2580 safe because the answer is not permitted to expand capabilities, and 2581 therefore will just respond to what is present in the offer. 2583 The application SHOULD NOT modify the SDP in the answer it transmits, 2584 as the answer contains the negotiated capabilities, and this can 2585 cause the two sides to have different ideas about what exactly was 2586 negotiated. 2588 As always, the application is solely responsible for what it sends to 2589 the other party, and all incoming SDP will be processed by the JSEP 2590 implementation to the extent of its capabilities. It is an error to 2591 assume that all SDP is well-formed; however, one should be able to 2592 assume that any implementation of this specification will be able to 2593 process, as a remote offer or answer, unmodified SDP coming from any 2594 other implementation of this specification. 2596 5.5. Processing a Local Description 2598 When a SessionDescription is supplied to setLocalDescription, the 2599 following steps MUST be performed: 2601 o First, the type of the SessionDescription is checked against the 2602 current state of the PeerConnection: 2604 * If the type is "offer", the PeerConnection state MUST be either 2605 "stable" or "have-local-offer". 2607 * If the type is "pranswer" or "answer", the PeerConnection state 2608 MUST be either "have-remote-offer" or "have-local-pranswer". 2610 o If the type is not correct for the current state, processing MUST 2611 stop and an error MUST be returned. 2613 o The SessionDescription is then checked to ensure that its contents 2614 are identical to those generated in the last call to createOffer/ 2615 createAnswer, and thus have not been altered, as discussed in 2616 Section 5.4; otherwise, processing MUST stop and an error MUST be 2617 returned. 2619 o Next, the SessionDescription is parsed into a data structure, as 2620 described in the Section 5.7 section below. If parsing fails for 2621 any reason, processing MUST stop and an error MUST be returned. 2623 o Finally, the parsed SessionDescription is applied as described in 2624 the Section 5.8 section below. 2626 5.6. Processing a Remote Description 2628 When a SessionDescription is supplied to setRemoteDescription, the 2629 following steps MUST be performed: 2631 o First, the type of the SessionDescription is checked against the 2632 current state of the PeerConnection: 2634 * If the type is "offer", the PeerConnection state MUST be either 2635 "stable" or "have-remote-offer". 2637 * If the type is "pranswer" or "answer", the PeerConnection state 2638 MUST be either "have-local-offer" or "have-remote-pranswer". 2640 o If the type is not correct for the current state, processing MUST 2641 stop and an error MUST be returned. 2643 o Next, the SessionDescription is parsed into a data structure, as 2644 described in the Section 5.7 section below. If parsing fails for 2645 any reason, processing MUST stop and an error MUST be returned. 2647 o Finally, the parsed SessionDescription is applied as described in 2648 the Section 5.9 section below. 2650 5.7. Parsing a Session Description 2652 When a SessionDescription of any type is supplied to setLocal/ 2653 RemoteDescription, the implementation must parse it and reject it if 2654 it is invalid. The exact details of this process are explained 2655 below. 2657 The SDP contained in the session description object consists of a 2658 sequence of text lines, each containing a key-value expression, as 2659 described in [RFC4566], Section 5. The SDP is read, line-by-line, 2660 and converted to a data structure that contains the deserialized 2661 information. However, SDP allows many types of lines, not all of 2662 which are relevant to JSEP applications. For each line, the 2663 implementation will first ensure it is syntactically correct 2664 according to its defining ABNF, check that it conforms to [RFC4566] 2665 and [RFC3264] semantics, and then either parse and store or discard 2666 the provided value, as described below. 2668 If any line is not well-formed, or cannot be parsed as described, the 2669 parser MUST stop with an error and reject the session description, 2670 even if the value is to be discarded. This ensures that 2671 implementations do not accidentally misinterpret ambiguous SDP. 2673 5.7.1. Session-Level Parsing 2675 First, the session-level lines are checked and parsed. These lines 2676 MUST occur in a specific order, and with a specific syntax, as 2677 defined in [RFC4566], Section 5. Note that while the specific line 2678 types (e.g. "v=", "c=") MUST occur in the defined order, lines of the 2679 same type (typically "a=") can occur in any order, and their ordering 2680 is not meaningful. 2682 The following non-attribute lines are not meaningful in the JSEP 2683 context and MAY be discarded once they have been checked. 2685 The "c=" line MUST be checked for syntax but its value is not 2686 used. This supersedes the guidance in [RFC5245], Section 6.1, to 2687 use "ice-mismatch" to indicate mismatches between "c=" and the 2688 candidate lines; because JSEP always uses ICE, "ice-mismatch" is 2689 not useful in this context. 2691 The "i=", "u=", "e=", "p=", "t=", "r=", "z=", and "k=" lines are 2692 not used by this specification; they MUST be checked for syntax 2693 but their values are not used. 2695 The remaining non-attribute lines are processed as follows: 2697 The "v=" line MUST have a version of 0, as specified in [RFC4566], 2698 Section 5.1. 2700 The "o=" line MUST be parsed as specified in [RFC4566], 2701 Section 5.2. 2703 The "b=" line, if present, MUST be parsed as specified in 2704 [RFC4566], Section 5.8, and the bwtype and bandwidth values 2705 stored. 2707 Finally, the attribute lines are processed. Specific processing MUST 2708 be applied for the following session-level attribute ("a=") lines: 2710 o Any "a=group" lines are parsed as specified in [RFC5888], 2711 Section 5, and the group's semantics and mids are stored. 2713 o If present, a single "a=ice-lite" line is parsed as specified in 2714 [RFC5245], Section 15.3, and a value indicating the presence of 2715 ice-lite is stored. 2717 o If present, a single "a=ice-ufrag" line is parsed as specified in 2718 [RFC5245], Section 15.4, and the ufrag value is stored. 2720 o If present, a single "a=ice-pwd" line is parsed as specified in 2721 [RFC5245], Section 15.4, and the password value is stored. 2723 o If present, a single "a=ice-options" line is parsed as specified 2724 in [RFC5245], Section 15.5, and the set of specified options is 2725 stored. 2727 o Any "a=fingerprint" lines are parsed as specified in [RFC4572], 2728 Section 5, and the set of fingerprint and algorithm values is 2729 stored. 2731 o If present, a single "a=setup" line is parsed as specified in 2732 [RFC4145], Section 4, and the setup value is stored. 2734 o If present, a single "a=dtls-id" line is parsed as specified in 2735 [I-D.ietf-mmusic-dtls-sdp] Section 5, and the dtls-id value is 2736 stored. 2738 o Any "a=identity" lines are parsed and the identity values stored 2739 for subsequent verification, as specified 2740 [I-D.ietf-rtcweb-security-arch], Section 5. 2742 o Any "a=extmap" lines are parsed as specified in [RFC5285], 2743 Section 5, and their values are stored. 2745 As required by [RFC4566], Section 5.13, unknown attribute lines MUST 2746 be ignored. 2748 Once all the session-level lines have been parsed, processing 2749 continues with the lines in m= sections. 2751 5.7.2. Media Section Parsing 2753 Like the session-level lines, the media section lines MUST occur in 2754 the specific order and with the specific syntax defined in [RFC4566], 2755 Section 5. 2757 The "m=" line itself MUST be parsed as described in [RFC4566], 2758 Section 5.14, and the media, port, proto, and fmt values stored. 2760 Following the "m=" line, specific processing MUST be applied for the 2761 following non-attribute lines: 2763 o As with the "c=" line at the session level, the "c=" line MUST be 2764 parsed according to [RFC4566], Section 5.7, but its value is not 2765 used. 2767 o The "b=" line, if present, MUST be parsed as specified in 2768 [RFC4566], Section 5.8, and the bwtype and bandwidth values 2769 stored. 2771 Specific processing MUST also be applied for the following attribute 2772 lines: 2774 o If present, a single "a=ice-ufrag" line is parsed as specified in 2775 [RFC5245], Section 15.4, and the ufrag value is stored. 2777 o If present, a single "a=ice-pwd" line is parsed as specified in 2778 [RFC5245], Section 15.4, and the password value is stored. 2780 o If present, a single "a=ice-options" line is parsed as specified 2781 in [RFC5245], Section 15.5, and the set of specified options is 2782 stored. 2784 o Any "a=candidate" attributes MUST be parsed as specified in 2785 [RFC5245], Section 15.1, and their values stored. 2787 o Any "a=remote-candidates" attributes MUST be parsed as specified 2788 in [RFC5245], Section 15.2, but their values are ignored. 2790 o If present, a single "a=end-of-candidates" attribute MUST be 2791 parsed as specified in [I-D.ietf-ice-trickle], Section 8.2, and 2792 its presence or absence flagged and stored. 2794 o Any "a=fingerprint" lines are parsed as specified in [RFC4572], 2795 Section 5, and the set of fingerprint and algorithm values is 2796 stored. 2798 If the "m=" proto value indicates use of RTP, as described in the 2799 Section 5.1.2 section above, the following attribute lines MUST be 2800 processed: 2802 o The "m=" fmt value MUST be parsed as specified in [RFC4566], 2803 Section 5.14, and the individual values stored. 2805 o Any "a=rtpmap" or "a=fmtp" lines MUST be parsed as specified in 2806 [RFC4566], Section 6, and their values stored. 2808 o If present, a single "a=ptime" line MUST be parsed as described in 2809 [RFC4566], Section 6, and its value stored. 2811 o If present, a single "a=maxptime" line MUST be parsed as described 2812 in [RFC4566], Section 6, and its value stored. 2814 o If present, a single direction attribute line (e.g. "a=sendrecv") 2815 MUST be parsed as described in [RFC4566], Section 6, and its value 2816 stored. 2818 o Any "a=ssrc" or "a=ssrc-group" attributes MUST be parsed as 2819 specified in [RFC5576], Sections 4.1-4.2, and their values stored. 2821 o Any "a=extmap" attributes MUST be parsed as specified in 2822 [RFC5285], Section 5, and their values stored. 2824 o Any "a=rtcp-fb" attributes MUST be parsed as specified in 2825 [RFC4585], Section 4.2., and their values stored. 2827 o If present, a single "a=rtcp-mux" attribute MUST be parsed as 2828 specified in [RFC5761], Section 5.1.3, and its presence or absence 2829 flagged and stored. 2831 o If present, a single "a=rtcp-mux-only" attribute MUST be parsed as 2832 specified in [I-D.ietf-mmusic-mux-exclusive], Section 3, and its 2833 presence or absence flagged and stored. 2835 o If present, a single "a=rtcp-rsize" attribute MUST be parsed as 2836 specified in [RFC5506], Section 5, and its presence or absence 2837 flagged and stored. 2839 o If present, a single "a=rtcp" attribute MUST be parsed as 2840 specified in [RFC3605], Section 2.1, but its value is ignored, as 2841 this information is superfluous when using ICE. 2843 o If present, "a=msid" attributes MUST be parsed as specified in 2844 [I-D.ietf-mmusic-msid], Section 3.2, and their values stored. 2846 o Any "a=imageattr" attributes MUST be parsed as specified in 2847 [RFC6236], Section 3, and their values stored. 2849 o Any "a=rid" lines MUST be parsed as specified in 2850 [I-D.ietf-mmusic-rid], Section 10, and their values stored. 2852 o If present, a single "a=simulcast" line MUST be parsed as 2853 specified in [I-D.ietf-mmusic-sdp-simulcast], and its values 2854 stored. 2856 Otherwise, if the "m=" proto value indicates use of SCTP, the 2857 following attribute lines MUST be processed: 2859 o The "m=" fmt value MUST be parsed as specified in 2860 [I-D.ietf-mmusic-sctp-sdp], Section 4.3, and the application 2861 protocol value stored. 2863 o An "a=sctp-port" attribute MUST be present, and it MUST be parsed 2864 as specified in [I-D.ietf-mmusic-sctp-sdp], Section 5.2, and the 2865 value stored. 2867 o If present, a single "a=max-message-size" attribute MUST be parsed 2868 as specified in [I-D.ietf-mmusic-sctp-sdp], Section 6, and the 2869 value stored. Otherwise, use the specified default. 2871 As required by [RFC4566], Section 5.13, unknown attribute lines MUST 2872 be ignored. 2874 5.7.3. Semantics Verification 2876 Assuming parsing completes successfully, the parsed description is 2877 then evaluated to ensure internal consistency as well as proper 2878 support for mandatory features. Specifically, the following checks 2879 are performed: 2881 o For each m= section, valid values for each of the mandatory-to-use 2882 features enumerated in Section 5.1.1 MUST be present. These 2883 values MAY either be present at the media level, or inherited from 2884 the session level. 2886 * ICE ufrag and password values, which MUST comply with the size 2887 limits specified in [RFC5245], Section 15.4. 2889 * dtls-id value, which MUST be set according to 2890 [I-D.ietf-mmusic-dtls-sdp] Section 5. If this is a re-offer 2891 and the dtls-id value is different from that presently in use, 2892 the DTLS connection is not being continued and the remote 2893 description MUST be part of an ICE restart, together with new 2894 ufrag and password values. If this is an answer, the dtls-id 2895 value, if present, MUST be the same as in the offer. 2897 * DTLS setup value, which MUST be set according to the rules 2898 specified in [RFC5763], Section 5 and MUST be consistent with 2899 the selected role of the current DTLS connection, if one exists 2900 and is being continued. 2902 * DTLS fingerprint values, where at least one fingerprint MUST be 2903 present. 2905 o All RID values referenced in an "a=simulcast" line MUST exist as 2906 "a=rid" lines. 2908 o Each m= section is also checked to ensure prohibited features are 2909 not used. If this is a local description, the "ice-lite" 2910 attribute MUST NOT be specified. 2912 o If the RTP/RTCP multiplexing policy is "require", each m= section 2913 MUST contain an "a=rtcp-mux" attribute. If an "m=" section 2914 contains an "a=rtcp-mux-only" attribute then that section MUST 2915 also contain an "a=rtcp-mux" attribute. 2917 If this session description is of type "pranswer" or "answer", the 2918 following additional checks are applied: 2920 o The session description must follow the rules defined in 2921 [RFC3264], Section 6, including the requirement that the number of 2922 m= sections MUST exactly match the number of m= sections in the 2923 associated offer. 2925 o For each m= section, the media type and protocol values MUST 2926 exactly match the media type and protocol values in the 2927 corresponding m= section in the associated offer. 2929 If any of the preceding checks failed, processing MUST stop and an 2930 error MUST be returned. 2932 5.8. Applying a Local Description 2934 The following steps are performed at the media engine level to apply 2935 a local description. If an error is returned, the session MUST be 2936 restored to the state it was in before performing these steps. 2938 Next, m= sections are processed. For each m= section, the following 2939 steps MUST be performed; if any parameters are out of bounds, or 2940 cannot be applied, processing MUST stop and an error MUST be 2941 returned. 2943 o If this m= section is new, begin gathering candidates for it, as 2944 defined in [RFC5245], Section 4.1.1, unless it has been marked as 2945 bundle-only. 2947 o Or, if the ICE ufrag and password values have changed, and it has 2948 not been marked as bundle-only, trigger the ICE agent to start an 2949 ICE restart, and begin gathering new candidates for the m= section 2950 as described in [RFC5245], Section 9.1.1.1. If this description 2951 is an answer, also start checks on that media section as defined 2952 in [RFC5245], Section 9.3.1.1. 2954 o If the m= section proto value indicates use of RTP: 2956 * If there is no RtpTransceiver associated with this m= section 2957 (which will only happen when applying an offer), find one and 2958 associate it with this m= section according to the following 2959 steps: 2961 + Find the RtpTransceiver that corresponds to this m= section, 2962 using the mapping between transceivers and m= section 2963 indices established when creating the offer. 2965 + Set the value of this RtpTransceiver's mid property to the 2966 MID of the m= section. 2968 * If RTCP mux is indicated, prepare to demux RTP and RTCP from 2969 the RTP ICE component, as specified in [RFC5761], 2970 Section 5.1.3. If RTCP mux is not indicated, but was 2971 previously negotiated, i.e., the RTCP ICE component no longer 2972 exists, this MUST result in an error. 2974 * For each specified RTP header extension, establish a mapping 2975 between the extension ID and URI, as described in section 6 of 2976 [RFC5285]. If any indicated RTP header extension is not 2977 supported, this MUST result in an error. 2979 * If the MID header extension is supported, prepare to demux RTP 2980 streams intended for this m= section based on the MID header 2981 extension, as described in 2982 [I-D.ietf-mmusic-sdp-bundle-negotiation], Section 14. 2984 * For each specified media format, establish a mapping between 2985 the payload type and the actual media format, as described in 2986 [RFC3264], Section 6.1. If any indicated media format is not 2987 supported, this MUST result in an error. 2989 * For each specified "rtx" media format, establish a mapping 2990 between the RTX payload type and its associated primary payload 2991 type, as described in [RFC4588], Sections 8.6 and 8.7. If any 2992 referenced primary payload types are not present, this MUST 2993 result in an error. 2995 * If the directional attribute is of type "sendrecv" or 2996 "recvonly", enable receipt and decoding of media. 2998 Finally, if this description is of type "pranswer" or "answer", 2999 follow the processing defined in the Section 5.10 section below. 3001 5.9. Applying a Remote Description 3003 The following steps are performed to apply a remote description. If 3004 an error is returned, the session MUST be restored to the state it 3005 was in before performing these steps. 3007 If the answer contains any "a=ice-options" attributes where "trickle" 3008 is listed as an attribute, update the PeerConnection canTrickle 3009 property to be true. Otherwise, set this property to false. 3011 The following steps MUST be performed for attributes at the session 3012 level; if any parameters are out of bounds, or cannot be applied, 3013 processing MUST stop and an error MUST be returned. 3015 o For any specified "CT" bandwidth value, set this as the limit for 3016 the maximum total bitrate for all m= sections, as specified in 3017 Section 5.8 of [RFC4566]. Within this overall limit, the 3018 implementation can dynamically decide how to best allocate the 3019 available bandwidth between m= sections, respecting any specific 3020 limits that have been specified for individual m= sections. 3022 o For any specified "RR" or "RS" bandwidth values, handle as 3023 specified in [RFC3556], Section 2. 3025 o Any "AS" bandwidth value MUST be ignored, as the meaning of this 3026 construct at the session level is not well defined. 3028 For each m= section, the following steps MUST be performed; if any 3029 parameters are out of bounds, or cannot be applied, processing MUST 3030 stop and an error MUST be returned. 3032 o If the PeerConnection state is "have-local-offer", and the ICE 3033 ufrag or password changed from the previous remote description, 3034 then an ICE restart is needed, as described in Section 9.1.1.1 of 3035 [RFC5245]. If the description is of type "offer", note that an 3036 ICE restart is needed. If the description is of type "answer" or 3037 "pranswer" and the current local description is also an ICE 3038 restart, then signal the ICE agent to begin checks as described in 3039 Section 9.3.1.1 of [RFC5245]. An answerer MUST change the ufrag 3040 and password in an answer if and only if ICE is restarting, as 3041 described in Section 9.2.1.1 of [RFC5245]. 3043 o If the PeerConnection state is "have-remote-pranswer", and the ICE 3044 ufrag or password changed from the previous provisional answer, 3045 then signal the ICE agent to discard any previous ICE check list 3046 state for the m= section and begin checks as if this were the 3047 first answer. However, such an answer MAY only change the ICE 3048 ufrag or password if the local offer is starting or restarting ICE 3049 for the m= section. 3051 o Configure the ICE components associated with this media section to 3052 use the supplied ICE remote ufrag and password for their 3053 connectivity checks. 3055 o Pair any supplied ICE candidates with any gathered local 3056 candidates, as described in Section 5.7 of [RFC5245] and start 3057 connectivity checks with the appropriate credentials. 3059 o If an "a=end-of-candidates" attribute is present, process the end- 3060 of-candidates indication as described in [I-D.ietf-ice-trickle] 3061 Section 11. 3063 o If the m= section proto value indicates use of RTP: 3065 * If the m= section is being recycled (see Section 5.2.2), 3066 dissociate the currently associated RtpTransceiver by setting 3067 its mid property to null, and discard the mapping between the 3068 transceiver and its m= section index. 3070 * If the m= section is not associated with any RtpTransceiver 3071 (possibly because it was dissociated in the previous step), 3072 either find an RtpTransceiver or create one according to the 3073 following steps: 3075 + If the m= section is sendrecv or recvonly, and there are 3076 RtpTransceivers of the same type that were added to the 3077 PeerConnection by addTrack and are not associated with any 3078 m= section and are not stopped, find the first (according to 3079 the canonical order described in Section 5.2.1) such 3080 RtpTransceiver. 3082 + If no RtpTransceiver was found in the previous step, create 3083 one with a recvonly direction. 3085 + Associate the found or created RtpTransceiver with the m= 3086 section by setting the value of the RtpTransceiver's mid 3087 property to the MID of the m= section, and establish a 3088 mapping between the transceiver and the index of the m= 3089 section. If the m= section does not include a MID (i.e., 3090 the remote endpoint does not support the MID extension), 3091 generate a value for the RtpTransceiver mid property, 3092 following the guidance for "a=mid" mentioned in 3093 Section 5.2.1. 3095 * For each specified media format that is also supported by the 3096 local implementation, establish a mapping between the specified 3097 payload type and the media format, as described in [RFC3264], 3098 Section 6.1. Specifically, this means that the implementation 3099 records the payload type to be used in outgoing RTP packets 3100 when sending each specified media format, as well as the 3101 relative preference for each format that is indicated in their 3102 ordering. If any indicated media format is not supported by 3103 the local implementation, it MUST be ignored. 3105 * For each specified "rtx" media format, establish a mapping 3106 between the RTX payload type and its associated primary payload 3107 type, as described in [RFC4588], Section 4. If any referenced 3108 primary payload types are not present, this MUST result in an 3109 error. 3111 * For each specified fmtp parameter that is supported by the 3112 local implementation, enable them on the associated media 3113 formats. 3115 * For each specified RTP header extension that is also supported 3116 by the local implementation, establish a mapping between the 3117 extension ID and URI, as described in [RFC5285], Section 5. 3118 Specifically, this means that the implementation records the 3119 extension ID to be used in outgoing RTP packets when sending 3120 each specified header extension. If any indicated RTP header 3121 extension is not supported by the local implementation, it MUST 3122 be ignored. 3124 * For each specified RTCP feedback mechanism that is supported by 3125 the local implementation, enable them on the associated media 3126 formats. 3128 * For any specified "TIAS" bandwidth value, set this value as a 3129 constraint on the maximum RTP bitrate to be used when sending 3130 media, as specified in [RFC3890]. If a "TIAS" value is not 3131 present, but an "AS" value is specified, generate a "TIAS" 3132 value using this formula: 3134 TIAS = AS * 1000 * 0.95 - 50 * 40 * 8 3136 The 50 is based on 50 packets per second, the 40 is based on an 3137 estimate of total header size, the 1000 changes the unit from 3138 kbps to bps (as required by TIAS), and the 0.95 is to allocate 3139 5% to RTCP. "TIAS" is used in preference to "AS" because it 3140 provides more accurate control of bandwidth. 3142 * For any "RR" or "RS" bandwidth values, handle as specified in 3143 [RFC3556], Section 2. 3145 * Any specified "CT" bandwidth value MUST be ignored, as the 3146 meaning of this construct at the media level is not well 3147 defined. 3149 * If the m= section is of type audio: 3151 + For each specified "CN" media format, enable DTX for all 3152 supported media formats with the same clockrate, as 3153 described in [RFC3389], Section 5, except for formats that 3154 have their own internal DTX mechanisms. DTX for such 3155 formats (e.g., Opus) is controlled via fmtp parameters, as 3156 discussed in Section 5.2.3.2. 3158 + For each specified "telephone-event" media format, enable 3159 DTMF transmission for all supported media formats with the 3160 same clockrate, as described in [RFC4733], Section 2.5.1.2. 3161 If the application attempts to transmit DTMF when using a 3162 media format that does not have a corresponding telephone- 3163 event format, this MUST result in an error. 3165 + For any specified "ptime" value, configure the available 3166 media formats to use the specified packet size. If the 3167 specified size is not supported for a media format, use the 3168 next closest value instead. 3170 Finally, if this description is of type "pranswer" or "answer", 3171 follow the processing defined in the Section 5.10 section below. 3173 5.10. Applying an Answer 3175 In addition to the steps mentioned above for processing a local or 3176 remote description, the following steps are performed when processing 3177 a description of type "pranswer" or "answer". 3179 For each m= section, the following steps MUST be performed: 3181 o If the m= section has been rejected (i.e. port is set to zero in 3182 the answer), stop any reception or transmission of media for this 3183 section, and, unless a non-rejected m= section is bundled with 3184 this m= section, discard any associated ICE components, as 3185 described in Section 9.2.1.3 of [RFC5245]. 3187 o If the remote DTLS fingerprint has been changed or the dtls-id has 3188 changed, tear down the DTLS connection. This includes the case 3189 when the PeerConnection state is "have-remote-pranswer". If a 3190 DTLS connection needs to be torn down but the answer does not 3191 indicate an ICE restart or, in the case of "have-remote-pranswer", 3192 new ICE credentials, an error MUST be generated. If an ICE 3193 restart is performed without a change in dtls-id or fingerprint, 3194 then the same DTLS connection is continued over the new ICE 3195 channel. 3197 o If no valid DTLS connection exists, prepare to start a DTLS 3198 connection, using the specified roles and fingerprints, on any 3199 underlying ICE components, once they are active. 3201 o If the m= section proto value indicates use of RTP: 3203 * If the m= section references any media formats, RTP header 3204 extensions, or RTCP feedback mechanisms that were not present 3205 in the corresponding m= section in the offer, this indicates a 3206 negotiation problem and MUST result in an error. 3208 * If the m= section has RTCP mux enabled, discard the RTCP ICE 3209 component, if one exists, and begin or continue muxing RTCP 3210 over the RTP ICE component, as specified in [RFC5761], 3211 Section 5.1.3. Otherwise, prepare to transmit RTCP over the 3212 RTCP ICE component; if no RTCP ICE component exists, because 3213 RTCP mux was previously enabled, this MUST result in an error. 3215 * If the m= section has reduced-size RTCP enabled, configure the 3216 RTCP transmission for this m= section to use reduced-size RTCP, 3217 as specified in [RFC5506]. 3219 * If the directional attribute in the answer is of type 3220 "sendrecv" or "sendonly", choose the media format to send as 3221 the most preferred media format from the remote description 3222 that is also present in the answer, as described in [RFC3264], 3223 Sections 6.1 and 7, and start transmitting RTP media once the 3224 underlying transport layers have been established. If an SSRC 3225 has not already been chosen for this outgoing RTP stream, 3226 choose a random one. If media is already being transmitted, 3227 the same SSRC SHOULD be used unless the clockrate of the new 3228 codec is different, in which case a new SSRC MUST be chosen, as 3229 specified in [RFC7160], Section 3.1. 3231 * The payload type mapping from the remote description is used to 3232 determine payload types for the outgoing RTP streams, including 3233 the payload type for the send media format chosen above. Any 3234 RTP header extensions that were negotiated should be included 3235 in the outgoing RTP streams, using the extension mapping from 3236 the remote description; if the RID header extension has been 3237 negotiated, and RID values are specified, include the RID 3238 header extension in the outgoing RTP streams, as indicated in 3239 [I-D.ietf-mmusic-rid], Section 4. 3241 * If simulcast has been negotiated, send the number of Source RTP 3242 Streams as specified in [I-D.ietf-mmusic-sdp-simulcast], 3243 Section 6.2.2. 3245 * If the send media format chosen above has a corresponding "rtx" 3246 media format, or a FEC mechanism has been negotiated, establish 3247 a Redundancy RTP Stream with a random SSRC for each Source RTP 3248 Stream, and start or continue transmitting RTX/FEC packets as 3249 needed. 3251 * If the send media format chosen above has a corresponding "red" 3252 media format of the same clockrate, allow redundant encoding 3253 using the specified format for resiliency purposes, as 3254 discussed in [I-D.ietf-rtcweb-fec], Section 3.2. Note that 3255 unlike RTX or FEC media formats, the "red" format is 3256 transmitted on the Source RTP Stream, not the Redundancy RTP 3257 Stream. 3259 * Enable the RTCP feedback mechanisms referenced in the media 3260 section for all Source RTP Streams using the specified media 3261 formats. Specifically, begin or continue sending the requested 3262 feedback types and reacting to received feedback, as specified 3263 in [RFC4585], Section 4.2. When sending RTCP feedback, follow 3264 the rules and recommendations from 3265 [I-D.ietf-avtcore-rtp-multi-stream], Section 5.4.1 to select 3266 which SSRC to use. 3268 * If the directional attribute is of type "recvonly" or 3269 "inactive", stop transmitting all RTP media, but continue 3270 sending RTCP, as described in [RFC3264], Section 5.1. 3272 o If the m= section proto value indicates use of SCTP: 3274 * If an SCTP association exists, and the remote SCTP port has 3275 changed, discard the existing SCTP association. This includes 3276 the case when the PeerConnection state is "have-remote- 3277 pranswer". 3279 * If no valid SCTP association exists, prepare to initiate a SCTP 3280 association over the associated ICE component and DTLS 3281 connection, using the local SCTP port value from the local 3282 description, and the remote SCTP port value from the remote 3283 description, as described in [I-D.ietf-mmusic-sctp-sdp], 3284 Section 10.2. 3286 If the answer contains valid bundle groups, discard any ICE 3287 components for the m= sections that will be bundled onto the primary 3288 ICE components in each bundle, and begin muxing these m= sections 3289 accordingly, as described in 3290 [I-D.ietf-mmusic-sdp-bundle-negotiation], Section 8.2. 3292 If the description is of type "answer", and there are still remaining 3293 candidates in the ICE candidate pool, discard them. 3295 6. Processing RTP/RTCP 3297 When bundling, associating incoming RTP/RTCP with the proper m= 3298 section is defined in [I-D.ietf-mmusic-sdp-bundle-negotiation]. When 3299 not bundling, the proper m= section is clear from the ICE component 3300 over which the RTP/RTCP is received. 3302 Once the proper m= section(s) are known, RTP/RTCP is delivered to the 3303 RtpTransceiver(s) associated with the m= section(s) and further 3304 processing of the RTP/RTCP is done at the RtpTransceiver level. This 3305 includes using RID [I-D.ietf-mmusic-rid] to distinguish between 3306 multiple Encoded Streams, as well as determine which Source RTP 3307 stream should be repaired by a given Redundancy RTP stream. 3309 7. Examples 3311 Note that this example section shows several SDP fragments. To 3312 format in 72 columns, some of the lines in SDP have been split into 3313 multiple lines, where leading whitespace indicates that a line is a 3314 continuation of the previous line. In addition, some blank lines 3315 have been added to improve readability but are not valid in SDP. 3317 More examples of SDP for WebRTC call flows can be found in 3318 [I-D.nandakumar-rtcweb-sdp]. 3320 7.1. Simple Example 3322 This section shows a very simple example that sets up a minimal audio 3323 / video call between two JSEP endpoints without using trickle ICE. 3324 The example in the following section provides a more detailed example 3325 of what could happen in a JSEP session. 3327 The code flow below shows Alice's endpoint initiating the session to 3328 Bob's endpoint. The messages from Alice's JS to Bob's JS are assumed 3329 to flow over some signaling protocol via a web server. The JS on 3330 both Alice's side and Bob's side waits for all candidates before 3331 sending the offer or answer, so the offers and answers are complete; 3332 trickle ICE is not used. Both Alice and Bob are using the default 3333 bundle policy of "balanced", and the default RTCP mux policy of 3334 "require". 3336 // set up local media state 3337 AliceJS->AliceUA: create new PeerConnection 3338 AliceJS->AliceUA: addTrack with two tracks: audio and video 3339 AliceJS->AliceUA: createOffer to get offer 3340 AliceJS->AliceUA: setLocalDescription with offer 3341 AliceUA->AliceJS: multiple onicecandidate events with candidates 3343 // wait for ICE gathering to complete 3344 AliceUA->AliceJS: onicecandidate event with null candidate 3345 AliceJS->AliceUA: get |offer-A1| from pendingLocalDescription 3347 // |offer-A1| is sent over signaling protocol to Bob 3348 AliceJS->WebServer: signaling with |offer-A1| 3349 WebServer->BobJS: signaling with |offer-A1| 3351 // |offer-A1| arrives at Bob 3352 BobJS->BobUA: create a PeerConnection 3353 BobJS->BobUA: setRemoteDescription with |offer-A1| 3354 BobUA->BobJS: ontrack events for audio and video tracks 3356 // Bob accepts call 3357 BobJS->BobUA: addTrack with local tracks 3358 BobJS->BobUA: createAnswer 3359 BobJS->BobUA: setLocalDescription with answer 3360 BobUA->BobJS: multiple onicecandidate events with candidates 3362 // wait for ICE gathering to complete 3363 BobUA->BobJS: onicecandidate event with null candidate 3364 BobJS->BobUA: get |answer-A1| from currentLocalDescription 3366 // |answer-A1| is sent over signaling protocol to Alice 3367 BobJS->WebServer: signaling with |answer-A1| 3368 WebServer->AliceJS: signaling with |answer-A1| 3370 // |answer-A1| arrives at Alice 3371 AliceJS->AliceUA: setRemoteDescription with |answer-A1| 3372 AliceUA->AliceJS: ontrack events for audio and video tracks 3374 // media flows 3375 BobUA->AliceUA: media sent from Bob to Alice 3376 AliceUA->BobUA: media sent from Alice to Bob 3378 The SDP for |offer-A1| looks like: 3380 v=0 3381 o=- 4962303333179871722 1 IN IP4 0.0.0.0 3382 s=- 3383 t=0 0 3384 a=ice-options:trickle 3385 a=group:BUNDLE a1 v1 3386 a=group:LS a1 v1 3388 m=audio 10100 UDP/TLS/RTP/SAVPF 96 0 8 97 98 3389 c=IN IP4 203.0.113.100 3390 a=mid:a1 3391 a=sendrecv 3392 a=rtpmap:96 opus/48000/2 3393 a=rtpmap:0 PCMU/8000 3394 a=rtpmap:8 PCMA/8000 3395 a=rtpmap:97 telephone-event/8000 3396 a=rtpmap:98 telephone-event/48000 3397 a=maxptime:120 3398 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 3399 a=extmap:2 urn:ietf:params:rtp-hdrext:ssrc-audio-level 3400 a=msid:47017fee-b6c1-4162-929c-a25110252400 3401 f83006c5-a0ff-4e0a-9ed9-d3e6747be7d9 3402 a=ice-ufrag:ETEn 3403 a=ice-pwd:OtSK0WpNtpUjkY4+86js7ZQl 3404 a=fingerprint:sha-256 3405 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04: 3406 BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3407 a=setup:actpass 3408 a=dtls-id:1 3409 a=rtcp:10101 IN IP4 203.0.113.100 3410 a=rtcp-mux 3411 a=rtcp-rsize 3412 a=candidate:1 1 udp 2113929471 203.0.113.100 10100 typ host 3413 a=candidate:1 2 udp 2113929470 203.0.113.100 10101 typ host 3414 a=end-of-candidates 3416 m=video 10102 UDP/TLS/RTP/SAVPF 100 101 3417 c=IN IP4 203.0.113.100 3418 a=mid:v1 3419 a=sendrecv 3420 a=rtpmap:100 VP8/90000 3421 a=rtpmap:101 rtx/90000 3422 a=fmtp:101 apt=100 3423 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 3424 a=rtcp-fb:100 ccm fir 3425 a=rtcp-fb:100 nack 3426 a=rtcp-fb:100 nack pli 3427 a=msid:47017fee-b6c1-4162-929c-a25110252400 3428 f30bdb4a-5db8-49b5-bcdc-e0c9a23172e0 3429 a=ice-ufrag:BGKk 3430 a=ice-pwd:mqyWsAjvtKwTGnvhPztQ9mIf 3431 a=fingerprint:sha-256 3432 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04: 3433 BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3434 a=setup:actpass 3435 a=dtls-id:1 3436 a=rtcp:10103 IN IP4 203.0.113.100 3437 a=rtcp-mux 3438 a=rtcp-rsize 3439 a=candidate:1 1 udp 2113929471 203.0.113.100 10102 typ host 3440 a=candidate:1 2 udp 2113929470 203.0.113.100 10103 typ host 3441 a=end-of-candidates 3443 The SDP for |answer-A1| looks like: 3445 v=0 3446 o=- 6729291447651054566 1 IN IP4 0.0.0.0 3447 s=- 3448 t=0 0 3449 a=ice-options:trickle 3450 a=group:BUNDLE a1 v1 3451 a=group:LS a1 v1 3453 m=audio 10200 UDP/TLS/RTP/SAVPF 96 0 8 97 98 3454 c=IN IP4 203.0.113.200 3455 a=mid:a1 3456 a=sendrecv 3457 a=rtpmap:96 opus/48000/2 3458 a=rtpmap:0 PCMU/8000 3459 a=rtpmap:8 PCMA/8000 3460 a=rtpmap:97 telephone-event/8000 3461 a=rtpmap:98 telephone-event/48000 3462 a=maxptime:120 3463 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 3464 a=extmap:2 urn:ietf:params:rtp-hdrext:ssrc-audio-level 3465 a=msid:61317484-2ed4-49d7-9eb7-1414322a7aae 3466 5a7b57b8-f043-4bd1-a45d-09d4dfa31226 3467 a=ice-ufrag:6sFv 3468 a=ice-pwd:cOTZKZNVlO9RSGsEGM63JXT2 3469 a=fingerprint:sha-256 3470 6B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35: 3471 DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 3472 a=setup:active 3473 a=dtls-id:1 3474 a=rtcp-mux 3475 a=rtcp-rsize 3476 a=candidate:1 1 udp 2113929471 203.0.113.200 10200 typ host 3477 a=end-of-candidates 3479 m=video 10200 UDP/TLS/RTP/SAVPF 100 101 3480 c=IN IP4 203.0.113.200 3481 a=mid:v1 3482 a=sendrecv 3483 a=rtpmap:100 VP8/90000 3484 a=rtpmap:101 rtx/90000 3485 a=fmtp:101 apt=100 3486 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 3487 a=rtcp-fb:100 ccm fir 3488 a=rtcp-fb:100 nack 3489 a=rtcp-fb:100 nack pli 3490 a=msid:61317484-2ed4-49d7-9eb7-1414322a7aae 3491 4ea4d4a1-2fda-4511-a9cc-1b32c2e59552 3493 7.2. Detailed Example 3495 This section shows a more involved example of a session between two 3496 JSEP endpoints. Trickle ICE is used in full trickle mode, with a 3497 bundle policy of "max-bundle", an RTCP mux policy of "require", and a 3498 single TURN server. Initially, both Alice and Bob establish an audio 3499 channel and a data channel. Later, Bob adds two video flows, one for 3500 his video feed, and one for screensharing, both supporting FEC, and 3501 with the video feed configured for simulcast. Alice accepts these 3502 video flows, but does not add video flows of her own, so they are 3503 handled as recvonly. Alice also specifies a maximum video decoder 3504 resolution. 3506 // set up local media state 3507 AliceJS->AliceUA: create new PeerConnection 3508 AliceJS->AliceUA: addTrack with an audio track 3509 AliceJS->AliceUA: createDataChannel to get data channel 3510 AliceJS->AliceUA: createOffer to get |offer-B1| 3511 AliceJS->AliceUA: setLocalDescription with |offer-B1| 3513 // |offer-B1| is sent over signaling protocol to Bob 3514 AliceJS->WebServer: signaling with |offer-B1| 3515 WebServer->BobJS: signaling with |offer-B1| 3517 // |offer-B1| arrives at Bob 3518 BobJS->BobUA: create a PeerConnection 3519 BobJS->BobUA: setRemoteDescription with |offer-B1| 3520 BobUA->BobJS: ontrack with audio track from Alice 3522 // candidates are sent to Bob 3523 AliceUA->AliceJS: onicecandidate (host) |offer-B1-candidate-1| 3524 AliceJS->WebServer: signaling with |offer-B1-candidate-1| 3525 AliceUA->AliceJS: onicecandidate (srflx) |offer-B1-candidate-2| 3526 AliceJS->WebServer: signaling with |offer-B1-candidate-2| 3527 AliceUA->AliceJS: onicecandidate (relay) |offer-B1-candidate-3| 3528 AliceJS->WebServer: signaling with |offer-B1-candidate-3| 3530 WebServer->BobJS: signaling with |offer-B1-candidate-1| 3531 BobJS->BobUA: addIceCandidate with |offer-B1-candidate-1| 3532 WebServer->BobJS: signaling with |offer-B1-candidate-2| 3533 BobJS->BobUA: addIceCandidate with |offer-B1-candidate-2| 3534 WebServer->BobJS: signaling with |offer-B1-candidate-3| 3535 BobJS->BobUA: addIceCandidate with |offer-B1-candidate-3| 3537 // Bob accepts call 3538 BobJS->BobUA: addTrack with local audio 3539 BobJS->BobUA: createDataChannel to get data channel 3540 BobJS->BobUA: createAnswer to get |answer-B1| 3541 BobJS->BobUA: setLocalDescription with |answer-B1| 3543 // |answer-B1| is sent to Alice 3544 BobJS->WebServer: signaling with |answer-B1| 3545 WebServer->AliceJS: signaling with |answer-B1| 3546 AliceJS->AliceUA: setRemoteDescription with |answer-B1| 3547 AliceUA->AliceJS: ontrack event with audio track from Bob 3549 // candidates are sent to Alice 3550 BobUA->BobJS: onicecandidate (host) |answer-B1-candidate-1| 3551 BobJS->WebServer: signaling with |answer-B1-candidate-1| 3552 BobUA->BobJS: onicecandidate (srflx) |answer-B1-candidate-2| 3553 BobJS->WebServer: signaling with |answer-B1-candidate-2| 3554 BobUA->BobJS: onicecandidate (relay) |answer-B1-candidate-3| 3555 BobJS->WebServer: signaling with |answer-B1-candidate-3| 3557 WebServer->AliceJS: signaling with |answer-B1-candidate-1| 3558 AliceJS->AliceUA: addIceCandidate with |answer-B1-candidate-1| 3559 WebServer->AliceJS: signaling with |answer-B1-candidate-2| 3560 AliceJS->AliceUA: addIceCandidate with |answer-B1-candidate-2| 3561 WebServer->AliceJS: signaling with |answer-B1-candidate-3| 3562 AliceJS->AliceUA: addIceCandidate with |answer-B1-candidate-3| 3564 // data channel opens 3565 BobUA->BobJS: ondatachannel event 3566 AliceUA->AliceJS: ondatachannel event 3567 BobUA->BobJS: onopen 3568 AliceUA->AliceJS: onopen 3570 // media is flowing between endpoints 3571 BobUA->AliceUA: audio+data sent from Bob to Alice 3572 AliceUA->BobUA: audio+data sent from Alice to Bob 3574 // some time later Bob adds two video streams 3575 // note, no candidates exchanged, because of bundle 3576 BobJS->BobUA: addTrack with first video stream 3577 BobJS->BobUA: addTrack with second video stream 3578 BobJS->BobUA: createOffer to get |offer-B2| 3579 BobJS->BobUA: setLocalDescription with |offer-B2| 3581 // |offer-B2| is sent to Alice 3582 BobJS->WebServer: signaling with |offer-B2| 3583 WebServer->AliceJS: signaling with |offer-B2| 3584 AliceJS->AliceUA: setRemoteDescription with |offer-B2| 3585 AliceUA->AliceJS: ontrack event with first video track 3586 AliceUA->AliceJS: ontrack event with second video track 3587 AliceJS->AliceUA: createAnswer to get |answer-B2| 3588 AliceJS->AliceUA: setLocalDescription with |answer-B2| 3590 // |answer-B2| is sent over signaling protocol to Bob 3591 AliceJS->WebServer: signaling with |answer-B2| 3592 WebServer->BobJS: signaling with |answer-B2| 3593 BobJS->BobUA: setRemoteDescription with |answer-B2| 3595 // media is flowing between endpoints 3596 BobUA->AliceUA: audio+video+data sent from Bob to Alice 3597 AliceUA->BobUA: audio+video+data sent from Alice to Bob 3599 The SDP for |offer-B1| looks like: 3601 v=0 3602 o=- 4962303333179871723 1 IN IP4 0.0.0.0 3603 s=- 3604 t=0 0 3605 a=ice-options:trickle 3606 a=group:BUNDLE a1 d1 3608 m=audio 9 UDP/TLS/RTP/SAVPF 96 0 8 97 98 3609 c=IN IP4 0.0.0.0 3610 a=mid:a1 3611 a=sendrecv 3612 a=rtpmap:96 opus/48000/2 3613 a=rtpmap:0 PCMU/8000 3614 a=rtpmap:8 PCMA/8000 3615 a=rtpmap:97 telephone-event/8000 3616 a=rtpmap:98 telephone-event/48000 3617 a=maxptime:120 3618 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 3619 a=extmap:2 urn:ietf:params:rtp-hdrext:ssrc-audio-level 3620 a=msid:57017fee-b6c1-4162-929c-a25110252400 3621 e83006c5-a0ff-4e0a-9ed9-d3e6747be7d9 3622 a=ice-ufrag:ATEn 3623 a=ice-pwd:AtSK0WpNtpUjkY4+86js7ZQl 3624 a=fingerprint:sha-256 3625 29:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04: 3626 BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3627 a=setup:actpass 3628 a=dtls-id:1 3629 a=rtcp-mux 3630 a=rtcp-mux-only 3631 a=rtcp-rsize 3633 m=application 0 UDP/DTLS/SCTP webrtc-datachannel 3634 c=IN IP4 0.0.0.0 3635 a=mid:d1 3636 a=sctp-port:5000 3637 a=max-message-size:65536 3638 a=bundle-only 3640 |offer-B1-candidate-1| looks like: 3642 ufrag ATEn 3643 index 0 3644 mid a1 3645 attr candidate:1 1 udp 2113929471 203.0.113.100 10100 typ host 3646 |offer-B1-candidate-2| looks like: 3648 ufrag ATEn 3649 index 0 3650 mid a1 3651 attr candidate:1 1 udp 1845494015 198.51.100.100 11100 typ srflx 3652 raddr 203.0.113.100 rport 10100 3654 |offer-B1-candidate-3| looks like: 3656 ufrag ATEn 3657 index 0 3658 mid a1 3659 attr candidate:1 1 udp 255 192.0.2.100 12100 typ relay 3660 raddr 198.51.100.100 rport 11100 3662 The SDP for |answer-B1| looks like: 3664 v=0 3665 o=- 7729291447651054566 1 IN IP4 0.0.0.0 3666 s=- 3667 t=0 0 3668 a=ice-options:trickle 3669 a=group:BUNDLE a1 d1 3671 m=audio 9 UDP/TLS/RTP/SAVPF 96 0 8 97 98 3672 c=IN IP4 0.0.0.0 3673 a=mid:a1 3674 a=sendrecv 3675 a=rtpmap:96 opus/48000/2 3676 a=rtpmap:0 PCMU/8000 3677 a=rtpmap:8 PCMA/8000 3678 a=rtpmap:97 telephone-event/8000 3679 a=rtpmap:98 telephone-event/48000 3680 a=maxptime:120 3681 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 3682 a=extmap:2 urn:ietf:params:rtp-hdrext:ssrc-audio-level 3683 a=msid:71317484-2ed4-49d7-9eb7-1414322a7aae 3684 6a7b57b8-f043-4bd1-a45d-09d4dfa31226 3685 a=ice-ufrag:7sFv 3686 a=ice-pwd:dOTZKZNVlO9RSGsEGM63JXT2 3687 a=fingerprint:sha-256 3688 7B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35: 3689 DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 3690 a=setup:active 3691 a=dtls-id:1 3692 a=rtcp-mux 3693 a=rtcp-mux-only 3694 a=rtcp-rsize 3696 m=application 9 UDP/DTLS/SCTP webrtc-datachannel 3697 c=IN IP4 0.0.0.0 3698 a=mid:d1 3699 a=sctp-port:5000 3700 a=max-message-size:65536 3702 |answer-B1-candidate-1| looks like: 3704 ufrag 7sFv 3705 index 0 3706 mid a1 3707 attr candidate:1 1 udp 2113929471 203.0.113.200 10200 typ host 3708 |answer-B1-candidate-2| looks like: 3710 ufrag 7sFv 3711 index 0 3712 mid a1 3713 attr candidate:1 1 udp 1845494015 198.51.100.200 11200 typ srflx 3714 raddr 203.0.113.200 rport 10200 3716 |answer-B1-candidate-3| looks like: 3718 ufrag 7sFv 3719 index 0 3720 mid a1 3721 attr candidate:1 1 udp 255 192.0.2.200 12200 typ relay 3722 raddr 198.51.100.200 rport 11200 3724 The SDP for |offer-B2| is shown below. In addition to the new m= 3725 sections for video, both of which are offering FEC, and one of which 3726 is offering simulcast, note the increment of the version number in 3727 the o= line, changes to the c= line, indicating the local candidate 3728 that was selected, and the inclusion of gathered candidates as 3729 a=candidate lines. 3731 v=0 3732 o=- 7729291447651054566 2 IN IP4 0.0.0.0 3733 s=- 3734 t=0 0 3735 a=ice-options:trickle 3736 a=group:BUNDLE a1 d1 v1 v2 3737 a=group:LS a1 v1 3739 m=audio 12200 UDP/TLS/RTP/SAVPF 96 0 8 97 98 3740 c=IN IP4 192.0.2.200 3741 a=mid:a1 3742 a=sendrecv 3743 a=rtpmap:96 opus/48000/2 3744 a=rtpmap:0 PCMU/8000 3745 a=rtpmap:8 PCMA/8000 3746 a=rtpmap:97 telephone-event/8000 3747 a=rtpmap:98 telephone-event/48000 3748 a=maxptime:120 3749 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 3750 a=extmap:2 urn:ietf:params:rtp-hdrext:ssrc-audio-level 3751 a=msid:71317484-2ed4-49d7-9eb7-1414322a7aae 3752 6a7b57b8-f043-4bd1-a45d-09d4dfa31226 3753 a=ice-ufrag:7sFv 3754 a=ice-pwd:dOTZKZNVlO9RSGsEGM63JXT2 3755 a=fingerprint:sha-256 3756 7B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35: 3757 DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 3758 a=setup:actpass 3759 a=dtls-id:1 3760 a=rtcp-mux 3761 a=rtcp-mux-only 3762 a=rtcp-rsize 3763 a=candidate:1 1 udp 2113929471 203.0.113.200 10200 typ host 3764 a=candidate:1 1 udp 1845494015 198.51.100.200 11200 typ srflx 3765 raddr 203.0.113.200 rport 10200 3766 a=candidate:1 1 udp 255 192.0.2.200 12200 typ relay 3767 raddr 198.51.100.200 rport 11200 3768 a=end-of-candidates 3770 m=application 12200 UDP/DTLS/SCTP webrtc-datachannel 3771 c=IN IP4 192.0.2.200 3772 a=mid:d1 3773 a=sctp-port:5000 3774 a=max-message-size:65536 3776 m=video 12200 UDP/TLS/RTP/SAVPF 100 101 102 3777 c=IN IP4 192.0.2.200 3778 a=mid:v1 3779 a=sendrecv 3780 a=rtpmap:100 VP8/90000 3781 a=rtpmap:101 rtx/90000 3782 a=fmtp:101 apt=100 3783 a=rtpmap:102 flexfec/90000 3784 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 3785 a=rtcp-fb:100 ccm fir 3786 a=rtcp-fb:100 nack 3787 a=rtcp-fb:100 nack pli 3788 a=msid:71317484-2ed4-49d7-9eb7-1414322a7aae 3789 5ea4d4a1-2fda-4511-a9cc-1b32c2e59552 3790 a=rid:1 send 3791 a=rid:2 send 3792 a=rid:3 send 3793 a=simulcast:send 1;2;3 3795 m=video 12200 UDP/TLS/RTP/SAVPF 100 101 102 3796 c=IN IP4 192.0.2.200 3797 a=mid:v2 3798 a=sendrecv 3799 a=rtpmap:100 VP8/90000 3800 a=rtpmap:101 rtx/90000 3801 a=fmtp:101 apt=100 3802 a=rtpmap:102 flexfec/90000 3803 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 3804 a=rtcp-fb:100 ccm fir 3805 a=rtcp-fb:100 nack 3806 a=rtcp-fb:100 nack pli 3807 a=msid:81317484-2ed4-49d7-9eb7-1414322a7aae 3808 6ea4d4a1-2fda-4511-a9cc-1b32c2e59552 3810 The SDP for |answer-B2| is shown below. In addition to the 3811 acceptance of the video m= sections, the use of a=recvonly to 3812 indicate one-way video, and the use of a=imageattr to limit the 3813 received resolution, note the use of setup:passive to maintain the 3814 existing DTLS roles. 3816 v=0 3817 o=- 4962303333179871723 2 IN IP4 0.0.0.0 3818 s=- 3819 t=0 0 3820 a=ice-options:trickle 3821 a=group:BUNDLE a1 d1 v1 v2 3822 a=group:LS a1 v1 3824 m=audio 12100 UDP/TLS/RTP/SAVPF 96 0 8 97 98 3825 c=IN IP4 192.0.2.100 3826 a=mid:a1 3827 a=sendrecv 3828 a=rtpmap:96 opus/48000/2 3829 a=rtpmap:0 PCMU/8000 3830 a=rtpmap:8 PCMA/8000 3831 a=rtpmap:97 telephone-event/8000 3832 a=rtpmap:98 telephone-event/48000 3833 a=maxptime:120 3834 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 3835 a=extmap:2 urn:ietf:params:rtp-hdrext:ssrc-audio-level 3836 a=msid:57017fee-b6c1-4162-929c-a25110252400 3837 e83006c5-a0ff-4e0a-9ed9-d3e6747be7d9 3838 a=ice-ufrag:ATEn 3839 a=ice-pwd:AtSK0WpNtpUjkY4+86js7ZQl 3840 a=fingerprint:sha-256 3841 29:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04: 3842 BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3843 a=setup:passive 3844 a=dtls-id:1 3845 a=rtcp-mux 3846 a=rtcp-mux-only 3847 a=rtcp-rsize 3848 a=candidate:1 1 udp 2113929471 203.0.113.100 10100 typ host 3849 a=candidate:1 1 udp 1845494015 198.51.100.100 11100 typ srflx 3850 raddr 203.0.113.100 rport 10100 3851 a=candidate:1 1 udp 255 192.0.2.100 12100 typ relay 3852 raddr 198.51.100.100 rport 11100 3853 a=end-of-candidates 3855 m=application 12100 UDP/DTLS/SCTP webrtc-datachannel 3856 c=IN IP4 192.0.2.100 3857 a=mid:d1 3858 a=sctp-port:5000 3859 a=max-message-size:65536 3861 m=video 12100 UDP/TLS/RTP/SAVPF 100 101 3862 c=IN IP4 192.0.2.100 3863 a=mid:v1 3864 a=recvonly 3865 a=rtpmap:100 VP8/90000 3866 a=rtpmap:101 rtx/90000 3867 a=fmtp:101 apt=100 3868 a=imageattr:100 recv [x=[48:1920],y=[48:1080],q=1.0] 3869 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 3870 a=rtcp-fb:100 ccm fir 3871 a=rtcp-fb:100 nack 3872 a=rtcp-fb:100 nack pli 3874 m=video 12100 UDP/TLS/RTP/SAVPF 100 101 3875 c=IN IP4 192.0.2.100 3876 a=mid:v2 3877 a=recvonly 3878 a=rtpmap:100 VP8/90000 3879 a=rtpmap:101 rtx/90000 3880 a=fmtp:101 apt=100 3881 a=imageattr:100 recv [x=[48:1920],y=[48:1080],q=1.0] 3882 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 3883 a=rtcp-fb:100 ccm fir 3884 a=rtcp-fb:100 nack 3885 a=rtcp-fb:100 nack pli 3887 7.3. Early Transport Warmup Example 3889 This example demonstrates the early warmup technique described in 3890 Section 4.1.8.1. Here, Alice's endpoint sends an offer to Bob's 3891 endpoint to start an audio/video call. Bob immediately responds with 3892 an answer that accepts the audio/video m= sections, but marks them as 3893 sendonly (from his perspective), meaning that Alice will not yet send 3894 media. This allows the JSEP implementation to start negotiating ICE 3895 and DTLS immediately. Bob's endpoint then prompts him to answer the 3896 call, and when he does, his endpoint sends a second offer which 3897 enables the audio and video m= sections, and thereby bidirectional 3898 media transmission. The advantage of such a flow is that as soon as 3899 the first answer is received, the implementation can proceed with ICE 3900 and DTLS negotiation and establish the session transport. If the 3901 transport setup completes before the second offer is sent, then media 3902 can be transmitted immediately by the callee immediately upon 3903 answering the call, minimizing perceived post-dial-delay. The second 3904 offer/answer exchange can also change the preferred codecs or other 3905 session parameters. 3907 This example also makes use of the "relay" ICE candidate policy 3908 described in Section 3.5.3 to minimize the ICE gathering and checking 3909 needed. 3911 // set up local media state 3912 AliceJS->AliceUA: create new PeerConnection with "relay" ICE policy 3913 AliceJS->AliceUA: addTrack with two tracks: audio and video 3914 AliceJS->AliceUA: createOffer to get |offer-C1| 3915 AliceJS->AliceUA: setLocalDescription with |offer-C1| 3917 // |offer-C1| is sent over signaling protocol to Bob 3918 AliceJS->WebServer: signaling with |offer-C1| 3919 WebServer->BobJS: signaling with |offer-C1| 3921 // |offer-C1| arrives at Bob 3922 BobJS->BobUA: create new PeerConnection with "relay" ICE policy 3923 BobJS->BobUA: setRemoteDescription with |offer-C1| 3924 BobUA->BobJS: ontrack events for audio and video 3926 // a relay candidate is sent to Bob 3927 AliceUA->AliceJS: onicecandidate (relay) |offer-C1-candidate-1| 3928 AliceJS->WebServer: signaling with |offer-C1-candidate-1| 3930 WebServer->BobJS: signaling with |offer-C1-candidate-1| 3931 BobJS->BobUA: addIceCandidate with |offer-C1-candidate-1| 3933 // Bob prepares an early answer to warmup the transport 3934 BobJS->BobUA: addTransceiver with null audio and video tracks 3935 BobJS->BobUA: transceiver.setDirection(sendonly) for both 3936 BobJS->BobUA: createAnswer 3937 BobJS->BobUA: setLocalDescription with answer 3938 // |answer-C1| is sent over signaling protocol to Alice 3939 BobJS->WebServer: signaling with |answer-C1| 3940 WebServer->AliceJS: signaling with |answer-C1| 3942 // |answer-C1| (sendonly) arrives at Alice 3943 AliceJS->AliceUA: setRemoteDescription with |answer-C1| 3944 AliceUA->AliceJS: ontrack events for audio and video 3946 // a relay candidate is sent to Alice 3947 BobUA->BobJS: onicecandidate (relay) |answer-B1-candidate-1| 3948 BobJS->WebServer: signaling with |answer-B1-candidate-1| 3950 WebServer->AliceJS: signaling with |answer-B1-candidate-1| 3951 AliceJS->AliceUA: addIceCandidate with |answer-B1-candidate-1| 3953 // ICE and DTLS establish while call is ringing 3955 // Bob accepts call, starts media, and sends new offer 3956 BobJS->BobUA: transceiver.setTrack with audio and video tracks 3957 BobUA->AliceUA: media sent from Bob to Alice 3958 BobJS->BobUA: transceiver.setDirection(sendrecv) for both 3959 transceivers 3960 BobJS->BobUA: createOffer 3961 BobJS->BobUA: setLocalDescription with offer 3963 // |offer-C2| is sent over signaling protocol to Alice 3964 BobJS->WebServer: signaling with |offer-C2| 3965 WebServer->AliceJS: signaling with |offer-C2| 3967 // |offer-C2| (sendrecv) arrives at Alice 3968 AliceJS->AliceUA: setRemoteDescription with |offer-C2| 3969 AliceJS->AliceUA: createAnswer 3970 AliceJS->AliceUA: setLocalDescription with |answer-C2| 3971 AliceUA->BobUA: media sent from Alice to Bob 3973 // |answer-C2| is sent over signaling protocol to Bob 3974 AliceJS->WebServer: signaling with |answer-C2| 3975 WebServer->BobJS: signaling with |answer-C2| 3976 BobJS->BobUA: setRemoteDescription with |answer-C2| 3978 The SDP for |offer-C1| looks like: 3980 v=0 3981 o=- 1070771854436052752 1 IN IP4 0.0.0.0 3982 s=- 3983 t=0 0 3984 a=ice-options:trickle 3985 a=group:BUNDLE a1 v1 3986 a=group:LS a1 v1 3988 m=audio 9 UDP/TLS/RTP/SAVPF 96 0 8 97 98 3989 c=IN IP4 0.0.0.0 3990 a=mid:a1 3991 a=sendrecv 3992 a=rtpmap:96 opus/48000/2 3993 a=rtpmap:0 PCMU/8000 3994 a=rtpmap:8 PCMA/8000 3995 a=rtpmap:97 telephone-event/8000 3996 a=rtpmap:98 telephone-event/48000 3997 a=maxptime:120 3998 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 3999 a=extmap:2 urn:ietf:params:rtp-hdrext:ssrc-audio-level 4000 a=msid:bbce3ba6-abfc-ac63-d00a-e15b286f8fce 4001 e80098db-7159-3c06-229a-00df2a9b3dbc 4002 a=ice-ufrag:4ZcD 4003 a=ice-pwd:ZaaG6OG7tCn4J/lehAGz+HHD 4004 a=fingerprint:sha-256 4005 C4:68:F8:77:6A:44:F1:98:6D:7C:9F:47:EB:E3:34:A4: 4006 0A:AA:2D:49:08:28:70:2E:1F:AE:18:7D:4E:3E:66:BF 4007 a=setup:actpass 4008 a=dtls-id:1 4009 a=rtcp-mux 4010 a=rtcp-mux-only 4011 a=rtcp-rsize 4013 m=video 0 UDP/TLS/RTP/SAVPF 100 101 4014 c=IN IP4 0.0.0.0 4015 a=mid:v1 4016 a=sendrecv 4017 a=rtpmap:100 VP8/90000 4018 a=rtpmap:101 rtx/90000 4019 a=fmtp:101 apt=100 4020 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 4021 a=rtcp-fb:100 ccm fir 4022 a=rtcp-fb:100 nack 4023 a=rtcp-fb:100 nack pli 4024 a=msid:bbce3ba6-abfc-ac63-d00a-e15b286f8fce 4025 ac701365-eb06-42df-cc93-7f22bc308789 4026 a=bundle-only 4027 |offer-C1-candidate-1| looks like: 4029 ufrag 4ZcD 4030 index 0 4031 mid a1 4032 attr candidate:1 1 udp 255 192.0.2.100 12100 typ relay 4033 raddr 0.0.0.0 rport 0 4035 The SDP for |answer-C1| looks like: 4037 v=0 4038 o=- 6386516489780559513 1 IN IP4 0.0.0.0 4039 s=- 4040 t=0 0 4041 a=ice-options:trickle 4042 a=group:BUNDLE a1 v1 4043 a=group:LS a1 v1 4045 m=audio 9 UDP/TLS/RTP/SAVPF 96 0 8 97 98 4046 c=IN IP4 0.0.0.0 4047 a=mid:a1 4048 a=sendonly 4049 a=rtpmap:96 opus/48000/2 4050 a=rtpmap:0 PCMU/8000 4051 a=rtpmap:8 PCMA/8000 4052 a=rtpmap:97 telephone-event/8000 4053 a=rtpmap:98 telephone-event/48000 4054 a=maxptime:120 4055 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 4056 a=extmap:2 urn:ietf:params:rtp-hdrext:ssrc-audio-level 4057 a=msid:751f239e-4ae0-c549-aa3d-890de772998b 4058 04b5a445-82cc-c9e8-9ffe-c24d0ef4b0ff 4059 a=ice-ufrag:TpaA 4060 a=ice-pwd:t2Ouhc67y8JcCaYZxUUTgKw/ 4061 a=fingerprint:sha-256 4062 A2:F3:A5:6D:4C:8C:1E:B2:62:10:4A:F6:70:61:C4:FC: 4063 3C:E0:01:D6:F3:24:80:74:DA:7C:3E:50:18:7B:CE:4D 4064 a=setup:active 4065 a=dtls-id:1 4066 a=rtcp-mux 4067 a=rtcp-mux-only 4068 a=rtcp-rsize 4070 m=video 9 UDP/TLS/RTP/SAVPF 100 101 4071 c=IN IP4 0.0.0.0 4072 a=mid:v1 4073 a=sendonly 4074 a=rtpmap:100 VP8/90000 4075 a=rtpmap:101 rtx/90000 4076 a=fmtp:101 apt=100 4077 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 4078 a=rtcp-fb:100 ccm fir 4079 a=rtcp-fb:100 nack 4080 a=rtcp-fb:100 nack pli 4081 a=msid:751f239e-4ae0-c549-aa3d-890de772998b 4082 39292672-c102-d075-f580-5826f31ca958 4084 |answer-C1-candidate-1| looks like: 4086 ufrag TpaA 4087 index 0 4088 mid a1 4089 attr candidate:1 1 udp 255 192.0.2.200 12200 typ relay 4090 raddr 0.0.0.0 rport 0 4092 The SDP for |offer-C2| looks like: 4094 v=0 4095 o=- 6386516489780559513 2 IN IP4 0.0.0.0 4096 s=- 4097 t=0 0 4098 a=ice-options:trickle 4099 a=group:BUNDLE a1 v1 4100 a=group:LS a1 v1 4102 m=audio 12200 UDP/TLS/RTP/SAVPF 96 0 8 97 98 4103 c=IN IP4 192.0.2.200 4104 a=mid:a1 4105 a=sendrecv 4106 a=rtpmap:96 opus/48000/2 4107 a=rtpmap:0 PCMU/8000 4108 a=rtpmap:8 PCMA/8000 4109 a=rtpmap:97 telephone-event/8000 4110 a=rtpmap:98 telephone-event/48000 4111 a=maxptime:120 4112 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 4113 a=extmap:2 urn:ietf:params:rtp-hdrext:ssrc-audio-level 4114 a=msid:751f239e-4ae0-c549-aa3d-890de772998b 4115 04b5a445-82cc-c9e8-9ffe-c24d0ef4b0ff 4116 a=ice-ufrag:TpaA 4117 a=ice-pwd:t2Ouhc67y8JcCaYZxUUTgKw/ 4118 a=fingerprint:sha-256 4119 A2:F3:A5:6D:4C:8C:1E:B2:62:10:4A:F6:70:61:C4:FC: 4120 3C:E0:01:D6:F3:24:80:74:DA:7C:3E:50:18:7B:CE:4D 4121 a=setup:actpass 4122 a=dtls-id:1 4123 a=rtcp-mux 4124 a=rtcp-mux-only 4125 a=rtcp-rsize 4126 a=candidate:1 1 udp 255 192.0.2.200 12200 typ relay 4127 raddr 0.0.0.0 rport 0 4128 a=end-of-candidates 4129 m=video 12200 UDP/TLS/RTP/SAVPF 100 101 4130 c=IN IP4 192.0.2.200 4131 a=mid:v1 4132 a=sendrecv 4133 a=rtpmap:100 VP8/90000 4134 a=rtpmap:101 rtx/90000 4135 a=fmtp:101 apt=100 4136 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 4137 a=rtcp-fb:100 ccm fir 4138 a=rtcp-fb:100 nack 4139 a=rtcp-fb:100 nack pli 4140 a=msid:751f239e-4ae0-c549-aa3d-890de772998b 4141 39292672-c102-d075-f580-5826f31ca958 4143 The SDP for |answer-C2| looks like: 4145 v=0 4146 o=- 1070771854436052752 2 IN IP4 0.0.0.0 4147 s=- 4148 t=0 0 4149 a=ice-options:trickle 4150 a=group:BUNDLE a1 v1 4151 a=group:LS a1 v1 4153 m=audio 12100 UDP/TLS/RTP/SAVPF 96 0 8 97 98 4154 c=IN IP4 192.0.2.100 4155 a=mid:a1 4156 a=sendrecv 4157 a=rtpmap:96 opus/48000/2 4158 a=rtpmap:0 PCMU/8000 4159 a=rtpmap:8 PCMA/8000 4160 a=rtpmap:97 telephone-event/8000 4161 a=rtpmap:98 telephone-event/48000 4162 a=maxptime:120 4163 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 4164 a=extmap:2 urn:ietf:params:rtp-hdrext:ssrc-audio-level 4165 a=msid:bbce3ba6-abfc-ac63-d00a-e15b286f8fce 4166 e80098db-7159-3c06-229a-00df2a9b3dbc 4167 a=ice-ufrag:4ZcD 4168 a=ice-pwd:ZaaG6OG7tCn4J/lehAGz+HHD 4169 a=fingerprint:sha-256 4170 C4:68:F8:77:6A:44:F1:98:6D:7C:9F:47:EB:E3:34:A4: 4171 0A:AA:2D:49:08:28:70:2E:1F:AE:18:7D:4E:3E:66:BF 4172 a=setup:passive 4173 a=dtls-id:1 4174 a=rtcp-mux 4175 a=rtcp-mux-only 4176 a=rtcp-rsize 4177 a=candidate:1 1 udp 255 192.0.2.100 12100 typ relay 4178 raddr 0.0.0.0 rport 0 4179 a=end-of-candidates 4181 m=video 12100 UDP/TLS/RTP/SAVPF 100 101 4182 c=IN IP4 192.0.2.100 4183 a=mid:v1 4184 a=sendrecv 4185 a=rtpmap:100 VP8/90000 4186 a=rtpmap:101 rtx/90000 4187 a=fmtp:101 apt=100 4188 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 4189 a=rtcp-fb:100 ccm fir 4190 a=rtcp-fb:100 nack 4191 a=rtcp-fb:100 nack pli 4192 a=msid:bbce3ba6-abfc-ac63-d00a-e15b286f8fce 4193 ac701365-eb06-42df-cc93-7f22bc308789 4195 8. Security Considerations 4197 The IETF has published separate documents 4198 [I-D.ietf-rtcweb-security-arch] [I-D.ietf-rtcweb-security] describing 4199 the security architecture for WebRTC as a whole. The remainder of 4200 this section describes security considerations for this document. 4202 While formally the JSEP interface is an API, it is better to think of 4203 it is an Internet protocol, with the JS being untrustworthy from the 4204 perspective of the endpoint. Thus, the threat model of [RFC3552] 4205 applies. In particular, JS can call the API in any order and with 4206 any inputs, including malicious ones. This is particularly relevant 4207 when we consider the SDP which is passed to setLocalDescription(). 4208 While correct API usage requires that the application pass in SDP 4209 which was derived from createOffer() or createAnswer(), there is no 4210 guarantee that applications do so. The JSEP implementation MUST be 4211 prepared for the JS to pass in bogus data instead. 4213 Conversely, the application programmer MUST recognize that the JS 4214 does not have complete control of endpoint behavior. One case that 4215 bears particular mention is that editing ICE candidates out of the 4216 SDP or suppressing trickled candidates does not have the expected 4217 behavior: implementations will still perform checks from those 4218 candidates even if they are not sent to the other side. Thus, for 4219 instance, it is not possible to prevent the remote peer from learning 4220 your public IP address by removing server reflexive candidates. 4222 Applications which wish to conceal their public IP address should 4223 instead configure the ICE agent to use only relay candidates. 4225 9. IANA Considerations 4227 This document requires no actions from IANA. 4229 10. Acknowledgements 4231 Harald Alvestrand, Taylor Brandstetter, Suhas Nandakumar, and Peter 4232 Thatcher provided significant text for this draft. Bernard Aboba, 4233 Adam Bergkvist, Dan Burnett, Ben Campbell, Alissa Cooper, Richard 4234 Ejzak, Stefan Hakansson, Ted Hardie, Christer Holmberg Andrew Hutton, 4235 Randell Jesup, Matthew Kaufman, Anant Narayanan, Adam Roach, Neil 4236 Stratford, Martin Thomson, Sean Turner, and Magnus Westerlund all 4237 provided valuable feedback on this proposal. 4239 11. References 4241 11.1. Normative References 4243 [I-D.ietf-avtcore-rtp-multi-stream] 4244 Lennox, J., Westerlund, M., Wu, Q., and C. Perkins, 4245 "Sending Multiple RTP Streams in a Single RTP Session", 4246 draft-ietf-avtcore-rtp-multi-stream-11 (work in progress), 4247 December 2015. 4249 [I-D.ietf-avtext-rid] 4250 Roach, A., Nandakumar, S., and P. Thatcher, "RTP Stream 4251 Identifier (RID) Source Description (SDES)", draft-ietf- 4252 avtext-rid-00 (work in progress), February 2016. 4254 [I-D.ietf-ice-trickle] 4255 Ivov, E., Rescorla, E., Uberti, J., and P. Saint-Andre, 4256 "Trickle ICE: Incremental Provisioning of Candidates for 4257 the Interactive Connectivity Establishment (ICE) 4258 Protocol". 4260 [I-D.ietf-mmusic-4572-update] 4261 Holmberg, C., "Updates to RFC 4572", draft-ietf-mmusic- 4262 4572-update-05 (work in progress), June 2016. 4264 [I-D.ietf-mmusic-dtls-sdp] 4265 Holmberg, C. and R. Shpount, "Using the SDP Offer/Answer 4266 Mechanism for DTLS", draft-ietf-mmusic-dtls-sdp-14 (work 4267 in progress), July 2016. 4269 [I-D.ietf-mmusic-msid] 4270 Alvestrand, H., "Cross Session Stream Identification in 4271 the Session Description Protocol", draft-ietf-mmusic- 4272 msid-01 (work in progress), August 2013. 4274 [I-D.ietf-mmusic-mux-exclusive] 4275 Holmberg, C., "Indicating Exclusive Support of RTP/RTCP 4276 Multiplexing using SDP", draft-ietf-mmusic-mux- 4277 exclusive-08 (work in progress), June 2016. 4279 [I-D.ietf-mmusic-rid] 4280 Thatcher, P., Zanaty, M., Nandakumar, S., Burman, B., 4281 Roach, A., and B. Campen, "RTP Payload Format 4282 Constraints", draft-ietf-mmusic-rid-04 (work in progress), 4283 February 2016. 4285 [I-D.ietf-mmusic-sctp-sdp] 4286 Loreto, S. and G. Camarillo, "Stream Control Transmission 4287 Protocol (SCTP)-Based Media Transport in the Session 4288 Description Protocol (SDP)", draft-ietf-mmusic-sctp-sdp-04 4289 (work in progress), June 2013. 4291 [I-D.ietf-mmusic-sdp-bundle-negotiation] 4292 Holmberg, C., Alvestrand, H., and C. Jennings, 4293 "Multiplexing Negotiation Using Session Description 4294 Protocol (SDP) Port Numbers", draft-ietf-mmusic-sdp- 4295 bundle-negotiation-04 (work in progress), June 2013. 4297 [I-D.ietf-mmusic-sdp-mux-attributes] 4298 Nandakumar, S., "A Framework for SDP Attributes when 4299 Multiplexing", draft-ietf-mmusic-sdp-mux-attributes-01 4300 (work in progress), February 2014. 4302 [I-D.ietf-mmusic-sdp-simulcast] 4303 Burman, B., Westerlund, M., Nandakumar, S., and M. Zanaty, 4304 "Using Simulcast in SDP and RTP Sessions", draft-ietf- 4305 mmusic-sdp-simulcast-04 (work in progress), February 2016. 4307 [I-D.ietf-rtcweb-audio] 4308 Valin, J. and C. Bran, "WebRTC Audio Codec and Processing 4309 Requirements", draft-ietf-rtcweb-audio-02 (work in 4310 progress), August 2013. 4312 [I-D.ietf-rtcweb-fec] 4313 Uberti, J., "WebRTC Forward Error Correction 4314 Requirements", draft-ietf-rtcweb-fec-00 (work in 4315 progress), February 2015. 4317 [I-D.ietf-rtcweb-rtp-usage] 4318 Perkins, C., Westerlund, M., and J. Ott, "Web Real-Time 4319 Communication (WebRTC): Media Transport and Use of RTP", 4320 draft-ietf-rtcweb-rtp-usage-09 (work in progress), 4321 September 2013. 4323 [I-D.ietf-rtcweb-security] 4324 Rescorla, E., "Security Considerations for WebRTC", draft- 4325 ietf-rtcweb-security-06 (work in progress), January 2014. 4327 [I-D.ietf-rtcweb-security-arch] 4328 Rescorla, E., "WebRTC Security Architecture", draft-ietf- 4329 rtcweb-security-arch-09 (work in progress), February 2014. 4331 [I-D.ietf-rtcweb-video] 4332 Roach, A., "WebRTC Video Processing and Codec 4333 Requirements", draft-ietf-rtcweb-video-00 (work in 4334 progress), July 2014. 4336 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 4337 Requirement Levels", BCP 14, RFC 2119, March 1997. 4339 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 4340 A., Peterson, J., Sparks, R., Handley, M., and E. 4341 Schooler, "SIP: Session Initiation Protocol", RFC 3261, 4342 June 2002. 4344 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 4345 with Session Description Protocol (SDP)", RFC 3264, June 4346 2002. 4348 [RFC3552] Rescorla, E. and B. Korver, "Guidelines for Writing RFC 4349 Text on Security Considerations", BCP 72, RFC 3552, July 4350 2003. 4352 [RFC3605] Huitema, C., "Real Time Control Protocol (RTCP) attribute 4353 in Session Description Protocol (SDP)", RFC 3605, October 4354 2003. 4356 [RFC3890] Westerlund, M., "A Transport Independent Bandwidth 4357 Modifier for the Session Description Protocol (SDP)", RFC 4358 3890, DOI 10.17487/RFC3890, September 2004, 4359 . 4361 [RFC4145] Yon, D. and G. Camarillo, "TCP-Based Media Transport in 4362 the Session Description Protocol (SDP)", RFC 4145, 4363 September 2005. 4365 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 4366 Description Protocol", RFC 4566, July 2006. 4368 [RFC4572] Lennox, J., "Connection-Oriented Media Transport over the 4369 Transport Layer Security (TLS) Protocol in the Session 4370 Description Protocol (SDP)", RFC 4572, July 2006. 4372 [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, 4373 "Extended RTP Profile for Real-time Transport Control 4374 Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, July 4375 2006. 4377 [RFC5245] Rosenberg, J., "Interactive Connectivity Establishment 4378 (ICE): A Protocol for Network Address Translator (NAT) 4379 Traversal for Offer/Answer Protocols", RFC 5245, April 4380 2010. 4382 [RFC5285] Singer, D. and H. Desineni, "A General Mechanism for RTP 4383 Header Extensions", RFC 5285, July 2008. 4385 [RFC5761] Perkins, C. and M. Westerlund, "Multiplexing RTP Data and 4386 Control Packets on a Single Port", RFC 5761, April 2010. 4388 [RFC5888] Camarillo, G. and H. Schulzrinne, "The Session Description 4389 Protocol (SDP) Grouping Framework", RFC 5888, June 2010. 4391 [RFC6236] Johansson, I. and K. Jung, "Negotiation of Generic Image 4392 Attributes in the Session Description Protocol (SDP)", RFC 4393 6236, May 2011. 4395 [RFC6347] Rescorla, E. and N. Modadugu, "Datagram Transport Layer 4396 Security Version 1.2", RFC 6347, January 2012. 4398 [RFC6716] Valin, JM., Vos, K., and T. Terriberry, "Definition of the 4399 Opus Audio Codec", RFC 6716, DOI 10.17487/RFC6716, 4400 September 2012, . 4402 [RFC6904] Lennox, J., "Encryption of Header Extensions in the Secure 4403 Real-time Transport Protocol (SRTP)", RFC 6904, April 4404 2013. 4406 [RFC7160] Petit-Huguenin, M. and G. Zorn, Ed., "Support for Multiple 4407 Clock Rates in an RTP Session", RFC 7160, DOI 10.17487/ 4408 RFC7160, April 2014, 4409 . 4411 [RFC7587] Spittka, J., Vos, K., and JM. Valin, "RTP Payload Format 4412 for the Opus Speech and Audio Codec", RFC 7587, DOI 4413 10.17487/RFC7587, June 2015, 4414 . 4416 [RFC7850] Nandakumar, S., "Registering Values of the SDP 'proto' 4417 Field for Transporting RTP Media over TCP under Various 4418 RTP Profiles", RFC 7850, DOI 10.17487/RFC7850, April 2016, 4419 . 4421 [RFC7941] Westerlund, M., Burman, B., Even, R., and M. Zanaty, "RTP 4422 Header Extension for the RTP Control Protocol (RTCP) 4423 Source Description Items", RFC 7941, DOI 10.17487/RFC7941, 4424 August 2016, . 4426 11.2. Informative References 4428 [I-D.ietf-avtext-lrr] 4429 Lennox, J., Hong, D., Uberti, J., Homer, S., and M. 4430 Flodman, "The Layer Refresh Request (LRR) RTCP Feedback 4431 Message", draft-ietf-avtext-lrr-03 (work in progress), 4432 July 2016. 4434 [I-D.ietf-rtcweb-ip-handling] 4435 Uberti, J. and G. Shieh, "WebRTC IP Address Handling 4436 Recommendations", draft-ietf-rtcweb-ip-handling-01 (work 4437 in progress), March 2016. 4439 [I-D.nandakumar-rtcweb-sdp] 4440 Nandakumar, S. and C. Jennings, "SDP for the WebRTC", 4441 draft-nandakumar-rtcweb-sdp-02 (work in progress), July 4442 2013. 4444 [RFC3389] Zopf, R., "Real-time Transport Protocol (RTP) Payload for 4445 Comfort Noise (CN)", RFC 3389, September 2002. 4447 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 4448 Jacobson, "RTP: A Transport Protocol for Real-Time 4449 Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550, 4450 July 2003, . 4452 [RFC3556] Casner, S., "Session Description Protocol (SDP) Bandwidth 4453 Modifiers for RTP Control Protocol (RTCP) Bandwidth", RFC 4454 3556, July 2003. 4456 [RFC3611] Friedman, T., Caceres, R., and A. Clark, "RTP Control 4457 Protocol Extended Reports (RTCP XR)", RFC 3611, DOI 4458 10.17487/RFC3611, November 2003, 4459 . 4461 [RFC3960] Camarillo, G. and H. Schulzrinne, "Early Media and Ringing 4462 Tone Generation in the Session Initiation Protocol (SIP)", 4463 RFC 3960, December 2004. 4465 [RFC4568] Andreasen, F., Baugher, M., and D. Wing, "Session 4466 Description Protocol (SDP) Security Descriptions for Media 4467 Streams", RFC 4568, July 2006. 4469 [RFC4588] Rey, J., Leon, D., Miyazaki, A., Varsa, V., and R. 4470 Hakenberg, "RTP Retransmission Payload Format", RFC 4588, 4471 July 2006. 4473 [RFC4733] Schulzrinne, H. and T. Taylor, "RTP Payload for DTMF 4474 Digits, Telephony Tones, and Telephony Signals", RFC 4733, 4475 DOI 10.17487/RFC4733, December 2006, 4476 . 4478 [RFC5104] Wenger, S., Chandra, U., Westerlund, M., and B. Burman, 4479 "Codec Control Messages in the RTP Audio-Visual Profile 4480 with Feedback (AVPF)", RFC 5104, DOI 10.17487/RFC5104, 4481 February 2008, . 4483 [RFC5506] Johansson, I. and M. Westerlund, "Support for Reduced-Size 4484 Real-Time Transport Control Protocol (RTCP): Opportunities 4485 and Consequences", RFC 5506, April 2009. 4487 [RFC5576] Lennox, J., Ott, J., and T. Schierl, "Source-Specific 4488 Media Attributes in the Session Description Protocol 4489 (SDP)", RFC 5576, June 2009. 4491 [RFC5763] Fischl, J., Tschofenig, H., and E. Rescorla, "Framework 4492 for Establishing a Secure Real-time Transport Protocol 4493 (SRTP) Security Context Using Datagram Transport Layer 4494 Security (DTLS)", RFC 5763, May 2010. 4496 [RFC5764] McGrew, D. and E. Rescorla, "Datagram Transport Layer 4497 Security (DTLS) Extension to Establish Keys for the Secure 4498 Real-time Transport Protocol (SRTP)", RFC 5764, May 2010. 4500 [RFC6464] Lennox, J., Ed., Ivov, E., and E. Marocco, "A Real-time 4501 Transport Protocol (RTP) Header Extension for Client-to- 4502 Mixer Audio Level Indication", RFC 6464, DOI 10.17487/ 4503 RFC6464, December 2011, 4504 . 4506 [RFC6544] Rosenberg, J., Keranen, A., Lowekamp, B., and A. Roach, 4507 "TCP Candidates with Interactive Connectivity 4508 Establishment (ICE)", RFC 6544, DOI 10.17487/RFC6544, 4509 March 2012, . 4511 [RFC7656] Lennox, J., Gross, K., Nandakumar, S., Salgueiro, G., and 4512 B. Burman, Ed., "A Taxonomy of Semantics and Mechanisms 4513 for Real-Time Transport Protocol (RTP) Sources", RFC 7656, 4514 DOI 10.17487/RFC7656, November 2015, 4515 . 4517 [TS26.114] 4518 3GPP TS 26.114 V12.8.0, "3rd Generation Partnership 4519 Project; Technical Specification Group Services and System 4520 Aspects; IP Multimedia Subsystem (IMS); Multimedia 4521 Telephony; Media handling and interaction (Release 12)", 4522 December 2014, . 4524 [W3C.WD-webrtc-20140617] 4525 Bergkvist, A., Burnett, D., Narayanan, A., and C. 4526 Jennings, "WebRTC 1.0: Real-time Communication Between 4527 Browsers", World Wide Web Consortium WD WD-webrtc- 4528 20140617, June 2014, 4529 . 4531 Appendix A. Appendix A 4533 For the syntax validation performed in Section 5.7, the following 4534 list of ABNF definitions is used: 4536 +------------------------+------------------------------------------+ 4537 | Attribute | Reference | 4538 +------------------------+------------------------------------------+ 4539 | ptime | [RFC4566] Section 9 | 4540 | maxptime | [RFC4566] Section 9 | 4541 | rtpmap | [RFC4566] Section 9 | 4542 | recvonly | [RFC4566] Section 9 | 4543 | sendrecv | [RFC4566] Section 9 | 4544 | sendonly | [RFC4566] Section 9 | 4545 | inactive | [RFC4566] Section 9 | 4546 | framerate | [RFC4566] Section 9 | 4547 | fmtp | [RFC4566] Section 9 | 4548 | quality | [RFC4566] Section 9 | 4549 | rtcp | [RFC3605] Section 2.1 | 4550 | setup | [RFC4145] Sections 3, 4, and 5 | 4551 | connection | [RFC4145] Sections 3, 4, and 5 | 4552 | fingerprint | [RFC4572] Section 5 | 4553 | rtcp-fb | [RFC4585] Section 4.2 | 4554 | candidate | [RFC5245] Section 15.1 | 4555 | remote-candidates | [RFC5245] Section 15.2 | 4556 | ice-lite | [RFC5245] Section 15.3 | 4557 | ice-ufrag | [RFC5245] Section 15.4 | 4558 | ice-pwd | [RFC5245] Section 15.4 | 4559 | ice-options | [RFC5245] Section 15.5 | 4560 | extmap | [RFC5285] Section 7 | 4561 | mid | [RFC5888] Section 4 and 5 | 4562 | group | [RFC5888] Section 4 and 5 | 4563 | imageattr | [RFC6236] Section 3.1 | 4564 | extmap (encrypt | [RFC6904] Section 4 | 4565 | option) | | 4566 | msid | [I-D.ietf-mmusic-msid] Section 2 | 4567 | rid | [I-D.ietf-mmusic-rid] Section 10 | 4568 | simulcast | [I-D.ietf-mmusic-sdp-simulcast] Section | 4569 | | 6.1 | 4570 | dtls-id | [I-D.ietf-mmusic-dtls-sdp] Section 4 | 4571 +------------------------+------------------------------------------+ 4573 Table 1: SDP ABNF References 4575 Appendix B. Change log 4577 Note: This section will be removed by RFC Editor before publication. 4579 Changes in draft-20: 4581 o Remove Appendix-B. 4583 Changes in draft-19: 4585 o Examples are now machine-generated for correctness, and use IETF- 4586 approved example IP addresses. 4588 o Add early transport warmup example, and add missing attributes to 4589 existing examples. 4591 o Only send "a=rtcp-mux-only" and "a=bundle-only" on new m= 4592 sections. 4594 o Update references. 4596 o Add coverage of a=identity. 4598 o Explain the lipsync group algorithm more thoroughly. 4600 o Remove unnecessary list of MTI specs. 4602 o Allow codecs which weren't offered to appear in answers and which 4603 weren't selected to appear in subsequent offers. 4605 o Codec preferences now are applied on both initial and subsequent 4606 offers and answers. 4608 o Clarify a=msid handling for recvonly m= sections. 4610 o Clarify behavior of attributes for bundle-only data channels. 4612 o Allow media attributes to appear in data m= sections when all the 4613 media m= sections are bundle-only. 4615 o Use consistent terminology for JSEP implementations. 4617 o Describe how to handle failed API calls. 4619 o Some cleanup on routing rules. 4621 Changes in draft-18: 4623 o Update demux algorithm and move it to an appendix in preparation 4624 for merging it into BUNDLE. 4626 o Clarify why we can't handle an incoming offer to send simulcast. 4628 o Expand IceCandidate object text. 4630 o Further document use of ICE candidate pool. 4632 o Document removeTrack. 4634 o Update requirements to only accept the last generated offer/answer 4635 as an argument to setLocalDescription. 4637 o Allow round pixels. 4639 o Fix code around default timing when AVPF is not specified. 4641 o Clean up terminology around m= line and m=section. 4643 o Provide a more realistic example for minimum decoder capabilities. 4645 o Document behavior when rtcp-mux policy is require but rtcp-mux 4646 attribute not provided. 4648 o Expanded discussion of RtpSender and RtpReceiver. 4650 o Add RtpTransceiver.currentDirection and document setDirection. 4652 o Require imageattr x=0, y=0 to indicate that there are no valid 4653 resolutions. 4655 o Require a privacy-preserving MID/RID construction. 4657 o Require support for RFC 3556 bandwidth modifiers. 4659 o Update maxptime description. 4661 o Note that endpoints may encounter extra codecs in answers and 4662 subsequent offers from non-JSEP peers. 4664 o Update references. 4666 Changes in draft-17: 4668 o Split createOffer and createAnswer sections to clearly indicate 4669 attributes which always appear and which only appear when not 4670 bundled into another m= section. 4672 o Add descriptions of RtpTransceiver methods. 4674 o Describe how to process RTCP feedback attributes. 4676 o Clarify transceiver directions and their interaction with 3264. 4678 o Describe setCodecPreferences. 4680 o Update RTP demux algorithm. Include RTCP. 4682 o Update requirements for when a=rtcp is included, limiting to cases 4683 where it is needed for backward compatibility. 4685 o Clarify SAR handling. 4687 o Updated addTrack matching algorithm. 4689 o Remove a=ssrc requirements. 4691 o Handle a=setup in reoffers. 4693 o Discuss how RTX/FEC should be handled. 4695 o Discuss how telephone-event should be handled. 4697 o Discuss how CN/DTX should be handled. 4699 o Add missing references to ABNF table. 4701 Changes in draft-16: 4703 o Update addIceCandidate to indicate ICE generation and allow per-m= 4704 section end-of-candidates. 4706 o Update fingerprint handling to use draft-ietf-mmusic-4572-update. 4708 o Update text around SDP processing of RTP header extensions and 4709 payload formats. 4711 o Add sections on simulcast, addTransceiver, and createDataChannel. 4713 o Clarify text to ensure that the session ID is a positive 63 bit 4714 integer. 4716 o Clarify SDP processing for direction indication. 4718 o Describe SDP processing for rtcp-mux-only. 4720 o Specify how SDP session version in o= line. 4722 o Require that when doing an re-offer, the capabilities of the new 4723 session are mostly required to be a subset of the previously 4724 negotiated session. 4726 o Clarified ICE restart interaction with bundle-only. 4728 o Remove support for changing SDP before calling 4729 setLocalDescription. 4731 o Specify algorithm for demuxing RTP based on MID, PT, and SSRC. 4733 o Clarify rules for rejecting m= lines when bundle policy is 4734 balanced or max-bundle. 4736 Changes in draft-15: 4738 o Clarify text around codecs offered in subsequent transactions to 4739 refer to what's been negotiated. 4741 o Rewrite LS handling text to indicate edge cases and that we're 4742 living with them. 4744 o Require that answerer reject m= lines when there are no codecs in 4745 common. 4747 o Enforce max-bundle on offer processing. 4749 o Fix TIAS formula to handle bits vs. kilobits. 4751 o Describe addTrack algorithm. 4753 o Clean up references. 4755 Changes in draft-14: 4757 o Added discussion of RtpTransceivers + RtpSenders + RtpReceivers, 4758 and how they interact with createOffer/createAnswer. 4760 o Removed obsolete OfferToReceiveX options. 4762 o Explained how addIceCandidate can be used for end-of-candidates. 4764 Changes in draft-13: 4766 o Clarified which SDP lines can be ignored. 4768 o Clarified how to handle various received attributes. 4770 o Revised how attributes should be generated for bundled m= lines. 4772 o Remove unused references. 4774 o Remove text advocating use of unilateral PTs. 4776 o Trigger an ICE restart even if the ICE candidate policy is being 4777 made more strict. 4779 o Remove the 'public' ICE candidate policy. 4781 o Move open issues into GitHub issues. 4783 o Split local/remote description accessors into current/pending. 4785 o Clarify a=imageattr handling. 4787 o Add more detail on VoiceActivityDetection handling. 4789 o Reference draft-shieh-rtcweb-ip-handling. 4791 o Make it clear when an ICE restart should occur. 4793 o Resolve changes needed in references. 4795 o Remove MSID semantics. 4797 o ice-options are now at session level. 4799 o Default RTCP mux policy is now 'require'. 4801 Changes in draft-12: 4803 o Filled in sections on applying local and remote descriptions. 4805 o Discussed downscaling and upscaling to fulfill imageattr 4806 requirements. 4808 o Updated what SDP can be modified by the application. 4810 o Updated to latest datachannel SDP. 4812 o Allowed multiple fingerprint lines. 4814 o Switched back to IPv4 for dummy candidates. 4816 o Added additional clarity on ICE default candidates. 4818 Changes in draft-11: 4820 o Clarified handling of RTP CNAMEs. 4822 o Updated what SDP lines should be processed or ignored. 4824 o Specified how a=imageattr should be used. 4826 Changes in draft-10: 4828 o Described video size negotiation with imageattr. 4830 o Clarified rejection of sections that do not have mux-only. 4832 o Add handling of LS groups 4834 Changes in draft-09: 4836 o Don't return null for {local,remote}Description after close(). 4838 o Changed TCP/TLS to UDP/DTLS in RTP profile names. 4840 o Separate out bundle and mux policy. 4842 o Added specific references to FEC mechanisms. 4844 o Added canTrickle mechanism. 4846 o Added section on subsequent answers and, answer options. 4848 o Added text defining set{Local,Remote}Description behavior. 4850 Changes in draft-08: 4852 o Added new example section and removed old examples in appendix. 4854 o Fixed field handling. 4856 o Added text describing a=rtcp attribute. 4858 o Reworked handling of OfferToReceiveAudio and OfferToReceiveVideo 4859 per discussion at IETF 90. 4861 o Reworked trickle ICE handling and its impact on m= and c= lines 4862 per discussion at interim. 4864 o Added max-bundle-and-rtcp-mux policy. 4866 o Added description of maxptime handling. 4868 o Updated ICE candidate pool default to 0. 4870 o Resolved open issues around AppID/receiver-ID. 4872 o Reworked and expanded how changes to the ICE configuration are 4873 handled. 4875 o Some reference updates. 4877 o Editorial clarification. 4879 Changes in draft-07: 4881 o Expanded discussion of VAD and Opus DTX. 4883 o Added a security considerations section. 4885 o Rewrote the section on modifying SDP to require implementations to 4886 clearly indicate whether any given modification is allowed. 4888 o Clarified impact of IceRestart on CreateOffer in local-offer 4889 state. 4891 o Guidance on whether attributes should be defined at the media 4892 level or the session level. 4894 o Renamed "default" bundle policy to "balanced". 4896 o Removed default ICE candidate pool size and clarify how it works. 4898 o Defined a canonical order for assignment of MSTs to m= lines. 4900 o Removed discussion of rehydration. 4902 o Added Eric Rescorla as a draft editor. 4904 o Cleaned up references. 4906 o Editorial cleanup 4908 Changes in draft-06: 4910 o Reworked handling of m= line recycling. 4912 o Added handling of BUNDLE and bundle-only. 4914 o Clarified handling of rollback. 4916 o Added text describing the ICE Candidate Pool and its behavior. 4918 o Allowed OfferToReceiveX to create multiple recvonly m= sections. 4920 Changes in draft-05: 4922 o Fixed several issues identified in the createOffer/Answer sections 4923 during document review. 4925 o Updated references. 4927 Changes in draft-04: 4929 o Filled in sections on createOffer and createAnswer. 4931 o Added SDP examples. 4933 o Fixed references. 4935 Changes in draft-03: 4937 o Added text describing relationship to W3C specification 4939 Changes in draft-02: 4941 o Converted from nroff 4943 o Removed comparisons to old approaches abandoned by the working 4944 group 4946 o Removed stuff that has moved to W3C specification 4948 o Align SDP handling with W3C draft 4950 o Clarified section on forking. 4952 Changes in draft-01: 4954 o Added diagrams for architecture and state machine. 4956 o Added sections on forking and rehydration. 4958 o Clarified meaning of "pranswer" and "answer". 4960 o Reworked how ICE restarts and media directions are controlled. 4962 o Added list of parameters that can be changed in a description. 4964 o Updated suggested API and examples to match latest thinking. 4966 o Suggested API and examples have been moved to an appendix. 4968 Changes in draft -00: 4970 o Migrated from draft-uberti-rtcweb-jsep-02. 4972 Authors' Addresses 4974 Justin Uberti 4975 Google 4976 747 6th St S 4977 Kirkland, WA 98033 4978 USA 4980 Email: justin@uberti.name 4982 Cullen Jennings 4983 Cisco 4984 400 3rd Avenue SW 4985 Calgary, AB T2P 4H2 4986 Canada 4988 Email: fluffy@iii.ca 4990 Eric Rescorla (editor) 4991 Mozilla 4992 331 Evelyn Ave 4993 Mountain View, CA 94041 4994 USA 4996 Email: ekr@rtfm.com