idnits 2.17.1 draft-ietf-rtcweb-jsep-17.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 25 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. == There are 10 instances of lines with private range IPv4 addresses in the document. If these are generic example addresses, they should be changed to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x, 198.51.100.x or 203.0.113.x. -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 21, 2016) is 2745 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '0' on line 766 == Unused Reference: 'I-D.ietf-rtcweb-audio' is defined on line 3853, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-rtcweb-video' is defined on line 3877, but no explicit reference was found in the text == Outdated reference: A later version (-09) exists of draft-ietf-avtext-rid-00 == Outdated reference: A later version (-13) exists of draft-ietf-mmusic-4572-update-05 == Outdated reference: A later version (-32) exists of draft-ietf-mmusic-dtls-sdp-14 == Outdated reference: A later version (-17) exists of draft-ietf-mmusic-msid-01 == Outdated reference: A later version (-12) exists of draft-ietf-mmusic-mux-exclusive-08 == Outdated reference: A later version (-15) exists of draft-ietf-mmusic-rid-04 == Outdated reference: A later version (-26) exists of draft-ietf-mmusic-sctp-sdp-04 == Outdated reference: A later version (-54) exists of draft-ietf-mmusic-sdp-bundle-negotiation-04 == Outdated reference: A later version (-19) exists of draft-ietf-mmusic-sdp-mux-attributes-01 == Outdated reference: A later version (-14) exists of draft-ietf-mmusic-sdp-simulcast-04 == Outdated reference: A later version (-11) exists of draft-ietf-rtcweb-audio-02 == Outdated reference: A later version (-10) exists of draft-ietf-rtcweb-fec-00 == Outdated reference: A later version (-26) exists of draft-ietf-rtcweb-rtp-usage-09 == Outdated reference: A later version (-12) exists of draft-ietf-rtcweb-security-06 == Outdated reference: A later version (-20) exists of draft-ietf-rtcweb-security-arch-09 == Outdated reference: A later version (-06) exists of draft-ietf-rtcweb-video-00 -- No information found for draft-nandakumar-mmusic-proto-iana-registration - is the name correct? -- Possible downref: Normative reference to a draft: ref. 'I-D.nandakumar-mmusic-proto-iana-registration' ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) ** Obsolete normative reference: RFC 4572 (Obsoleted by RFC 8122) ** Obsolete normative reference: RFC 5245 (Obsoleted by RFC 8445, RFC 8839) ** Obsolete normative reference: RFC 5285 (Obsoleted by RFC 8285) ** Obsolete normative reference: RFC 6347 (Obsoleted by RFC 9147) == Outdated reference: A later version (-12) exists of draft-ietf-rtcweb-ip-handling-01 == Outdated reference: A later version (-08) exists of draft-nandakumar-rtcweb-sdp-02 Summary: 5 errors (**), 0 flaws (~~), 23 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group J. Uberti 3 Internet-Draft Google 4 Intended status: Standards Track C. Jennings 5 Expires: April 24, 2017 Cisco 6 E. Rescorla, Ed. 7 Mozilla 8 October 21, 2016 10 Javascript Session Establishment Protocol 11 draft-ietf-rtcweb-jsep-17 13 Abstract 15 This document describes the mechanisms for allowing a Javascript 16 application to control the signaling plane of a multimedia session 17 via the interface specified in the W3C RTCPeerConnection API, and 18 discusses how this relates to existing signaling protocols. 20 Status of This Memo 22 This Internet-Draft is submitted in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at http://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on April 24, 2017. 37 Copyright Notice 39 Copyright (c) 2016 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents 44 (http://trustee.ietf.org/license-info) in effect on the date of 45 publication of this document. Please review these documents 46 carefully, as they describe your rights and restrictions with respect 47 to this document. Code Components extracted from this document must 48 include Simplified BSD License text as described in Section 4.e of 49 the Trust Legal Provisions and are provided without warranty as 50 described in the Simplified BSD License. 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 55 1.1. General Design of JSEP . . . . . . . . . . . . . . . . . 4 56 1.2. Other Approaches Considered . . . . . . . . . . . . . . . 5 57 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 58 3. Semantics and Syntax . . . . . . . . . . . . . . . . . . . . 6 59 3.1. Signaling Model . . . . . . . . . . . . . . . . . . . . . 6 60 3.2. Session Descriptions and State Machine . . . . . . . . . 7 61 3.3. Session Description Format . . . . . . . . . . . . . . . 10 62 3.4. Session Description Control . . . . . . . . . . . . . . . 10 63 3.4.1. RtpTransceivers . . . . . . . . . . . . . . . . . . . 10 64 3.4.2. RtpSenders . . . . . . . . . . . . . . . . . . . . . 11 65 3.4.3. RtpReceivers . . . . . . . . . . . . . . . . . . . . 11 66 3.5. ICE . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 67 3.5.1. ICE Gathering Overview . . . . . . . . . . . . . . . 11 68 3.5.2. ICE Candidate Trickling . . . . . . . . . . . . . . . 12 69 3.5.2.1. ICE Candidate Format . . . . . . . . . . . . . . 12 70 3.5.3. ICE Candidate Policy . . . . . . . . . . . . . . . . 13 71 3.5.4. ICE Candidate Pool . . . . . . . . . . . . . . . . . 14 72 3.6. Video Size Negotiation . . . . . . . . . . . . . . . . . 14 73 3.6.1. Creating an imageattr Attribute . . . . . . . . . . . 15 74 3.6.2. Interpreting an imageattr Attribute . . . . . . . . . 16 75 3.7. Simulcast . . . . . . . . . . . . . . . . . . . . . . . . 17 76 3.8. Interactions With Forking . . . . . . . . . . . . . . . . 18 77 3.8.1. Sequential Forking . . . . . . . . . . . . . . . . . 18 78 3.8.2. Parallel Forking . . . . . . . . . . . . . . . . . . 19 79 4. Interface . . . . . . . . . . . . . . . . . . . . . . . . . . 20 80 4.1. PeerConnection . . . . . . . . . . . . . . . . . . . . . 20 81 4.1.1. Constructor . . . . . . . . . . . . . . . . . . . . . 20 82 4.1.2. addTrack . . . . . . . . . . . . . . . . . . . . . . 22 83 4.1.3. addTransceiver . . . . . . . . . . . . . . . . . . . 22 84 4.1.4. createDataChannel . . . . . . . . . . . . . . . . . . 23 85 4.1.5. createOffer . . . . . . . . . . . . . . . . . . . . . 23 86 4.1.6. createAnswer . . . . . . . . . . . . . . . . . . . . 24 87 4.1.7. SessionDescriptionType . . . . . . . . . . . . . . . 25 88 4.1.7.1. Use of Provisional Answers . . . . . . . . . . . 25 89 4.1.7.2. Rollback . . . . . . . . . . . . . . . . . . . . 26 90 4.1.8. setLocalDescription . . . . . . . . . . . . . . . . . 27 91 4.1.9. setRemoteDescription . . . . . . . . . . . . . . . . 28 92 4.1.10. currentLocalDescription . . . . . . . . . . . . . . . 28 93 4.1.11. pendingLocalDescription . . . . . . . . . . . . . . . 28 94 4.1.12. currentRemoteDescription . . . . . . . . . . . . . . 28 95 4.1.13. pendingRemoteDescription . . . . . . . . . . . . . . 29 96 4.1.14. canTrickleIceCandidates . . . . . . . . . . . . . . . 29 97 4.1.15. setConfiguration . . . . . . . . . . . . . . . . . . 30 98 4.1.16. addIceCandidate . . . . . . . . . . . . . . . . . . . 30 99 4.2. RtpTransceiver . . . . . . . . . . . . . . . . . . . . . 31 100 4.2.1. stop . . . . . . . . . . . . . . . . . . . . . . . . 31 101 4.2.2. stopped . . . . . . . . . . . . . . . . . . . . . . . 31 102 4.2.3. setDirection . . . . . . . . . . . . . . . . . . . . 31 103 4.2.4. setCodecPreferences . . . . . . . . . . . . . . . . . 32 104 5. SDP Interaction Procedures . . . . . . . . . . . . . . . . . 32 105 5.1. Requirements Overview . . . . . . . . . . . . . . . . . . 32 106 5.1.1. Implementation Requirements . . . . . . . . . . . . . 33 107 5.1.2. Usage Requirements . . . . . . . . . . . . . . . . . 34 108 5.1.3. Profile Names and Interoperability . . . . . . . . . 34 109 5.2. Constructing an Offer . . . . . . . . . . . . . . . . . . 35 110 5.2.1. Initial Offers . . . . . . . . . . . . . . . . . . . 35 111 5.2.2. Subsequent Offers . . . . . . . . . . . . . . . . . . 41 112 5.2.3. Options Handling . . . . . . . . . . . . . . . . . . 44 113 5.2.3.1. IceRestart . . . . . . . . . . . . . . . . . . . 44 114 5.2.3.2. VoiceActivityDetection . . . . . . . . . . . . . 45 115 5.3. Generating an Answer . . . . . . . . . . . . . . . . . . 45 116 5.3.1. Initial Answers . . . . . . . . . . . . . . . . . . . 45 117 5.3.2. Subsequent Answers . . . . . . . . . . . . . . . . . 50 118 5.3.3. Options Handling . . . . . . . . . . . . . . . . . . 51 119 5.3.3.1. VoiceActivityDetection . . . . . . . . . . . . . 51 120 5.4. Modifying an Offer or Answer . . . . . . . . . . . . . . 51 121 5.5. Processing a Local Description . . . . . . . . . . . . . 52 122 5.6. Processing a Remote Description . . . . . . . . . . . . . 53 123 5.7. Parsing a Session Description . . . . . . . . . . . . . . 53 124 5.7.1. Session-Level Parsing . . . . . . . . . . . . . . . . 54 125 5.7.2. Media Section Parsing . . . . . . . . . . . . . . . . 55 126 5.7.3. Semantics Verification . . . . . . . . . . . . . . . 58 127 5.8. Applying a Local Description . . . . . . . . . . . . . . 59 128 5.9. Applying a Remote Description . . . . . . . . . . . . . . 60 129 5.10. Applying an Answer . . . . . . . . . . . . . . . . . . . 64 130 6. Processing RTP/RTCP packets . . . . . . . . . . . . . . . . . 66 131 7. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 68 132 7.1. Simple Example . . . . . . . . . . . . . . . . . . . . . 68 133 7.2. Normal Examples . . . . . . . . . . . . . . . . . . . . . 72 134 8. Security Considerations . . . . . . . . . . . . . . . . . . . 81 135 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 81 136 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 81 137 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 82 138 11.1. Normative References . . . . . . . . . . . . . . . . . . 82 139 11.2. Informative References . . . . . . . . . . . . . . . . . 85 140 Appendix A. Appendix A . . . . . . . . . . . . . . . . . . . . . 86 141 Appendix B. Change log . . . . . . . . . . . . . . . . . . . . . 87 142 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 94 144 1. Introduction 146 This document describes how the W3C WEBRTC RTCPeerConnection 147 interface [W3C.WD-webrtc-20140617] is used to control the setup, 148 management and teardown of a multimedia session. 150 1.1. General Design of JSEP 152 The thinking behind WebRTC call setup has been to fully specify and 153 control the media plane, but to leave the signaling plane up to the 154 application as much as possible. The rationale is that different 155 applications may prefer to use different protocols, such as the 156 existing SIP or Jingle call signaling protocols, or something custom 157 to the particular application, perhaps for a novel use case. In this 158 approach, the key information that needs to be exchanged is the 159 multimedia session description, which specifies the necessary 160 transport and media configuration information necessary to establish 161 the media plane. 163 With these considerations in mind, this document describes the 164 Javascript Session Establishment Protocol (JSEP) that allows for full 165 control of the signaling state machine from Javascript. JSEP removes 166 the browser almost entirely from the core signaling flow, which is 167 instead handled by the Javascript making use of two interfaces: (1) 168 passing in local and remote session descriptions and (2) interacting 169 with the ICE state machine. 171 In this document, the use of JSEP is described as if it always occurs 172 between two browsers. Note though in many cases it will actually be 173 between a browser and some kind of server, such as a gateway or MCU. 174 This distinction is invisible to the browser; it just follows the 175 instructions it is given via the API. 177 JSEP's handling of session descriptions is simple and 178 straightforward. Whenever an offer/answer exchange is needed, the 179 initiating side creates an offer by calling a createOffer() API. The 180 application optionally modifies that offer, and then uses it to set 181 up its local config via the setLocalDescription() API. The offer is 182 then sent off to the remote side over its preferred signaling 183 mechanism (e.g., WebSockets); upon receipt of that offer, the remote 184 party installs it using the setRemoteDescription() API. 186 To complete the offer/answer exchange, the remote party uses the 187 createAnswer() API to generate an appropriate answer, applies it 188 using the setLocalDescription() API, and sends the answer back to the 189 initiator over the signaling channel. When the initiator gets that 190 answer, it installs it using the setRemoteDescription() API, and 191 initial setup is complete. This process can be repeated for 192 additional offer/answer exchanges. 194 Regarding ICE [RFC5245], JSEP decouples the ICE state machine from 195 the overall signaling state machine, as the ICE state machine must 196 remain in the browser, because only the browser has the necessary 197 knowledge of candidates and other transport info. Performing this 198 separation also provides additional flexibility; in protocols that 199 decouple session descriptions from transport, such as Jingle, the 200 session description can be sent immediately and the transport 201 information can be sent when available. In protocols that don't, 202 such as SIP, the information can be used in the aggregated form. 203 Sending transport information separately can allow for faster ICE and 204 DTLS startup, since ICE checks can start as soon as any transport 205 information is available rather than waiting for all of it. 207 Through its abstraction of signaling, the JSEP approach does require 208 the application to be aware of the signaling process. While the 209 application does not need to understand the contents of session 210 descriptions to set up a call, the application must call the right 211 APIs at the right times, convert the session descriptions and ICE 212 information into the defined messages of its chosen signaling 213 protocol, and perform the reverse conversion on the messages it 214 receives from the other side. 216 One way to mitigate this is to provide a Javascript library that 217 hides this complexity from the developer; said library would 218 implement a given signaling protocol along with its state machine and 219 serialization code, presenting a higher level call-oriented interface 220 to the application developer. For example, libraries exist to adapt 221 the JSEP API into an API suitable for a SIP or XMPP. Thus, JSEP 222 provides greater control for the experienced developer without 223 forcing any additional complexity on the novice developer. 225 1.2. Other Approaches Considered 227 One approach that was considered instead of JSEP was to include a 228 lightweight signaling protocol. Instead of providing session 229 descriptions to the API, the API would produce and consume messages 230 from this protocol. While providing a more high-level API, this put 231 more control of signaling within the browser, forcing the browser to 232 have to understand and handle concepts like signaling glare. In 233 addition, it prevented the application from driving the state machine 234 to a desired state, as is needed in the page reload case. 236 A second approach that was considered but not chosen was to decouple 237 the management of the media control objects from session 238 descriptions, instead offering APIs that would control each component 239 directly. This was rejected based on a feeling that requiring 240 exposure of this level of complexity to the application programmer 241 would not be beneficial; it would result in an API where even a 242 simple example would require a significant amount of code to 243 orchestrate all the needed interactions, as well as creating a large 244 API surface that needed to be agreed upon and documented. In 245 addition, these API points could be called in any order, resulting in 246 a more complex set of interactions with the media subsystem than the 247 JSEP approach, which specifies how session descriptions are to be 248 evaluated and applied. 250 One variation on JSEP that was considered was to keep the basic 251 session description-oriented API, but to move the mechanism for 252 generating offers and answers out of the browser. Instead of 253 providing createOffer/createAnswer methods within the browser, this 254 approach would instead expose a getCapabilities API which would 255 provide the application with the information it needed in order to 256 generate its own session descriptions. This increases the amount of 257 work that the application needs to do; it needs to know how to 258 generate session descriptions from capabilities, and especially how 259 to generate the correct answer from an arbitrary offer and the 260 supported capabilities. While this could certainly be addressed by 261 using a library like the one mentioned above, it basically forces the 262 use of said library even for a simple example. Providing 263 createOffer/createAnswer avoids this problem, but still allows 264 applications to generate their own offers/answers (to a large extent) 265 if they choose, using the description generated by createOffer as an 266 indication of the browser's capabilities. 268 2. Terminology 270 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 271 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 272 document are to be interpreted as described in [RFC2119]. 274 3. Semantics and Syntax 276 3.1. Signaling Model 278 JSEP does not specify a particular signaling model or state machine, 279 other than the generic need to exchange session descriptions in the 280 fashion described by [RFC3264](offer/answer) in order for both sides 281 of the session to know how to conduct the session. JSEP provides 282 mechanisms to create offers and answers, as well as to apply them to 283 a session. However, the browser is totally decoupled from the actual 284 mechanism by which these offers and answers are communicated to the 285 remote side, including addressing, retransmission, forking, and glare 286 handling. These issues are left entirely up to the application; the 287 application has complete control over which offers and answers get 288 handed to the browser, and when. 290 +-----------+ +-----------+ 291 | Web App |<--- App-Specific Signaling -->| Web App | 292 +-----------+ +-----------+ 293 ^ ^ 294 | SDP | SDP 295 V V 296 +-----------+ +-----------+ 297 | Browser |<----------- Media ------------>| Browser | 298 +-----------+ +-----------+ 300 Figure 1: JSEP Signaling Model 302 3.2. Session Descriptions and State Machine 304 In order to establish the media plane, the user agent needs specific 305 parameters to indicate what to transmit to the remote side, as well 306 as how to handle the media that is received. These parameters are 307 determined by the exchange of session descriptions in offers and 308 answers, and there are certain details to this process that must be 309 handled in the JSEP APIs. 311 Whether a session description applies to the local side or the remote 312 side affects the meaning of that description. For example, the list 313 of codecs sent to a remote party indicates what the local side is 314 willing to receive, which, when intersected with the set of codecs 315 the remote side supports, specifies what the remote side should send. 316 However, not all parameters follow this rule; for example, the DTLS- 317 SRTP parameters [RFC5763] sent to a remote party indicate what 318 certificate the local side will use in DTLS setup, and thereby what 319 the remote party should expect to receive; the remote party will have 320 to accept these parameters, with no option to choose different 321 values. 323 In addition, various RFCs put different conditions on the format of 324 offers versus answers. For example, an offer may propose an 325 arbitrary number of media streams (i.e. m= sections), but an answer 326 must contain the exact same number as the offer. 328 Lastly, while the exact media parameters are only known only after an 329 offer and an answer have been exchanged, it is possible for the 330 offerer to receive media after they have sent an offer and before 331 they have received an answer. To properly process incoming media in 332 this case, the offerer's media handler must be aware of the details 333 of the offer before the answer arrives. 335 Therefore, in order to handle session descriptions properly, the user 336 agent needs: 338 1. To know if a session description pertains to the local or remote 339 side. 341 2. To know if a session description is an offer or an answer. 343 3. To allow the offer to be specified independently of the answer. 345 JSEP addresses this by adding both setLocalDescription and 346 setRemoteDescription methods and having session description objects 347 contain a type field indicating the type of session description being 348 supplied. This satisfies the requirements listed above for both the 349 offerer, who first calls setLocalDescription(sdp [offer]) and then 350 later setRemoteDescription(sdp [answer]), as well as for the 351 answerer, who first calls setRemoteDescription(sdp [offer]) and then 352 later setLocalDescription(sdp [answer]). 354 JSEP also allows for an answer to be treated as provisional by the 355 application. Provisional answers provide a way for an answerer to 356 communicate initial session parameters back to the offerer, in order 357 to allow the session to begin, while allowing a final answer to be 358 specified later. This concept of a final answer is important to the 359 offer/answer model; when such an answer is received, any extra 360 resources allocated by the caller can be released, now that the exact 361 session configuration is known. These "resources" can include things 362 like extra ICE components, TURN candidates, or video decoders. 363 Provisional answers, on the other hand, do no such deallocation 364 results; as a result, multiple dissimilar provisional answers can be 365 received and applied during call setup. 367 In [RFC3264], the constraint at the signaling level is that only one 368 offer can be outstanding for a given session, but at the media stack 369 level, a new offer can be generated at any point. For example, when 370 using SIP for signaling, if one offer is sent, then cancelled using a 371 SIP CANCEL, another offer can be generated even though no answer was 372 received for the first offer. To support this, the JSEP media layer 373 can provide an offer via the createOffer() method whenever the 374 Javascript application needs one for the signaling. The answerer can 375 send back zero or more provisional answers, and finally end the 376 offer-answer exchange by sending a final answer. The state machine 377 for this is as follows: 379 setRemote(OFFER) setLocal(PRANSWER) 380 /-----\ /-----\ 381 | | | | 382 v | v | 383 +---------------+ | +---------------+ | 384 | |----/ | |----/ 385 | | setLocal(PRANSWER) | | 386 | Remote-Offer |------------------- >| Local-Pranswer| 387 | | | | 388 | | | | 389 +---------------+ +---------------+ 390 ^ | | 391 | | setLocal(ANSWER) | 392 setRemote(OFFER) | | 393 | V setLocal(ANSWER) | 394 +---------------+ | 395 | | | 396 | |<---------------------------+ 397 | Stable | 398 | |<---------------------------+ 399 | | | 400 +---------------+ setRemote(ANSWER) | 401 ^ | | 402 | | setLocal(OFFER) | 403 setRemote(ANSWER) | | 404 | V | 405 +---------------+ +---------------+ 406 | | | | 407 | | setRemote(PRANSWER) | | 408 | Local-Offer |------------------- >|Remote-Pranswer| 409 | | | | 410 | |----\ | |----\ 411 +---------------+ | +---------------+ | 412 ^ | ^ | 413 | | | | 414 \-----/ \-----/ 415 setLocal(OFFER) setRemote(PRANSWER) 417 Figure 2: JSEP State Machine 419 Aside from these state transitions there is no other difference 420 between the handling of provisional ("pranswer") and final ("answer") 421 answers. 423 3.3. Session Description Format 425 In the WebRTC specification, session descriptions are formatted as 426 SDP messages. While this format is not optimal for manipulation from 427 Javascript, it is widely accepted, and frequently updated with new 428 features. Any alternate encoding of session descriptions would have 429 to keep pace with the changes to SDP, at least until the time that 430 this new encoding eclipsed SDP in popularity. As a result, JSEP 431 currently uses SDP as the internal representation for its session 432 descriptions. 434 However, to simplify Javascript processing, and provide for future 435 flexibility, the SDP syntax is encapsulated within a 436 SessionDescription object, which can be constructed from SDP, and be 437 serialized out to SDP. If future specifications agree on a JSON 438 format for session descriptions, we could easily enable this object 439 to generate and consume that JSON. 441 Other methods may be added to SessionDescription in the future to 442 simplify handling of SessionDescriptions from Javascript. In the 443 meantime, Javascript libraries can be used to perform these 444 manipulations. 446 Note that most applications should be able to treat the 447 SessionDescriptions produced and consumed by these various API calls 448 as opaque blobs; that is, the application will not need to read or 449 change them. 451 3.4. Session Description Control 453 In order to give the application control over various common session 454 parameters, JSEP provides control surfaces which tell the browser how 455 to generate session descriptions. This avoids the need for 456 Javascript to modify session descriptions in most cases. 458 Changes to these objects result in changes to the session 459 descriptions generated by subsequent createOffer/Answer calls. 461 3.4.1. RtpTransceivers 463 RtpTransceivers allow the application to control the RTP media 464 associated with one m= section. Each RtpTransceiver has an RtpSender 465 and an RtpReceiver, which an application can use to control the 466 sending and receiving of RTP media. The application may also modify 467 the RtpTransceiver directly, for instance, by stopping it. 469 RtpTransceivers generally have a 1:1 mapping with m= sections, 470 although there may be more RtpTransceivers than m= sections when 471 RtpTransceivers are created but not yet associated with a m= section, 472 or if RtpTransceivers have been stopped and disassociated from m= 473 sections. An RtpTransceiver is never associated with more than one 474 m= section, and once a session description is applied, a m= section 475 is always associated with exactly one RtpTransceiver. 477 RtpTransceivers can be created explicitly by the application or 478 implicitly by calling setRemoteDescription with an offer that adds 479 new m= sections. 481 3.4.2. RtpSenders 483 RtpSenders allow the application to control how RTP media is sent. 485 3.4.3. RtpReceivers 487 RtpReceivers allows the application to control how RTP media is 488 received. 490 3.5. ICE 492 3.5.1. ICE Gathering Overview 494 JSEP gathers ICE candidates as needed by the application. Collection 495 of ICE candidates is referred to as a gathering phase, and this is 496 triggered either by the addition of a new or recycled m= line to the 497 local session description, or new ICE credentials in the description, 498 indicating an ICE restart. Use of new ICE credentials can be 499 triggered explicitly by the application, or implicitly by the browser 500 in response to changes in the ICE configuration. 502 When the ICE configuration changes in a way that requires a new 503 gathering phase, a 'needs-ice-restart' bit is set. When this bit is 504 set, calls to the createOffer API will generate new ICE credentials. 505 This bit is cleared by a call to the setLocalDescription API with new 506 ICE credentials from either an offer or an answer, i.e., from either 507 a local- or remote-initiated ICE restart. 509 When a new gathering phase starts, the ICE Agent will notify the 510 application that gathering is occurring through an event. Then, when 511 each new ICE candidate becomes available, the ICE Agent will supply 512 it to the application via an additional event; these candidates will 513 also automatically be added to the current and/or pending local 514 session description. Finally, when all candidates have been 515 gathered, an event will be dispatched to signal that the gathering 516 process is complete. 518 Note that gathering phases only gather the candidates needed by 519 new/recycled/restarting m= lines; other m= lines continue to use 520 their existing candidates. Also, when bundling is active, candidates 521 are only gathered (and exchanged) for the m= lines referenced in 522 BUNDLE-tags, as described in 523 [I-D.ietf-mmusic-sdp-bundle-negotiation]. 525 3.5.2. ICE Candidate Trickling 527 Candidate trickling is a technique through which a caller may 528 incrementally provide candidates to the callee after the initial 529 offer has been dispatched; the semantics of "Trickle ICE" are defined 530 in [I-D.ietf-ice-trickle]. This process allows the callee to begin 531 acting upon the call and setting up the ICE (and perhaps DTLS) 532 connections immediately, without having to wait for the caller to 533 gather all possible candidates. This results in faster media setup 534 in cases where gathering is not performed prior to initiating the 535 call. 537 JSEP supports optional candidate trickling by providing APIs, as 538 described above, that provide control and feedback on the ICE 539 candidate gathering process. Applications that support candidate 540 trickling can send the initial offer immediately and send individual 541 candidates when they get the notified of a new candidate; 542 applications that do not support this feature can simply wait for the 543 indication that gathering is complete, and then create and send their 544 offer, with all the candidates, at this time. 546 Upon receipt of trickled candidates, the receiving application will 547 supply them to its ICE Agent. This triggers the ICE Agent to start 548 using the new remote candidates for connectivity checks. 550 3.5.2.1. ICE Candidate Format 552 As with session descriptions, the syntax of the IceCandidate object 553 provides some abstraction, but can be easily converted to and from 554 the SDP candidate lines. 556 The candidate lines are the only SDP information that is contained 557 within IceCandidate, as they represent the only information needed 558 that is not present in the initial offer (i.e., for trickle 559 candidates). This information is carried with the same syntax as the 560 "candidate-attribute" field defined for ICE. For example: 562 candidate:1 1 UDP 1694498815 192.0.2.33 10000 typ host 563 The IceCandidate object also contains fields to indicate which m= 564 line it should be associated with. The m= line can be identified in 565 one of two ways; either by a m= line index, or a MID. The m= line 566 index is a zero-based index, with index N referring to the N+1th m= 567 line in the SDP sent by the entity which sent the IceCandidate. The 568 MID uses the "media stream identification" attribute, as defined in 569 [RFC5888], Section 4, to identify the m= line. JSEP implementations 570 creating an ICE Candidate object MUST populate both of these fields, 571 using the MID of the associated RtpTransceiver object (which may be 572 locally generated by the answerer when interacting with a non-JSEP 573 remote endpoint that does not support the MID attribute, as discussed 574 in Section 5.9 below). Implementations receiving an ICE Candidate 575 object MUST use the MID if present, or the m= line index, if not (the 576 non-JSEP remote endpoint case). 578 3.5.3. ICE Candidate Policy 580 Typically, when gathering ICE candidates, the browser will gather all 581 possible forms of initial candidates - host, server reflexive, and 582 relay. However, in certain cases, applications may want to have more 583 specific control over the gathering process, due to privacy or 584 related concerns. For example, one may want to suppress the use of 585 host candidates, to avoid exposing information about the local 586 network, or go as far as only using relay candidates, to leak as 587 little location information as possible (note that these choices come 588 with corresponding operational costs). To accomplish this, the 589 browser MUST allow the application to restrict which ICE candidates 590 are used in a session. Note that this filtering is applied on top of 591 any restrictions the browser chooses to enforce regarding which IP 592 addresses are permitted for the application, as discussed in 593 [I-D.ietf-rtcweb-ip-handling]. 595 There may also be cases where the application wants to change which 596 types of candidates are used while the session is active. A prime 597 example is where a callee may initially want to use only relay 598 candidates, to avoid leaking location information to an arbitrary 599 caller, but then change to use all candidates (for lower operational 600 cost) once the user has indicated they want to take the call. For 601 this scenario, the browser MUST allow the candidate policy to be 602 changed in mid-session, subject to the aforementioned interactions 603 with local policy. 605 To administer the ICE candidate policy, the browser will determine 606 the current setting at the start of each gathering phase. Then, 607 during the gathering phase, the browser MUST NOT expose candidates 608 disallowed by the current policy to the application, use them as the 609 source of connectivity checks, or indirectly expose them via other 610 fields, such as the raddr/rport attributes for other ICE candidates. 612 Later, if a different policy is specified by the application, the 613 application can apply it by kicking off a new gathering phase via an 614 ICE restart. 616 3.5.4. ICE Candidate Pool 618 JSEP applications typically inform the browser to begin ICE gathering 619 via the information supplied to setLocalDescription, as this is where 620 the app specifies the number of media streams, and thereby ICE 621 components, for which to gather candidates. However, to accelerate 622 cases where the application knows the number of ICE components to use 623 ahead of time, it may ask the browser to gather a pool of potential 624 ICE candidates to help ensure rapid media setup. 626 When setLocalDescription is eventually called, and the browser goes 627 to gather the needed ICE candidates, it SHOULD start by checking if 628 any candidates are available in the pool. If there are candidates in 629 the pool, they SHOULD be handed to the application immediately via 630 the ICE candidate event. If the pool becomes depleted, either 631 because a larger-than-expected number of ICE components is used, or 632 because the pool has not had enough time to gather candidates, the 633 remaining candidates are gathered as usual. 635 One example of where this concept is useful is an application that 636 expects an incoming call at some point in the future, and wants to 637 minimize the time it takes to establish connectivity, to avoid 638 clipping of initial media. By pre-gathering candidates into the 639 pool, it can exchange and start sending connectivity checks from 640 these candidates almost immediately upon receipt of a call. Note 641 though that by holding on to these pre-gathered candidates, which 642 will be kept alive as long as they may be needed, the application 643 will consume resources on the STUN/TURN servers it is using. 645 3.6. Video Size Negotiation 647 Video size negotiation is the process through which a receiver can 648 use the "a=imageattr" SDP attribute [RFC6236] to indicate what video 649 frame sizes it is capable of receiving. A receiver may have hard 650 limits on what its video decoder can process, or it may wish to 651 constrain what it receives due to application preferences, e.g. a 652 specific size for the window in which the video will be displayed. 654 Note that certain codecs support transmission of samples with aspect 655 ratios other than 1.0 (i.e., non-square pixels). JSEP 656 implementations will not transmit non-square pixels, but SHOULD 657 receive and render such video with the correct aspect ratio. 658 However, sample aspect ratio has no impact on the size negotiation 659 described below; all dimensions assume square pixels. 661 3.6.1. Creating an imageattr Attribute 663 In order to determine the limits on what video resolution a receiver 664 wants to receive, it will intersect its decoder hard limits with any 665 mandatory constraints that have been applied to the associated 666 MediaStreamTrack. If the decoder limits are unknown, e.g. when using 667 a software decoder, the mandatory constraints are used directly. For 668 the answerer, these mandatory constraints can be applied to the 669 remote MediaStreamTracks that are created by a setRemoteDescription 670 call, and will affect the output of the ensuing createAnswer call. 671 Any constraints set after setLocalDescription is used to set the 672 answer will result in a new offer-answer exchange. For the offerer, 673 because it does not know about any remote MediaStreamTracks until it 674 receives the answer, the offer can only reflect decoder hard limits. 675 If the offerer wishes to set mandatory constraints on video 676 resolution, it must do so after receiving the answer, and the result 677 will be a new offer-answer to communicate them. 679 If there are no known decoder limits or mandatory constraints, the 680 "a=imageattr" attribute SHOULD be omitted. 682 Otherwise, an "a=imageattr" attribute is created with "recv" 683 direction, and the resulting resolution space formed by intersecting 684 the decoder limits and constraints is used to specify its minimum and 685 maximum x= and y= values. If the intersection is the null set, i.e., 686 there are no resolutions that are permitted by both the decoder and 687 the mandatory constraints, this SHOULD be represented by x=0 and y=0 688 values. 690 The rules here express a single set of preferences, and therefore, 691 the "a=imageattr" q= value is not important. It SHOULD be set to 692 1.0. 694 The "a=imageattr" field is payload type specific. When all video 695 codecs supported have the same capabilities, use of a single 696 attribute, with the wildcard payload type (*), is RECOMMENDED. 697 However, when the supported video codecs have differing capabilities, 698 specific "a=imageattr" attributes MUST be inserted for each payload 699 type. 701 As an example, consider a system with a HD-capable, multiformat video 702 decoder, where the application has constrained the received track to 703 at most 360p. In this case, the implementation would generate this 704 attribute: 706 a=imageattr:* recv [x=[16:640],y=[16:360],q=1.0] 707 This declaration indicates that the receiver is capable of decoding 708 any image resolution from 16x16 up to 640x360 pixels. 710 3.6.2. Interpreting an imageattr Attribute 712 [RFC6236] defines "a=imageattr" to be an advisory field. This means 713 that it does not absolutely constrain the video formats that the 714 sender can use, but gives an indication of the preferred values. 716 This specification prescribes more specific behavior. When a sender 717 of a given MediaStreamTrack, which is producing video of a certain 718 resolution, receives an "a=imageattr recv" attribute, it MUST check 719 to see if the original resolution meets the size criteria specified 720 in the attribute, and adapt the resolution accordingly by scaling (if 721 appropriate). Note that when considering a MediaStreamTrack that is 722 producing rotated video, the unrotated resolution MUST be used. This 723 is required regardless of whether the receiver supports performing 724 receive-side rotation (e.g., through CVO), as it significantly 725 simplifies the matching logic. 727 For the purposes of resolution negotiation, only size limits are 728 considered. Any other values, e.g. picture or sample aspect ratio, 729 MUST be ignored. 731 When communicating with a non-JSEP endpoint, multiple relevant 732 "a=imageattr recv" attributes may be received. If this occurs, 733 attributes other than the one with the highest "q=" value MUST be 734 ignored. 736 If an "a=imageattr recv" attribute references a different video codec 737 than what has been selected for the MediaStreamTrack, it MUST be 738 ignored. 740 If the original resolution matches the size limits in the attribute, 741 the track MUST be transmitted untouched. 743 If the original resolution exceeds the size limits in the attribute, 744 the sender SHOULD apply downscaling to the output of the 745 MediaStreamTrack in order to satisfy the limits. Downscaling MUST 746 NOT change the track aspect ratio. 748 If the original resolution is less than the size limits in the 749 attribute, upscaling is needed, but this may not be appropriate in 750 all cases. To address this concern, the application can set an 751 upscaling policy for each sent track. For this case, if upscaling is 752 permitted by policy, the sender SHOULD apply upscaling in order to 753 provide the desired resolution. Otherwise, the sender MUST NOT apply 754 upscaling. The sender SHOULD NOT upscale in other cases, even if the 755 policy permits it. Upscaling MUST NOT change the track aspect ratio. 757 If there is no appropriate and permitted scaling mechanism that 758 allows the received size limits to be satisfied, the sender MUST NOT 759 transmit the track. 761 If the attribute includes a "sar=" (sample aspect ratio) value set to 762 something other than "1.0", indicating the receiver wants to receive 763 non-square pixels, this cannot be satisfied and the sender MUST NOT 764 transmit the track. 766 In the special case of receiving a maximum resolution of [0, 0], as 767 described above, the sender MUST NOT transmit the track. 769 3.7. Simulcast 771 JSEP supports simulcast of a MediaStreamTrack, where multiple 772 encodings of the source media can be transmitted within the context 773 of a single m= section. The current JSEP API is designed to allow 774 applications to send simulcasted media but only to receive a single 775 encoding. This allows for multi-user scenarios where each sending 776 client sends multiple encodings to a server, which then, for each 777 receiving client, chooses the appropriate encoding to forward. 779 Applications request support for simulcast by configuring multiple 780 encodings on an RTPSender, which, upon generation of an offer or 781 answer, are indicated in SDP markings on the corresponding m= 782 section, as described below. Receivers that understand simulcast and 783 are willing to receive it will also include SDP markings to indicate 784 their support, and JSEP endpoints will use these markings to 785 determine whether simulcast is permitted for a given RTPSender. If 786 simulcast support is not negotiated, the RTPSender will only use the 787 first configured encoding. 789 Note that the exact simulcast parameters are up to the sending 790 application. While the aforementioned SDP markings are provided to 791 ensure the remote side can receive and demux multiple simulcast 792 encodings, the specific resolutions and bitrates to be used for each 793 encoding are purely a send-side decision in JSEP. 795 JSEP currently does not provide an API to configure receipt of 796 simulcast. This means that if simulcast is offered by the remote 797 endpoint, the answer generated by a JSEP endpoint will not indicate 798 support for receipt of simulcast, and as such the remote endpoint 799 will only send a single encoding per m= section. In addition, when 800 the JSEP endpoint is the answerer, the permitted encodings for the 801 RTPSender must be consistent with the offer, but this information is 802 currently not surfaced through any API. This means that established 803 simulcast streams will continue to work through a received re-offer, 804 but setting up initial simulcast by way of a received offer requires 805 out-of-band signaling or SDP inspection. Future versions of this 806 specification may add additional APIs to provide this control. 808 When using JSEP to transmit multiple encodings from a RTPSender, the 809 techniques from [I-D.ietf-mmusic-sdp-simulcast] and 810 [I-D.ietf-mmusic-rid] are used. Specifically, when multiple 811 encodings have been configured for a RTPSender, the m= section for 812 the RTPSender will include an "a=simulcast" attribute, as defined in 813 [I-D.ietf-mmusic-sdp-simulcast], Section 6.2, with a "send" simulcast 814 stream description that lists each desired encoding, and no "recv" 815 simulcast stream description. The m= section will also include an 816 "a=rid" attribute for each encoding, as specfied in 817 [I-D.ietf-mmusic-rid], Section 4; the use of RID identifiers allows 818 the individual encodings to be disambiguated even though they are all 819 part of the same m= section. 821 3.8. Interactions With Forking 823 Some call signaling systems allow various types of forking where an 824 SDP Offer may be provided to more than one device. For example, SIP 825 [RFC3261] defines both a "Parallel Search" and "Sequential Search". 826 Although these are primarily signaling level issues that are outside 827 the scope of JSEP, they do have some impact on the configuration of 828 the media plane that is relevant. When forking happens at the 829 signaling layer, the Javascript application responsible for the 830 signaling needs to make the decisions about what media should be sent 831 or received at any point of time, as well as which remote endpoint it 832 should communicate with; JSEP is used to make sure the media engine 833 can make the RTP and media perform as required by the application. 834 The basic operations that the applications can have the media engine 835 do are: 837 o Start exchanging media with a given remote peer, but keep all the 838 resources reserved in the offer. 840 o Start exchanging media with a given remote peer, and free any 841 resources in the offer that are not being used. 843 3.8.1. Sequential Forking 845 Sequential forking involves a call being dispatched to multiple 846 remote callees, where each callee can accept the call, but only one 847 active session ever exists at a time; no mixing of received media is 848 performed. 850 JSEP handles sequential forking well, allowing the application to 851 easily control the policy for selecting the desired remote endpoint. 852 When an answer arrives from one of the callees, the application can 853 choose to apply it either as a provisional answer, leaving open the 854 possibility of using a different answer in the future, or apply it as 855 a final answer, ending the setup flow. 857 In a "first-one-wins" situation, the first answer will be applied as 858 a final answer, and the application will reject any subsequent 859 answers. In SIP parlance, this would be ACK + BYE. 861 In a "last-one-wins" situation, all answers would be applied as 862 provisional answers, and any previous call leg will be terminated. 863 At some point, the application will end the setup process, perhaps 864 with a timer; at this point, the application could reapply the 865 pending remote description as a final answer. 867 3.8.2. Parallel Forking 869 Parallel forking involves a call being dispatched to multiple remote 870 callees, where each callee can accept the call, and multiple 871 simultaneous active signaling sessions can be established as a 872 result. If multiple callees send media at the same time, the 873 possibilities for handling this are described in Section 3.1 of 874 [RFC3960]. Most SIP devices today only support exchanging media with 875 a single device at a time, and do not try to mix multiple early media 876 audio sources, as that could result in a confusing situation. For 877 example, consider having a European ringback tone mixed together with 878 the North American ringback tone - the resulting sound would not be 879 like either tone, and would confuse the user. If the signaling 880 application wishes to only exchange media with one of the remote 881 endpoints at a time, then from a media engine point of view, this is 882 exactly like the sequential forking case. 884 In the parallel forking case where the Javascript application wishes 885 to simultaneously exchange media with multiple peers, the flow is 886 slightly more complex, but the Javascript application can follow the 887 strategy that [RFC3960] describes using UPDATE. The UPDATE approach 888 allows the signaling to set up a separate media flow for each peer 889 that it wishes to exchange media with. In JSEP, this offer used in 890 the UPDATE would be formed by simply creating a new PeerConnection 891 and making sure that the same local media streams have been added 892 into this new PeerConnection. Then the new PeerConnection object 893 would produce a SDP offer that could be used by the signaling to 894 perform the UPDATE strategy discussed in [RFC3960]. 896 As a result of sharing the media streams, the application will end up 897 with N parallel PeerConnection sessions, each with a local and remote 898 description and their own local and remote addresses. The media flow 899 from these sessions can be managed by specifying SDP direction 900 attributes in the descriptions, or the application can choose to play 901 out the media from all sessions mixed together. Of course, if the 902 application wants to only keep a single session, it can simply 903 terminate the sessions that it no longer needs. 905 4. Interface 907 This section details the basic operations that must be present to 908 implement JSEP functionality. The actual API exposed in the W3C API 909 may have somewhat different syntax, but should map easily to these 910 concepts. 912 4.1. PeerConnection 914 4.1.1. Constructor 916 The PeerConnection constructor allows the application to specify 917 global parameters for the media session, such as the STUN/TURN 918 servers and credentials to use when gathering candidates, as well as 919 the initial ICE candidate policy and pool size, and also the bundle 920 policy to use. 922 If an ICE candidate policy is specified, it functions as described in 923 Section 3.5.3, causing the browser to only surface the permitted 924 candidates (including any internal browser filtering) to the 925 application, and only use those candidates for connectivity checks. 926 The set of available policies is as follows: 928 all: All candidates permitted by browser policy will be gathered and 929 used. 931 relay: All candidates except relay candidates will be filtered out. 932 This obfuscates the location information that might be ascertained 933 by the remote peer from the received candidates. Depending on how 934 the application deploys its relay servers, this could obfuscate 935 location to a metro or possibly even global level. 937 The default ICE candidate policy MUST be set to "all" as this is 938 generally the desired policy, and also typically reduces use of 939 application TURN server resources significantly. 941 If a size is specified for the ICE candidate pool, this indicates the 942 number of ICE components to pre-gather candidates for. Because pre- 943 gathering results in utilizing STUN/TURN server resources for 944 potentially long periods of time, this must only occur upon 945 application request, and therefore the default candidate pool size 946 MUST be zero. 948 The application can specify its preferred policy regarding use of 949 bundle, the multiplexing mechanism defined in 950 [I-D.ietf-mmusic-sdp-bundle-negotiation]. Regardless of policy, the 951 application will always try to negotiate bundle onto a single 952 transport, and will offer a single bundle group across all media 953 section; use of this single transport is contingent upon the answerer 954 accepting bundle. However, by specifying a policy from the list 955 below, the application can control exactly how aggressively it will 956 try to bundle media streams together, which affects how it will 957 interoperate with a non-bundle-aware endpoint. When negotiating with 958 a non-bundle-aware endpoint, only the streams not marked as bundle- 959 only streams will be established. 961 The set of available policies is as follows: 963 balanced: The first media section of each type (audio, video, or 964 application) will contain transport parameters, which will allow 965 an answerer to unbundle that section. The second and any 966 subsequent media section of each type will be marked bundle-only. 967 The result is that if there are N distinct media types, then 968 candidates will be gathered for for N media streams. This policy 969 balances desire to multiplex with the need to ensure basic audio 970 and video can still be negotiated in legacy cases. When acting as 971 answerer, if there is no bundle group in the offer, the 972 implementation will reject all but the first m= section of each 973 type. 975 max-compat: All media sections will contain transport parameters; 976 none will be marked as bundle-only. This policy will allow all 977 streams to be received by non-bundle-aware endpoints, but require 978 separate candidates to be gathered for each media stream. 980 max-bundle: Only the first media section will contain transport 981 parameters; all streams other than the first will be marked as 982 bundle-only. This policy aims to minimize candidate gathering and 983 maximize multiplexing, at the cost of less compatibility with 984 legacy endpoints. When acting as answerer, the implementation 985 will reject any m= sections other than the first m= section, 986 unless they are in the same bundle group as that m= section. 988 As it provides the best tradeoff between performance and 989 compatibility with legacy endpoints, the default bundle policy MUST 990 be set to "balanced". 992 The application can specify its preferred policy regarding use of 993 RTP/RTCP multiplexing [RFC5761] using one of the following policies: 995 negotiate: The browser will gather both RTP and RTCP candidates but 996 also will offer "a=rtcp-mux", thus allowing for compatibility with 997 either multiplexing or non-multiplexing endpoints. 999 require: The browser will only gather RTP candidates. This halves 1000 the number of candidates that the offerer needs to gather. When 1001 acting as answerer, the implementation will reject any m= section 1002 that does not contain an "a=rtcp-mux" attribute. 1004 The default multiplexing policy MUST be set to "require". 1005 Implementations MAY choose to reject attempts by the application to 1006 set the multiplexing policy to "negotiate". 1008 4.1.2. addTrack 1010 The addTrack method adds a MediaStreamTrack to the PeerConnection, 1011 using the MediaStream argument to associate the track with other 1012 tracks in the same MediaStream, so that they can be added to the same 1013 "LS" group when creating an offer or answer. addTrack attempts to 1014 minimize the number of transceivers as follows: If the PeerConnection 1015 is in the "have-remote-offer" state, the track will be attached to 1016 the first compatible transceiver that was created by the most recent 1017 call to setRemoteDescription() and does not have a local track. 1018 Otherwise, a new transceiver will be created, as described in 1019 Section 4.1.3. 1021 4.1.3. addTransceiver 1023 The addTransceiver method adds a new RTPTransceiver to the 1024 PeerConnection. If a MediaStreamTrack argument is provided, then the 1025 transceiver will be configured with that media type and the track 1026 will be attached to the transceiver. Otherwise, the application MUST 1027 explicitly specify the type; this mode is useful for creating 1028 recvonly transceivers as well as for creating transceivers to which a 1029 track can be attached at some later point. 1031 At the time of creation, the application can also specify a 1032 transceiver direction attribute, a set of MediaStreams which the 1033 transceiver is associated with (allowing LS group assignments), and a 1034 set of encodings for the media (used for simulcast as described in 1035 Section 3.7). 1037 4.1.4. createDataChannel 1039 The createDataChannel method creates a new data channel and attaches 1040 it to the PeerConnection. If no data channel currently exists for 1041 this PeerConnection, then a new offer/answer exchange is required. 1042 All data channels on a given PeerConnection share the same SCTP/DTLS 1043 association and therefore the same m= section, so subsequent creation 1044 of data channels does not have any impact on the JSEP state. 1046 The createDataChannel method also includes a number of arguments 1047 which are used by the PeerConnection (e.g., maxPacketLifetime) but 1048 are not reflected in the SDP and do not affect the JSEP state. 1050 4.1.5. createOffer 1052 The createOffer method generates a blob of SDP that contains a 1053 [RFC3264] offer with the supported configurations for the session, 1054 including descriptions of the media added to this PeerConnection, the 1055 codec/RTP/RTCP options supported by this implementation, and any 1056 candidates that have been gathered by the ICE Agent. An options 1057 parameter may be supplied to provide additional control over the 1058 generated offer. This options parameter allows an application to 1059 trigger an ICE restart, for the purpose of reestablishing 1060 connectivity. 1062 In the initial offer, the generated SDP will contain all desired 1063 functionality for the session (functionality that is supported but 1064 not desired by default may be omitted); for each SDP line, the 1065 generation of the SDP will follow the process defined for generating 1066 an initial offer from the document that specifies the given SDP line. 1067 The exact handling of initial offer generation is detailed in 1068 Section 5.2.1 below. 1070 In the event createOffer is called after the session is established, 1071 createOffer will generate an offer to modify the current session 1072 based on any changes that have been made to the session, e.g., adding 1073 or stopping RtpTransceivers, or requesting an ICE restart. For each 1074 existing stream, the generation of each SDP line must follow the 1075 process defined for generating an updated offer from the RFC that 1076 specifies the given SDP line. For each new stream, the generation of 1077 the SDP must follow the process of generating an initial offer, as 1078 mentioned above. If no changes have been made, or for SDP lines that 1079 are unaffected by the requested changes, the offer will only contain 1080 the parameters negotiated by the last offer-answer exchange. The 1081 exact handling of subsequent offer generation is detailed in 1082 Section 5.2.2. below. 1084 Session descriptions generated by createOffer must be immediately 1085 usable by setLocalDescription; if a system has limited resources 1086 (e.g. a finite number of decoders), createOffer should return an 1087 offer that reflects the current state of the system, so that 1088 setLocalDescription will succeed when it attempts to acquire those 1089 resources. Because this method may need to inspect the system state 1090 to determine the currently available resources, it may be implemented 1091 as an async operation. 1093 Calling this method may do things such as generate new ICE 1094 credentials, but does not result in candidate gathering, or cause 1095 media to start or stop flowing. 1097 4.1.6. createAnswer 1099 The createAnswer method generates a blob of SDP that contains a 1100 [RFC3264] SDP answer with the supported configuration for the session 1101 that is compatible with the parameters supplied in the most recent 1102 call to setRemoteDescription, which MUST have been called prior to 1103 calling createAnswer. Like createOffer, the returned blob contains 1104 descriptions of the media added to this PeerConnection, the 1105 codec/RTP/RTCP options negotiated for this session, and any 1106 candidates that have been gathered by the ICE Agent. An options 1107 parameter may be supplied to provide additional control over the 1108 generated answer. 1110 As an answer, the generated SDP will contain a specific configuration 1111 that specifies how the media plane should be established; for each 1112 SDP line, the generation of the SDP must follow the process defined 1113 for generating an answer from the document that specifies the given 1114 SDP line. The exact handling of answer generation is detailed in 1115 Section 5.3. below. 1117 Session descriptions generated by createAnswer must be immediately 1118 usable by setLocalDescription; like createOffer, the returned 1119 description should reflect the current state of the system. Because 1120 this method may need to inspect the system state to determine the 1121 currently available resources, it may need to be implemented as an 1122 async operation. 1124 Calling this method may do things such as generate new ICE 1125 credentials, but does not trigger candidate gathering or change media 1126 state. 1128 4.1.7. SessionDescriptionType 1130 Session description objects (RTCSessionDescription) may be of type 1131 "offer", "pranswer", "answer" or "rollback". These types provide 1132 information as to how the description parameter should be parsed, and 1133 how the media state should be changed. 1135 "offer" indicates that a description should be parsed as an offer; 1136 said description may include many possible media configurations. A 1137 description used as an "offer" may be applied anytime the 1138 PeerConnection is in a stable state, or as an update to a previously 1139 supplied but unanswered "offer". 1141 "pranswer" indicates that a description should be parsed as an 1142 answer, but not a final answer, and so should not result in the 1143 freeing of allocated resources. It may result in the start of media 1144 transmission, if the answer does not specify an inactive media 1145 direction. A description used as a "pranswer" may be applied as a 1146 response to an "offer", or an update to a previously sent "pranswer". 1148 "answer" indicates that a description should be parsed as an answer, 1149 the offer-answer exchange should be considered complete, and any 1150 resources (decoders, candidates) that are no longer needed can be 1151 released. A description used as an "answer" may be applied as a 1152 response to an "offer", or an update to a previously sent "pranswer". 1154 The only difference between a provisional and final answer is that 1155 the final answer results in the freeing of any unused resources that 1156 were allocated as a result of the offer. As such, the application 1157 can use some discretion on whether an answer should be applied as 1158 provisional or final, and can change the type of the session 1159 description as needed. For example, in a serial forking scenario, an 1160 application may receive multiple "final" answers, one from each 1161 remote endpoint. The application could choose to accept the initial 1162 answers as provisional answers, and only apply an answer as final 1163 when it receives one that meets its criteria (e.g. a live user 1164 instead of voicemail). 1166 "rollback" is a special session description type implying that the 1167 state machine should be rolled back to the previous stable state, as 1168 described in Section 4.1.7.2. The contents MUST be empty. 1170 4.1.7.1. Use of Provisional Answers 1172 Most web applications will not need to create answers using the 1173 "pranswer" type. While it is good practice to send an immediate 1174 response to an "offer", in order to warm up the session transport and 1175 prevent media clipping, the preferred handling for a web application 1176 would be to create and send an "inactive" final answer immediately 1177 after receiving the offer. Later, when the called user actually 1178 accepts the call, the application can create a new "sendrecv" offer 1179 to update the previous offer/answer pair and start the media flow. 1180 While this could also be done with an inactive "pranswer", followed 1181 by a sendrecv "answer", the initial "pranswer" leaves the offer- 1182 answer exchange open, which means that neither side can send an 1183 updated offer during this time. 1185 As an example, consider a typical web application that will set up a 1186 data channel, an audio channel, and a video channel. When an 1187 endpoint receives an offer with these channels, it could send an 1188 answer accepting the data channel for two-way data, and accepting the 1189 audio and video tracks as inactive or receive-only. It could then 1190 ask the user to accept the call, acquire the local media streams, and 1191 send a new offer to the remote side moving the audio and video to be 1192 two-way media. By the time the human has accepted the call and 1193 triggered the new offer, it is likely that the ICE and DTLS 1194 handshaking for all the channels will already have finished. 1196 Of course, some applications may not be able to perform this double 1197 offer-answer exchange, particularly ones that are attempting to 1198 gateway to legacy signaling protocols. In these cases, "pranswer" 1199 can still provide the application with a mechanism to warm up the 1200 transport. 1202 4.1.7.2. Rollback 1204 In certain situations it may be desirable to "undo" a change made to 1205 setLocalDescription or setRemoteDescription. Consider a case where a 1206 call is ongoing, and one side wants to change some of the session 1207 parameters; that side generates an updated offer and then calls 1208 setLocalDescription. However, the remote side, either before or 1209 after setRemoteDescription, decides it does not want to accept the 1210 new parameters, and sends a reject message back to the offerer. Now, 1211 the offerer, and possibly the answerer as well, need to return to a 1212 stable state and the previous local/remote description. To support 1213 this, we introduce the concept of "rollback". 1215 A rollback discards any proposed changes to the session, returning 1216 the state machine to the stable state, and setting the pending local 1217 and/or remote description back to null. Any resources or candidates 1218 that were allocated by the abandoned local description are discarded; 1219 any media that is received will be processed according to the 1220 previous local and remote descriptions. Rollback can only be used to 1221 cancel proposed changes; there is no support for rolling back from a 1222 stable state to a previous stable state. Note that this implies that 1223 once the answerer has performed setLocalDescription with his answer, 1224 this cannot be rolled back. 1226 A rollback will disassociate any RtpTransceivers that were associated 1227 with m= sections by the application of the rolled-back session 1228 description (see Section 5.9 and Section 5.8). This means that some 1229 RtpTransceivers that were previously associated will no longer be 1230 associated with any m= section; in such cases, the value of the 1231 RtpTransceiver's mid attribute MUST be set to null. RtpTransceivers 1232 that were created by applying a remote offer that was subsequently 1233 rolled back MUST be removed. However, a RtpTransceiver MUST NOT be 1234 removed if the RtpTransceiver's RtpSender was activated by the 1235 addTrack method. This is so that an application may call addTrack, 1236 then call setRemoteDescription with an offer, then roll back that 1237 offer, then call createOffer and have a m= section for the added 1238 track appear in the generated offer. 1240 A rollback is performed by supplying a session description of type 1241 "rollback" with empty contents to either setLocalDescription or 1242 setRemoteDescription, depending on which was most recently used (i.e. 1243 if the new offer was supplied to setLocalDescription, the rollback 1244 should be done using setLocalDescription as well). 1246 4.1.8. setLocalDescription 1248 The setLocalDescription method instructs the PeerConnection to apply 1249 the supplied session description as its local configuration. The 1250 type field indicates whether the description should be processed as 1251 an offer, provisional answer, or final answer; offers and answers are 1252 checked differently, using the various rules that exist for each SDP 1253 line. 1255 This API changes the local media state; among other things, it sets 1256 up local resources for receiving and decoding media. In order to 1257 successfully handle scenarios where the application wants to offer to 1258 change from one media format to a different, incompatible format, the 1259 PeerConnection must be able to simultaneously support use of both the 1260 current and pending local descriptions (e.g. support codecs that 1261 exist in both descriptions) until a final answer is received, at 1262 which point the PeerConnection can fully adopt the pending local 1263 description, or roll back to the current description if the remote 1264 side denied the change. 1266 This API indirectly controls the candidate gathering process. When a 1267 local description is supplied, and the number of transports currently 1268 in use does not match the number of transports needed by the local 1269 description, the PeerConnection will create transports as needed and 1270 begin gathering candidates for them. 1272 If setRemoteDescription was previously called with an offer, and 1273 setLocalDescription is called with an answer (provisional or final), 1274 and the media directions are compatible, and media are available to 1275 send, this will result in the starting of media transmission. 1277 4.1.9. setRemoteDescription 1279 The setRemoteDescription method instructs the PeerConnection to apply 1280 the supplied session description as the desired remote configuration. 1281 As in setLocalDescription, the type field of the description 1282 indicates how it should be processed. 1284 This API changes the local media state; among other things, it sets 1285 up local resources for sending and encoding media. 1287 If setLocalDescription was previously called with an offer, and 1288 setRemoteDescription is called with an answer (provisional or final), 1289 and the media directions are compatible, and media are available to 1290 send, this will result in the starting of media transmission. 1292 4.1.10. currentLocalDescription 1294 The currentLocalDescription method returns a copy of the current 1295 negotiated local description - i.e., the local description from the 1296 last successful offer/answer exchange - in addition to any local 1297 candidates that have been generated by the ICE Agent since the local 1298 description was set. 1300 A null object will be returned if an offer/answer exchange has not 1301 yet been completed. 1303 4.1.11. pendingLocalDescription 1305 The pendingLocalDescription method returns a copy of the local 1306 description currently in negotiation - i.e., a local offer set 1307 without any corresponding remote answer - in addition to any local 1308 candidates that have been generated by the ICE Agent since the local 1309 description was set. 1311 A null object will be returned if the state of the PeerConnection is 1312 "stable" or "have-remote-offer". 1314 4.1.12. currentRemoteDescription 1316 The currentRemoteDescription method returns a copy of the current 1317 negotiated remote description - i.e., the remote description from the 1318 last successful offer/answer exchange - in addition to any remote 1319 candidates that have been supplied via processIceMessage since the 1320 remote description was set. 1322 A null object will be returned if an offer/answer exchange has not 1323 yet been completed. 1325 4.1.13. pendingRemoteDescription 1327 The pendingRemoteDescription method returns a copy of the remote 1328 description currently in negotiation - i.e., a remote offer set 1329 without any corresponding local answer - in addition to any remote 1330 candidates that have been supplied via processIceMessage since the 1331 remote description was set. 1333 A null object will be returned if the state of the PeerConnection is 1334 "stable" or "have-local-offer". 1336 4.1.14. canTrickleIceCandidates 1338 The canTrickleIceCandidates property indicates whether the remote 1339 side supports receiving trickled candidates. There are three 1340 potential values: 1342 null: No SDP has been received from the other side, so it is not 1343 known if it can handle trickle. This is the initial value before 1344 setRemoteDescription() is called. 1346 true: SDP has been received from the other side indicating that it 1347 can support trickle. 1349 false: SDP has been received from the other side indicating that it 1350 cannot support trickle. 1352 As described in Section 3.5.2, JSEP implementations always provide 1353 candidates to the application individually, consistent with what is 1354 needed for Trickle ICE. However, applications can use the 1355 canTrickleIceCandidates property to determine whether their peer can 1356 actually do Trickle ICE, i.e., whether it is safe to send an initial 1357 offer or answer followed later by candidates as they are gathered. 1358 As "true" is the only value that definitively indicates remote 1359 Trickle ICE support, an application which compares 1360 canTrickleIceCandidates against "true" will by default attempt Half 1361 Trickle on initial offers and Full Trickle on subsequent interactions 1362 with a Trickle ICE-compatible agent. 1364 4.1.15. setConfiguration 1366 The setConfiguration method allows the global configuration of the 1367 PeerConnection, which was initially set by constructor parameters, to 1368 be changed during the session. The effects of this method call 1369 depend on when it is invoked, and differ depending on which specific 1370 parameters are changed: 1372 o Any changes to the STUN/TURN servers to use affect the next 1373 gathering phase. If an ICE gathering phase has already started or 1374 completed, the 'needs-ice-restart' bit mentioned in Section 3.5.1 1375 will be set. This will cause the next call to createOffer to 1376 generate new ICE credentials, for the purpose of forcing an ICE 1377 restart and kicking off a new gathering phase, in which the new 1378 servers will be used. If the ICE candidate pool has a nonzero 1379 size, any existing candidates will be discarded, and new 1380 candidates will be gathered from the new servers. 1382 o Any change to the ICE candidate policy affects the next gathering 1383 phase. If an ICE gathering phase has already started or 1384 completed, the 'needs-ice-restart' bit will be set. Either way, 1385 changes to the policy have no effect on the candidate pool, 1386 because pooled candidates are not surfaced to the application 1387 until a gathering phase occurs, and so any necessary filtering can 1388 still be done on any pooled candidates. 1390 o Any changes to the ICE candidate pool size take effect 1391 immediately; if increased, additional candidates are pre-gathered; 1392 if decreased, the now-superfluous candidates are discarded. 1394 o The bundle and RTCP-multiplexing policies MUST NOT be changed 1395 after the construction of the PeerConnection. 1397 This call may result in a change to the state of the ICE Agent, and 1398 may result in a change to media state if it results in connectivity 1399 being established. 1401 4.1.16. addIceCandidate 1403 The addIceCandidate method provides a remote candidate to the ICE 1404 Agent, which, if parsed successfully, will be added to the current 1405 and/or pending remote description according to the rules defined for 1406 Trickle ICE. The pair of MID and ufrag is used to determine the m= 1407 section and ICE candidate generation to which the candidate belongs. 1408 If the MID is not present, the m= line index is used to look up the 1409 locally generated MID (see Section 5.9), which is used in place of a 1410 supplied MID. If these values or the candidate string are invalid, 1411 an error is generated. 1413 The purpose of the ufrag is to resolve ambiguities when trickle ICE 1414 is in progress during an ICE restart. If the ufrag is absent, the 1415 candidate MUST be assumed to belong to the most recently applied 1416 remote description. Connectivity checks will be sent to the new 1417 candidate. 1419 This method can also be used to provide an end-of-candidates 1420 indication to the ICE Agent, as defined in [I-D.ietf-ice-trickle]). 1421 The MID and ufrag are used as described above to determine the m= 1422 section and ICE generation for which candidate gathering is complete. 1423 If the ufrag is not present, then the end-of-candidates indication 1424 MUST be assumed to apply to the relevant m= section in the most 1425 recently applied remote description. If neither the MID nor the m= 1426 index is present, then the indication MUST be assumed to apply to all 1427 m= sections in the most recently applied remote description. 1429 This call will result in a change to the state of the ICE Agent, and 1430 may result in a change to media state if it results in connectivity 1431 being established. 1433 4.2. RtpTransceiver 1435 4.2.1. stop 1437 The stop method stops an RtpTransceiver. This will cause future 1438 calls to createOffer to generate a zero port for the associated m= 1439 section. See below for more details. 1441 4.2.2. stopped 1443 The stopped method returns "true" if the transceiver has been 1444 stopped, either by a call to stopTransceiver or by applying an answer 1445 that rejects the associated m= section, and "false" otherwise. 1447 A stopped RtpTransceiver does not send any outgoing RTP or RTCP or 1448 process any incoming RTP or RTCP. It cannot be restarted. 1450 4.2.3. setDirection 1452 The setDirection method sets the direction of a transceiver, which 1453 affects the direction attribute of the associated m= section on 1454 future calls to createOffer and createAnswer. 1456 When creating offers, the transceiver direction is directly reflected 1457 in the output, even for reoffers. When creating answers, the 1458 transceiver direction is intersected with the offered direction, as 1459 explained in the Section 5.3 section below. 1461 4.2.4. setCodecPreferences 1463 The setCodecPreferences method sets the codec preferences of a 1464 transceiver, which in turn affect the presence and order of codecs of 1465 the associated m= section on future calls to createOffer and 1466 createAnswer. Note that setCodecPreferences does not directly affect 1467 which codec the implemtation decides to send. It only affects which 1468 codecs the implementation indicates that it prefers to receive, via 1469 the offer or answer. Even when a codec is excluded by 1470 setCodecPreferences, it still may be used to send until the next 1471 offer/answer exchange discards it. 1473 The codec preferences of an RtpTransceiver can cause codecs to be 1474 excluded by subsequent calls to createOffer and createAnswer, in 1475 which case the corresponding media formats in the associated m= 1476 section will be excluded. The codec preferences cannot add media 1477 formats that would otherwise not be present. This includes codecs 1478 that were not negotiated in a previous offer/answer exchange that 1479 included the transceiver. 1481 The codec preferences of an RtpTransceiver can also determine the 1482 order of codecs in subsequent calls to createOffer and createAnswer, 1483 in which case the order of the media formats in the associated m= 1484 section will match. However, the codec preferences cannot change the 1485 order of the media formats after an answer containing the transceiver 1486 has been applied. At this point, codecs can only be removed, not 1487 reordered. 1489 5. SDP Interaction Procedures 1491 This section describes the specific procedures to be followed when 1492 creating and parsing SDP objects. 1494 5.1. Requirements Overview 1496 JSEP implementations must comply with the specifications listed below 1497 that govern the creation and processing of offers and answers. 1499 The first set of specifications is the "mandatory-to-implement" set. 1500 All implementations must support these behaviors, but may not use all 1501 of them if the remote side, which may not be a JSEP endpoint, does 1502 not support them. 1504 The second set of specifications is the "mandatory-to-use" set. The 1505 local JSEP endpoint and any remote endpoint must indicate support for 1506 these specifications in their session descriptions. 1508 5.1.1. Implementation Requirements 1510 This list of mandatory-to-implement specifications is derived from 1511 the requirements outlined in [I-D.ietf-rtcweb-rtp-usage]. 1513 R-1 [RFC4566] is the base SDP specification and MUST be 1514 implemented. 1516 R-2 [RFC5764] MUST be supported for signaling the UDP/TLS/RTP/SAVPF 1517 [RFC5764], TCP/DTLS/RTP/SAVPF 1518 [I-D.nandakumar-mmusic-proto-iana-registration], "UDP/DTLS/ 1519 SCTP" [I-D.ietf-mmusic-sctp-sdp], and "TCP/DTLS/SCTP" 1520 [I-D.ietf-mmusic-sctp-sdp] RTP profiles. 1522 R-3 [RFC5245] MUST be implemented for signaling the ICE credentials 1523 and candidate lines corresponding to each media stream. The 1524 ICE implementation MUST be a Full implementation, not a Lite 1525 implementation. 1527 R-4 [RFC5763] MUST be implemented to signal DTLS certificate 1528 fingerprints. 1530 R-5 [RFC4568] MUST NOT be implemented to signal SDES SRTP keying 1531 information. 1533 R-6 The [RFC5888] grouping framework MUST be implemented for 1534 signaling grouping information, and MUST be used to identify m= 1535 lines via the a=mid attribute. 1537 R-7 [I-D.ietf-mmusic-msid] MUST be supported, in order to signal 1538 associations between RTP objects and W3C MediaStreams and 1539 MediaStreamTracks in a standard way. 1541 R-8 The bundle mechanism in 1542 [I-D.ietf-mmusic-sdp-bundle-negotiation] MUST be supported to 1543 signal the ability to multiplex RTP streams on a single UDP 1544 port, in order to avoid excessive use of port number resources. 1546 R-9 The SDP attributes of "sendonly", "recvonly", "inactive", and 1547 "sendrecv" from [RFC4566] MUST be implemented to signal 1548 information about media direction. 1550 R-10 [RFC5576] MUST be implemented to signal RTP SSRC values and 1551 grouping semantics. 1553 R-11 [RFC4585] MUST be implemented to signal RTCP based feedback. 1555 R-12 [RFC5761] MUST be implemented to signal multiplexing of RTP and 1556 RTCP. 1558 R-13 [RFC5506] MUST be implemented to signal reduced-size RTCP 1559 messages. 1561 R-14 [RFC4588] MUST be implemented to signal RTX payload type 1562 associations. 1564 R-15 [RFC3556] with bandwidth modifiers MAY be supported for 1565 specifying RTCP bandwidth as a fraction of the media bandwidth, 1566 RTCP fraction allocated to the senders and setting maximum 1567 media bit-rate boundaries. 1569 R-16 TODO: any others? 1571 As required by [RFC4566], Section 5.13, JSEP implementations MUST 1572 ignore unknown attribute (a=) lines. 1574 5.1.2. Usage Requirements 1576 All session descriptions handled by JSEP endpoints, both local and 1577 remote, MUST indicate support for the following specifications. If 1578 any of these are absent, this omission MUST be treated as an error. 1580 R-1 ICE, as specified in [RFC5245], MUST be used. Note that the 1581 remote endpoint may use a Lite implementation; implementations 1582 MUST properly handle remote endpoints which do ICE-Lite. 1584 R-2 DTLS [RFC6347] or DTLS-SRTP [RFC5763], MUST be used, as 1585 appropriate for the media type, as specified in 1586 [I-D.ietf-rtcweb-security-arch] 1588 5.1.3. Profile Names and Interoperability 1590 For media m= sections, JSEP endpoints MUST support both the "UDP/TLS/ 1591 RTP/SAVPF" and "TCP/DTLS/RTP/SAVPF" profiles and MUST indicate one of 1592 these two profiles for each media m= line they produce in an offer. 1593 For data m= sections, JSEP endpoints must support both the "UDP/DTLS/ 1594 SCTP" and "TCP/DTLS/SCTP" profiles and MUST indicate one of these two 1595 profiles for each data m= line they produce in an offer. Because ICE 1596 can select either TCP or UDP transport depending on network 1597 conditions, both advertisements are consistent with ICE eventually 1598 selecting either either UDP or TCP. 1600 Unfortunately, in an attempt at compatibility, some endpoints 1601 generate other profile strings even when they mean to support one of 1602 these profiles. For instance, an endpoint might generate "RTP/AVP" 1603 but supply "a=fingerprint" and "a=rtcp-fb" attributes, indicating its 1604 willingness to support "(UDP,TCP)/TLS/RTP/SAVPF". In order to 1605 simplify compatibility with such endpoints, JSEP endpoints MUST 1606 follow the following rules when processing the media m= sections in 1607 an offer: 1609 o The profile in any "m=" line in any answer MUST exactly match the 1610 profile provided in the offer. 1612 o Any profile matching the following patterns MUST be accepted: 1613 "RTP/[S]AVP[F]" and "(UDP/TCP)/TLS/RTP/SAVP[F]" 1615 o Because DTLS-SRTP is REQUIRED, the choice of SAVP or AVP has no 1616 effect; support for DTLS-SRTP is determined by the presence of one 1617 or more "a=fingerprint" attribute. Note that lack of an 1618 "a=fingerprint" attribute will lead to negotiation failure. 1620 o The use of AVPF or AVP simply controls the timing rules used for 1621 RTCP feedback. If AVPF is provided, or an "a=rtcp-fb" attribute 1622 is present, assume AVPF timing, i.e., a default value of "trr- 1623 int=0". Otherwise, assume that AVPF is being used in an AVP 1624 compatible mode and use AVP timing, i.e., "trr-int=4". 1626 o For data m= sections, JSEP endpoints MUST support receiving the 1627 "UDP/ DTLS/SCTP", "TCP/DTLS/SCTP", or "DTLS/SCTP" (for backwards 1628 compatibility) profiles. 1630 Note that re-offers by JSEP endpoints MUST use the correct profile 1631 strings even if the initial offer/answer exchange used an (incorrect) 1632 older profile string. 1634 5.2. Constructing an Offer 1636 When createOffer is called, a new SDP description must be created 1637 that includes the functionality specified in 1638 [I-D.ietf-rtcweb-rtp-usage]. The exact details of this process are 1639 explained below. 1641 5.2.1. Initial Offers 1643 When createOffer is called for the first time, the result is known as 1644 the initial offer. 1646 The first step in generating an initial offer is to generate session- 1647 level attributes, as specified in [RFC4566], Section 5. 1648 Specifically: 1650 o The first SDP line MUST be "v=0", as specified in [RFC4566], 1651 Section 5.1 1653 o The second SDP line MUST be an "o=" line, as specified in 1654 [RFC4566], Section 5.2. The value of the field SHOULD 1655 be "-". [RFC3264] requires that the be representable as 1656 a 64-bit signed integer. It is RECOMMENDED that the be 1657 generated as a 64-bit quantity with the high bit being sent to 1658 zero and the remaining 63 bits being cryptographically random. 1659 The value of the tuple 1660 SHOULD be set to a non-meaningful address, such as IN IP4 0.0.0.0, 1661 to prevent leaking the local address in this field. As mentioned 1662 in [RFC4566], the entire o= line needs to be unique, but selecting 1663 a random number for is sufficient to accomplish this. 1665 o The third SDP line MUST be a "s=" line, as specified in [RFC4566], 1666 Section 5.3; to match the "o=" line, a single dash SHOULD be used 1667 as the session name, e.g. "s=-". Note that this differs from the 1668 advice in [RFC4566] which proposes a single space, but as both 1669 "o=" and "s=" are meaningless, having the same meaningless value 1670 seems clearer. 1672 o Session Information ("i="), URI ("u="), Email Address ("e="), 1673 Phone Number ("p="), Bandwidth ("b="), Repeat Times ("r="), and 1674 Time Zones ("z=") lines are not useful in this context and SHOULD 1675 NOT be included. 1677 o Encryption Keys ("k=") lines do not provide sufficient security 1678 and MUST NOT be included. 1680 o A "t=" line MUST be added, as specified in [RFC4566], Section 5.9; 1681 both and SHOULD be set to zero, e.g. "t=0 1682 0". 1684 o An "a=ice-options" line with the "trickle" option MUST be added, 1685 as specified in [I-D.ietf-ice-trickle], Section 4. 1687 The next step is to generate m= sections, as specified in [RFC4566] 1688 Section 5.14. An m= section is generated for each RtpTransceiver 1689 that has been added to the PeerConnection. This is done in the order 1690 that their associated RtpTransceivers were added to the 1691 PeerConnection and excludes RtpTransceivers that are stopped and not 1692 associated with an m= section (either due to an m= section being 1693 recycled or an RtpTransceiver having been stopped before being 1694 associated with an m= section) . 1696 Each m= section, provided it is not marked as bundle-only, MUST 1697 generate a unique set of ICE credentials and gather its own unique 1698 set of ICE candidates. Bundle-only m= sections MUST NOT contain any 1699 ICE credentials and MUST NOT gather any candidates. 1701 For DTLS, all m= sections MUST use all the certificate(s) that have 1702 been specified for the PeerConnection; as a result, they MUST all 1703 have the same [I-D.ietf-mmusic-4572-update] fingerprint value(s), or 1704 these value(s) MUST be session-level attributes. 1706 Each m= section should be generated as specified in [RFC4566], 1707 Section 5.14. For the m= line itself, the following rules MUST be 1708 followed: 1710 o The port value is set to the port of the default ICE candidate for 1711 this m= section, but given that no candidates have yet been 1712 gathered, the "dummy" port value of 9 (Discard) MUST be used, as 1713 indicated in [I-D.ietf-ice-trickle], Section 5.1. 1715 o To properly indicate use of DTLS, the field MUST be set to 1716 "UDP/TLS/RTP/SAVPF", as specified in [RFC5764], Section 8, if the 1717 default candidate uses UDP transport, or "TCP/DTLS/RTP/SAVPF", as 1718 specified in [I-D.nandakumar-mmusic-proto-iana-registration] if 1719 the default candidate uses TCP transport. 1721 o If codec preferences have been set for the associated transceiver, 1722 media formats MUST be generated in the corresponding order, and 1723 MUST exclude any codecs not present in the codec preferences. 1725 o Unless excluded by the above restrictions, the media formats MUST 1726 include the mandatory audio/video codecs as specified in 1727 [I-D.ietf-rtcweb-audio](see Section 3) and 1728 [I-D.ietf-rtcweb-video](see Section 5). 1730 The m= line MUST be followed immediately by a "c=" line, as specified 1731 in [RFC4566], Section 5.7. Again, as no candidates have yet been 1732 gathered, the "c=" line must contain the "dummy" value "IN IP4 1733 0.0.0.0", as defined in [I-D.ietf-ice-trickle], Section 5.1. 1735 [I-D.ietf-mmusic-sdp-mux-attributes] groups SDP attributes into 1736 different categories. To avoid unnecessary duplication when 1737 bundling, Section 8.1 of [I-D.ietf-mmusic-sdp-bundle-negotiation] 1738 specifies that attributes of category IDENTICAL or TRANSPORT should 1739 not be repeated in bundled m= sections. 1741 The following attributes, which are of a category other than 1742 IDENTICAL or TRANSPORT, MUST be included in each m= section: 1744 o An "a=mid" line, as specified in [RFC5888], Section 4. When 1745 generating mid values, it is RECOMMENDED that the values be 3 1746 bytes or less, to allow them to efficiently fit into the RTP 1747 header extension defined in 1748 [I-D.ietf-mmusic-sdp-bundle-negotiation], Section 11. 1750 o A direction attribute which is the same as that of the associated 1751 transceiver. 1753 o For each media format on the m= line, "a=rtpmap" and "a=fmtp" 1754 lines, as specified in [RFC4566], Section 6, and [RFC3264], 1755 Section 5.1. 1757 o If this m= section is for media with configurable frame sizes, 1758 e.g. audio, an "a=maxptime" line, indicating the smallest of the 1759 maximum supported frame sizes out of all codecs included above, as 1760 specified in [RFC4566], Section 6. 1762 o If this m= section is for video media, and there are known 1763 limitations on the size of images which can be decoded, an 1764 "a=imageattr" line, as specified in Section 3.6. 1766 o For each primary codec where RTP retransmission should be used, a 1767 corresponding "a=rtpmap" line indicating "rtx" with the clock rate 1768 of the primary codec and an "a=fmtp" line that references the 1769 payload type of the primary codec, as specified in [RFC4588], 1770 Section 8.1. 1772 o For each supported FEC mechanism, "a=rtpmap" and "a=fmtp" lines, 1773 as specified in [RFC4566], Section 6. The FEC mechanisms that 1774 MUST be supported are specified in [I-D.ietf-rtcweb-fec], 1775 Section 6, and specific usage for each media type is outlined in 1776 Sections 4 and 5. 1778 o For each supported RTP header extension, an "a=extmap" line, as 1779 specified in [RFC5285], Section 5. The list of header extensions 1780 that SHOULD/MUST be supported is specified in 1781 [I-D.ietf-rtcweb-rtp-usage], Section 5.2. Any header extensions 1782 that require encryption MUST be specified as indicated in 1783 [RFC6904], Section 4. 1785 o For each supported RTCP feedback mechanism, an "a=rtcp-fb" 1786 mechanism, as specified in [RFC4585], Section 4.2. The list of 1787 RTCP feedback mechanisms that SHOULD/MUST be supported is 1788 specified in [I-D.ietf-rtcweb-rtp-usage], Section 5.1. 1790 o If the bundle policy for this PeerConnection is set to "max- 1791 bundle", and this is not the first m= section, or the bundle 1792 policy is set to "balanced", and this is not the first m= section 1793 for this media type, an "a=bundle-only" line. 1795 o If the RtpTransceiver has a sendrecv or sendonly direction: 1797 * An "a=msid" line, as specified in [I-D.ietf-mmusic-msid], 1798 Section 2. 1800 o If the RtpTransceiver has a sendrecv or sendonly direction, and 1801 the application has specified RID values or has specified more 1802 than one encoding in the RtpSenders's parameters, an "a=rid" line 1803 for each encoding specified. The "a=rid" line is specified in 1804 [I-D.ietf-mmusic-rid], and its direction MUST be "send". If the 1805 application has chosen a RID value, it MUST be used as the rid- 1806 identifier; otherwise a RID value MUST be generated by the 1807 implementation. When generating RID values, it is RECOMMENDED 1808 that the values be 3 bytes or less, to allow them to efficiently 1809 fit into the RTP header extension defined in 1810 [I-D.ietf-avtext-rid], Section 11. If no encodings have been 1811 specified, or only one encoding is specified but without a RID 1812 value, then no "a=rid" lines are generated. 1814 o If the RtpTransceiver has a sendrecv or sendonly direction and 1815 more than one "a=rid" line has been generated, an "a=simulcast" 1816 line, with direction "send", as defined in 1817 [I-D.ietf-mmusic-sdp-simulcast], Section 6.2. The list of RIDs 1818 MUST include all of the RID identifiers used in the "a=rid" lines 1819 for this m= section. 1821 The following attributes, which are of category IDENTICAL or 1822 TRANSPORT, MUST appear only in "m=" sections which either have a 1823 unique address or which are associated with the bundle-tag. (In 1824 initial offers, this means those "m=" sections which do not contain 1825 an "a=bundle-only" attribute. 1827 o "a=ice-ufrag" and "a=ice-pwd" lines, as specified in [RFC5245], 1828 Section 15.4. 1830 o An "a=fingerprint" line for each of the endpoint's certificates, 1831 as specified in [RFC4572], Section 5; the digest algorithm used 1832 for the fingerprint MUST match that used in the certificate 1833 signature. 1835 o An "a=setup" line, as specified in [RFC4145], Section 4, and 1836 clarified for use in DTLS-SRTP scenarios in [RFC5763], Section 5. 1837 The role value in the offer MUST be "actpass". 1839 o An "a=dtls-id" line, as specified in [I-D.ietf-mmusic-dtls-sdp] 1840 Section 5.2. 1842 o An "a=rtcp" line, as specified in [RFC3605], Section 2.1, 1843 containing the dummy value "9 IN IP4 0.0.0.0", because no 1844 candidates have yet been gathered. 1846 o An "a=rtcp-mux" line, as specified in [RFC5761], Section 5.1.1. 1848 o An "a=rtcp-rsize" line, as specified in [RFC5506], Section 5. 1850 Lastly, if a data channel has been created, a m= section MUST be 1851 generated for data. The field MUST be set to "application" 1852 and the field MUST be set to "UDP/DTLS/SCTP" if the default 1853 candidate uses UDP transport, or "TCP/DTLS/SCTP" if the default 1854 candidate uses TCP transport [I-D.ietf-mmusic-sctp-sdp]. The "fmt" 1855 value MUST be set to "webrtc-datachannel" as specified in 1856 [I-D.ietf-mmusic-sctp-sdp], Section 4.1. 1858 Within the data m= section, the "a=mid", "a=ice-ufrag", "a=ice-pwd", 1859 "a=fingerprint", "a=dtls-id", and "a=setup" lines MUST be included as 1860 mentioned above, along with an "a=fmtp:webrtc-datachannel" line and 1861 an "a=sctp-port" line referencing the SCTP port number as defined in 1862 [I-D.ietf-mmusic-sctp-sdp], Section 4.1. 1864 Once all m= sections have been generated, a session-level "a=group" 1865 attribute MUST be added as specified in [RFC5888]. This attribute 1866 MUST have semantics "bundle", and MUST include the mid identifiers of 1867 each m= section. The effect of this is that the browser offers all 1868 m= sections as one bundle group. However, whether the m= sections 1869 are bundle-only or not depends on the bundle policy. 1871 The next step is to generate session-level lip sync groups as defined 1872 in [RFC5888], Section 7. For each MediaStream referenced by more 1873 than one RtpTransceiver (by passing those MediaStreams as arguments 1874 to the addTrack and addTransceiver methods), a group of type "LS" 1875 MUST be added that contains the mid values for each RtpTransceiver. 1877 Attributes which SDP permits to either be at the session level or the 1878 media level SHOULD generally be at the media level even if they are 1879 identical. This promotes readability, especially if one of a set of 1880 initially identical attributes is subsequently changed. 1882 Attributes other than the ones specified above MAY be included, 1883 except for the following attributes which are specifically 1884 incompatible with the requirements of [I-D.ietf-rtcweb-rtp-usage], 1885 and MUST NOT be included: 1887 o "a=crypto" 1889 o "a=key-mgmt" 1890 o "a=ice-lite" 1892 Note that when bundle is used, any additional attributes that are 1893 added MUST follow the advice in [I-D.ietf-mmusic-sdp-mux-attributes] 1894 on how those attributes interact with bundle. 1896 Note that these requirements are in some cases stricter than those of 1897 SDP. Implementations MUST be prepared to accept compliant SDP even 1898 if it would not conform to the requirements for generating SDP in 1899 this specification. 1901 5.2.2. Subsequent Offers 1903 When createOffer is called a second (or later) time, or is called 1904 after a local description has already been installed, the processing 1905 is somewhat different than for an initial offer. 1907 If the initial offer was not applied using setLocalDescription, 1908 meaning the PeerConnection is still in the "stable" state, the steps 1909 for generating an initial offer should be followed, subject to the 1910 following restriction: 1912 o The fields of the "o=" line MUST stay the same except for the 1913 field, which MUST increment by one on each call 1914 to createOffer if the offer might differ from the output of the 1915 previous call to createOffer; implementations MAY opt to increment 1916 on every call. The value of the generated 1917 is independent of the of the 1918 current local description; in particular, in the case where the 1919 current version is N, an offer is created with version N+1, and 1920 then that offer is rolled back so that the current version is 1921 again N, the next generated offer will still have version N+2. 1923 Note that if the application creates an offer by reading 1924 currentLocalDescription instead of calling createOffer, the returned 1925 SDP may be different than when setLocalDescription was originally 1926 called, due to the addition of gathered ICE candidates, but the 1927 will not have changed. There are no known 1928 scenarios in which this causes problems, but if this is a concern, 1929 the solution is simply to use createOffer to ensure a unique 1930 . 1932 If the initial offer was applied using setLocalDescription, but an 1933 answer from the remote side has not yet been applied, meaning the 1934 PeerConnection is still in the "local-offer" state, an offer is 1935 generated by following the steps in the "stable" state above, along 1936 with these exceptions: 1938 o The "s=" and "t=" lines MUST stay the same. 1940 o If any RtpTransceiver has been added, and there exists an m= 1941 section with a zero port in the current local description or the 1942 current remote description, that m= section MUST be recycled by 1943 generating an m= section for the added RtpTransceiver as if the m= 1944 section were being added to the session description, placed at the 1945 same index as the m= section with a zero port. 1947 o If an RtpTransceiver is stopped and is not associated with an m= 1948 section, an m= section MUST NOT be generated for it. This 1949 prevents adding back RtpTransceivers whose m= sections were 1950 recycled and used for a new RtpTransceiver in a previous offer/ 1951 answer exchange, as described above. 1953 o If an RtpTransceiver has been stopped and is associated with an m= 1954 section, and the m= section is not being recycled as described 1955 above, an m= section MUST be generated for it with the port set to 1956 zero and the "a=msid" line removed. 1958 o For RtpTransceivers that are not stopped, the "a=msid" line MUST 1959 stay the same if they are present in the current description. 1961 o Each "m=" and c=" line MUST be filled in with the port, protocol, 1962 and address of the default candidate for the m= section, as 1963 described in [RFC5245], Section 4.3. If ICE checking has already 1964 completed for one or more candidate pairs and a candidate pair is 1965 in active use, then that pair MUST be used, even if ICE has not 1966 yet completed. Note that this differs from the guidance in 1967 [RFC5245], Section 9.1.2.2, which only refers to offers created 1968 when ICE has completed. In each case, if no RTP candidates have 1969 yet been gathered, dummy values MUST be used, as described above. 1971 o Each "a=mid" line MUST stay the same. 1973 o Each "a=ice-ufrag" and "a=ice-pwd" line MUST stay the same, unless 1974 the ICE configuration has changed (either changes to the supported 1975 STUN/TURN servers, or the ICE candidate policy), or the 1976 "IceRestart" option ( Section 5.2.3.1 was specified. If the m= 1977 section is bundled into another m= section, it still MUST NOT 1978 contain any ICE credentials. 1980 o If the m= section is not bundled into another m= section, an 1981 "a=rtcp" attribute line MUST be added with of the default RTCP 1982 candidate, as indicated in [RFC5761], section 5.1.3. 1984 o If the m= section is not bundled into another m= section, for each 1985 candidate that has been gathered during the most recent gathering 1986 phase (see Section 3.5.1), an "a=candidate" line MUST be added, as 1987 defined in [RFC5245], Section 4.3., paragraph 3. If candidate 1988 gathering for the section has completed, an "a=end-of-candidates" 1989 attribute MUST be added, as described in [I-D.ietf-ice-trickle], 1990 Section 9.3. If the m= section is bundled into another m= 1991 section, both "a=candidate" and "a=end-of-candidates" MUST be 1992 omitted. 1994 o For RtpTransceivers that are still present, the "a=msid" line MUST 1995 stay the same. 1997 o For RtpTransceivers that are still present, the "a=rid" lines MUST 1998 stay the same. 2000 o For RtpTransceivers that are still present, any "a=simulcast" line 2001 MUST stay the same. 2003 o If any RtpTransceiver has been stopped, the port MUST be set to 2004 zero and the "a=msid" line MUST be removed. 2006 o If any RtpTransceiver has been added, and there exists a m= 2007 section with a zero port in the current local description or the 2008 current remote description, that m= section MUST be recycled by 2009 generating a m= section for the added RtpTransceiver as if the m= 2010 section were being added to session description, except that 2011 instead of adding it, the generated m= section replaces the m= 2012 section with a zero port. 2014 If the initial offer was applied using setLocalDescription, and an 2015 answer from the remote side has been applied using 2016 setRemoteDescription, meaning the PeerConnection is in the "remote- 2017 pranswer" or "stable" states, an offer is generated based on the 2018 negotiated session descriptions by following the steps mentioned for 2019 the "local-offer" state above. 2021 In addition, for each non-recycled, non-rejected m= section in the 2022 new offer, the following adjustments are made based on the contents 2023 of the corresponding m= section in the current remote description: 2025 o The m= line and corresponding "a=rtpmap" and "a=fmtp" lines MUST 2026 only include codecs present in the most recent answer which have 2027 not been excluded by the codec preferences of the associated 2028 transceiver. 2030 o The media formats on the m= line MUST be generated in the same 2031 order as in the current local description. 2033 o The RTP header extensions MUST only include those that are present 2034 in the most recent answer. 2036 o The RTCP feedback extensions MUST only include those that are 2037 present in the most recent answer. 2039 o The "a=rtcp" line MUST only be added if the most recent answer did 2040 not include an "a=rtcp-mux" line. 2042 o The "a=rtcp-mux" line MUST only be added if present in the most 2043 recent answer. 2045 o The "a=rtcp-mux-only" line MUST only be added if present in the 2046 most recent answer. 2048 o The "a=rtcp-rsize" line MUST only be added if present in the most 2049 recent answer. 2051 The "a=group:BUNDLE" attribute MUST include the mid identifiers 2052 specified in the bundle group in the most recent answer, minus any m= 2053 sections that have been marked as rejected, plus any newly added or 2054 re-enabled m= sections. In other words, the bundle attribute must 2055 contain all m= sections that were previously bundled, as long as they 2056 are still alive, as well as any new m= sections. 2058 The "LS" groups are generated in the same way as with initial offers. 2060 5.2.3. Options Handling 2062 The createOffer method takes as a parameter an RTCOfferOptions 2063 object. Special processing is performed when generating a SDP 2064 description if the following options are present. 2066 5.2.3.1. IceRestart 2068 If the "IceRestart" option is specified, with a value of "true", the 2069 offer MUST indicate an ICE restart by generating new ICE ufrag and 2070 pwd attributes, as specified in [RFC5245], Section 9.1.1.1. If this 2071 option is specified on an initial offer, it has no effect (since a 2072 new ICE ufrag and pwd are already generated). Similarly, if the ICE 2073 configuration has changed, this option has no effect, since new ufrag 2074 and pwd attributes will be generated automatically. This option is 2075 primarily useful for reestablishing connectivity in cases where 2076 failures are detected by the application. 2078 5.2.3.2. VoiceActivityDetection 2080 If the "VoiceActivityDetection" option is specified, with a value of 2081 "true", the offer MUST indicate support for silence suppression in 2082 the audio it receives by including comfort noise ("CN") codecs for 2083 each offered audio codec, as specified in [RFC3389], Section 5.1, 2084 except for codecs that have their own internal silence suppression 2085 support. For codecs that have their own internal silence suppression 2086 support, the appropriate fmtp parameters for that codec MUST be 2087 specified to indicate that silence suppression for received audio is 2088 desired. For example, when using the Opus codec, the "usedtx=1" 2089 parameter would be specified in the offer. This option allows the 2090 endpoint to significantly reduce the amount of audio bandwidth it 2091 receives, at the cost of some fidelity, depending on the quality of 2092 the remote VAD algorithm. 2094 If the "VoiceActivityDetection" option is specified, with a value of 2095 "false", the browser MUST NOT emit "CN" codecs. For codecs that have 2096 their own internal silence suppression support, the appropriate fmtp 2097 parameters for that codec MUST be specified to indicate that silence 2098 suppression for received audio is not desired. For example, when 2099 using the Opus codec, the "usedtx=0" parameter would be specified in 2100 the offer. 2102 Note that setting the "VoiceActivityDetection" parameter when 2103 generating an offer is a request to receive audio with silence 2104 suppression. It has no impact on whether the local endpoint does 2105 silence suppression for the audio it sends. 2107 The "VoiceActivityDetection" option does not have any impact on the 2108 setting of the "vad" value in the signaling of the client to mixer 2109 audio level header extension described in [RFC6464], Section 4. 2111 5.3. Generating an Answer 2113 When createAnswer is called, a new SDP description must be created 2114 that is compatible with the supplied remote description as well as 2115 the requirements specified in [I-D.ietf-rtcweb-rtp-usage]. The exact 2116 details of this process are explained below. 2118 5.3.1. Initial Answers 2120 When createAnswer is called for the first time after a remote 2121 description has been provided, the result is known as the initial 2122 answer. If no remote description has been installed, an answer 2123 cannot be generated, and an error MUST be returned. 2125 Note that the remote description SDP may not have been created by a 2126 JSEP endpoint and may not conform to all the requirements listed in 2127 Section 5.2. For many cases, this is not a problem. However, if any 2128 mandatory SDP attributes are missing, or functionality listed as 2129 mandatory-to-use above is not present, this MUST be treated as an 2130 error, and MUST cause the affected m= sections to be marked as 2131 rejected. 2133 The first step in generating an initial answer is to generate 2134 session-level attributes. The process here is identical to that 2135 indicated in the Initial Offers section above, except that the 2136 "a=ice-options" line, with the "trickle" option as specified in 2137 [I-D.ietf-ice-trickle], Section 4, is only included if such an option 2138 was present in the offer. 2140 The next step is to generate session-level lip sync groups as defined 2141 in [RFC5888], Section 7. For each group of type "LS" present in the 2142 offer, determine which of the local RtpTransceivers identified by 2143 that group's mid values reference a common local MediaStream (as 2144 specified in the addTrack and addTransceiver methods). If at least 2145 two such RtpTransceivers exist, a group of type "LS" with the mid 2146 values of these RtpTransceivers MUST be added. Otherwise, this 2147 indicates a difference of opinion between the offerer and answerer 2148 regarding lip sync status, and as such, the offered group MUST be 2149 ignored and no corresponding "LS" group generated. 2151 The next step is to generate m= sections for each m= section that is 2152 present in the remote offer, as specified in [RFC3264], Section 6. 2153 For the purposes of this discussion, any session-level attributes in 2154 the offer that are also valid as media-level attributes SHALL be 2155 considered to be present in each m= section. 2157 The next step is to go through each offered m= section. Each offered 2158 m= section will have an associated RtpTransceiver, as described in 2159 Section 5.9. If there are more RtpTransceivers than there are m= 2160 sections, the unmatched RtpTransceivers will need to be associated in 2161 a subsequent offer. 2163 For each offered m= section, if any of the following conditions are 2164 true, the corresponding m= section in the answer MUST be marked as 2165 rejected by setting the port in the m= line to zero, as indicated in 2166 [RFC3264], Section 6., and further processing for this m= section can 2167 be skipped: 2169 o The associated RtpTransceiver has been stopped. 2171 o No supported codec is present in the offer. 2173 o The bundle policy is "max-bundle", and this is not the first m= 2174 section or in the same bundle group as the first m= section. 2176 o The bundle policy is "balanced", and this is not the first m= 2177 section for this media type or in the same bundle group as the 2178 first m= section for this media type. 2180 o The RTP/RTCP multiplexing policy is "require" and the m= section 2181 doesn't contain an "a=rtcp-mux" attribute. 2183 Otherwise, each m= section in the answer should then be generated as 2184 specified in [RFC3264], Section 6.1. For the m= line itself, the 2185 following rules must be followed: 2187 o The port value would normally be set to the port of the default 2188 ICE candidate for this m= section, but given that no candidates 2189 have yet been gathered, the "dummy" port value of 9 (Discard) MUST 2190 be used, as indicated in [I-D.ietf-ice-trickle], Section 5.1. 2192 o The field MUST be set to exactly match the field 2193 for the corresponding m= line in the offer. 2195 o If codec preferences have been set for the associated transceiver, 2196 media formats MUST be generated in the corresponding order, and 2197 MUST exclude any codecs not present in the codec preferences or 2198 not present in the offer. 2200 o Unless excluded by the above restrictions, the media formats MUST 2201 include the mandatory audio/video codecs as specified in 2202 [I-D.ietf-rtcweb-audio](see Section 3) and 2203 [I-D.ietf-rtcweb-video](see Section 5). 2205 The m= line MUST be followed immediately by a "c=" line, as specified 2206 in [RFC4566], Section 5.7. Again, as no candidates have yet been 2207 gathered, the "c=" line must contain the "dummy" value "IN IP4 2208 0.0.0.0", as defined in [I-D.ietf-ice-trickle], Section 5.1. 2210 If the offer supports bundle, all m= sections to be bundled must use 2211 the same ICE credentials and candidates; all m= sections not being 2212 bundled must use unique ICE credentials and candidates. Each m= 2213 section MUST contain the following attributes (which are of attribute 2214 types other than IDENTICAL and TRANSPORT): 2216 o If and only if present in the offer, an "a=mid" line, as specified 2217 in [RFC5888], Section 9.1. The "mid" value MUST match that 2218 specified in the offer. 2220 o A direction attribute, determined by applying the rules regarding 2221 the offered direction specified in [RFC3264], Section 6.1, and 2222 then intersecting with the direction of the associated 2223 RtpTransceiver. For example, in the case where an m= section is 2224 offered as "sendonly", and the local transceiver is set to 2225 "sendrecv", the result in the answer is a "recvonly" direction. 2227 o For each media format on the m= line, "a=rtpmap" and "a=fmtp" 2228 lines, as specified in [RFC4566], Section 6, and [RFC3264], 2229 Section 6.1. 2231 o If this m= section is for media with configurable frame sizes, 2232 e.g. audio, an "a=maxptime" line, indicating the smallest of the 2233 maximum supported frame sizes out of all codecs included above, as 2234 specified in [RFC4566], Section 6. 2236 o If this m= section is for video media, and there are known 2237 limitations on the size of images which can be decoded, an 2238 "a=imageattr" line, as specified in Section 3.6. 2240 o If "rtx" is present in the offer, for each primary codec where RTP 2241 retransmission should be used, a corresponding "a=rtpmap" line 2242 indicating "rtx" with the clock rate of the primary codec and an 2243 "a=fmtp" line that references the payload type of the primary 2244 codec, as specified in [RFC4588], Section 8.1. 2246 o For each supported FEC mechanism, "a=rtpmap" and "a=fmtp" lines, 2247 as specified in [RFC4566], Section 6. The FEC mechanisms that 2248 MUST be supported are specified in [I-D.ietf-rtcweb-fec], 2249 Section 6, and specific usage for each media type is outlined in 2250 Sections 4 and 5. 2252 o For each supported RTP header extension that is present in the 2253 offer, an "a=extmap" line, as specified in [RFC5285], Section 5. 2254 The list of header extensions that SHOULD/MUST be supported is 2255 specified in [I-D.ietf-rtcweb-rtp-usage], Section 5.2. Any header 2256 extensions that require encryption MUST be specified as indicated 2257 in [RFC6904], Section 4. 2259 o For each supported RTCP feedback mechanism that is present in the 2260 offer, an "a=rtcp-fb" mechanism, as specified in [RFC4585], 2261 Section 4.2. The list of RTCP feedback mechanisms that SHOULD/ 2262 MUST be supported is specified in [I-D.ietf-rtcweb-rtp-usage], 2263 Section 5.1. 2265 o If the RtpTransceiver has a sendrecv or sendonly direction: 2267 * An "a=msid" line, as specified in [I-D.ietf-mmusic-msid], 2268 Section 2. 2270 Each m= section which is not bundled into another m= section, MUST 2271 contain the following attributes (which are of category IDENTICAL or 2272 TRANSPORT): 2274 o "a=ice-ufrag" and "a=ice-pwd" lines, as specified in [RFC5245], 2275 Section 15.4. 2277 o An "a=fingerprint" line for each of the endpoint's certificates, 2278 as specified in [RFC4572], Section 5; the digest algorithm used 2279 for the fingerprint MUST match that used in the certificate 2280 signature. 2282 o An "a=setup" line, as specified in [RFC4145], Section 4, and 2283 clarified for use in DTLS-SRTP scenarios in [RFC5763], Section 5. 2284 The role value in the answer MUST be "active" or "passive"; the 2285 "active" role is RECOMMENDED. The role value MUST be consistent 2286 with the existing DTLS connection, if one exists and is being 2287 continued. 2289 o An "a=dtls-id" line, as specified in [I-D.ietf-mmusic-dtls-sdp] 2290 Section 5.3. 2292 o If present in the offer, an "a=rtcp-mux" line, as specified in 2293 [RFC5761], Section 5.1.1. Otherwise, an "a=rtcp" line, as 2294 specified in [RFC3605], Section 2.1, containing the dummy value "9 2295 IN IP4 0.0.0.0" (because no candidates have yet been gathered). 2297 o If present in the offer, an "a=rtcp-rsize" line, as specified in 2298 [RFC5506], Section 5. 2300 If a data channel m= section has been offered, a m= section MUST also 2301 be generated for data. The field MUST be set to 2302 "application" and the and "fmt" fields MUST be set to exactly 2303 match the fields in the offer. 2305 Within the data m= section, the "a=mid", "a=ice-ufrag", "a=ice-pwd", 2306 "a=candidate", "a=fingerprint", "a=dtls-id", and "a=setup" lines MUST 2307 be included under the conditions described above, along with an 2308 "a=fmtp:webrtc-datachannel" line and an "a=sctp-port" line 2309 referencing the SCTP port number as defined in 2310 [I-D.ietf-mmusic-sctp-sdp], Section 4.1. 2312 If "a=group" attributes with semantics of "BUNDLE" are offered, 2313 corresponding session-level "a=group" attributes MUST be added as 2314 specified in [RFC5888]. These attributes MUST have semantics 2315 "BUNDLE", and MUST include the all mid identifiers from the offered 2316 bundle groups that have not been rejected. Note that regardless of 2317 the presence of "a=bundle-only" in the offer, no m= sections in the 2318 answer should have an "a=bundle-only" line. 2320 Attributes that are common between all m= sections MAY be moved to 2321 session-level, if explicitly defined to be valid at session-level. 2323 The attributes prohibited in the creation of offers are also 2324 prohibited in the creation of answers. 2326 5.3.2. Subsequent Answers 2328 When createAnswer is called a second (or later) time, or is called 2329 after a local description has already been installed, the processing 2330 is somewhat different than for an initial answer. 2332 If the initial answer was not applied using setLocalDescription, 2333 meaning the PeerConnection is still in the "have-remote-offer" state, 2334 the steps for generating an initial answer should be followed, 2335 subject to the following restriction: 2337 o The fields of the "o=" line MUST stay the same except for the 2338 field, which MUST increment if the session 2339 description changes in any way from the previously generated 2340 answer. 2342 If any session description was previously supplied to 2343 setLocalDescription, an answer is generated by following the steps in 2344 the "have-remote-offer" state above, along with these exceptions: 2346 o The "s=" and "t=" lines MUST stay the same. 2348 o Each "m=" and c=" line MUST be filled in with the port and address 2349 of the default candidate for the m= section, as described in 2350 [RFC5245], Section 4.3. Note, however, that the m= line protocol 2351 need not match the default candidate, because this protocol value 2352 must instead match what was supplied in the offer, as described 2353 above. 2355 o The media formats on the m= line MUST be generated in the same 2356 order as in the current local description. 2358 o Each "a=ice-ufrag" and "a=ice-pwd" line MUST stay the same, unless 2359 the m= section is restarting, in which case new ICE credentials 2360 must be created as specified in [RFC5245], Section 9.2.1.1. If 2361 the m= section is bundled into another m= section, it still MUST 2362 NOT contain any ICE credentials. 2364 o If the m= section is not bundled into another m= section and RTCP 2365 multiplexing is not active, an "a=rtcp" attribute line MUST be 2366 filled in with the port and address of the default RTCP candidate. 2367 If no RTCP candidates have yet been gathered, dummy values MUST be 2368 used, as described in the initial answer section above. 2370 o If the m= section is not bundled into another m= section, for each 2371 candidate that has been gathered during the most recent gathering 2372 phase (see Section 3.5.1), an "a=candidate" line MUST be added, as 2373 defined in [RFC5245], Section 4.3., paragraph 3. If candidate 2374 gathering for the section has completed, an "a=end-of-candidates" 2375 attribute MUST be added, as described in [I-D.ietf-ice-trickle], 2376 Section 9.3. If the m= section is bundled into another m= 2377 section, both "a=candidate" and "a=end-of-candidates" MUST be 2378 omitted. 2380 o For RtpTransceivers that are not stopped, the "a=msid" line MUST 2381 stay the same. 2383 5.3.3. Options Handling 2385 The createAnswer method takes as a parameter an RTCAnswerOptions 2386 object. The set of parameters for RTCAnswerOptions is different than 2387 those supported in RTCOfferOptions; the IceRestart option is 2388 unnecessary, as ICE credentials will automatically be changed for all 2389 m= lines where the offerer chose to perform ICE restart. 2391 The following options are supported in RTCAnswerOptions. 2393 5.3.3.1. VoiceActivityDetection 2395 Silence suppression in the answer is handled as described in 2396 Section 5.2.3.2, with one exception: if support for silence 2397 suppression was not indicated in the offer, the 2398 VoiceActivityDetection parameter has no effect, and the answer should 2399 be generated as if VoiceActivityDetection was set to false. This is 2400 done on a per-codec basis (e.g., if the offerer somehow offered 2401 support for CN but set "usedtx=0" for Opus, setting 2402 VoiceActivityDetection to true would result in an answer with CN 2403 codecs and "usedtx=0"). 2405 5.4. Modifying an Offer or Answer 2407 The SDP returned from createOffer or createAnswer MUST NOT be changed 2408 before passing it to setLocalDescription. If precise control over 2409 the SDP is needed, the aformentioned createOffer/createAnswer options 2410 or RTPSender APIs MUST be used. 2412 Note that the application MAY modify the SDP to reduce the 2413 capabilities in the offer it sends to the far side (post- 2414 setLocalDescription) or the offer that it installs from the far side 2415 (pre-setRemoteDescription), as long as it remains a valid SDP offer 2416 and specifies a subset of what was in the original offer. This is 2417 safe because the answer is not permitted to expand capabilities, and 2418 therefore will just respond to what is present in the offer. 2420 The application SHOULD NOT modify the SDP in the answer it transmits, 2421 as the answer contains the negotiated capabilities, and this can 2422 cause the two sides to have different ideas about what exactly was 2423 negotiated. 2425 As always, the application is solely responsible for what it sends to 2426 the other party, and all incoming SDP will be processed by the 2427 browser to the extent of its capabilities. It is an error to assume 2428 that all SDP is well-formed; however, one should be able to assume 2429 that any implementation of this specification will be able to 2430 process, as a remote offer or answer, unmodified SDP coming from any 2431 other implementation of this specification. 2433 5.5. Processing a Local Description 2435 When a SessionDescription is supplied to setLocalDescription, the 2436 following steps MUST be performed: 2438 o First, the type of the SessionDescription is checked against the 2439 current state of the PeerConnection: 2441 * If the type is "offer", the PeerConnection state MUST be either 2442 "stable" or "have-local-offer". 2444 * If the type is "pranswer" or "answer", the PeerConnection state 2445 MUST be either "have-remote-offer" or "have-local-pranswer". 2447 o If the type is not correct for the current state, processing MUST 2448 stop and an error MUST be returned. 2450 o Next, the SessionDescription is parsed into a data structure, as 2451 described in the Section 5.7 section below. If parsing fails for 2452 any reason, processing MUST stop and an error MUST be returned. 2454 o Finally, the parsed SessionDescription is applied as described in 2455 the Section 5.8 section below. 2457 5.6. Processing a Remote Description 2459 When a SessionDescription is supplied to setRemoteDescription, the 2460 following steps MUST be performed: 2462 o First, the type of the SessionDescription is checked against the 2463 current state of the PeerConnection: 2465 * If the type is "offer", the PeerConnection state MUST be either 2466 "stable" or "have-remote-offer". 2468 * If the type is "pranswer" or "answer", the PeerConnection state 2469 MUST be either "have-local-offer" or "have-remote-pranswer". 2471 o If the type is not correct for the current state, processing MUST 2472 stop and an error MUST be returned. 2474 o Next, the SessionDescription is parsed into a data structure, as 2475 described in the Section 5.7 section below. If parsing fails for 2476 any reason, processing MUST stop and an error MUST be returned. 2478 o Finally, the parsed SessionDescription is applied as described in 2479 the Section 5.9 section below. 2481 5.7. Parsing a Session Description 2483 When a SessionDescription of any type is supplied to setLocal/ 2484 RemoteDescription, the implementation must parse it and reject it if 2485 it is invalid. The exact details of this process are explained 2486 below. 2488 The SDP contained in the session description object consists of a 2489 sequence of text lines, each containing a key-value expression, as 2490 described in [RFC4566], Section 5. The SDP is read, line-by-line, 2491 and converted to a data structure that contains the deserialized 2492 information. However, SDP allows many types of lines, not all of 2493 which are relevant to JSEP applications. For each line, the 2494 implementation will first ensure it is syntactically correct 2495 according to its defining ABNF, check that it conforms to [RFC4566] 2496 and [RFC3264] semantics, and then either parse and store or discard 2497 the provided value, as described below. 2499 If any line is not well-formed, or cannot be parsed as described, the 2500 parser MUST stop with an error and reject the session description, 2501 even if the value is to be discarded. This ensures that 2502 implementations do not accidentally misinterpret ambiguous SDP. 2504 5.7.1. Session-Level Parsing 2506 First, the session-level lines are checked and parsed. These lines 2507 MUST occur in a specific order, and with a specific syntax, as 2508 defined in [RFC4566], Section 5. Note that while the specific line 2509 types (e.g. "v=", "c=") MUST occur in the defined order, lines of the 2510 same type (typically "a=") can occur in any order, and their ordering 2511 is not meaningful. 2513 The following non-attribute lines are not meaningful in the JSEP 2514 context and MAY be discarded once they have been checked. 2516 The "c=" line MUST be checked for syntax but its value is not 2517 used. This supersedes the guidance in [RFC5245], Section 6.1, to 2518 use "ice-mismatch" to indicate mismatches between "c=" and the 2519 candidate lines; because JSEP always uses ICE, "ice-mismatch" is 2520 not useful in this context. 2522 The "i=", "u=", "e=", "p=", "t=", "r=", "z=", and "k=" lines are 2523 not used by this specification; they MUST be checked for syntax 2524 but their values are not used. 2526 The remaining non-attribute lines are processed as follows: 2528 The "v=" line MUST have a version of 0, as specified in [RFC4566], 2529 Section 5.1. 2531 The "o=" line MUST be parsed as specified in [RFC4566], 2532 Section 5.2. 2534 The "b=" line, if present, MUST be parsed as specified in 2535 [RFC4566], Section 5.8, and the bwtype and bandwidth values 2536 stored. 2538 Finally, the attribute lines are processed. Specific processing MUST 2539 be applied for the following session-level attribute ("a=") lines: 2541 o Any "a=group" lines are parsed as specified in [RFC5888], 2542 Section 5, and the group's semantics and mids are stored. 2544 o If present, a single "a=ice-lite" line is parsed as specified in 2545 [RFC5245], Section 15.3, and a value indicating the presence of 2546 ice-lite is stored. 2548 o If present, a single "a=ice-ufrag" line is parsed as specified in 2549 [RFC5245], Section 15.4, and the ufrag value is stored. 2551 o If present, a single "a=ice-pwd" line is parsed as specified in 2552 [RFC5245], Section 15.4, and the password value is stored. 2554 o If present, a single "a=ice-options" line is parsed as specified 2555 in [RFC5245], Section 15.5, and the set of specified options is 2556 stored. 2558 o Any "a=fingerprint" lines are parsed as specified in [RFC4572], 2559 Section 5, and the set of fingerprint and algorithm values is 2560 stored. 2562 o If present, a single "a=setup" line is parsed as specified in 2563 [RFC4145], Section 4, and the setup value is stored. 2565 o If present, a single "a=dtls-id" line is parsed as specified in 2566 [I-D.ietf-mmusic-dtls-sdp] Section 5, and the dtls-id value is 2567 stored. 2569 o Any "a=extmap" lines are parsed as specified in [RFC5285], 2570 Section 5, and their values are stored. 2572 Once all the session-level lines have been parsed, processing 2573 continues with the lines in media sections. 2575 5.7.2. Media Section Parsing 2577 Like the session-level lines, the media session lines MUST occur in 2578 the specific order and with the specific syntax defined in [RFC4566], 2579 Section 5. 2581 The "m=" line itself MUST be parsed as described in [RFC4566], 2582 Section 5.14, and the media, port, proto, and fmt values stored. 2584 Following the "m=" line, specific processing MUST be applied for the 2585 following non-attribute lines: 2587 o As with the "c=" line at the session level, the "c=" line MUST be 2588 parsed according to [RFC4566], Section 5.7, but its value is not 2589 used. 2591 o The "b=" line, if present, MUST be parsed as specified in 2592 [RFC4566], Section 5.8, and the bwtype and bandwidth values 2593 stored. 2595 Specific processing MUST also be applied for the following attribute 2596 lines: 2598 o If present, a single "a=ice-ufrag" line is parsed as specified in 2599 [RFC5245], Section 15.4, and the ufrag value is stored. 2601 o If present, a single "a=ice-pwd" line is parsed as specified in 2602 [RFC5245], Section 15.4, and the password value is stored. 2604 o If present, a single "a=ice-options" line is parsed as specified 2605 in [RFC5245], Section 15.5, and the set of specified options is 2606 stored. 2608 o Any "a=candidate" attributes MUST be parsed as specified in 2609 [RFC5245], Section 15.1, and their values stored. 2611 o Any "a=remote-candidates" attributes MUST be parsed as specified 2612 in [RFC5245], Section 15.2, but their values are ignored. 2614 o If present, a single "a=end-of-candidates" attribute MUST be 2615 parsed as specified in [I-D.ietf-ice-trickle], Section 8.2, and 2616 its presence or absence flagged and stored. 2618 o Any "a=fingerprint" lines are parsed as specified in [RFC4572], 2619 Section 5, and the set of fingerprint and algorithm values is 2620 stored. 2622 If the "m=" proto value indicates use of RTP, as described in the 2623 Section 5.1.3 section above, the following attribute lines MUST be 2624 processed: 2626 o The "m=" fmt value MUST be parsed as specified in [RFC4566], 2627 Section 5.14, and the individual values stored. 2629 o Any "a=rtpmap" or "a=fmtp" lines MUST be parsed as specified in 2630 [RFC4566], Section 6, and their values stored. 2632 o If present, a single "a=ptime" line MUST be parsed as described in 2633 [RFC4566], Section 6, and its value stored. 2635 o If present, a single "a=maxptime" line MUST be parsed as described 2636 in [RFC4566], Section 6, and its value stored. 2638 o If present, a single direction attribute line (e.g. "a=sendrecv") 2639 MUST be parsed as described in [RFC4566], Section 6, and its value 2640 stored. 2642 o Any "a=ssrc" or "a=ssrc-group" attributes MUST be parsed as 2643 specified in [RFC5576], Sections 4.1-4.2, and their values stored. 2645 o Any "a=extmap" attributes MUST be parsed as specified in 2646 [RFC5285], Section 5, and their values stored. 2648 o Any "a=rtcp-fb" attributes MUST be parsed as specified in 2649 [RFC4585], Section 4.2., and their values stored. 2651 o If present, a single "a=rtcp-mux" attribute MUST be parsed as 2652 specified in [RFC5761], Section 5.1.1, and its presence or absence 2653 flagged and stored. 2655 o If present, a single "a=rtcp-mux-only" attribute MUST be parsed as 2656 specified in [I-D.ietf-mmusic-mux-exclusive], Section 3, and its 2657 presence or absence flagged and stored. 2659 o If present, a single "a=rtcp-rsize" attribute MUST be parsed as 2660 specified in [RFC5506], Section 5, and its presence or absence 2661 flagged and stored. 2663 o If present, a single "a=rtcp" attribute MUST be parsed as 2664 specified in [RFC3605], Section 2.1, but its value is ignored, as 2665 this information is superfluous when using ICE. 2667 o If present, a single "a=msid" attribute MUST be parsed as 2668 specified in [I-D.ietf-mmusic-msid], Section 3.2, and its value 2669 stored. 2671 o Any "a=imageattr" attributes MUST be parsed as specified in 2672 [RFC6236], Section 3, and their values stored. 2674 o Any "a=rid" lines MUST be parsed as specified in 2675 [I-D.ietf-mmusic-rid], Section 10, and their values stored. 2677 o If present, a single "a=simulcast" line MUST be parsed as 2678 specified in [I-D.ietf-mmusic-sdp-simulcast], and its values 2679 stored. 2681 Otherwise, if the "m=" proto value indicates use of SCTP, the 2682 following attribute lines MUST be processed: 2684 o The "m=" fmt value MUST be parsed as specified in 2685 [I-D.ietf-mmusic-sctp-sdp], Section 4.3, and the application 2686 protocol value stored. 2688 o An "a=sctp-port" attribute MUST be present, and it MUST be parsed 2689 as specified in [I-D.ietf-mmusic-sctp-sdp], Section 5.2, and the 2690 value stored. 2692 o If present, a single "a=max-message-size" attribute MUST be parsed 2693 as specified in [I-D.ietf-mmusic-sctp-sdp], Section 6, and the 2694 value stored. Otherwise, use the specified default. 2696 5.7.3. Semantics Verification 2698 Assuming parsing completes successfully, the parsed description is 2699 then evaluated to ensure internal consistency as well as proper 2700 support for mandatory features. Specifically, the following checks 2701 are performed: 2703 o For each m= section, valid values for each of the mandatory-to-use 2704 features enumerated in Section 5.1.2 MUST be present. These 2705 values MAY either be present at the media level, or inherited from 2706 the session level. 2708 * ICE ufrag and password values, which MUST comply with the size 2709 limits specified in [RFC5245], Section 15.4. 2711 * dtls-id value, which MUST be set according to 2712 [I-D.ietf-mmusic-dtls-sdp] Section 5. If this is a re-offer 2713 and the dtls-id value is different from that presently in use, 2714 the DTLS connection is not being continued and the remote 2715 description MUST be part of an ICE restart, together with new 2716 ufrag and password values. If this is an answer, the dtls-id 2717 value, if present, MUST be the same as in the offer. 2719 * DTLS setup value, which MUST be set according to the rules 2720 specified in [RFC5763], Section 5 and MUST be consistent with 2721 the selected role of the current DTLS connection, if one exists 2722 and is being continued. 2724 * DTLS fingerprint values, where at least one fingerprint MUST be 2725 present. 2727 o All RID values referenced in an "a=simulcast" line MUST exist as 2728 "a=rid" lines. 2730 o Each m= section is also checked to ensure prohibited features are 2731 not used. If this is a local description, the "ice-lite" 2732 attribute MUST NOT be specified. 2734 If this session description is of type "pranswer" or "answer", the 2735 following additional checks are applied: 2737 o The session description must follow the rules defined in 2738 [RFC3264], Section 6, including the requirement that the number of 2739 m= sections MUST exactly match the number of m= sections in the 2740 associated offer. 2742 o For each m= section, the media type and protocol values MUST 2743 exactly match the media type and protocol values in the 2744 corresponding m= section in the associated offer. 2746 5.8. Applying a Local Description 2748 The following steps are performed at the media engine level to apply 2749 a local description. 2751 First, the parsed parameters are checked to ensure that they have not 2752 been altered after their generation in createOffer/createAnswer, as 2753 discussed in Section 5.4; otherwise, processing MUST stop and an 2754 error MUST be returned. 2756 Next, media sections are processed. For each media section, the 2757 following steps MUST be performed; if any parameters are out of 2758 bounds, or cannot be applied, processing MUST stop and an error MUST 2759 be returned. 2761 o If this media section is new, begin gathering candidates for it, 2762 as defined in [RFC5245], Section 4.1.1, unless it has been marked 2763 as bundle-only. 2765 o Or, if the ICE ufrag and password values have changed, and it has 2766 not been marked as bundle-only, trigger the ICE Agent to start an 2767 ICE restart, and begin gathering new candidates for the media 2768 section as described in [RFC5245], Section 9.1.1.1. If this 2769 description is an answer, also start checks on that media section 2770 as defined in [RFC5245], Section 9.3.1.1. 2772 o If the media section proto value indicates use of RTP: 2774 * If there is no RtpTransceiver associated with this m= section 2775 (which should only happen when applying an offer), find one and 2776 associate it with this m= section according to the following 2777 steps: 2779 + Find the RtpTransceiver that corresponds to the m= section 2780 with the same MID in the created offer. 2782 + Set the value of the RtpTransceiver's mid attribute to the 2783 MID of the m= section. 2785 * If RTCP mux is indicated, prepare to demux RTP and RTCP from 2786 the RTP ICE component, as specified in [RFC5761], 2787 Section 5.1.1. If RTCP mux is not indicated, but was indicated 2788 in a previous description, this MUST result in an error. 2790 * For each specified RTP header extension, establish a mapping 2791 between the extension ID and URI, as described in section 6 of 2792 [RFC5285]. If any indicated RTP header extension is not 2793 supported, this MUST result in an error. 2795 * If the MID header extension is supported, prepare to demux RTP 2796 data intended for this media section based on the MID header 2797 extension, as described in [I-D.ietf-mmusic-msid], Section 3.2. 2799 * For each specified media format, establish a mapping between 2800 the payload type and the actual media format, as described in 2801 [RFC3264], Section 6.1. If any indicated media format is not 2802 supported, this MUST result in an error. 2804 * For each specified "rtx" media format, establish a mapping 2805 between the RTX payload type and its associated primary payload 2806 type, as described in [RFC4588], Sections 8.6 and 8.7. If any 2807 referenced primary payload types are not present, this MUST 2808 result in an error. 2810 * If the directional attribute is of type "sendrecv" or 2811 "recvonly", enable receipt and decoding of media. 2813 Finally, if this description is of type "pranswer" or "answer", 2814 follow the processing defined in the Section 5.10 section below. 2816 5.9. Applying a Remote Description 2818 If the answer contains any "a=ice-options" attributes where "trickle" 2819 is listed as an attribute, update the PeerConnection canTrickle 2820 property to be true. Otherwise, set this property to false. 2822 The following steps are performed at the media engine level to apply 2823 a remote description. 2825 The following steps MUST be performed for attributes at the session 2826 level; if any parameters are out of bounds, or cannot be applied, 2827 processing MUST stop and an error MUST be returned. 2829 o For any specified "CT" bandwidth value, set this as the limit for 2830 the maximum total bitrate for all m= sections, as specified in 2831 Section 5.8 of [RFC4566]. The implementation can decide how to 2832 allocate the available bandwidth between m= sections to 2833 simultaneously meet any limits on individual m= sections, as well 2834 as this overall session limit. 2836 o For any specified "RR" or "RS" bandwidth values, handle as 2837 specified in [RFC3556], Section 2. 2839 o Any "AS" bandwidth value MUST be ignored, as the meaning of this 2840 construct at the session level is not well defined. 2842 For each media section, the following steps MUST be performed; if any 2843 parameters are out of bounds, or cannot be applied, processing MUST 2844 stop and an error MUST be returned. 2846 o If the ICE ufrag or password changed from the previous remote 2847 description, then an ICE restart is needed, as described in 2848 Section 9.1.1.1 of [RFC5245] If the description is of type 2849 "offer", mark that an ICE restart is needed. If the description 2850 is of type "answer" and the current local description is also an 2851 ICE restart, then signal the ICE agent to begin checks as 2852 described in Section 9.3.1.1 of [RFC5245]. An answer MUST change 2853 the ufrag and password in an answer if and only if ICE is 2854 restarting, as described in Section 9.2.1.1 of [RFC5245]. 2856 o Configure the ICE components associated with this media section to 2857 use the supplied ICE remote ufrag and password for their 2858 connectivity checks. 2860 o Pair any supplied ICE candidates with any gathered local 2861 candidates, as described in Section 5.7 of [RFC5245] and start 2862 connectivity checks with the appropriate credentials. 2864 o If an "a=end-of-candidates" attribute is present, process the end- 2865 of-candidates indication as described in [I-D.ietf-ice-trickle] 2866 Section 11. 2868 o If the media section proto value indicates use of RTP: 2870 * If the m= section is being recycled (see Section 5.2.2), 2871 dissociate the currently associated RtpTransceiver by setting 2872 its mid attribute to null. 2874 * If the m= section is not associated with any RtpTransceiver 2875 (possibly because it was dissociated in the previous step), 2876 either find an RtpTransceiver or create one according to the 2877 following steps: 2879 + If the m= section is sendrecv or recvonly, and there are 2880 RtpTransceivers of the same type that were added to the 2881 PeerConnection by addTrack and are not associated with any 2882 m= section and are not stopped, find the first (according to 2883 the canonical order described in Section 5.2.1) such 2884 RtpTransceiver. 2886 + If no RtpTransceiver was found in the previous step, create 2887 one with a recvonly direction. 2889 + Associate the found or created RtpTransceiver with the m= 2890 section by setting the value of the RtpTransceiver's mid 2891 attribute to the MID of the m= section. If the m= section 2892 does not include a MID (i.e., the remote side does not 2893 support the MID extension), generate a value for the 2894 RtpTransceiver mid attribute, following the guidance for 2895 "a=mid" mentioned in Section 5.2.1. 2897 * For each specified media format that is also supported by the 2898 local implementation, establish a mapping between the specified 2899 payload type and the media format, as described in [RFC3264], 2900 Section 6.1. Specifically, this means that the implementation 2901 records the payload type to be used in outgoing RTP packets 2902 when sending each specified media format, as well as the 2903 relative preference for each format that is indicated in their 2904 ordering. If any indicated media format is not supported by 2905 the local implementation, it MUST be ignored. 2907 * For each specified "rtx" media format, establish a mapping 2908 between the RTX payload type and its associated primary payload 2909 type, as described in [RFC4588], Section 4. If any referenced 2910 primary payload types are not present, this MUST result in an 2911 error. 2913 * For each specified fmtp parameter that is supported by the 2914 local implementation, enable them on the associated media 2915 formats. 2917 * For each specified RTP header extension that is also supported 2918 by the local implementation, establish a mapping between the 2919 extension ID and URI, as described in [RFC5285], Section 5. 2920 Specifically, this means that the implementation records the 2921 extension ID to be used in outgoing RTP packets when sending 2922 each specified header extension. If any indicated RTP header 2923 extension is not supported by the local implementation, it MUST 2924 be ignored. 2926 * For each specified RTCP feedback mechanism that is supported by 2927 the local implementation, enable them on the associated media 2928 formats. 2930 * For any specified "TIAS" bandwidth value, set this value as a 2931 constraint on the maximum RTP bitrate to be used when sending 2932 media, as specified in [RFC3890]. If a "TIAS" value is not 2933 present, but an "AS" value is specified, generate a "TIAS" 2934 value using this formula: 2936 TIAS = AS * 1000 * 0.95 - 50 * 40 * 8 2938 The 50 is based on 50 packets per second, the 40 is based on an 2939 estimate of total header size, the 1000 changes the unit from 2940 kbps to bps (as required by TIAS), and the 0.95 is to allocate 2941 5% to RTCP. If more accurate control of bandwidth is needed, 2942 "TIAS" should be used instead of "AS". 2944 * For any "RR" or "RS" bandwidth values, handle as specified in 2945 [RFC3556], Section 2. 2947 * Any specified "CT" bandwidth value MUST be ignored, as the 2948 meaning of this construct at the media level is not well 2949 defined. 2951 * If the media section is of type audio: 2953 + For each specified "CN" media format, enable DTX for all 2954 supported media formats with the same clockrate, as 2955 described in [RFC3389], Section 5, except for formats that 2956 have their own internal DTX mechanisms. DTX for such 2957 formats (e.g., Opus) is controlled via fmtp parameters, as 2958 discussed in Section 5.2.3.2. 2960 + For each specified "telephone-event" media format, enable 2961 DTMF transmission for all supported media formats with the 2962 same clockrate, as described in [RFC4733], Section 2.5.1.2. 2963 If the application attempts to transmit DTMF when using a 2964 media format that does not have a corresponding telephone- 2965 event format, this MUST result in an error. 2967 + For any specified "ptime" value, configure the available 2968 media formats to use the specified packet size. If the 2969 specified size is not supported for a media format, use the 2970 next closest value instead. 2972 Finally, if this description is of type "pranswer" or "answer", 2973 follow the processing defined in the Section 5.10 section below. 2975 5.10. Applying an Answer 2977 In addition to the steps mentioned above for processing a local or 2978 remote description, the following steps are performed when processing 2979 a description of type "pranswer" or "answer". 2981 For each media section, the following steps MUST be performed: 2983 o If the media section has been rejected (i.e. port is set to zero 2984 in the answer), stop any reception or transmission of media for 2985 this section, and discard any associated ICE components, as 2986 described in Section 9.2.1.3 of [RFC5245]. 2988 o If the remote DTLS fingerprint has been changed or the dtls-id has 2989 changed, tear down the DTLS connection. If a DTLS connection 2990 needs to be torn down but the answer does not indicate an ICE 2991 restart, an error MUST be generated. If an ICE restart is 2992 performed without a change in dtls-id or fingerprint, then the 2993 same DTLS connection is continued over the new ICE channel. 2995 o If no valid DTLS connection exists, prepare to start a DTLS 2996 connection, using the specified roles and fingerprints, on any 2997 underlying ICE components, once they are active. 2999 o If the media section proto value indicates use of RTP: 3001 * If the media section references any media formats, RTP header 3002 extensions, or RTCP feedback mechanisms that were not present 3003 in the corresponding media section in the offer, this indicates 3004 a negotiation problem and MUST result in an error. 3006 * If the media section has RTCP mux enabled, discard any RTCP 3007 component, and begin or continue muxing RTCP over the RTP 3008 component, as specified in [RFC5761], Section 5.1.3. 3009 Otherwise, prepare to transmit RTCP over the RTCP component; if 3010 no RTCP component exists, because RTCP mux was previously 3011 enabled, this MUST result in an error. 3013 * If the media section has reduced-size RTCP enabled, configure 3014 the RTCP transmission for this media section to use reduced- 3015 size RTCP, as specified in [RFC5506]. 3017 * If the directional attribute in the answer is of type 3018 "sendrecv" or "sendonly", choose the media format to send as 3019 the most preferred media format from the remote description 3020 that is also present in the answer, as described in [RFC3264], 3021 Sections 6.1 and 7, and start transmitting RTP media once the 3022 underlying transport layers have been established. If a SSRC 3023 has not already been chosen for this outgoing RTP stream, 3024 choose a random one. 3026 * The payload type mapping from the remote description is used to 3027 determine payload types for the outgoing RTP streams, including 3028 the payload type for the send media format chosen above. Any 3029 RTP header extensions that were negotiated should be included 3030 in the outgoing RTP streams, using the extension mapping from 3031 the remote description; if the RID header extension has been 3032 negotiated, and RID values are specified, include the RID 3033 header extension in the outgoing RTP streams, as indicated in 3034 [I-D.ietf-mmusic-rid], Section 4. 3036 * If simulcast has been negotiated, send the number of Source RTP 3037 Streams as specified in [I-D.ietf-mmusic-sdp-simulcast], 3038 Section 6.2.2. 3040 * If the send media format chosen above has a corresponding "rtx" 3041 media format, or a FEC mechanism has been negotiated, establish 3042 a Redundancy RTP Stream with a random SSRC for each Source RTP 3043 Stream, and start or continue transmitting RTX/FEC packets as 3044 needed. 3046 * If the send media format chosen above has a corresponding "red" 3047 media format of the same clockrate, allow redundant encoding 3048 using the specified format for resiliency purposes, as 3049 discussed in [I-D.ietf-rtcweb-fec], Section 3.2. Note that 3050 unlike RTX or FEC media formats, the "red" format is 3051 transmitted on the Source RTP Stream, not the Redundancy RTP 3052 Stream. 3054 * Enable the RTCP feedback mechanisms referenced in the media 3055 section for all Source RTP Streams using the specified media 3056 formats. Specifically, begin or continue sending the requested 3057 feedback types and reacting to received feedback, as specified 3058 in [RFC4585], Section 4.2. When sending RTCP feedback, use the 3059 SSRC of an outgoing Source RTP Stream as the RTCP sender SSRC; 3060 if no outgoing Source RTP Stream exists, choose a random one. 3062 * If the directional attribute is of type "recvonly" or 3063 "inactive", stop transmitting all RTP media, but continue 3064 sending RTCP, as described in [RFC3264], Section 5.1. 3066 o If the media section proto value indicates use of SCTP: 3068 * If no SCTP association yet exists, prepare to initiate a SCTP 3069 association over the associated ICE component and DTLS 3070 connection, using the local SCTP port value from the local 3071 description, and the remote SCTP port value from the remote 3072 description, as described in [I-D.ietf-mmusic-sctp-sdp], 3073 Section 10.2. 3075 If the answer contains valid bundle groups, discard any ICE 3076 components for the m= sections that will be bundled onto the primary 3077 ICE components in each bundle, and begin muxing these m= sections 3078 accordingly, as described in 3079 [I-D.ietf-mmusic-sdp-bundle-negotiation], Section 8.2. 3081 6. Processing RTP/RTCP packets 3083 Note: The following algorithm does not yet have WG consensus but is 3084 included here as something concrete for the working group to discuss. 3086 When an RTP packet is received by a transport and passes SRTP 3087 authentication, that packet needs to be routed to the correct 3088 RtpReceiver. For each transport, the following steps MUST be 3089 followed to prepare to route packets: 3091 Construct a table mapping MID to RtpReceiver for each RtpReceiver 3092 configured to receive from this transport. 3094 Construct a table mapping incoming SSRC to RtpReceiver for each 3095 RtpReceiver configured to receive from this transport and for each 3096 SSRC that RtpReceiver is configured to receive. Some of the SSRCs 3097 may be present in the m= section corresponding to that RtpReceiver 3098 in the remote description. 3100 Construct a table mapping outgoing SSRC to RtpSender for each 3101 RtpSender configured to transmit from this transport and for each 3102 SSRC that RtpSender is configured to use when sending. 3104 Construct a table mapping payload type to RtpReceiver for each 3105 RtpReceiver configured to receive from this transport and for each 3106 payload type that RtpReceiver is configured to receive. The 3107 payload types of a given RtpReceiver are found in the m= section 3108 corresponding to that RtpReceiver in the local description. If 3109 any payload type could map to more than one RtpReceiver, map to 3110 the RtpReceiver whose m= section appears earliest in the local 3111 description. 3113 As RtpTransceivers (and, thus, RtpReceivers) are added, removed, 3114 stopped, or reconfigured, the tables above must also be updated. 3116 For each RTP packet received, the following steps MUST be followed to 3117 route the packet: 3119 If the packet has a MID and that MID is not in the table mapping 3120 MID to RtpReceiver, drop the packet and stop. 3122 If the packet has a MID and that MID is in the table mapping MID 3123 to RtpReceiver, update the incoming SSRC mapping table to include 3124 an entry that maps the packet's SSRC to the RtpReceiver for that 3125 MID. 3127 If the packet's SSRC is in the incoming SSRC mapping table, 3128 deliver the packet to the associated RtpReceiver and stop. 3130 If the packet's payload type is in the payload type table, update 3131 the the incoming SSRC mapping table to include an entry that maps 3132 the packet's SSRC to the RtpReceiver for that payload type. In 3133 addition, deliver the packet to the associated RtpReceiver and 3134 stop. 3136 Otherwise, drop the packet. 3138 For each RTCP packet received (including each RTCP packet that is 3139 part of a compound RTCP packet), the following type-specific handling 3140 MUST be performed to route the packet: 3142 If the packet is of type SR, and the sender SSRC for the packet is 3143 found in the incoming SSRC table, deliver a copy of the packet to 3144 the RtpReceiver associated with that SSRC. In addition, for each 3145 report block in the report whose SSRC is found in the outgoing 3146 SSRC table, deliver a copy of the RTCP packet to the RtpSender 3147 associated with that SSRC. 3149 If the packet is of type RR, for each report block in the packet 3150 whose SSRC is found in the outgoing SSRC table, deliver a copy of 3151 the RTCP packet to the RtpSender associated with that SSRC. 3153 If the packet is of type SDES, and the sender SSRC for the packet 3154 is found in the incoming SSRC table, deliver the packet to the 3155 RtpReceiver associated with that SSRC. In addition, for each 3156 chunk in the packet that contains a MID that is in the table 3157 mapping MID to RtpReceiver, update the incoming SSRC mapping table 3158 to include an entry that maps the SSRC for that chunk to the 3159 RtpReceiver associated with that MID. (This case can occur when 3160 RTCP for a source is received before any RTP packets.) 3162 If the packet is of type BYE, for each SSRC indicated in the 3163 packet that is found in the incoming SSRC table, deliver a copy of 3164 the packet to the RtpReceiver associated with that SSRC. 3166 If the packet is of type RTPFB or PSFB, as defined in [RFC4585], 3167 and the media source SSRC for the packet is found in the outgoing 3168 SSRC table, deliver the packet to the RtpSender associated with 3169 that SSRC. 3171 After packets are routed to the RtpReceiver, further processing of 3172 the RTP packets is done at the RtpReceiver level. This includes 3173 using [I-D.ietf-mmusic-rid] to distinguish between multiple Encoded 3174 Streams, as well as determine which Source RTP stream should be 3175 repaired by a given Redundancy RTP stream. If the RTP packet's PT 3176 does not match any codec in use by the RtpReceiver, the packet will 3177 be dropped. 3179 7. Examples 3181 Note that this example section shows several SDP fragments. To 3182 format in 72 columns, some of the lines in SDP have been split into 3183 multiple lines, where leading whitespace indicates that a line is a 3184 continuation of the previous line. In addition, some blank lines 3185 have been added to improve readability but are not valid in SDP. 3187 More examples of SDP for WebRTC call flows can be found in 3188 [I-D.nandakumar-rtcweb-sdp]. 3190 7.1. Simple Example 3192 This section shows a very simple example that sets up a minimal audio 3193 / video call between two browsers and does not use trickle ICE. The 3194 example in the following section provides a more realistic example of 3195 what would happen in a normal browser to browser connection. 3197 The flow shows Alice's browser initiating the session to Bob's 3198 browser. The messages from Alice's JS to Bob's JS are assumed to 3199 flow over some signaling protocol via a web server. The JS on both 3200 Alice's side and Bob's side waits for all candidates before sending 3201 the offer or answer, so the offers and answers are complete. Trickle 3202 ICE is not used. Both Alice and Bob are using the default policy of 3203 balanced. 3205 // set up local media state 3206 AliceJS->AliceUA: create new PeerConnection 3207 AliceJS->AliceUA: addTrack with two tracks: audio and video 3208 AliceJS->AliceUA: createOffer to get offer 3209 AliceJS->AliceUA: setLocalDescription with offer 3210 AliceUA->AliceJS: multiple onicecandidate events with candidates 3212 // wait for ICE gathering to complete 3213 AliceUA->AliceJS: onicecandidate event with null candidate 3214 AliceJS->AliceUA: get |offer-A1| from pendingLocalDescription 3216 // |offer-A1| is sent over signaling protocol to Bob 3217 AliceJS->WebServer: signaling with |offer-A1| 3218 WebServer->BobJS: signaling with |offer-A1| 3220 // |offer-A1| arrives at Bob 3221 BobJS->BobUA: create a PeerConnection 3222 BobJS->BobUA: setRemoteDescription with |offer-A1| 3223 BobUA->BobJS: onaddstream event with remoteStream 3225 // Bob accepts call 3226 BobJS->BobUA: addTrack with local tracks 3227 BobJS->BobUA: createAnswer 3228 BobJS->BobUA: setLocalDescription with answer 3229 BobUA->BobJS: multiple onicecandidate events with candidates 3231 // wait for ICE gathering to complete 3232 BobUA->BobJS: onicecandidate event with null candidate 3233 BobJS->BobUA: get |answer-A1| from currentLocalDescription 3235 // |answer-A1| is sent over signaling protocol to Alice 3236 BobJS->WebServer: signaling with |answer-A1| 3237 WebServer->AliceJS: signaling with |answer-A1| 3239 // |answer-A1| arrives at Alice 3240 AliceJS->AliceUA: setRemoteDescription with |answer-A1| 3241 AliceUA->AliceJS: onaddstream event with remoteStream 3243 // media flows 3244 BobUA->AliceUA: media sent from Bob to Alice 3245 AliceUA->BobUA: media sent from Alice to Bob 3247 The SDP for |offer-A1| looks like: 3249 v=0 3250 o=- 4962303333179871722 1 IN IP4 0.0.0.0 3251 s=- 3252 t=0 0 3253 a=group:BUNDLE a1 v1 3254 a=ice-options:trickle 3255 m=audio 56500 UDP/TLS/RTP/SAVPF 96 0 8 97 98 3256 c=IN IP4 192.0.2.1 3257 a=mid:a1 3258 a=rtcp:56501 IN IP4 192.0.2.1 3259 a=msid:47017fee-b6c1-4162-929c-a25110252400 3260 f83006c5-a0ff-4e0a-9ed9-d3e6747be7d9 3261 a=sendrecv 3262 a=rtpmap:96 opus/48000/2 3263 a=rtpmap:0 PCMU/8000 3264 a=rtpmap:8 PCMA/8000 3265 a=rtpmap:97 telephone-event/8000 3266 a=rtpmap:98 telephone-event/48000 3267 a=maxptime:120 3268 a=ice-ufrag:ETEn1v9DoTMB9J4r 3269 a=ice-pwd:OtSK0WpNtpUjkY4+86js7ZQl 3270 a=fingerprint:sha-256 3271 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 3272 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3273 a=setup:actpass 3274 a=rtcp-mux 3275 a=rtcp-rsize 3276 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level 3277 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 3278 a=candidate:3348148302 1 udp 2113937151 192.0.2.1 56500 3279 typ host 3280 a=candidate:3348148302 2 udp 2113937151 192.0.2.1 56501 3281 typ host 3282 a=end-of-candidates 3284 m=video 56502 UDP/TLS/RTP/SAVPF 100 101 3285 c=IN IP4 192.0.2.1 3286 a=rtcp:56503 IN IP4 192.0.2.1 3287 a=mid:v1 3288 a=msid:61317484-2ed4-49d7-9eb7-1414322a7aae 3289 f30bdb4a-5db8-49b5-bcdc-e0c9a23172e0 3290 a=sendrecv 3291 a=rtpmap:100 VP8/90000 3292 a=rtpmap:101 rtx/90000 3293 a=fmtp:101 apt=100 3294 a=ice-ufrag:BGKkWnG5GmiUpdIV 3295 a=ice-pwd:mqyWsAjvtKwTGnvhPztQ9mIf 3296 a=fingerprint:sha-256 3297 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 3298 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3300 a=setup:actpass 3301 a=rtcp-mux 3302 a=rtcp-rsize 3303 a=extmap:3 urn:ietf:params:rtp-hdrext:sdes:mid 3304 a=rtcp-fb:100 ccm fir 3305 a=rtcp-fb:100 nack 3306 a=rtcp-fb:100 nack pli 3307 a=candidate:3348148302 1 udp 2113937151 192.0.2.1 56502 3308 typ host 3309 a=candidate:3348148302 2 udp 2113937151 192.0.2.1 56503 3310 typ host 3311 a=end-of-candidates 3313 The SDP for |answer-A1| looks like: 3315 v=0 3316 o=- 6729291447651054566 1 IN IP4 0.0.0.0 3317 s=- 3318 t=0 0 3319 a=group:BUNDLE a1 v1 3320 m=audio 20000 UDP/TLS/RTP/SAVPF 96 0 8 97 98 3321 c=IN IP4 192.0.2.2 3322 a=mid:a1 3323 a=rtcp:20000 IN IP4 192.0.2.2 3324 a=msid:PI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1 3325 PI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1a0 3326 a=sendrecv 3327 a=rtpmap:96 opus/48000/2 3328 a=rtpmap:0 PCMU/8000 3329 a=rtpmap:8 PCMA/8000 3330 a=rtpmap:97 telephone-event/8000 3331 a=rtpmap:98 telephone-event/48000 3332 a=maxptime:120 3333 a=ice-ufrag:6sFvz2gdLkEwjZEr 3334 a=ice-pwd:cOTZKZNVlO9RSGsEGM63JXT2 3335 a=fingerprint:sha-256 6B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35 3336 :DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 3337 a=setup:active 3338 a=rtcp-mux 3339 a=rtcp-rsize 3340 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level 3341 a=candidate:2299743422 1 udp 2113937151 192.0.2.2 20000 3342 typ host 3343 a=end-of-candidates 3345 m=video 20000 UDP/TLS/RTP/SAVPF 100 101 3346 c=IN IP4 192.0.2.2 3347 a=rtcp 20001 IN IP4 192.0.2.2 3348 a=mid:v1 3349 a=msid:PI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1 3350 PI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1v0 3351 a=sendrecv 3352 a=rtpmap:100 VP8/90000 3353 a=rtpmap:101 rtx/90000 3354 a=fmtp:101 apt=100 3355 a=fingerprint:sha-256 6B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35 3356 :DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 3357 a=setup:active 3358 a=rtcp-mux 3359 a=rtcp-rsize 3360 a=rtcp-fb:100 ccm fir 3361 a=rtcp-fb:100 nack 3362 a=rtcp-fb:100 nack pli 3364 7.2. Normal Examples 3366 This section shows a typical example of a session between two 3367 browsers setting up an audio channel and a data channel. Trickle ICE 3368 is used in full trickle mode with a bundle policy of max-bundle, an 3369 RTCP mux policy of require, and a single TURN server. Later, two 3370 video flows, one for the presenter and one for screen sharing, are 3371 added to the session. This example shows Alice's browser initiating 3372 the session to Bob's browser. The messages from Alice's JS to Bob's 3373 JS are assumed to flow over some signaling protocol via a web server. 3375 // set up local media state 3376 AliceJS->AliceUA: create new PeerConnection 3377 AliceJS->AliceUA: addTrack with an audio track 3378 AliceJS->AliceUA: createDataChannel to get data channel 3379 AliceJS->AliceUA: createOffer to get |offer-B1| 3380 AliceJS->AliceUA: setLocalDescription with |offer-B1| 3382 // |offer-B1| is sent over signaling protocol to Bob 3383 AliceJS->WebServer: signaling with |offer-B1| 3384 WebServer->BobJS: signaling with |offer-B1| 3386 // |offer-B1| arrives at Bob 3387 BobJS->BobUA: create a PeerConnection 3388 BobJS->BobUA: setRemoteDescription with |offer-B1| 3389 BobUA->BobJS: onaddstream with audio track from Alice 3391 // candidates are sent to Bob 3392 AliceUA->AliceJS: onicecandidate event with |candidate-B1| (host) 3393 AliceJS->WebServer: signaling with |candidate-B1| 3394 AliceUA->AliceJS: onicecandidate event with |candidate-B2| (srflx) 3395 AliceJS->WebServer: signaling with |candidate-B2| 3397 WebServer->BobJS: signaling with |candidate-B1| 3398 BobJS->BobUA: addIceCandidate with |candidate-B1| 3399 WebServer->BobJS: signaling with |candidate-B2| 3400 BobJS->BobUA: addIceCandidate with |candidate-B2| 3402 // Bob accepts call 3403 BobJS->BobUA: addTrack with local audio 3404 BobJS->BobUA: createDataChannel to get data channel 3405 BobJS->BobUA: createAnswer to get |answer-B1| 3406 BobJS->BobUA: setLocalDescription with |answer-B1| 3408 // |answer-B1| is sent to Alice 3409 BobJS->WebServer: signaling with |answer-B1| 3410 WebServer->AliceJS: signaling with |answer-B1| 3411 AliceJS->AliceUA: setRemoteDescription with |answer-B1| 3412 AliceUA->AliceJS: onaddstream event with audio track from Bob 3414 // candidates are sent to Alice 3415 BobUA->BobJS: onicecandidate event with |candidate-B3| (host) 3416 BobJS->WebServer: signaling with |candidate-B3| 3417 BobUA->BobJS: onicecandidate event with |candidate-B4| (srflx) 3418 BobJS->WebServer: signaling with |candidate-B4| 3420 WebServer->AliceJS: signaling with |candidate-B3| 3421 AliceJS->AliceUA: addIceCandidate with |candidate-B3| 3422 WebServer->AliceJS: signaling with |candidate-B4| 3423 AliceJS->AliceUA: addIceCandidate with |candidate-B4| 3425 // data channel opens 3426 BobUA->BobJS: ondatachannel event 3427 AliceUA->AliceJS: ondatachannel event 3428 BobUA->BobJS: onopen 3429 AliceUA->AliceJS: onopen 3431 // media is flowing between browsers 3432 BobUA->AliceUA: audio+data sent from Bob to Alice 3433 AliceUA->BobUA: audio+data sent from Alice to Bob 3435 // some time later Bob adds two video streams 3436 // note, no candidates exchanged, because of bundle 3437 BobJS->BobUA: addTrack with first video stream 3438 BobJS->BobUA: addTrack with second video stream 3439 BobJS->BobUA: createOffer to get |offer-B2| 3440 BobJS->BobUA: setLocalDescription with |offer-B2| 3442 // |offer-B2| is sent to Alice 3443 BobJS->WebServer: signaling with |offer-B2| 3444 WebServer->AliceJS: signaling with |offer-B2| 3445 AliceJS->AliceUA: setRemoteDescription with |offer-B2| 3446 AliceUA->AliceJS: onaddstream event with first video stream 3447 AliceUA->AliceJS: onaddstream event with second video stream 3448 AliceJS->AliceUA: createAnswer to get |answer-B2| 3449 AliceJS->AliceUA: setLocalDescription with |answer-B2| 3451 // |answer-B2| is sent over signaling protocol to Bob 3452 AliceJS->WebServer: signaling with |answer-B2| 3453 WebServer->BobJS: signaling with |answer-B2| 3454 BobJS->BobUA: setRemoteDescription with |answer-B2| 3456 // media is flowing between browsers 3457 BobUA->AliceUA: audio+video+data sent from Bob to Alice 3458 AliceUA->BobUA: audio+video+data sent from Alice to Bob 3460 The SDP for |offer-B1| looks like: 3462 v=0 3463 o=- 4962303333179871723 1 IN IP4 0.0.0.0 3464 s=- 3465 t=0 0 3466 a=group:BUNDLE a1 d1 3467 a=ice-options:trickle 3468 m=audio 9 UDP/TLS/RTP/SAVPF 96 0 8 97 98 3469 c=IN IP4 0.0.0.0 3470 a=rtcp:9 IN IP4 0.0.0.0 3471 a=mid:a1 3472 a=msid:57017fee-b6c1-4162-929c-a25110252400 3473 e83006c5-a0ff-4e0a-9ed9-d3e6747be7d9 3474 a=sendrecv 3475 a=rtpmap:96 opus/48000/2 3476 a=rtpmap:0 PCMU/8000 3477 a=rtpmap:8 PCMA/8000 3478 a=rtpmap:97 telephone-event/8000 3479 a=rtpmap:98 telephone-event/48000 3480 a=maxptime:120 3481 a=ice-ufrag:ATEn1v9DoTMB9J4r 3482 a=ice-pwd:AtSK0WpNtpUjkY4+86js7ZQl 3483 a=fingerprint:sha-256 3484 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 3485 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3486 a=setup:actpass 3487 a=rtcp-mux 3488 a=rtcp-rsize 3489 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level 3490 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 3492 m=application 0 UDP/DTLS/SCTP webrtc-datachannel 3493 c=IN IP4 0.0.0.0 3494 a=bundle-only 3495 a=mid:d1 3496 a=fmtp:webrtc-datachannel max-message-size=65536 3497 a=sctp-port 5000 3498 a=fingerprint:sha-256 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 3499 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3500 a=setup:actpass 3502 The SDP for |candidate-B1| looks like: 3504 candidate:109270923 1 udp 2122194687 192.168.1.2 51556 typ host 3506 The SDP for |candidate-B2| looks like: 3508 candidate:4036177503 1 udp 1685987071 11.22.33.44 52546 typ srflx 3509 raddr 192.168.1.2 rport 51556 3511 The SDP for |answer-B1| looks like: 3513 v=0 3514 o=- 7729291447651054566 1 IN IP4 0.0.0.0 3515 s=- 3516 t=0 0 3517 a=group:BUNDLE a1 d1 3518 a=ice-options:trickle 3519 m=audio 9 UDP/TLS/RTP/SAVPF 96 0 8 97 98 3520 c=IN IP4 0.0.0.0 3521 a=rtcp:9 IN IP4 0.0.0.0 3522 a=mid:a1 3523 a=msid:QI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1 3524 QI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1a0 3525 a=sendrecv 3526 a=rtpmap:96 opus/48000/2 3527 a=rtpmap:0 PCMU/8000 3528 a=rtpmap:8 PCMA/8000 3529 a=rtpmap:97 telephone-event/8000 3530 a=rtpmap:98 telephone-event/48000 3531 a=maxptime:120 3532 a=ice-ufrag:7sFvz2gdLkEwjZEr 3533 a=ice-pwd:dOTZKZNVlO9RSGsEGM63JXT2 3534 a=fingerprint:sha-256 6B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35 3535 :DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 3536 a=setup:active 3537 a=rtcp-mux 3538 a=rtcp-rsize 3539 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level 3540 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 3542 m=application 9 UDP/DTLS/SCTP webrtc-datachannel 3543 c=IN IP4 0.0.0.0 3544 a=mid:d1 3545 a=fmtp:webrtc-datachannel max-message-size=65536 3546 a=sctp-port 5000 3547 a=fingerprint:sha-256 6B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35 3548 :DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 3549 a=setup:active 3551 The SDP for |candidate-B3| looks like: 3553 candidate:109270924 1 udp 2122194687 192.168.2.3 61665 typ host 3555 The SDP for |candidate-B4| looks like: 3557 candidate:4036177504 1 udp 1685987071 55.66.77.88 64532 typ srflx 3558 raddr 192.168.2.3 rport 61665 3560 The SDP for |offer-B2| looks like: (note the increment of the version 3561 number in the o= line, and the c= and a=rtcp lines, which indicate 3562 the local candidate that was selected) 3564 v=0 3565 o=- 7729291447651054566 2 IN IP4 0.0.0.0 3566 s=- 3567 t=0 0 3568 a=group:BUNDLE a1 d1 v1 v2 3569 a=ice-options:trickle 3570 m=audio 64532 UDP/TLS/RTP/SAVPF 96 0 8 97 98 3571 c=IN IP4 55.66.77.88 3572 a=rtcp:64532 IN IP4 55.66.77.88 3573 a=mid:a1 3574 a=msid:QI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1 3575 QI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1a0 3576 a=sendrecv 3577 a=rtpmap:96 opus/48000/2 3578 a=rtpmap:0 PCMU/8000 3579 a=rtpmap:8 PCMA/8000 3580 a=rtpmap:97 telephone-event/8000 3581 a=rtpmap:98 telephone-event/48000 3582 a=maxptime:120 3583 a=ice-ufrag:7sFvz2gdLkEwjZEr 3584 a=ice-pwd:dOTZKZNVlO9RSGsEGM63JXT2 3585 a=fingerprint:sha-256 6B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35 3586 :DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 3587 a=setup:actpass 3588 a=rtcp-mux 3589 a=rtcp-rsize 3590 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level 3591 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 3592 a=candidate:109270924 1 udp 2122194687 192.168.2.3 61665 typ host 3593 a=candidate:4036177504 1 udp 1685987071 55.66.77.88 64532 typ srflx 3594 raddr 192.168.2.3 rport 61665 3595 a=candidate:3671762467 1 udp 41819903 66.77.88.99 50416 typ relay 3596 raddr 55.66.77.88 rport 64532 3598 a=end-of-candidates 3600 m=application 64532 UDP/DTLS/SCTP webrtc-datachannel 3601 c=IN IP4 55.66.77.88 3602 a=mid:d1 3603 a=fmtp:webrtc-datachannel max-message-size=65536 3604 a=sctp-port 5000 3605 a=ice-ufrag:7sFvz2gdLkEwjZEr 3606 a=ice-pwd:dOTZKZNVlO9RSGsEGM63JXT2 3607 a=fingerprint:sha-256 6B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35 3608 :DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 3609 a=setup:actpass 3610 a=candidate:109270924 1 udp 2122194687 192.168.2.3 61665 typ host 3611 a=candidate:4036177504 1 udp 1685987071 55.66.77.88 64532 typ srflx 3612 raddr 192.168.2.3 rport 61665 3613 a=candidate:3671762467 1 udp 41819903 66.77.88.99 50416 typ relay 3614 raddr 55.66.77.88 rport 64532 3615 a=end-of-candidates 3617 m=video 0 UDP/TLS/RTP/SAVPF 100 101 3618 c=IN IP4 55.66.77.88 3619 a=bundle-only 3620 a=rtcp:64532 IN IP4 55.66.77.88 3621 a=mid:v1 3622 a=msid:61317484-2ed4-49d7-9eb7-1414322a7aae 3623 f30bdb4a-5db8-49b5-bcdc-e0c9a23172e0 3624 a=sendrecv 3625 a=rtpmap:100 VP8/90000 3626 a=rtpmap:101 rtx/90000 3627 a=fmtp:101 apt=100 3628 a=fingerprint:sha-256 3629 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 3630 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3631 a=setup:actpass 3632 a=rtcp-mux 3633 a=rtcp-rsize 3634 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 3635 a=rtcp-fb:100 ccm fir 3636 a=rtcp-fb:100 nack 3637 a=rtcp-fb:100 nack pli 3639 m=video 0 UDP/TLS/RTP/SAVPF 100 101 3640 c=IN IP4 55.66.77.88 3641 a=bundle-only 3642 a=rtcp:64532 IN IP4 55.66.77.88 3643 a=mid:v1 3644 a=msid:71317484-2ed4-49d7-9eb7-1414322a7aae 3645 f30bdb4a-5db8-49b5-bcdc-e0c9a23172e0 3647 a=sendrecv 3648 a=rtpmap:100 VP8/90000 3649 a=rtpmap:101 rtx/90000 3650 a=fmtp:101 apt=100 3651 a=fingerprint:sha-256 3652 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 3653 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3654 a=setup:actpass 3655 a=rtcp-mux 3656 a=rtcp-rsize 3657 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 3658 a=rtcp-fb:100 ccm fir 3659 a=rtcp-fb:100 nack 3660 a=rtcp-fb:100 nack pli 3662 The SDP for |answer-B2| looks like: (note the use of setup:passive to 3663 maintain the existing DTLS roles, and the use of a=recvonly to 3664 indicate that the video streams are one-way) 3666 v=0 3667 o=- 4962303333179871723 2 IN IP4 0.0.0.0 3668 s=- 3669 t=0 0 3670 a=group:BUNDLE a1 d1 v1 v2 3671 a=ice-options:trickle 3672 m=audio 52546 UDP/TLS/RTP/SAVPF 96 0 8 97 98 3673 c=IN IP4 11.22.33.44 3674 a=rtcp:52546 IN IP4 11.22.33.44 3675 a=mid:a1 3676 a=msid:57017fee-b6c1-4162-929c-a25110252400 3677 e83006c5-a0ff-4e0a-9ed9-d3e6747be7d9 3678 a=sendrecv 3679 a=rtpmap:96 opus/48000/2 3680 a=rtpmap:0 PCMU/8000 3681 a=rtpmap:8 PCMA/8000 3682 a=rtpmap:97 telephone-event/8000 3683 a=rtpmap:98 telephone-event/48000 3684 a=maxptime:120 3685 a=ice-ufrag:ATEn1v9DoTMB9J4r 3686 a=ice-pwd:AtSK0WpNtpUjkY4+86js7ZQl 3687 a=fingerprint:sha-256 3688 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 3689 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3690 a=setup:passive 3691 a=rtcp-mux 3692 a=rtcp-rsize 3693 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level 3694 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 3695 a=candidate:109270923 1 udp 2122194687 192.168.1.2 51556 typ host 3696 a=candidate:4036177503 1 udp 1685987071 11.22.33.44 52546 typ srflx 3697 raddr 192.168.1.2 rport 51556 3698 a=candidate:3671762466 1 udp 41819903 22.33.44.55 61405 typ relay 3699 raddr 11.22.33.44 rport 52546 3700 a=end-of-candidates 3702 m=application 52546 UDP/DTLS/SCTP webrtc-datachannel 3703 c=IN IP4 11.22.33.44 3704 a=mid:d1 3705 a=fmtp:webrtc-datachannel max-message-size=65536 3706 a=sctp-port 5000 3707 a=fingerprint:sha-256 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 3708 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3709 a=setup:passive 3711 m=video 52546 UDP/TLS/RTP/SAVPF 100 101 3712 c=IN IP4 11.22.33.44 3713 a=rtcp:52546 IN IP4 11.22.33.44 3714 a=mid:v1 3715 a=recvonly 3716 a=rtpmap:100 VP8/90000 3717 a=rtpmap:101 rtx/90000 3718 a=fmtp:101 apt=100 3719 a=fingerprint:sha-256 3720 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 3721 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3722 a=setup:passive 3723 a=rtcp-mux 3724 a=rtcp-rsize 3725 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 3726 a=rtcp-fb:100 ccm fir 3727 a=rtcp-fb:100 nack 3728 a=rtcp-fb:100 nack pli 3730 m=video 52546 UDP/TLS/RTP/SAVPF 100 101 3731 c=IN IP4 11.22.33.44 3732 a=rtcp:52546 IN IP4 11.22.33.44 3733 a=mid:v2 3734 a=recvonly 3735 a=rtpmap:100 VP8/90000 3736 a=rtpmap:101 rtx/90000 3737 a=fmtp:101 apt=100 3738 a=fingerprint:sha-256 3739 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 3740 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3742 a=setup:passive 3743 a=rtcp-mux 3744 a=rtcp-rsize 3745 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 3746 a=rtcp-fb:100 ccm fir 3747 a=rtcp-fb:100 nack 3748 a=rtcp-fb:100 nack pli 3750 8. Security Considerations 3752 The IETF has published separate documents 3753 [I-D.ietf-rtcweb-security-arch] [I-D.ietf-rtcweb-security] describing 3754 the security architecture for WebRTC as a whole. The remainder of 3755 this section describes security considerations for this document. 3757 While formally the JSEP interface is an API, it is better to think of 3758 it is an Internet protocol, with the JS being untrustworthy from the 3759 perspective of the browser. Thus, the threat model of [RFC3552] 3760 applies. In particular, JS can call the API in any order and with 3761 any inputs, including malicious ones. This is particularly relevant 3762 when we consider the SDP which is passed to setLocalDescription(). 3763 While correct API usage requires that the application pass in SDP 3764 which was derived from createOffer() or createAnswer(), there is no 3765 guarantee that applications do so. The browser MUST be prepared for 3766 the JS to pass in bogus data instead. 3768 Conversely, the application programmer MUST recognize that the JS 3769 does not have complete control of browser behavior. One case that 3770 bears particular mention is that editing ICE candidates out of the 3771 SDP or suppressing trickled candidates does not have the expected 3772 behavior: implementations will still perform checks from those 3773 candidates even if they are not sent to the other side. Thus, for 3774 instance, it is not possible to prevent the remote peer from learning 3775 your public IP address by removing server reflexive candidates. 3776 Applications which wish to conceal their public IP address should 3777 instead configure the ICE agent to use only relay candidates. 3779 9. IANA Considerations 3781 This document requires no actions from IANA. 3783 10. Acknowledgements 3785 Significant text incorporated in the draft as well and review was 3786 provided by Peter Thatcher, Taylor Brandstetter, Harald Alvestrand 3787 and Suhas Nandakumar. Dan Burnett, Neil Stratford, Anant Narayanan, 3788 Andrew Hutton, Richard Ejzak, Adam Bergkvist and Matthew Kaufman all 3789 provided valuable feedback on this proposal. 3791 11. References 3793 11.1. Normative References 3795 [I-D.ietf-avtext-rid] 3796 Roach, A., Nandakumar, S., and P. Thatcher, "RTP Stream 3797 Identifier (RID) Source Description (SDES)", draft-ietf- 3798 avtext-rid-00 (work in progress), February 2016. 3800 [I-D.ietf-ice-trickle] 3801 Ivov, E., Rescorla, E., Uberti, J., and P. Saint-Andre, 3802 "Trickle ICE: Incremental Provisioning of Candidates for 3803 the Interactive Connectivity Establishment (ICE) 3804 Protocol". 3806 [I-D.ietf-mmusic-4572-update] 3807 Holmberg, C., "Updates to RFC 4572", draft-ietf-mmusic- 3808 4572-update-05 (work in progress), June 2016. 3810 [I-D.ietf-mmusic-dtls-sdp] 3811 Holmberg, C. and R. Shpount, "Using the SDP Offer/Answer 3812 Mechanism for DTLS", draft-ietf-mmusic-dtls-sdp-14 (work 3813 in progress), July 2016. 3815 [I-D.ietf-mmusic-msid] 3816 Alvestrand, H., "Cross Session Stream Identification in 3817 the Session Description Protocol", draft-ietf-mmusic- 3818 msid-01 (work in progress), August 2013. 3820 [I-D.ietf-mmusic-mux-exclusive] 3821 Holmberg, C., "Indicating Exclusive Support of RTP/RTCP 3822 Multiplexing using SDP", draft-ietf-mmusic-mux- 3823 exclusive-08 (work in progress), June 2016. 3825 [I-D.ietf-mmusic-rid] 3826 Thatcher, P., Zanaty, M., Nandakumar, S., Burman, B., 3827 Roach, A., and B. Campen, "RTP Payload Format 3828 Constraints", draft-ietf-mmusic-rid-04 (work in progress), 3829 February 2016. 3831 [I-D.ietf-mmusic-sctp-sdp] 3832 Loreto, S. and G. Camarillo, "Stream Control Transmission 3833 Protocol (SCTP)-Based Media Transport in the Session 3834 Description Protocol (SDP)", draft-ietf-mmusic-sctp-sdp-04 3835 (work in progress), June 2013. 3837 [I-D.ietf-mmusic-sdp-bundle-negotiation] 3838 Holmberg, C., Alvestrand, H., and C. Jennings, 3839 "Multiplexing Negotiation Using Session Description 3840 Protocol (SDP) Port Numbers", draft-ietf-mmusic-sdp- 3841 bundle-negotiation-04 (work in progress), June 2013. 3843 [I-D.ietf-mmusic-sdp-mux-attributes] 3844 Nandakumar, S., "A Framework for SDP Attributes when 3845 Multiplexing", draft-ietf-mmusic-sdp-mux-attributes-01 3846 (work in progress), February 2014. 3848 [I-D.ietf-mmusic-sdp-simulcast] 3849 Burman, B., Westerlund, M., Nandakumar, S., and M. Zanaty, 3850 "Using Simulcast in SDP and RTP Sessions", draft-ietf- 3851 mmusic-sdp-simulcast-04 (work in progress), February 2016. 3853 [I-D.ietf-rtcweb-audio] 3854 Valin, J. and C. Bran, "WebRTC Audio Codec and Processing 3855 Requirements", draft-ietf-rtcweb-audio-02 (work in 3856 progress), August 2013. 3858 [I-D.ietf-rtcweb-fec] 3859 Uberti, J., "WebRTC Forward Error Correction 3860 Requirements", draft-ietf-rtcweb-fec-00 (work in 3861 progress), February 2015. 3863 [I-D.ietf-rtcweb-rtp-usage] 3864 Perkins, C., Westerlund, M., and J. Ott, "Web Real-Time 3865 Communication (WebRTC): Media Transport and Use of RTP", 3866 draft-ietf-rtcweb-rtp-usage-09 (work in progress), 3867 September 2013. 3869 [I-D.ietf-rtcweb-security] 3870 Rescorla, E., "Security Considerations for WebRTC", draft- 3871 ietf-rtcweb-security-06 (work in progress), January 2014. 3873 [I-D.ietf-rtcweb-security-arch] 3874 Rescorla, E., "WebRTC Security Architecture", draft-ietf- 3875 rtcweb-security-arch-09 (work in progress), February 2014. 3877 [I-D.ietf-rtcweb-video] 3878 Roach, A., "WebRTC Video Processing and Codec 3879 Requirements", draft-ietf-rtcweb-video-00 (work in 3880 progress), July 2014. 3882 [I-D.nandakumar-mmusic-proto-iana-registration] 3883 Nandakumar, S., "IANA registration of SDP 'proto' 3884 attribute for transporting RTP Media over TCP under 3885 various RTP profiles.", September 2014. 3887 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 3888 Requirement Levels", BCP 14, RFC 2119, March 1997. 3890 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 3891 A., Peterson, J., Sparks, R., Handley, M., and E. 3892 Schooler, "SIP: Session Initiation Protocol", RFC 3261, 3893 June 2002. 3895 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 3896 with Session Description Protocol (SDP)", RFC 3264, June 3897 2002. 3899 [RFC3552] Rescorla, E. and B. Korver, "Guidelines for Writing RFC 3900 Text on Security Considerations", BCP 72, RFC 3552, July 3901 2003. 3903 [RFC3605] Huitema, C., "Real Time Control Protocol (RTCP) attribute 3904 in Session Description Protocol (SDP)", RFC 3605, October 3905 2003. 3907 [RFC3890] Westerlund, M., "A Transport Independent Bandwidth 3908 Modifier for the Session Description Protocol (SDP)", 3909 RFC 3890, DOI 10.17487/RFC3890, September 2004, 3910 . 3912 [RFC4145] Yon, D. and G. Camarillo, "TCP-Based Media Transport in 3913 the Session Description Protocol (SDP)", RFC 4145, 3914 September 2005. 3916 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 3917 Description Protocol", RFC 4566, July 2006. 3919 [RFC4572] Lennox, J., "Connection-Oriented Media Transport over the 3920 Transport Layer Security (TLS) Protocol in the Session 3921 Description Protocol (SDP)", RFC 4572, July 2006. 3923 [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, 3924 "Extended RTP Profile for Real-time Transport Control 3925 Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, July 3926 2006. 3928 [RFC5245] Rosenberg, J., "Interactive Connectivity Establishment 3929 (ICE): A Protocol for Network Address Translator (NAT) 3930 Traversal for Offer/Answer Protocols", RFC 5245, April 3931 2010. 3933 [RFC5285] Singer, D. and H. Desineni, "A General Mechanism for RTP 3934 Header Extensions", RFC 5285, July 2008. 3936 [RFC5761] Perkins, C. and M. Westerlund, "Multiplexing RTP Data and 3937 Control Packets on a Single Port", RFC 5761, April 2010. 3939 [RFC5888] Camarillo, G. and H. Schulzrinne, "The Session Description 3940 Protocol (SDP) Grouping Framework", RFC 5888, June 2010. 3942 [RFC6236] Johansson, I. and K. Jung, "Negotiation of Generic Image 3943 Attributes in the Session Description Protocol (SDP)", 3944 RFC 6236, May 2011. 3946 [RFC6347] Rescorla, E. and N. Modadugu, "Datagram Transport Layer 3947 Security Version 1.2", RFC 6347, January 2012. 3949 [RFC6904] Lennox, J., "Encryption of Header Extensions in the Secure 3950 Real-time Transport Protocol (SRTP)", RFC 6904, April 3951 2013. 3953 11.2. Informative References 3955 [I-D.ietf-rtcweb-ip-handling] 3956 Uberti, J. and G. Shieh, "WebRTC IP Address Handling 3957 Recommendations", draft-ietf-rtcweb-ip-handling-01 (work 3958 in progress), March 2016. 3960 [I-D.nandakumar-rtcweb-sdp] 3961 Nandakumar, S. and C. Jennings, "SDP for the WebRTC", 3962 draft-nandakumar-rtcweb-sdp-02 (work in progress), July 3963 2013. 3965 [RFC3389] Zopf, R., "Real-time Transport Protocol (RTP) Payload for 3966 Comfort Noise (CN)", RFC 3389, September 2002. 3968 [RFC3556] Casner, S., "Session Description Protocol (SDP) Bandwidth 3969 Modifiers for RTP Control Protocol (RTCP) Bandwidth", 3970 RFC 3556, July 2003. 3972 [RFC3960] Camarillo, G. and H. Schulzrinne, "Early Media and Ringing 3973 Tone Generation in the Session Initiation Protocol (SIP)", 3974 RFC 3960, December 2004. 3976 [RFC4568] Andreasen, F., Baugher, M., and D. Wing, "Session 3977 Description Protocol (SDP) Security Descriptions for Media 3978 Streams", RFC 4568, July 2006. 3980 [RFC4588] Rey, J., Leon, D., Miyazaki, A., Varsa, V., and R. 3981 Hakenberg, "RTP Retransmission Payload Format", RFC 4588, 3982 July 2006. 3984 [RFC4733] Schulzrinne, H. and T. Taylor, "RTP Payload for DTMF 3985 Digits, Telephony Tones, and Telephony Signals", RFC 4733, 3986 DOI 10.17487/RFC4733, December 2006, 3987 . 3989 [RFC5506] Johansson, I. and M. Westerlund, "Support for Reduced-Size 3990 Real-Time Transport Control Protocol (RTCP): Opportunities 3991 and Consequences", RFC 5506, April 2009. 3993 [RFC5576] Lennox, J., Ott, J., and T. Schierl, "Source-Specific 3994 Media Attributes in the Session Description Protocol 3995 (SDP)", RFC 5576, June 2009. 3997 [RFC5763] Fischl, J., Tschofenig, H., and E. Rescorla, "Framework 3998 for Establishing a Secure Real-time Transport Protocol 3999 (SRTP) Security Context Using Datagram Transport Layer 4000 Security (DTLS)", RFC 5763, May 2010. 4002 [RFC5764] McGrew, D. and E. Rescorla, "Datagram Transport Layer 4003 Security (DTLS) Extension to Establish Keys for the Secure 4004 Real-time Transport Protocol (SRTP)", RFC 5764, May 2010. 4006 [RFC6464] Lennox, J., Ed., Ivov, E., and E. Marocco, "A Real-time 4007 Transport Protocol (RTP) Header Extension for Client-to- 4008 Mixer Audio Level Indication", RFC 6464, 4009 DOI 10.17487/RFC6464, December 2011, 4010 . 4012 [W3C.WD-webrtc-20140617] 4013 Bergkvist, A., Burnett, D., Narayanan, A., and C. 4014 Jennings, "WebRTC 1.0: Real-time Communication Between 4015 Browsers", World Wide Web Consortium WD WD-webrtc- 4016 20140617, June 2014, 4017 . 4019 Appendix A. Appendix A 4021 For the syntax validation performed in Section 5.7, the following 4022 list of ABNF definitions is used: 4024 +-----------------------+-------------------------------------------+ 4025 | Attribute | Reference | 4026 +-----------------------+-------------------------------------------+ 4027 | ptime | [RFC4566] Section 9 | 4028 | maxptime | [RFC4566] Section 9 | 4029 | rtpmap | [RFC4566] Section 9 | 4030 | recvonly | [RFC4566] Section 9 | 4031 | sendrecv | [RFC4566] Section 9 | 4032 | sendonly | [RFC4566] Section 9 | 4033 | inactive | [RFC4566] Section 9 | 4034 | framerate | [RFC4566] Section 9 | 4035 | fmtp | [RFC4566] Section 9 | 4036 | quality | [RFC4566] Section 9 | 4037 | rtcp | [RFC3605] Section 2.1 | 4038 | setup | [RFC4145] Sections 3, 4, and 5 | 4039 | connection | [RFC4145] Sections 3, 4, and 5 | 4040 | fingerprint | [RFC4572] Section 5 | 4041 | rtcp-fb | [RFC4585] Section 4.2 | 4042 | candidate | [RFC5245] Section 15.1 | 4043 | remote-candidates | [RFC5245] Section 15.2 | 4044 | ice-lite | [RFC5245] Section 15.3 | 4045 | ice-ufrag | [RFC5245] Section 15.4 | 4046 | ice-pwd | [RFC5245] Section 15.4 | 4047 | ice-options | [RFC5245] Section 15.5 | 4048 | extmap | [RFC5285] Section 7 | 4049 | mid | [RFC5888] Section 4 and 5 | 4050 | group | [RFC5888] Section 4 and 5 | 4051 | imageattr | [RFC6236] Section 3.1 | 4052 | extmap (encrypt | [RFC6904] Section 4 | 4053 | option) | | 4054 | msid | [I-D.ietf-mmusic-msid] Section 2 | 4055 | rid | [I-D.ietf-mmusic-rid] Section 10 | 4056 | simulcast | [I-D.ietf-mmusic-sdp-simulcast]Section | 4057 | | 6.1 | 4058 | dtls-id | [I-D.ietf-mmusic-dtls-sdp]Section 4 | 4059 +-----------------------+-------------------------------------------+ 4061 Table 1: SDP ABNF References 4063 Appendix B. Change log 4065 Note: This section will be removed by RFC Editor before publication. 4067 Changes in draft-17: 4069 o Split createOffer and createAnswer sections to clearly indicate 4070 attributes which always appear and which only appear when not 4071 bundled into another m= section. 4073 o Add descriptions of RtpTransceiver methods. 4075 o Describe how to process RTCP feedback attributes. 4077 o Clarify transceiver directions and their interaction with 3264. 4079 o Describe setCodecPreferences. 4081 o Update RTP demux algorithm. Include RTCP. 4083 o Update requirements for when a=rtcp is included, limiting to cases 4084 where it is needed for backward compatibility. 4086 o Clarify SAR handling. 4088 o Updated addTrack matching algorithm. 4090 o Remove a=ssrc requirements. 4092 o Handle a=setup in reoffers. 4094 o Discuss how RTX/FEC should be handled. 4096 o Discuss how telephone-event should be handled. 4098 o Discuss how CN/DTX should be handled. 4100 o Add missing references to ABNF table. 4102 Changes in draft-16: 4104 o Update addIceCandidate to indicate ICE generation and allow per-m= 4105 section end-of-candidates. 4107 o Update fingerprint handling to use draft-ietf-mmusic-4572-update. 4109 o Update text around SDP processing of RTP header extensions and 4110 payload formats. 4112 o Add sections on simulcast, addTransceiver, and createDataChannel. 4114 o Clarify text to ensure that the session ID is a positive 63 bit 4115 integer. 4117 o Clarify SDP processing for direction indication. 4119 o Describe SDP processing for rtcp-mux-only. 4121 o Specify how SDP session version in o= line. 4123 o Require that when doing an re-offer, the capabilities of the new 4124 session are mostly required to be a subset of the previously 4125 negotiated session. 4127 o Clarified ICE restart interaction with bundle-only. 4129 o Remove support for changing SDP before calling 4130 setLocalDescription. 4132 o Specify algorithm for demuxing RTP based on MID, PT, and SSRC. 4134 o Clarify rules for rejecting m= lines when bundle policy is 4135 balanced or max-bundle. 4137 Changes in draft-15: 4139 o Clarify text around codecs offered in subsequent transactions to 4140 refer to what's been negotiated. 4142 o Rewrite LS handling text to indicate edge cases and that we're 4143 living with them. 4145 o Require that answerer reject m= lines when there are no codecs in 4146 common. 4148 o Enforce max-bundle on offer processing. 4150 o Fix TIAS formula to handle bits vs. kilobits. 4152 o Describe addTrack algorithm. 4154 o Clean up references. 4156 Changes in draft-14: 4158 o Added discussion of RtpTransceivers + RtpSenders + RtpReceivers, 4159 and how they interact with createOffer/createAnswer. 4161 o Removed obsolete OfferToReceiveX options. 4163 o Explained how addIceCandidate can be used for end-of-candidates. 4165 Changes in draft-13: 4167 o Clarified which SDP lines can be ignored. 4169 o Clarified how to handle various received attributes. 4171 o Revised how attributes should be generated for bundled m= lines. 4173 o Remove unused references. 4175 o Remove text advocating use of unilateral PTs. 4177 o Trigger an ICE restart even if the ICE candidate policy is being 4178 made more strict. 4180 o Remove the 'public' ICE candidate policy. 4182 o Move open issues/TODOs into GitHub issues. 4184 o Split local/remote description accessors into current/pending. 4186 o Clarify a=imageattr handling. 4188 o Add more detail on VoiceActivityDetection handling. 4190 o Reference draft-shieh-rtcweb-ip-handling. 4192 o Make it clear when an ICE restart should occur. 4194 o Resolve reference TODOs. 4196 o Remove MSID semantics. 4198 o ice-options are now at session level. 4200 o Default RTCP mux policy is now 'require'. 4202 Changes in draft-12: 4204 o Filled in sections on applying local and remote descriptions. 4206 o Discussed downscaling and upscaling to fulfill imageattr 4207 requirements. 4209 o Updated what SDP can be modified by the application. 4211 o Updated to latest datachannel SDP. 4213 o Allowed multiple fingerprint lines. 4215 o Switched back to IPv4 for dummy candidates. 4217 o Added additional clarity on ICE default candidates. 4219 Changes in draft-11: 4221 o Clarified handling of RTP CNAMEs. 4223 o Updated what SDP lines should be processed or ignored. 4225 o Specified how a=imageattr should be used. 4227 Changes in draft-10: 4229 o TODO 4231 Changes in draft-09: 4233 o Don't return null for {local,remote}Description after close(). 4235 o Changed TCP/TLS to UDP/DTLS in RTP profile names. 4237 o Separate out bundle and mux policy. 4239 o Added specific references to FEC mechanisms. 4241 o Added canTrickle mechanism. 4243 o Added section on subsequent answers and, answer options. 4245 o Added text defining set{Local,Remote}Description behavior. 4247 Changes in draft-08: 4249 o Added new example section and removed old examples in appendix. 4251 o Fixed field handling. 4253 o Added text describing a=rtcp attribute. 4255 o Reworked handling of OfferToReceiveAudio and OfferToReceiveVideo 4256 per discussion at IETF 90. 4258 o Reworked trickle ICE handling and its impact on m= and c= lines 4259 per discussion at interim. 4261 o Added max-bundle-and-rtcp-mux policy. 4263 o Added description of maxptime handling. 4265 o Updated ICE candidate pool default to 0. 4267 o Resolved open issues around AppID/receiver-ID. 4269 o Reworked and expanded how changes to the ICE configuration are 4270 handled. 4272 o Some reference updates. 4274 o Editorial clarification. 4276 Changes in draft-07: 4278 o Expanded discussion of VAD and Opus DTX. 4280 o Added a security considerations section. 4282 o Rewrote the section on modifying SDP to require implementations to 4283 clearly indicate whether any given modification is allowed. 4285 o Clarified impact of IceRestart on CreateOffer in local-offer 4286 state. 4288 o Guidance on whether attributes should be defined at the media 4289 level or the session level. 4291 o Renamed "default" bundle policy to "balanced". 4293 o Removed default ICE candidate pool size and clarify how it works. 4295 o Defined a canonical order for assignment of MSTs to m= lines. 4297 o Removed discussion of rehydration. 4299 o Added Eric Rescorla as a draft editor. 4301 o Cleaned up references. 4303 o Editorial cleanup 4305 Changes in draft-06: 4307 o Reworked handling of m= line recycling. 4309 o Added handling of BUNDLE and bundle-only. 4311 o Clarified handling of rollback. 4313 o Added text describing the ICE Candidate Pool and its behavior. 4315 o Allowed OfferToReceiveX to create multiple recvonly m= sections. 4317 Changes in draft-05: 4319 o Fixed several issues identified in the createOffer/Answer sections 4320 during document review. 4322 o Updated references. 4324 Changes in draft-04: 4326 o Filled in sections on createOffer and createAnswer. 4328 o Added SDP examples. 4330 o Fixed references. 4332 Changes in draft-03: 4334 o Added text describing relationship to W3C specification 4336 Changes in draft-02: 4338 o Converted from nroff 4340 o Removed comparisons to old approaches abandoned by the working 4341 group 4343 o Removed stuff that has moved to W3C specification 4345 o Align SDP handling with W3C draft 4347 o Clarified section on forking. 4349 Changes in draft-01: 4351 o Added diagrams for architecture and state machine. 4353 o Added sections on forking and rehydration. 4355 o Clarified meaning of "pranswer" and "answer". 4357 o Reworked how ICE restarts and media directions are controlled. 4359 o Added list of parameters that can be changed in a description. 4361 o Updated suggested API and examples to match latest thinking. 4363 o Suggested API and examples have been moved to an appendix. 4365 Changes in draft -00: 4367 o Migrated from draft-uberti-rtcweb-jsep-02. 4369 Authors' Addresses 4371 Justin Uberti 4372 Google 4373 747 6th St S 4374 Kirkland, WA 98033 4375 USA 4377 Email: justin@uberti.name 4379 Cullen Jennings 4380 Cisco 4381 400 3rd Avenue SW 4382 Calgary, AB T2P 4H2 4383 Canada 4385 Email: fluffy@iii.ca 4387 Eric Rescorla (editor) 4388 Mozilla 4389 331 Evelyn Ave 4390 Mountain View, CA 94041 4391 USA 4393 Email: ekr@rtfm.com