idnits 2.17.1 draft-ietf-rtcweb-jsep-12.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 44 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. == There are 20 instances of lines with private range IPv4 addresses in the document. If these are generic example addresses, they should be changed to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x, 198.51.100.x or 203.0.113.x. -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 18, 2015) is 3112 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '0' on line 666 == Missing Reference: 'RFC1918' is mentioned on line 777, but not defined == Missing Reference: 'RFC4787' is mentioned on line 780, but not defined == Unused Reference: 'I-D.ietf-rtcweb-data-protocol' is defined on line 3329, but no explicit reference was found in the text == Unused Reference: 'RFC5124' is defined on line 3399, but no explicit reference was found in the text == Unused Reference: 'RFC7022' is defined on line 3428, but no explicit reference was found in the text == Outdated reference: A later version (-17) exists of draft-ietf-mmusic-msid-01 == Outdated reference: A later version (-26) exists of draft-ietf-mmusic-sctp-sdp-04 == Outdated reference: A later version (-54) exists of draft-ietf-mmusic-sdp-bundle-negotiation-04 == Outdated reference: A later version (-19) exists of draft-ietf-mmusic-sdp-mux-attributes-01 == Outdated reference: A later version (-02) exists of draft-ietf-mmusic-trickle-ice-00 == Outdated reference: A later version (-11) exists of draft-ietf-rtcweb-audio-02 == Outdated reference: A later version (-09) exists of draft-ietf-rtcweb-data-protocol-04 == Outdated reference: A later version (-10) exists of draft-ietf-rtcweb-fec-00 == Outdated reference: A later version (-26) exists of draft-ietf-rtcweb-rtp-usage-09 == Outdated reference: A later version (-12) exists of draft-ietf-rtcweb-security-06 == Outdated reference: A later version (-20) exists of draft-ietf-rtcweb-security-arch-09 == Outdated reference: A later version (-06) exists of draft-ietf-rtcweb-video-00 -- No information found for draft-nandakumar-mmusic-proto-iana-registration - is the name correct? -- Possible downref: Normative reference to a draft: ref. 'I-D.nandakumar-mmusic-proto-iana-registration' ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) ** Obsolete normative reference: RFC 4572 (Obsoleted by RFC 8122) ** Obsolete normative reference: RFC 5245 (Obsoleted by RFC 8445, RFC 8839) ** Obsolete normative reference: RFC 5285 (Obsoleted by RFC 8285) ** Obsolete normative reference: RFC 6347 (Obsoleted by RFC 9147) == Outdated reference: A later version (-08) exists of draft-nandakumar-rtcweb-sdp-02 Summary: 5 errors (**), 0 flaws (~~), 21 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group J. Uberti 3 Internet-Draft Google 4 Intended status: Standards Track C. Jennings 5 Expires: April 20, 2016 Cisco 6 E. Rescorla, Ed. 7 Mozilla 8 October 18, 2015 10 Javascript Session Establishment Protocol 11 draft-ietf-rtcweb-jsep-12 13 Abstract 15 This document describes the mechanisms for allowing a Javascript 16 application to control the signaling plane of a multimedia session 17 via the interface specified in the W3C RTCPeerConnection API, and 18 discusses how this relates to existing signaling protocols. 20 Status of This Memo 22 This Internet-Draft is submitted in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at http://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on April 20, 2016. 37 Copyright Notice 39 Copyright (c) 2015 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents 44 (http://trustee.ietf.org/license-info) in effect on the date of 45 publication of this document. Please review these documents 46 carefully, as they describe your rights and restrictions with respect 47 to this document. Code Components extracted from this document must 48 include Simplified BSD License text as described in Section 4.e of 49 the Trust Legal Provisions and are provided without warranty as 50 described in the Simplified BSD License. 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 55 1.1. General Design of JSEP . . . . . . . . . . . . . . . . . 3 56 1.2. Other Approaches Considered . . . . . . . . . . . . . . . 5 57 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 58 3. Semantics and Syntax . . . . . . . . . . . . . . . . . . . . 6 59 3.1. Signaling Model . . . . . . . . . . . . . . . . . . . . . 6 60 3.2. Session Descriptions and State Machine . . . . . . . . . 7 61 3.3. Session Description Format . . . . . . . . . . . . . . . 10 62 3.4. ICE . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 63 3.4.1. ICE Gathering Overview . . . . . . . . . . . . . . . 10 64 3.4.2. ICE Candidate Trickling . . . . . . . . . . . . . . . 11 65 3.4.2.1. ICE Candidate Format . . . . . . . . . . . . . . 11 66 3.4.3. ICE Candidate Policy . . . . . . . . . . . . . . . . 12 67 3.4.4. ICE Candidate Pool . . . . . . . . . . . . . . . . . 13 68 3.5. Video Size Negotiation . . . . . . . . . . . . . . . . . 13 69 3.5.1. Creating an imageattr Attribute . . . . . . . . . . . 13 70 3.5.2. Interpreting an imageattr Attribute . . . . . . . . . 14 71 3.6. Interactions With Forking . . . . . . . . . . . . . . . . 15 72 3.6.1. Sequential Forking . . . . . . . . . . . . . . . . . 15 73 3.6.2. Parallel Forking . . . . . . . . . . . . . . . . . . 16 74 4. Interface . . . . . . . . . . . . . . . . . . . . . . . . . . 17 75 4.1. Methods . . . . . . . . . . . . . . . . . . . . . . . . . 17 76 4.1.1. Constructor . . . . . . . . . . . . . . . . . . . . . 17 77 4.1.2. createOffer . . . . . . . . . . . . . . . . . . . . . 19 78 4.1.3. createAnswer . . . . . . . . . . . . . . . . . . . . 20 79 4.1.4. SessionDescriptionType . . . . . . . . . . . . . . . 21 80 4.1.4.1. Use of Provisional Answers . . . . . . . . . . . 22 81 4.1.4.2. Rollback . . . . . . . . . . . . . . . . . . . . 22 82 4.1.5. setLocalDescription . . . . . . . . . . . . . . . . . 23 83 4.1.6. setRemoteDescription . . . . . . . . . . . . . . . . 24 84 4.1.7. localDescription . . . . . . . . . . . . . . . . . . 24 85 4.1.8. remoteDescription . . . . . . . . . . . . . . . . . . 24 86 4.1.9. canTrickleIceCandidates . . . . . . . . . . . . . . . 24 87 4.1.10. setConfiguration . . . . . . . . . . . . . . . . . . 25 88 4.1.11. addIceCandidate . . . . . . . . . . . . . . . . . . . 26 89 5. SDP Interaction Procedures . . . . . . . . . . . . . . . . . 26 90 5.1. Requirements Overview . . . . . . . . . . . . . . . . . . 26 91 5.1.1. Implementation Requirements . . . . . . . . . . . . . 26 92 5.1.2. Usage Requirements . . . . . . . . . . . . . . . . . 28 93 5.1.3. Profile Names and Interoperability . . . . . . . . . 28 94 5.2. Constructing an Offer . . . . . . . . . . . . . . . . . . 29 95 5.2.1. Initial Offers . . . . . . . . . . . . . . . . . . . 29 96 5.2.2. Subsequent Offers . . . . . . . . . . . . . . . . . . 34 97 5.2.3. Options Handling . . . . . . . . . . . . . . . . . . 37 98 5.2.3.1. OfferToReceiveAudio . . . . . . . . . . . . . . . 37 99 5.2.3.2. OfferToReceiveVideo . . . . . . . . . . . . . . . 38 100 5.2.3.3. IceRestart . . . . . . . . . . . . . . . . . . . 38 101 5.2.3.4. VoiceActivityDetection . . . . . . . . . . . . . 38 102 5.3. Generating an Answer . . . . . . . . . . . . . . . . . . 39 103 5.3.1. Initial Answers . . . . . . . . . . . . . . . . . . . 39 104 5.3.2. Subsequent Answers . . . . . . . . . . . . . . . . . 43 105 5.3.3. Options Handling . . . . . . . . . . . . . . . . . . 44 106 5.3.3.1. VoiceActivityDetection . . . . . . . . . . . . . 45 107 5.4. Processing a Local Description . . . . . . . . . . . . . 45 108 5.5. Processing a Remote Description . . . . . . . . . . . . . 45 109 5.6. Parsing a Session Description . . . . . . . . . . . . . . 46 110 5.6.1. Session-Level Parsing . . . . . . . . . . . . . . . . 46 111 5.6.2. Media Section Parsing . . . . . . . . . . . . . . . . 48 112 5.6.3. Semantics Verification . . . . . . . . . . . . . . . 49 113 5.7. Applying a Local Description . . . . . . . . . . . . . . 50 114 5.8. Applying a Remote Description . . . . . . . . . . . . . . 51 115 5.9. Applying an Answer . . . . . . . . . . . . . . . . . . . 53 116 6. Configurable SDP Parameters . . . . . . . . . . . . . . . . . 54 117 7. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 55 118 7.1. Simple Example . . . . . . . . . . . . . . . . . . . . . 56 119 7.2. Normal Examples . . . . . . . . . . . . . . . . . . . . . 60 120 8. Security Considerations . . . . . . . . . . . . . . . . . . . 71 121 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 71 122 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 71 123 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 72 124 11.1. Normative References . . . . . . . . . . . . . . . . . . 72 125 11.2. Informative References . . . . . . . . . . . . . . . . . 75 126 Appendix A. Change log . . . . . . . . . . . . . . . . . . . . . 76 127 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 79 129 1. Introduction 131 This document describes how the W3C WEBRTC RTCPeerConnection 132 interface[W3C.WD-webrtc-20140617] is used to control the setup, 133 management and teardown of a multimedia session. 135 1.1. General Design of JSEP 137 The thinking behind WebRTC call setup has been to fully specify and 138 control the media plane, but to leave the signaling plane up to the 139 application as much as possible. The rationale is that different 140 applications may prefer to use different protocols, such as the 141 existing SIP or Jingle call signaling protocols, or something custom 142 to the particular application, perhaps for a novel use case. In this 143 approach, the key information that needs to be exchanged is the 144 multimedia session description, which specifies the necessary 145 transport and media configuration information necessary to establish 146 the media plane. 148 With these considerations in mind, this document describes the 149 Javascript Session Establishment Protocol (JSEP) that allows for full 150 control of the signaling state machine from Javascript. JSEP removes 151 the browser almost entirely from the core signaling flow, which is 152 instead handled by the Javascript making use of two interfaces: (1) 153 passing in local and remote session descriptions and (2) interacting 154 with the ICE state machine. 156 In this document, the use of JSEP is described as if it always occurs 157 between two browsers. Note though in many cases it will actually be 158 between a browser and some kind of server, such as a gateway or MCU. 159 This distinction is invisible to the browser; it just follows the 160 instructions it is given via the API. 162 JSEP's handling of session descriptions is simple and 163 straightforward. Whenever an offer/answer exchange is needed, the 164 initiating side creates an offer by calling a createOffer() API. The 165 application optionally modifies that offer, and then uses it to set 166 up its local config via the setLocalDescription() API. The offer is 167 then sent off to the remote side over its preferred signaling 168 mechanism (e.g., WebSockets); upon receipt of that offer, the remote 169 party installs it using the setRemoteDescription() API. 171 To complete the offer/answer exchange, the remote party uses the 172 createAnswer() API to generate an appropriate answer, applies it 173 using the setLocalDescription() API, and sends the answer back to the 174 initiator over the signaling channel. When the initiator gets that 175 answer, it installs it using the setRemoteDescription() API, and 176 initial setup is complete. This process can be repeated for 177 additional offer/answer exchanges. 179 Regarding ICE [RFC5245], JSEP decouples the ICE state machine from 180 the overall signaling state machine, as the ICE state machine must 181 remain in the browser, because only the browser has the necessary 182 knowledge of candidates and other transport info. Performing this 183 separation also provides additional flexibility; in protocols that 184 decouple session descriptions from transport, such as Jingle, the 185 session description can be sent immediately and the transport 186 information can be sent when available. In protocols that don't, 187 such as SIP, the information can be used in the aggregated form. 188 Sending transport information separately can allow for faster ICE and 189 DTLS startup, since ICE checks can start as soon as any transport 190 information is available rather than waiting for all of it. 192 Through its abstraction of signaling, the JSEP approach does require 193 the application to be aware of the signaling process. While the 194 application does not need to understand the contents of session 195 descriptions to set up a call, the application must call the right 196 APIs at the right times, convert the session descriptions and ICE 197 information into the defined messages of its chosen signaling 198 protocol, and perform the reverse conversion on the messages it 199 receives from the other side. 201 One way to mitigate this is to provide a Javascript library that 202 hides this complexity from the developer; said library would 203 implement a given signaling protocol along with its state machine and 204 serialization code, presenting a higher level call-oriented interface 205 to the application developer. For example, libraries exist to adapt 206 the JSEP API into an API suitable for a SIP or XMPP. Thus, JSEP 207 provides greater control for the experienced developer without 208 forcing any additional complexity on the novice developer. 210 1.2. Other Approaches Considered 212 One approach that was considered instead of JSEP was to include a 213 lightweight signaling protocol. Instead of providing session 214 descriptions to the API, the API would produce and consume messages 215 from this protocol. While providing a more high-level API, this put 216 more control of signaling within the browser, forcing the browser to 217 have to understand and handle concepts like signaling glare. In 218 addition, it prevented the application from driving the state machine 219 to a desired state, as is needed in the page reload case. 221 A second approach that was considered but not chosen was to decouple 222 the management of the media control objects from session 223 descriptions, instead offering APIs that would control each component 224 directly. This was rejected based on a feeling that requiring 225 exposure of this level of complexity to the application programmer 226 would not be beneficial; it would result in an API where even a 227 simple example would require a significant amount of code to 228 orchestrate all the needed interactions, as well as creating a large 229 API surface that needed to be agreed upon and documented. In 230 addition, these API points could be called in any order, resulting in 231 a more complex set of interactions with the media subsystem than the 232 JSEP approach, which specifies how session descriptions are to be 233 evaluated and applied. 235 One variation on JSEP that was considered was to keep the basic 236 session description-oriented API, but to move the mechanism for 237 generating offers and answers out of the browser. Instead of 238 providing createOffer/createAnswer methods within the browser, this 239 approach would instead expose a getCapabilities API which would 240 provide the application with the information it needed in order to 241 generate its own session descriptions. This increases the amount of 242 work that the application needs to do; it needs to know how to 243 generate session descriptions from capabilities, and especially how 244 to generate the correct answer from an arbitrary offer and the 245 supported capabilities. While this could certainly be addressed by 246 using a library like the one mentioned above, it basically forces the 247 use of said library even for a simple example. Providing 248 createOffer/createAnswer avoids this problem, but still allows 249 applications to generate their own offers/answers (to a large extent) 250 if they choose, using the description generated by createOffer as an 251 indication of the browser's capabilities. 253 2. Terminology 255 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 256 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 257 document are to be interpreted as described in [RFC2119]. 259 3. Semantics and Syntax 261 3.1. Signaling Model 263 JSEP does not specify a particular signaling model or state machine, 264 other than the generic need to exchange session descriptions in the 265 fashion described by [RFC3264] (offer/answer) in order for both sides 266 of the session to know how to conduct the session. JSEP provides 267 mechanisms to create offers and answers, as well as to apply them to 268 a session. However, the browser is totally decoupled from the actual 269 mechanism by which these offers and answers are communicated to the 270 remote side, including addressing, retransmission, forking, and glare 271 handling. These issues are left entirely up to the application; the 272 application has complete control over which offers and answers get 273 handed to the browser, and when. 275 +-----------+ +-----------+ 276 | Web App |<--- App-Specific Signaling -->| Web App | 277 +-----------+ +-----------+ 278 ^ ^ 279 | SDP | SDP 280 V V 281 +-----------+ +-----------+ 282 | Browser |<----------- Media ------------>| Browser | 283 +-----------+ +-----------+ 285 Figure 1: JSEP Signaling Model 287 3.2. Session Descriptions and State Machine 289 In order to establish the media plane, the user agent needs specific 290 parameters to indicate what to transmit to the remote side, as well 291 as how to handle the media that is received. These parameters are 292 determined by the exchange of session descriptions in offers and 293 answers, and there are certain details to this process that must be 294 handled in the JSEP APIs. 296 Whether a session description applies to the local side or the remote 297 side affects the meaning of that description. For example, the list 298 of codecs sent to a remote party indicates what the local side is 299 willing to receive, which, when intersected with the set of codecs 300 the remote side supports, specifies what the remote side should send. 301 However, not all parameters follow this rule; for example, the DTLS- 302 SRTP parameters [RFC5763] sent to a remote party indicate what 303 certificate the local side will use in DTLS setup, and thereby what 304 the remote party should expect to receive; the remote party will have 305 to accept these parameters, with no option to choose different 306 values. 308 In addition, various RFCs put different conditions on the format of 309 offers versus answers. For example, an offer may propose an 310 arbitrary number of media streams (i.e. m= sections), but an answer 311 must contain the exact same number as the offer. 313 Lastly, while the exact media parameters are only known only after an 314 offer and an answer have been exchanged, it is possible for the 315 offerer to receive media after they have sent an offer and before 316 they have received an answer. To properly process incoming media in 317 this case, the offerer's media handler must be aware of the details 318 of the offer before the answer arrives. 320 Therefore, in order to handle session descriptions properly, the user 321 agent needs: 323 1. To know if a session description pertains to the local or remote 324 side. 326 2. To know if a session description is an offer or an answer. 328 3. To allow the offer to be specified independently of the answer. 330 JSEP addresses this by adding both setLocalDescription and 331 setRemoteDescription methods and having session description objects 332 contain a type field indicating the type of session description being 333 supplied. This satisfies the requirements listed above for both the 334 offerer, who first calls setLocalDescription(sdp [offer]) and then 335 later setRemoteDescription(sdp [answer]), as well as for the 336 answerer, who first calls setRemoteDescription(sdp [offer]) and then 337 later setLocalDescription(sdp [answer]). 339 JSEP also allows for an answer to be treated as provisional by the 340 application. Provisional answers provide a way for an answerer to 341 communicate initial session parameters back to the offerer, in order 342 to allow the session to begin, while allowing a final answer to be 343 specified later. This concept of a final answer is important to the 344 offer/answer model; when such an answer is received, any extra 345 resources allocated by the caller can be released, now that the exact 346 session configuration is known. These "resources" can include things 347 like extra ICE components, TURN candidates, or video decoders. 348 Provisional answers, on the other hand, do no such deallocation 349 results; as a result, multiple dissimilar provisional answers can be 350 received and applied during call setup. 352 In [RFC3264], the constraint at the signaling level is that only one 353 offer can be outstanding for a given session, but at the media stack 354 level, a new offer can be generated at any point. For example, when 355 using SIP for signaling, if one offer is sent, then cancelled using a 356 SIP CANCEL, another offer can be generated even though no answer was 357 received for the first offer. To support this, the JSEP media layer 358 can provide an offer via the createOffer() method whenever the 359 Javascript application needs one for the signaling. The answerer can 360 send back zero or more provisional answers, and finally end the 361 offer-answer exchange by sending a final answer. The state machine 362 for this is as follows: 364 setRemote(OFFER) setLocal(PRANSWER) 365 /-----\ /-----\ 366 | | | | 367 v | v | 368 +---------------+ | +---------------+ | 369 | |----/ | |----/ 370 | | setLocal(PRANSWER) | | 371 | Remote-Offer |------------------- >| Local-Pranswer| 372 | | | | 373 | | | | 374 +---------------+ +---------------+ 375 ^ | | 376 | | setLocal(ANSWER) | 377 setRemote(OFFER) | | 378 | V setLocal(ANSWER) | 379 +---------------+ | 380 | | | 381 | |<---------------------------+ 382 | Stable | 383 | |<---------------------------+ 384 | | | 385 +---------------+ setRemote(ANSWER) | 386 ^ | | 387 | | setLocal(OFFER) | 388 setRemote(ANSWER) | | 389 | V | 390 +---------------+ +---------------+ 391 | | | | 392 | | setRemote(PRANSWER) | | 393 | Local-Offer |------------------- >|Remote-Pranswer| 394 | | | | 395 | |----\ | |----\ 396 +---------------+ | +---------------+ | 397 ^ | ^ | 398 | | | | 399 \-----/ \-----/ 400 setLocal(OFFER) setRemote(PRANSWER) 402 Figure 2: JSEP State Machine 404 Aside from these state transitions there is no other difference 405 between the handling of provisional ("pranswer") and final ("answer") 406 answers. 408 3.3. Session Description Format 410 In the WebRTC specification, session descriptions are formatted as 411 SDP messages. While this format is not optimal for manipulation from 412 Javascript, it is widely accepted, and frequently updated with new 413 features. Any alternate encoding of session descriptions would have 414 to keep pace with the changes to SDP, at least until the time that 415 this new encoding eclipsed SDP in popularity. As a result, JSEP 416 currently uses SDP as the internal representation for its session 417 descriptions. 419 However, to simplify Javascript processing, and provide for future 420 flexibility, the SDP syntax is encapsulated within a 421 SessionDescription object, which can be constructed from SDP, and be 422 serialized out to SDP. If future specifications agree on a JSON 423 format for session descriptions, we could easily enable this object 424 to generate and consume that JSON. 426 Other methods may be added to SessionDescription in the future to 427 simplify handling of SessionDescriptions from Javascript. In the 428 meantime, Javascript libraries can be used to perform these 429 manipulations. 431 Note that most applications should be able to treat the 432 SessionDescriptions produced and consumed by these various API calls 433 as opaque blobs; that is, the application will not need to read or 434 change them. The W3C WebRTC API specification will provide 435 appropriate APIs to allow the application to control various session 436 parameters, which will provide the necessary information to the 437 browser about what sort of SessionDescription to produce. 439 3.4. ICE 441 3.4.1. ICE Gathering Overview 443 JSEP gathers ICE candidates as needed by the application. Collection 444 of ICE candidates is referred to as a gathering phase, and this is 445 triggered either by the addition of a new or recycled m= line to the 446 local session description, or new ICE credentials in the description, 447 indicating an ICE restart. Use of new ICE credentials can be 448 triggered explicitly by the application, or implicitly by the browser 449 in response to changes in the ICE configuration. 451 When a new gathering phase starts, the ICE Agent will notify the 452 application that gathering is occurring through an event. Then, when 453 each new ICE candidate becomes available, the ICE Agent will supply 454 it to the application via an additional event; these candidates will 455 also automatically be added to the local session description. 457 Finally, when all candidates have been gathered, an event will be 458 dispatched to signal that the gathering process is complete. 460 Note that gathering phases only gather the candidates needed by 461 new/recycled/restarting m= lines; other m= lines continue to use 462 their existing candidates. 464 3.4.2. ICE Candidate Trickling 466 Candidate trickling is a technique through which a caller may 467 incrementally provide candidates to the callee after the initial 468 offer has been dispatched; the semantics of "Trickle ICE" are defined 469 in [I-D.ietf-mmusic-trickle-ice]. This process allows the callee to 470 begin acting upon the call and setting up the ICE (and perhaps DTLS) 471 connections immediately, without having to wait for the caller to 472 gather all possible candidates. This results in faster media setup 473 in cases where gathering is not performed prior to initiating the 474 call. 476 JSEP supports optional candidate trickling by providing APIs, as 477 described above, that provide control and feedback on the ICE 478 candidate gathering process. Applications that support candidate 479 trickling can send the initial offer immediately and send individual 480 candidates when they get the notified of a new candidate; 481 applications that do not support this feature can simply wait for the 482 indication that gathering is complete, and then create and send their 483 offer, with all the candidates, at this time. 485 Upon receipt of trickled candidates, the receiving application will 486 supply them to its ICE Agent. This triggers the ICE Agent to start 487 using the new remote candidates for connectivity checks. 489 3.4.2.1. ICE Candidate Format 491 As with session descriptions, the syntax of the IceCandidate object 492 provides some abstraction, but can be easily converted to and from 493 the SDP candidate lines. 495 The candidate lines are the only SDP information that is contained 496 within IceCandidate, as they represent the only information needed 497 that is not present in the initial offer (i.e., for trickle 498 candidates). This information is carried with the same syntax as the 499 "candidate-attribute" field defined for ICE. For example: 501 candidate:1 1 UDP 1694498815 192.0.2.33 10000 typ host 503 The IceCandidate object also contains fields to indicate which m= 504 line it should be associated with. The m= line can be identified in 505 one of two ways; either by a m= line index, or a MID. The m= line 506 index is a zero-based index, with index N referring to the N+1th m= 507 line in the SDP sent by the entity which sent the IceCandidate. The 508 MID uses the "media stream identification" attribute, as defined in 509 [RFC5888], Section 4, to identify the m= line. JSEP implementations 510 creating an ICE Candidate object MUST populate both of these fields. 511 Implementations receiving an ICE Candidate object MUST use the MID if 512 present, or the m= line index, if not (as it could have come from a 513 non-JSEP endpoint). 515 3.4.3. ICE Candidate Policy 517 Typically, when gathering ICE candidates, the browser will gather all 518 possible forms of initial candidates - host, server reflexive, and 519 relay. However, in certain cases, applications may want to have more 520 specific control over the gathering process, due to privacy or 521 related concerns. For example, one may want to suppress the use of 522 host candidates, to avoid exposing information about the local 523 network, or go as far as only using relay candidates, to leak as 524 little location information as possible (note that these choices come 525 with corresponding operational costs). To accomplish this, the 526 browser MUST allow the application to restrict which ICE candidates 527 are used in a session. In addition, administrators may also wish to 528 control the set of ICE candidates, and so the browser SHOULD also 529 allow control via local policy, with the most restrictive policy 530 prevailing. 532 There may also be cases where the application wants to change which 533 types of candidates are used while the session is active. A prime 534 example is where a callee may initially want to use only relay 535 candidates, to avoid leaking location information to an arbitrary 536 caller, but then change to use all candidates (for lower operational 537 cost) once the user has indicated they want to take the call. For 538 this scenario, the browser MUST allow the candidate policy to be 539 changed in mid-session, subject to the aforementioned interactions 540 with local policy. 542 To administer the ICE candidate policy, the browser will determine 543 the current setting at the start of each gathering phase. Then, 544 during the gathering phase, the browser MUST NOT expose candidates 545 disallowed by the current policy to the application, use them as the 546 source of connectivity checks, or indirectly expose them via other 547 fields, such as the raddr/rport attributes for other ICE candidates. 548 Later, if a different policy is specified by the application, the 549 application can apply it by kicking off a new gathering phase via an 550 ICE restart. 552 3.4.4. ICE Candidate Pool 554 JSEP applications typically inform the browser to begin ICE gathering 555 via the information supplied to setLocalDescription, as this is where 556 the app specifies the number of media streams, and thereby ICE 557 components, for which to gather candidates. However, to accelerate 558 cases where the application knows the number of ICE components to use 559 ahead of time, it may ask the browser to gather a pool of potential 560 ICE candidates to help ensure rapid media setup. 562 When setLocalDescription is eventually called, and the browser goes 563 to gather the needed ICE candidates, it SHOULD start by checking if 564 any candidates are available in the pool. If there are candidates in 565 the pool, they SHOULD be handed to the application immediately via 566 the ICE candidate event. If the pool becomes depleted, either 567 because a larger-than-expected number of ICE components is used, or 568 because the pool has not had enough time to gather candidates, the 569 remaining candidates are gathered as usual. 571 One example of where this concept is useful is an application that 572 expects an incoming call at some point in the future, and wants to 573 minimize the time it takes to establish connectivity, to avoid 574 clipping of initial media. By pre-gathering candidates into the 575 pool, it can exchange and start sending connectivity checks from 576 these candidates almost immediately upon receipt of a call. Note 577 though that by holding on to these pre-gathered candidates, which 578 will be kept alive as long as they may be needed, the application 579 will consume resources on the STUN/TURN servers it is using. 581 3.5. Video Size Negotiation 583 Video size negotiation is the process through which a receiver can 584 use the "a=imageattr" SDP attribute [RFC6236] to indicate what video 585 frame sizes it is capable of receiving. A receiver may have hard 586 limits on what its video decoder can process, or it may wish to 587 constrain what it receives due to application preferences, e.g. a 588 specific size for the window in which the video will be displayed. 590 3.5.1. Creating an imageattr Attribute 592 In order to determine the limits on what video resolution a receiver 593 wants to receive, it will intersect its decoder hard limits with any 594 mandatory constraints that have been applied to the associated 595 MediaStreamTrack. If the decoder limits are unknown, e.g. when using 596 a software decoder, the mandatory constraints are used directly. For 597 the answerer, these mandatory constraints can be applied to the 598 remote MediaStreamTracks that are created by a setRemoteDescription 599 call, and will affect the output of the ensuing createAnswer call. 601 Any constraints set after setLocalDescription is used to set the 602 answer will result in a new offer-answer exchange. For the offerer, 603 because it does not know about any remote MediaStreamTracks until it 604 receives the answer, the offer can only reflect decoder hard limits. 605 If the offerer wishes to set mandatory constraints on video 606 resolution, it must do so after receiving the answer, and the result 607 will be a new offer-answer to communicate them. 609 If there are no known decoder limits or mandatory constraints, the 610 "a=imageattr" attribute SHOULD be omitted. 612 Otherwise, an "a=imageattr" attribute is created with "recv" 613 direction, and the resulting resolution space formed by intersecting 614 the decoder limits and constraints is used to specify its minimum and 615 maximum x= and y= values. If the intersection is the null set, i.e., 616 there are no resolutions that are permitted by both the decoder and 617 the mandatory constraints, this SHOULD be represented by x=0 and y=0 618 values. 620 The rules here express a single set of preferences, and therefore, 621 the "a=imageattr" q= value is not important. It SHOULD be set to 622 1.0. 624 The "a=imageattr" field is payload type specific. When all video 625 codecs supported have the same capabilities, use of a single 626 attribute, with the wildcard payload type (*), is RECOMMENDED. 627 However, when the supported video codecs have differing capabilities, 628 specific "a=imageattr" attributes MUST be inserted for each payload 629 type. 631 As an example, consider a system with a HD-capable, multiformat video 632 decoder, where the application has constrained the received track to 633 at most 360p. In this case, the implemention would generate this 634 attribute: 636 a=imageattr:* recv [x=[16:640],y=[16:360],q=1.0] 638 3.5.2. Interpreting an imageattr Attribute 640 [RFC6236] defines "a=imageattr" to be an advisory field. This means 641 that it does not absolutely constrain the video formats that the 642 sender can use, but gives an indication of the preferred values. 644 This specification prescribes more specific behavior. When a sender 645 of a given MediaStreamTrack, which is producing video of a certain 646 resolution, receives an "a=imageattr recv" attribute, it MUST first 647 check to see if the original resolution meets the criteria specified 648 in the attribute, and transmit it untouched if so. If the original 649 resolution is too large for the attribute criteria, the sender SHOULD 650 apply downscaling to the output of the MediaStreamTrack in order to 651 satisfy the criteria. 653 If the receiver requires a minimum resolution which is greater than 654 the native resolution of the video, upscaling is needed, but this may 655 not be appropriate in all cases. To address this concern, the 656 application can set an upscaling policy for each sent track. For 657 this case, if upscaling is permitted by policy, the sender SHOULD 658 apply upscaling in order to provide the desired resolution. 659 Otherwise, the sender MUST NOT apply upscaling. The sender SHOULD 660 NOT upscale in other cases, even if the policy permits it. 662 If there is no appropriate and permitted scaling mechanism that 663 allows the received criteria to be satisfied, the sender MUST NOT 664 transmit the track. 666 In the special case of receiving a maximum resolution of [0, 0], as 667 described above, the sender MUST NOT transmit the track. 669 3.6. Interactions With Forking 671 Some call signaling systems allow various types of forking where an 672 SDP Offer may be provided to more than one device. For example, SIP 673 [RFC3261] defines both a "Parallel Search" and "Sequential Search". 674 Although these are primarily signaling level issues that are outside 675 the scope of JSEP, they do have some impact on the configuration of 676 the media plane that is relevant. When forking happens at the 677 signaling layer, the Javascript application responsible for the 678 signaling needs to make the decisions about what media should be sent 679 or received at any point of time, as well as which remote endpoint it 680 should communicate with; JSEP is used to make sure the media engine 681 can make the RTP and media perform as required by the application. 682 The basic operations that the applications can have the media engine 683 do are: 685 o Start exchanging media with a given remote peer, but keep all the 686 resources reserved in the offer. 688 o Start exchanging media with a given remote peer, and free any 689 resources in the offer that are not being used. 691 3.6.1. Sequential Forking 693 Sequential forking involves a call being dispatched to multiple 694 remote callees, where each callee can accept the call, but only one 695 active session ever exists at a time; no mixing of received media is 696 performed. 698 JSEP handles sequential forking well, allowing the application to 699 easily control the policy for selecting the desired remote endpoint. 700 When an answer arrives from one of the callees, the application can 701 choose to apply it either as a provisional answer, leaving open the 702 possibility of using a different answer in the future, or apply it as 703 a final answer, ending the setup flow. 705 In a "first-one-wins" situation, the first answer will be applied as 706 a final answer, and the application will reject any subsequent 707 answers. In SIP parlance, this would be ACK + BYE. 709 In a "last-one-wins" situation, all answers would be applied as 710 provisional answers, and any previous call leg will be terminated. 711 At some point, the application will end the setup process, perhaps 712 with a timer; at this point, the application could reapply the 713 existing remote description as a final answer. 715 3.6.2. Parallel Forking 717 Parallel forking involves a call being dispatched to multiple remote 718 callees, where each callee can accept the call, and multiple 719 simultaneous active signaling sessions can be established as a 720 result. If multiple callees send media at the same time, the 721 possibilities for handling this are described in Section 3.1 of 722 [RFC3960]. Most SIP devices today only support exchanging media with 723 a single device at a time, and do not try to mix multiple early media 724 audio sources, as that could result in a confusing situation. For 725 example, consider having a European ringback tone mixed together with 726 the North American ringback tone - the resulting sound would not be 727 like either tone, and would confuse the user. If the signaling 728 application wishes to only exchange media with one of the remote 729 endpoints at a time, then from a media engine point of view, this is 730 exactly like the sequential forking case. 732 In the parallel forking case where the Javascript application wishes 733 to simultaneously exchange media with multiple peers, the flow is 734 slightly more complex, but the Javascript application can follow the 735 strategy that [RFC3960] describes using UPDATE. The UPDATE approach 736 allows the signaling to set up a separate media flow for each peer 737 that it wishes to exchange media with. In JSEP, this offer used in 738 the UPDATE would be formed by simply creating a new PeerConnection 739 and making sure that the same local media streams have been added 740 into this new PeerConnection. Then the new PeerConnection object 741 would produce a SDP offer that could be used by the signaling to 742 perform the UPDATE strategy discussed in [RFC3960]. 744 As a result of sharing the media streams, the application will end up 745 with N parallel PeerConnection sessions, each with a local and remote 746 description and their own local and remote addresses. The media flow 747 from these sessions can be managed by specifying SDP direction 748 attributes in the descriptions, or the application can choose to play 749 out the media from all sessions mixed together. Of course, if the 750 application wants to only keep a single session, it can simply 751 terminate the sessions that it no longer needs. 753 4. Interface 755 This section details the basic operations that must be present to 756 implement JSEP functionality. The actual API exposed in the W3C API 757 may have somewhat different syntax, but should map easily to these 758 concepts. 760 4.1. Methods 762 4.1.1. Constructor 764 The PeerConnection constructor allows the application to specify 765 global parameters for the media session, such as the STUN/TURN 766 servers and credentials to use when gathering candidates, as well as 767 the initial ICE candidate policy and pool size, and also the BUNDLE 768 policy to use. 770 If an ICE candidate policy is specified, it functions as described in 771 Section 3.4.3, causing the browser to only surface the permitted 772 candidates to the application, and only use those candidates for 773 connectivity checks. The set of available policies is as follows: 775 all: All candidates will be gathered and used. 777 public: Candidates with private IP addresses [RFC1918] will be 778 filtered out. This prevents exposure of internal network details, 779 at the cost of requiring relay usage even for intranet calls, if 780 the NAT does not allow hairpinning as described in [RFC4787], 781 section 6. 783 relay: All candidates except relay candidates will be filtered out. 784 This obfuscates the location information that might be ascertained 785 by the remote peer from the received candidates. Depending on how 786 the application deploys its relay servers, this could obfuscate 787 location to a metro or possibly even global level. 789 Although it can be overridden by local policy, the default ICE 790 candidate policy MUST be set to allow all candidates, as this 791 minimizes use of application STUN/TURN server resources. 793 If a size is specified for the ICE candidate pool, this indicates the 794 number of ICE components to pre-gather candidates for. Because pre- 795 gathering results in utilizing STUN/TURN server resources for 796 potentially long periods of time, this must only occur upon 797 application request, and therefore the default candidate pool size 798 MUST be zero. 800 The application can specify its preferred policy regarding use of 801 BUNDLE, the multiplexing mechanism defined in 802 [I-D.ietf-mmusic-sdp-bundle-negotiation]. Regardless of policy, the 803 application will always try to negotiate BUNDLE onto a single 804 transport, and will offer a single BUNDLE group across all media 805 section; use of this single transport is contingent upon the answerer 806 accepting BUNDLE. However, by specifying a policy from the list 807 below, the application can control exactly how aggressively it will 808 try to BUNDLE media streams together, which affects how it will 809 interoperate with a non-BUNDLE-aware endpoint. When negotiating with 810 a non-BUNDLE-aware endpoint, only the streams not marked as bundle- 811 only streams will be established. The set of available policies is 812 as follows: 814 balanced: The first media section of each type (audio, video, or 815 application) will contain transport parameters, which will allow 816 an answerer to unbundle that section. The second and any 817 subsequent media section of each type will be marked bundle-only. 818 The result is that if there are N distinct media types, then 819 candidates will be gathered for for N media streams. This policy 820 balances desire to multiplex with the need to ensure basic audio 821 and video can still be negotiated in legacy cases. 823 max-compat: All media sections will contain transport parameters; 824 none will be marked as bundle-only. This policy will allow all 825 streams to be received by non-BUNDLE-aware endpoints, but require 826 separate candidates to be gathered for each media stream. 828 max-bundle: Only the first media section will contain transport 829 parameters; all streams other than the first will be marked as 830 bundle-only. This policy aims to minimize candidate gathering and 831 maximize multiplexing, at the cost of less compatibility with 832 legacy endpoints. 834 As it provides the best tradeoff between performance and 835 compatibility with legacy endpoints, the default BUNDLE policy MUST 836 be set to "balanced". 838 The application can specify its preferred policy regarding use of 839 RTP/RTCP multiplexing [RFC5761] using one of the following policies: 841 negotiate: The browser will gather both RTP and RTCP candidates but 842 also will offer "a=rtcp-mux", thus allowing for compatibility with 843 either multiplexing or non-multiplexing endpoints. 845 require: The browser will only gather RTP candidates. This halves 846 the number of candidates that the offerer needs to gather. When 847 acting as answerer, the browser will reject any m= section that 848 does not provide an "a=rtcp-mux" attribute. 850 4.1.2. createOffer 852 The createOffer method generates a blob of SDP that contains a 853 [RFC3264] offer with the supported configurations for the session, 854 including descriptions of the local MediaStreams attached to this 855 PeerConnection, the codec/RTP/RTCP options supported by this 856 implementation, and any candidates that have been gathered by the ICE 857 Agent. An options parameter may be supplied to provide additional 858 control over the generated offer. This options parameter should 859 allow for the following manipulations to be performed: 861 o To indicate support for a media type even if no MediaStreamTracks 862 of that type have been added to the session (e.g., an audio call 863 that wants to receive video.) 865 o To trigger an ICE restart, for the purpose of reestablishing 866 connectivity. 868 In the initial offer, the generated SDP will contain all desired 869 functionality for the session (functionality that is supported but 870 not desired by default may be omitted); for each SDP line, the 871 generation of the SDP will follow the process defined for generating 872 an initial offer from the document that specifies the given SDP line. 873 The exact handling of initial offer generation is detailed in 874 Section 5.2.1 below. 876 In the event createOffer is called after the session is established, 877 createOffer will generate an offer to modify the current session 878 based on any changes that have been made to the session, e.g. adding 879 or removing MediaStreams, or requesting an ICE restart. For each 880 existing stream, the generation of each SDP line must follow the 881 process defined for generating an updated offer from the RFC that 882 specifies the given SDP line. For each new stream, the generation of 883 the SDP must follow the process of generating an initial offer, as 884 mentioned above. If no changes have been made, or for SDP lines that 885 are unaffected by the requested changes, the offer will only contain 886 the parameters negotiated by the last offer-answer exchange. The 887 exact handling of subsequent offer generation is detailed in 888 Section 5.2.2. below. 890 Session descriptions generated by createOffer must be immediately 891 usable by setLocalDescription; if a system has limited resources 892 (e.g. a finite number of decoders), createOffer should return an 893 offer that reflects the current state of the system, so that 894 setLocalDescription will succeed when it attempts to acquire those 895 resources. Because this method may need to inspect the system state 896 to determine the currently available resources, it may be implemented 897 as an async operation. 899 Calling this method may do things such as generate new ICE 900 credentials, but does not result in candidate gathering, or cause 901 media to start or stop flowing. 903 4.1.3. createAnswer 905 The createAnswer method generates a blob of SDP that contains a 906 [RFC3264] SDP answer with the supported configuration for the session 907 that is compatible with the parameters supplied in the most recent 908 call to setRemoteDescription, which MUST have been called prior to 909 calling createAnswer. Like createOffer, the returned blob contains 910 descriptions of the local MediaStreams attached to this 911 PeerConnection, the codec/RTP/RTCP options negotiated for this 912 session, and any candidates that have been gathered by the ICE Agent. 913 An options parameter may be supplied to provide additional control 914 over the generated answer. 916 As an answer, the generated SDP will contain a specific configuration 917 that specifies how the media plane should be established; for each 918 SDP line, the generation of the SDP must follow the process defined 919 for generating an answer from the document that specifies the given 920 SDP line. The exact handling of answer generation is detailed in 921 Section 5.3. below. 923 Session descriptions generated by createAnswer must be immediately 924 usable by setLocalDescription; like createOffer, the returned 925 description should reflect the current state of the system. Because 926 this method may need to inspect the system state to determine the 927 currently available resources, it may need to be implemented as an 928 async operation. 930 Calling this method may do things such as generate new ICE 931 credentials, but does not trigger candidate gathering or change media 932 state. 934 4.1.4. SessionDescriptionType 936 Session description objects (RTCSessionDescription) may be of type 937 "offer", "pranswer", "answer" or "rollback". These types provide 938 information as to how the description parameter should be parsed, and 939 how the media state should be changed. 941 "offer" indicates that a description should be parsed as an offer; 942 said description may include many possible media configurations. A 943 description used as an "offer" may be applied anytime the 944 PeerConnection is in a stable state, or as an update to a previously 945 supplied but unanswered "offer". 947 "pranswer" indicates that a description should be parsed as an 948 answer, but not a final answer, and so should not result in the 949 freeing of allocated resources. It may result in the start of media 950 transmission, if the answer does not specify an inactive media 951 direction. A description used as a "pranswer" may be applied as a 952 response to an "offer", or an update to a previously sent "pranswer". 954 "answer" indicates that a description should be parsed as an answer, 955 the offer-answer exchange should be considered complete, and any 956 resources (decoders, candidates) that are no longer needed can be 957 released. A description used as an "answer" may be applied as a 958 response to an "offer", or an update to a previously sent "pranswer". 960 The only difference between a provisional and final answer is that 961 the final answer results in the freeing of any unused resources that 962 were allocated as a result of the offer. As such, the application 963 can use some discretion on whether an answer should be applied as 964 provisional or final, and can change the type of the session 965 description as needed. For example, in a serial forking scenario, an 966 application may receive multiple "final" answers, one from each 967 remote endpoint. The application could choose to accept the initial 968 answers as provisional answers, and only apply an answer as final 969 when it receives one that meets its criteria (e.g. a live user 970 instead of voicemail). 972 "rollback" is a special session description type implying that the 973 state machine should be rolled back to the previous state, as 974 described in Section 4.1.4.2. The contents MUST be empty. 976 4.1.4.1. Use of Provisional Answers 978 Most web applications will not need to create answers using the 979 "pranswer" type. While it is good practice to send an immediate 980 response to an "offer", in order to warm up the session transport and 981 prevent media clipping, the preferred handling for a web application 982 would be to create and send an "inactive" final answer immediately 983 after receiving the offer. Later, when the called user actually 984 accepts the call, the application can create a new "sendrecv" offer 985 to update the previous offer/answer pair and start the media flow. 986 While this could also be done with an inactive "pranswer", followed 987 by a sendrecv "answer", the initial "pranswer" leaves the offer- 988 answer exchange open, which means that neither side can send an 989 updated offer during this time. 991 As an example, consider a typical web application that will set up a 992 data channel, an audio channel, and a video channel. When an 993 endpoint receives an offer with these channels, it could send an 994 answer accepting the data channel for two-way data, and accepting the 995 audio and video tracks as inactive or receive-only. It could then 996 ask the user to accept the call, acquire the local media streams, and 997 send a new offer to the remote side moving the audio and video to be 998 two-way media. By the time the human has accepted the call and 999 triggered the new offer, it is likely that the ICE and DTLS 1000 handshaking for all the channels will already have finished. 1002 Of course, some applications may not be able to perform this double 1003 offer-answer exchange, particularly ones that are attempting to 1004 gateway to legacy signaling protocols. In these cases, "pranswer" 1005 can still provide the application with a mechanism to warm up the 1006 transport. 1008 4.1.4.2. Rollback 1010 In certain situations it may be desirable to "undo" a change made to 1011 setLocalDescription or setRemoteDescription. Consider a case where a 1012 call is ongoing, and one side wants to change some of the session 1013 parameters; that side generates an updated offer and then calls 1014 setLocalDescription. However, the remote side, either before or 1015 after setRemoteDescription, decides it does not want to accept the 1016 new parameters, and sends a reject message back to the offerer. Now, 1017 the offerer, and possibly the answerer as well, need to return to a 1018 stable state and the previous local/remote description. To support 1019 this, we introduce the concept of "rollback". 1021 A rollback discards any proposed changes to the session, returning 1022 the state machine to the stable state, and setting the modified local 1023 and/or remote description back to their previous values. Any 1024 resources or candidates that were allocated by the abandoned local 1025 description are discarded; any media that is received will be 1026 processed according to the previous local and remote descriptions. 1027 Rollback can only be used to cancel proposed changes; there is no 1028 support for rolling back from a stable state to a previous stable 1029 state. Note that this implies that once the answerer has performed 1030 setLocalDescription with his answer, this cannot be rolled back. 1032 A rollback is performed by supplying a session description of type 1033 "rollback" with empty contents to either setLocalDescription or 1034 setRemoteDescription, depending on which was most recently used (i.e. 1035 if the new offer was supplied to setLocalDescription, the rollback 1036 should be done using setLocalDescription as well). 1038 4.1.5. setLocalDescription 1040 The setLocalDescription method instructs the PeerConnection to apply 1041 the supplied session description as its local configuration. The 1042 type field indicates whether the description should be processed as 1043 an offer, provisional answer, or final answer; offers and answers are 1044 checked differently, using the various rules that exist for each SDP 1045 line. 1047 This API changes the local media state; among other things, it sets 1048 up local resources for receiving and decoding media. In order to 1049 successfully handle scenarios where the application wants to offer to 1050 change from one media format to a different, incompatible format, the 1051 PeerConnection must be able to simultaneously support use of both the 1052 old and new local descriptions (e.g. support codecs that exist in 1053 both descriptions) until a final answer is received, at which point 1054 the PeerConnection can fully adopt the new local description, or roll 1055 back to the old description if the remote side denied the change. 1057 This API indirectly controls the candidate gathering process. When a 1058 local description is supplied, and the number of transports currently 1059 in use does not match the number of transports needed by the local 1060 description, the PeerConnection will create transports as needed and 1061 begin gathering candidates for them. 1063 If setRemoteDescription was previous called with an offer, and 1064 setLocalDescription is called with an answer (provisional or final), 1065 and the media directions are compatible, and media are available to 1066 send, this will result in the starting of media transmission. 1068 4.1.6. setRemoteDescription 1070 The setRemoteDescription method instructs the PeerConnection to apply 1071 the supplied session description as the desired remote configuration. 1072 As in setLocalDescription, the type field of the description 1073 indicates how it should be processed. 1075 This API changes the local media state; among other things, it sets 1076 up local resources for sending and encoding media. 1078 If setLocalDescription was previously called with an offer, and 1079 setRemoteDescription is called with an answer (provisional or final), 1080 and the media directions are compatible, and media are available to 1081 send, this will result in the starting of media transmission. 1083 4.1.7. localDescription 1085 The localDescription method returns a copy of the current local 1086 configuration, i.e. what was most recently passed to 1087 setLocalDescription, plus any local candidates that have been 1088 generated by the ICE Agent. 1090 [[OPEN ISSUE: Do we need to expose accessors for both the current and 1091 proposed local description? https://github.com/rtcweb-wg/jsep/ 1092 issues/16]] 1094 A null object will be returned if the local description has not yet 1095 been established. 1097 4.1.8. remoteDescription 1099 The remoteDescription method returns a copy of the current remote 1100 configuration, i.e. what was most recently passed to 1101 setRemoteDescription, plus any remote candidates that have been 1102 supplied via processIceMessage. 1104 [[OPEN ISSUE: Do we need to expose accessors for both the current and 1105 proposed remote description? https://github.com/rtcweb-wg/jsep/ 1106 issues/16]] 1108 A null object will be returned if the remote description has not yet 1109 been established. 1111 4.1.9. canTrickleIceCandidates 1113 The canTrickleIceCandidates property indicates whether the remote 1114 side supports receiving trickled candidates. There are three 1115 potential values: 1117 null: No SDP has been received from the other side, so it is not 1118 known if it can handle trickle. This is the initial value before 1119 setRemoteDescription() is called. 1121 true: SDP has been received from the other side indicating that it 1122 can support trickle. 1124 false: SDP has been received from the other side indicating that it 1125 cannot support trickle. 1127 As described in Section 3.4.2, JSEP implementations always provide 1128 candidates to the application individually, consistent with what is 1129 needed for Trickle ICE. However, applications can use the 1130 canTrickleIceCandidates property to determine whether their peer can 1131 actually do Trickle ICE, i.e., whether it is safe to send an initial 1132 offer or answer followed later by candidates as they are gathered. 1133 As "true" is the only value that definitively indicates remote 1134 Trickle ICE support, an application which compares 1135 canTrickleIceCandidates against "true" will by default attempt Half 1136 Trickle on initial offers and Full Trickle on subsequent interactions 1137 with a Trickle ICE-compatible agent. 1139 4.1.10. setConfiguration 1141 The setConfiguration method allows the global configuration of the 1142 PeerConnection, which was initially set by constructor parameters, to 1143 be changed during the session. The effects of this method call 1144 depend on when it is invoked, and differ depending on which specific 1145 parameters are changed: 1147 o Any changes to the STUN/TURN servers to use affect the next 1148 gathering phase. If gathering has already occurred, this will 1149 cause the next call to createOffer to generate new ICE 1150 credentials, for the purpose of forcing an ICE restart and kicking 1151 off a new gathering phase, in which the new servers will be used. 1152 If the ICE candidate pool has a nonzero size, any existing 1153 candidates will be discarded, and new candidates will be gathered 1154 from the new servers. 1156 o Any changes to the ICE candidate policy also affect the next 1157 gathering phase, in similar fashion to the server changes 1158 described above. Note though that changes to the policy have no 1159 effect on the candidate pool, because pooled candidates are not 1160 surfaced to the application until a gathering phase occurs, and so 1161 any necessary filtering can still be done on any pooled 1162 candidates. 1164 o Any changes to the ICE candidate pool size take effect 1165 immediately; if increased, additional candidates are pre-gathered; 1166 if decreased, the now-superfluous candidates are discarded. 1168 o The BUNDLE and RTCP-multiplexing policies MUST NOT be changed 1169 after the construction of the PeerConnection. 1171 This call may result in a change to the state of the ICE Agent, and 1172 may result in a change to media state if it results in connectivity 1173 being established. 1175 4.1.11. addIceCandidate 1177 The addIceCandidate method provides a remote candidate to the ICE 1178 Agent, which, if parsed successfully, will be added to the remote 1179 description according to the rules defined for Trickle ICE. 1180 Connectivity checks will be sent to the new candidate. 1182 This call will result in a change to the state of the ICE Agent, and 1183 may result in a change to media state if it results in connectivity 1184 being established. 1186 5. SDP Interaction Procedures 1188 This section describes the specific procedures to be followed when 1189 creating and parsing SDP objects. 1191 5.1. Requirements Overview 1193 JSEP implementations must comply with the specifications listed below 1194 that govern the creation and processing of offers and answers. 1196 The first set of specifications is the "mandatory-to-implement" set. 1197 All implementations must support these behaviors, but may not use all 1198 of them if the remote side, which may not be a JSEP endpoint, does 1199 not support them. 1201 The second set of specifications is the "mandatory-to-use" set. The 1202 local JSEP endpoint and any remote endpoint must indicate support for 1203 these specifications in their session descriptions. 1205 5.1.1. Implementation Requirements 1207 This list of mandatory-to-implement specifications is derived from 1208 the requirements outlined in [I-D.ietf-rtcweb-rtp-usage]. 1210 R-1 [RFC4566] is the base SDP specification and MUST be 1211 implemented. 1213 R-2 [RFC5764] MUST be supported for signaling the UDP/TLS/RTP/SAVPF 1214 [RFC5764] and TCP/DTLS/RTP/SAVPF 1215 [I-D.nandakumar-mmusic-proto-iana-registration] RTP profiles. 1217 R-3 [RFC5245] MUST be implemented for signaling the ICE credentials 1218 and candidate lines corresponding to each media stream. The 1219 ICE implementation MUST be a Full implementation, not a Lite 1220 implementation. 1222 R-4 [RFC5763] MUST be implemented to signal DTLS certificate 1223 fingerprints. 1225 R-5 [RFC4568] MUST NOT be implemented to signal SDES SRTP keying 1226 information. 1228 R-6 The [RFC5888] grouping framework MUST be implemented for 1229 signaling grouping information, and MUST be used to identify m= 1230 lines via the a=mid attribute. 1232 R-7 [I-D.ietf-mmusic-msid] MUST be supported, in order to signal 1233 associations between RTP objects and W3C MediaStreams and 1234 MediaStreamTracks in a standard way. 1236 R-8 The bundle mechanism in 1237 [I-D.ietf-mmusic-sdp-bundle-negotiation] MUST be supported to 1238 signal the ability to multiplex RTP streams on a single UDP 1239 port, in order to avoid excessive use of port number resources. 1241 R-9 The SDP attributes of "sendonly", "recvonly", "inactive", and 1242 "sendrecv" from [RFC4566] MUST be implemented to signal 1243 information about media direction. 1245 R-10 [RFC5576] MUST be implemented to signal RTP SSRC values and 1246 grouping semantics. 1248 R-11 [RFC4585] MUST be implemented to signal RTCP based feedback. 1250 R-12 [RFC5761] MUST be implemented to signal multiplexing of RTP and 1251 RTCP. 1253 R-13 [RFC5506] MUST be implemented to signal reduced-size RTCP 1254 messages. 1256 R-14 [RFC4588] MUST be implemented to signal RTX payload type 1257 associations. 1259 R-15 [RFC3556] with bandwidth modifiers MAY be supported for 1260 specifying RTCP bandwidth as a fraction of the media bandwidth, 1261 RTCP fraction allocated to the senders and setting maximum 1262 media bit-rate boundaries. 1264 R-16 TODO: any others? 1266 As required by [RFC4566], Section 5.13, JSEP implementations MUST 1267 ignore unknown attribute (a=) lines. 1269 5.1.2. Usage Requirements 1271 All session descriptions handled by JSEP endpoints, both local and 1272 remote, MUST indicate support for the following specifications. If 1273 any of these are absent, this omission MUST be treated as an error. 1275 R-1 ICE, as specified in [RFC5245], MUST be used. Note that the 1276 remote endpoint may use a Lite implementation; implementations 1277 MUST properly handle remote endpoints which do ICE-Lite. 1279 R-2 DTLS [RFC6347] or DTLS-SRTP [RFC5763], MUST be used, as 1280 appropriate for the media type, as specified in 1281 [I-D.ietf-rtcweb-security-arch] 1283 5.1.3. Profile Names and Interoperability 1285 For media m= sections, JSEP endpoints MUST support both the "UDP/TLS/ 1286 RTP/SAVPF" and "TCP/DTLS/RTP/SAVPF" profiles and MUST indicate one of 1287 these two profiles for each media m= line they produce in an offer. 1288 For data m= sections, JSEP endpoints must support both the "UDP/DTLS/ 1289 SCTP" and "TCP/DTLS/SCTP" profiles and MUST indicate one of these two 1290 profiles for each data m= line they produce in an offer. Because ICE 1291 can select either TCP or UDP transport depending on network 1292 conditions, both advertisements are consistent with ICE eventually 1293 selecting either either UDP or TCP. 1295 Unfortunately, in an attempt at compatibility, some endpoints 1296 generate other profile strings even when they mean to support one of 1297 these profiles. For instance, an endpoint might generate "RTP/AVP" 1298 but supply "a=fingerprint" and "a=rtcp-fb" attributes, indicating its 1299 willingness to support "(UDP,TCP)/TLS/RTP/SAVPF". In order to 1300 simplify compatibility with such endpoints, JSEP endpoints MUST 1301 follow the following rules when processing the media m= sections in 1302 an offer: 1304 o The profile in any "m=" line in any answer MUST exactly match the 1305 profile provided in the offer. 1307 o Any profile matching the following patterns MUST be accepted: 1308 "RTP/[S]AVP[F]" and "(UDP/TCP)/TLS/RTP/SAVP[F]" 1310 o Because DTLS-SRTP is REQUIRED, the choice of SAVP or AVP has no 1311 effect; support for DTLS-SRTP is determined by the presence of one 1312 or more "a=fingerprint" attribute. Note that lack of an 1313 "a=fingerprint" attribute will lead to negotiation failure. 1315 o The use of AVPF or AVP simply controls the timing rules used for 1316 RTCP feedback. If AVPF is provided, or an "a=rtcp-fb" attribute 1317 is present, assume AVPF timing, i.e. a default value of "trr- 1318 int=0". Otherwise, assume that AVPF is being used in an AVP 1319 compatible mode and use AVP timing, i.e., "trr-int=4". 1321 o For data m= sections, JSEP endpoints MUST support receiving the 1322 "UDP/ DTLS/SCTP", "TCP/DTLS/SCTP", or "DTLS/SCTP" (for backwards 1323 compatibility) profiles. 1325 Note that re-offers by JSEP endpoints MUST use the correct profile 1326 strings even if the initial offer/answer exchange used an (incorrect) 1327 older profile string. 1329 5.2. Constructing an Offer 1331 When createOffer is called, a new SDP description must be created 1332 that includes the functionality specified in 1333 [I-D.ietf-rtcweb-rtp-usage]. The exact details of this process are 1334 explained below. 1336 5.2.1. Initial Offers 1338 When createOffer is called for the first time, the result is known as 1339 the initial offer. 1341 The first step in generating an initial offer is to generate session- 1342 level attributes, as specified in [RFC4566], Section 5. 1343 Specifically: 1345 o The first SDP line MUST be "v=0", as specified in [RFC4566], 1346 Section 5.1 1348 o The second SDP line MUST be an "o=" line, as specified in 1349 [RFC4566], Section 5.2. The value of the field SHOULD 1350 be "-". The value of the field SHOULD be a 1351 cryptographically random number. To ensure uniqueness, this 1352 number SHOULD be at least 64 bits long. The value of the field SHOULD be zero. The value of the 1354 tuple SHOULD be set to a non- 1355 meaningful address, such as IN IP4 0.0.0.0, to prevent leaking the 1356 local address in this field. As mentioned in [RFC4566], the 1357 entire o= line needs to be unique, but selecting a random number 1358 for is sufficient to accomplish this. 1360 o The third SDP line MUST be a "s=" line, as specified in [RFC4566], 1361 Section 5.3; to match the "o=" line, a single dash SHOULD be used 1362 as the session name, e.g. "s=-". Note that this differs from the 1363 advice in [RFC4566] which proposes a single space, but as both 1364 "o=" and "s=" are meaningless, having the same meaningless value 1365 seems clearer. 1367 o Session Information ("i="), URI ("u="), Email Address ("e="), 1368 Phone Number ("p="), Bandwidth ("b="), Repeat Times ("r="), and 1369 Time Zones ("z=") lines are not useful in this context and SHOULD 1370 NOT be included. 1372 o Encryption Keys ("k=") lines do not provide sufficient security 1373 and MUST NOT be included. 1375 o A "t=" line MUST be added, as specified in [RFC4566], Section 5.9; 1376 both and SHOULD be set to zero, e.g. "t=0 1377 0". 1379 o An "a=msid-semantic:WMS" line MUST be added, as specified in 1380 [I-D.ietf-mmusic-msid], Section 4. 1382 The next step is to generate m= sections, as specified in [RFC4566] 1383 Section 5.14, for each MediaStreamTrack that has been added to the 1384 PeerConnection via the addStream method. (Note that this method 1385 takes a MediaStream, which can contain multiple MediaStreamTracks, 1386 and therefore multiple m= sections can be generated even if addStream 1387 is only called once.) m=sections MUST be sorted first by the order in 1388 which the MediaStreams were added to the PeerConnection, and then by 1389 the alphabetical ordering of the media type for the MediaStreamTrack. 1390 For example, if a MediaStream containing both an audio and a video 1391 MediaStreamTrack is added to a PeerConnection, the resultant m=audio 1392 section will precede the m=video section. If a second MediaStream 1393 containing an audio MediaStreamTrack was added, it would follow the 1394 m=video section. 1396 Each m= section, provided it is not being bundled into another m= 1397 section, MUST generate a unique set of ICE credentials and gather its 1398 own unique set of ICE candidates. Otherwise, it MUST use the same 1399 ICE credentials and candidates as the m= section into which it is 1400 being bundled. Note that this means that for offers, any m= sections 1401 which are not bundle-only MUST have unique ICE credentials and 1402 candidates, since it is possible that the answerer will accept them 1403 without bundling them. 1405 For DTLS, all m= sections MUST use the certificate for the identity 1406 that has been specified for the PeerConnection; as a result, they 1407 MUST all have the same [RFC4572] fingerprint value, or this value 1408 MUST be a session-level attribute. 1410 Each m= section should be generated as specified in [RFC4566], 1411 Section 5.14. For the m= line itself, the following rules MUST be 1412 followed: 1414 o The port value is set to the port of the default ICE candidate for 1415 this m= section, but given that no candidates have yet been 1416 gathered, the "dummy" port value of 9 (Discard) MUST be used, as 1417 indicated in [I-D.ietf-mmusic-trickle-ice], Section 5.1. 1419 o To properly indicate use of DTLS, the field MUST be set to 1420 "UDP/TLS/RTP/SAVPF", as specified in [RFC5764], Section 8, if the 1421 default candidate uses UDP transport, or "TCP/DTLS/RTP/SAVPF", as 1422 specified in[I-D.nandakumar-mmusic-proto-iana-registration] if the 1423 default candidate uses TCP transport. 1425 The m= line MUST be followed immediately by a "c=" line, as specified 1426 in [RFC4566], Section 5.7. Again, as no candidates have yet been 1427 gathered, the "c=" line must contain the "dummy" value "IN IP4 1428 0.0.0.0", as defined in [I-D.ietf-mmusic-trickle-ice], Section 5.1. 1429 [[NOTE: This has not yet changed in the trickle ICE draft.]] 1431 Each m= section MUST include the following attribute lines: 1433 o An "a=mid" line, as specified in [RFC5888], Section 4. When 1434 generating mid values, it is RECOMMENDED that the values be 3 1435 bytes or less, to allow them to efficiently fit into the RTP 1436 header extension defined in 1437 [I-D.ietf-mmusic-sdp-bundle-negotiation], Section 11. 1439 o An "a=rtcp" line, as specified in [RFC3605], Section 2.1, 1440 containing the dummy value "9 IN IP4 0.0.0.0", because no 1441 candidates have yet been gathered. 1443 o An "a=msid" line, as specified in [I-D.ietf-mmusic-msid], 1444 Section 2. 1446 o An "a=sendrecv" line, as specified in [RFC3264], Section 5.1. 1448 o For each supported codec, "a=rtpmap" and "a=fmtp" lines, as 1449 specified in [RFC4566], Section 6. The audio and video codecs 1450 that MUST be supported are specified in [I-D.ietf-rtcweb-audio] 1451 (see Section 3) and [I-D.ietf-rtcweb-video] (see Section 5). 1453 o If this m= section is for media with configurable frame sizes, 1454 e.g. audio, an "a=maxptime" line, indicating the smallest of the 1455 maximum supported frame sizes out of all codecs included above, as 1456 specified in [RFC4566], Section 6. 1458 o If this m= section is for video media, and there are known 1459 limitations on the size of images which can be decoded, an 1460 "a=imageattr" line, as specified in Section 3.5. 1462 o For each primary codec where RTP retransmission should be used, a 1463 corresponding "a=rtpmap" line indicating "rtx" with the clock rate 1464 of the primary codec and an "a=fmtp" line that references the 1465 payload type of the primary codec, as specified in [RFC4588], 1466 Section 8.1. 1468 o For each supported FEC mechanism, "a=rtpmap" and "a=fmtp" lines, 1469 as specified in [RFC4566], Section 6. The FEC mechanisms that 1470 MUST be supported are specified in [I-D.ietf-rtcweb-fec], 1471 Section 6, and specific usage for each media type is outlined in 1472 Sections 4 and 5. 1474 o "a=ice-ufrag" and "a=ice-pwd" lines, as specified in [RFC5245], 1475 Section 15.4. 1477 o An "a=ice-options" line, with the "trickle" option, as specified 1478 in [I-D.ietf-mmusic-trickle-ice], Section 4. 1480 o An "a=fingerprint" line for each of the endpoint's certificates, 1481 as specified in [RFC4572], Section 5; the digest algorithm used 1482 for the fingerprint MUST match that used in the certificate 1483 signature. 1485 o An "a=setup" line, as specified in [RFC4145], Section 4, and 1486 clarified for use in DTLS-SRTP scenarios in [RFC5763], Section 5. 1487 The role value in the offer MUST be "actpass". 1489 o An "a=rtcp-mux" line, as specified in [RFC5761], Section 5.1.1. 1491 o An "a=rtcp-rsize" line, as specified in [RFC5506], Section 5. 1493 o For each supported RTP header extension, an "a=extmap" line, as 1494 specified in [RFC5285], Section 5. The list of header extensions 1495 that SHOULD/MUST be supported is specified in 1496 [I-D.ietf-rtcweb-rtp-usage], Section 5.2. Any header extensions 1497 that require encryption MUST be specified as indicated in 1498 [RFC6904], Section 4. 1500 o For each supported RTCP feedback mechanism, an "a=rtcp-fb" 1501 mechanism, as specified in [RFC4585], Section 4.2. The list of 1502 RTCP feedback mechanisms that SHOULD/MUST be supported is 1503 specified in [I-D.ietf-rtcweb-rtp-usage], Section 5.1. 1505 o An "a=ssrc" line, as specified in [RFC5576], Section 4.1, 1506 indicating the SSRC to be used for sending media, along with the 1507 mandatory "cname" source attribute, as specified in Section 6.1, 1508 indicating the CNAME for the source. The CNAME MUST be generated 1509 in accordance with Section 4.9 of [I-D.ietf-rtcweb-rtp-usage]. 1511 o If RTX is supported for this media type, another "a=ssrc" line 1512 with the RTX SSRC, and an "a=ssrc-group" line, as specified in 1513 [RFC5576], section 4.2, with semantics set to "FID" and including 1514 the primary and RTX SSRCs. 1516 o If FEC is supported for this media type, another "a=ssrc" line 1517 with the FEC SSRC, and an "a=ssrc-group" line with semantics set 1518 to "FEC-FR" and including the primary and FEC SSRCs, as specified 1519 in [RFC5956], section 4.3. For simplicity, if both RTX and FEC 1520 are supported, the FEC SSRC MUST be the same as the RTX SSRC. 1522 o If the BUNDLE policy for this PeerConnection is set to "max- 1523 bundle", and this is not the first m= section, or the BUNDLE 1524 policy is set to "balanced", and this is not the first m= section 1525 for this media type, an "a=bundle-only" line. 1527 Lastly, if a data channel has been created, a m= section MUST be 1528 generated for data. The field MUST be set to "application" 1529 and the field MUST be set to "UDP/DTLS/SCTP" if the default 1530 candidate uses UDP transport, or "TCP/DTLS/SCTP" if the default 1531 candidate uses TCP transport [I-D.ietf-mmusic-sctp-sdp]. The "fmt" 1532 value MUST be set to "webrtc-datachannel" as specified in 1533 [I-D.ietf-mmusic-sctp-sdp], Section 4.1. 1535 Within the data m= section, the "a=mid", "a=ice-ufrag", "a=ice- 1536 passwd", "a=ice-options", "a=candidate", "a=fingerprint", and 1537 "a=setup" lines MUST be included as mentioned above, along with an 1538 "a=fmtp:webrtc-datachannel" line and an "a=sctp-port" line 1539 referencing the SCTP port number as defined in 1540 [I-D.ietf-mmusic-sctp-sdp], Section 4.1. 1542 Once all m= sections have been generated, a session-level "a=group" 1543 attribute MUST be added as specified in [RFC5888]. This attribute 1544 MUST have semantics "BUNDLE", and MUST include the mid identifiers of 1545 each m= section. The effect of this is that the browser offers all 1546 m= sections as one BUNDLE group. However, whether the m= sections 1547 are bundle-only or not depends on the BUNDLE policy. 1549 The next step is to generate session-level lip sync groups as defined 1550 in [RFC5888], Section 7. For each MediaStream with more than one 1551 MediaStreamTrack, a group of type "LS" MUST be added that contains 1552 the mid values for each MediaStreamTrack in that MediaStream. 1554 Attributes which SDP permits to either be at the session level or the 1555 media level SHOULD generally be at the media level even if they are 1556 identical. This promotes readability, especially if one of a set of 1557 initially identical attributes is subsequently changed. 1559 Attributes other than the ones specified above MAY be included, 1560 except for the following attributes which are specifically 1561 incompatible with the requirements of [I-D.ietf-rtcweb-rtp-usage], 1562 and MUST NOT be included: 1564 o "a=crypto" 1566 o "a=key-mgmt" 1568 o "a=ice-lite" 1570 Note that when BUNDLE is used, any additional attributes that are 1571 added MUST follow the advice in [I-D.ietf-mmusic-sdp-mux-attributes] 1572 on how those attributes interact with BUNDLE. 1574 Note that these requirements are in some cases stricter than those of 1575 SDP. Implementations MUST be prepared to accept compliant SDP even 1576 if it would not conform to the requirements for generating SDP in 1577 this specification. 1579 5.2.2. Subsequent Offers 1581 When createOffer is called a second (or later) time, or is called 1582 after a local description has already been installed, the processing 1583 is somewhat different than for an initial offer. 1585 If the initial offer was not applied using setLocalDescription, 1586 meaning the PeerConnection is still in the "stable" state, the steps 1587 for generating an initial offer should be followed, subject to the 1588 following restriction: 1590 o The fields of the "o=" line MUST stay the same except for the 1591 field, which MUST increment if the session 1592 description changes in any way, including the addition of ICE 1593 candidates. 1595 If the initial offer was applied using setLocalDescription, but an 1596 answer from the remote side has not yet been applied, meaning the 1597 PeerConnection is still in the "local-offer" state, an offer is 1598 generated by following the steps in the "stable" state above, along 1599 with these exceptions: 1601 o The "s=" and "t=" lines MUST stay the same. 1603 o Each "m=" and c=" line MUST be filled in with the port, protocol, 1604 and address of the default candidate for the m= section, as 1605 described in [RFC5245], Section 4.3. If ICE checking has already 1606 completed for one or more candidate pairs and a candidate pair is 1607 in active use, then that pair MUST be used, even if ICE has not 1608 yet completed. Note that this differs from the guidance in 1609 [RFC5245], Section 9.1.2.2, which only refers to offers created 1610 when ICE has completed. Each "a=rtcp" attribute line MUST also be 1611 filled in with the port and address of the appropriate default 1612 candidate, either the default RTP or RTCP candidate, depending on 1613 whether RTCP multiplexing is currently active or not. Note that 1614 if RTCP multiplexing is being offered, but not yet active, the 1615 default RTCP candidate MUST be used, as indicated in [RFC5761], 1616 section 5.1.3. In each case, if no candidates of the desired type 1617 have yet been gathered, dummy values MUST be used, as described 1618 above. 1620 o Each "a=mid" line MUST stay the same. 1622 o Each "a=ice-ufrag" and "a=ice-pwd" line MUST stay the same, unless 1623 the ICE configuration has changed (either changes to the supported 1624 STUN/TURN servers, or the ICE candidate policy), or the 1625 "IceRestart" option (Section 5.2.3.3 was specified. 1627 o Within each m= section, for each candidate that has been gathered 1628 during the most recent gathering phase (see Section 3.4.1), an 1629 "a=candidate" line MUST be added, as specified in [RFC5245], 1630 Section 4.3., paragraph 3. If candidate gathering for the section 1631 has completed, an "a=end-of-candidates" attribute MUST be added, 1632 as described in [I-D.ietf-mmusic-trickle-ice], Section 9.3. 1634 o For MediaStreamTracks that are still present, the "a=msid", 1635 "a=ssrc", and "a=ssrc-group" lines MUST stay the same. 1637 o If any MediaStreamTracks have been removed, either through the 1638 removeStream method or by removing them from an added MediaStream, 1639 their m= sections MUST be marked as recvonly by changing the value 1640 of the [RFC3264] directional attribute to "a=recvonly". The 1641 "a=msid", "a=ssrc", and "a=ssrc-group" lines MUST be removed from 1642 the associated m= sections. 1644 o If any MediaStreamTracks have been added, and there exist m= 1645 sections of the appropriate media type with no associated 1646 MediaStreamTracks (i.e. as described in the preceding paragraph), 1647 those m= sections MUST be recycled by adding the new 1648 MediaStreamTrack to the m= section. This is done by adding the 1649 necessary "a=msid", "a=ssrc", and "a=ssrc-group" lines to the 1650 recycled m= section, and removing the "a=recvonly" attribute. 1652 If the initial offer was applied using setLocalDescription, and an 1653 answer from the remote side has been applied using 1654 setRemoteDescription, meaning the PeerConnection is in the "remote- 1655 pranswer" or "stable" states, an offer is generated based on the 1656 negotiated session descriptions by following the steps mentioned for 1657 the "local-offer" state above, along with these exceptions: [OPEN 1658 ISSUE: should this be permitted in the remote-pranswer state? 1659 https://github.com/rtcweb-wg/jsep/issues/143] 1661 o If a m= section exists in the current local description, but does 1662 not have an associated local MediaStreamTrack (possibly because 1663 said MediaStreamTrack was removed since the last exchange), a m= 1664 section MUST still be generated in the new offer, as indicated in 1665 [RFC3264], Section 8. The disposition of this section will depend 1666 on the state of the remote MediaStreamTrack associated with this 1667 m= section. If one exists, and it is still in the "live" state, 1668 the new m= section MUST be marked as "a=recvonly", with no 1669 "a=msid" or related attributes present. If no remote 1670 MediaStreamTrack exists, or it is in the "ended" state, the m= 1671 section MUST be marked as rejected, by setting the port to zero, 1672 as indicated in [RFC3264], Section 8.2. 1674 o If any MediaStreamTracks have been added, and there exist recvonly 1675 m= sections of the appropriate media type with no associated 1676 MediaStreamTracks, or rejected m= sections of any media type, 1677 those m= sections MUST be recycled, and a local MediaStreamTrack 1678 associated with these recycled m= sections until all such existing 1679 m= sections have been used. This includes any recvonly or 1680 rejected m= sections created by the preceding paragraph. 1682 In addition, for each non-recycled, non-rejected m= section in the 1683 new offer, the following adjustments are made based on the contents 1684 of the corresponding m= section in the current remote description: 1686 o The m= line and corresponding "a=rtpmap" and "a=fmtp" lines MUST 1687 only include codecs present in the remote description. 1689 o The RTP header extensions MUST only include those that are present 1690 in the remote description. 1692 o The RTCP feedback extensions MUST only include those that are 1693 present in the remote description. 1695 o The "a=rtcp-mux" line MUST only be added if present in the remote 1696 description. 1698 o The "a=rtcp-rsize" line MUST only be added if present in the 1699 remote description. 1701 The "a=group:BUNDLE" attribute MUST include the mid identifiers 1702 specified in the BUNDLE group in the most recent answer, minus any m= 1703 sections that have been marked as rejected, plus any newly added or 1704 re-enabled m= sections. In other words, the BUNDLE attribute must 1705 contain all m= sections that were previously bundled, as long as they 1706 are still alive, as well as any new m= sections. 1708 The "LS" groups are generated in the same way as with initial offers. 1710 5.2.3. Options Handling 1712 The createOffer method takes as a parameter an RTCOfferOptions 1713 object. Special processing is performed when generating a SDP 1714 description if the following options are present. 1716 5.2.3.1. OfferToReceiveAudio 1718 If the "OfferToReceiveAudio" option is specified, with an integer 1719 value of N, and M audio MediaStreamTracks have been added to the 1720 PeerConnection, the offer MUST include N non-rejected m= sections 1721 with media type "audio", even if N is greater than M. This allows 1722 the offerer to receive audio, including multiple independent streams, 1723 even when not sending it; accordingly, the directional attribute on 1724 the N-M audio m= sections without associated MediaStreamTracks MUST 1725 be set to recvonly. 1727 If N is set to a value less than M, the offer MUST mark the m= 1728 sections associated with the M-N most recently added (since the last 1729 setLocalDescription) MediaStreamTracks as sendonly. This allows the 1730 offerer to indicate that it does not want to receive audio on some or 1731 all of its newly created streams. For m= sections that have 1732 previously been negotiated, this setting has no effect. [TODO: refer 1733 to RTCRtpSender in the future] 1735 For backwards compatibility with pre-standard versions of this 1736 specification, a value of "true" is interpreted as equivalent to N=1, 1737 and "false" as N=0. 1739 5.2.3.2. OfferToReceiveVideo 1741 If the "OfferToReceiveVideo" option is specified, with an integer 1742 value of N, and M video MediaStreamTracks have been added to the 1743 PeerConnection, the offer MUST include N non-rejected m= sections 1744 with media type "video", even if N is greater than M. This allows 1745 the offerer to receive video, including multiple independent streams, 1746 even when not sending it; accordingly, the directional attribute on 1747 the N-M video m= sections without associated MediaStreamTracks MUST 1748 be set to recvonly. 1750 If N is set to a value less than M, the offer MUST mark the m= 1751 sections associated with the M-N most recently added (since the last 1752 setLocalDescription) MediaStreamTracks as sendonly. This allows the 1753 offerer to indicate that it does not want to receive video on some or 1754 all of its newly created streams. For m= sections that have 1755 previously been negotiated, this setting has no effect. [TODO: refer 1756 to RTCRtpSender in the future] 1758 For backwards compatibility with pre-standard versions of this 1759 specification, a value of "true" is interpreted as equivalent to N=1, 1760 and "false" as N=0. 1762 5.2.3.3. IceRestart 1764 If the "IceRestart" option is specified, with a value of "true", the 1765 offer MUST indicate an ICE restart by generating new ICE ufrag and 1766 pwd attributes, as specified in [RFC5245], Section 9.1.1.1. If this 1767 option is specified on an initial offer, it has no effect (since a 1768 new ICE ufrag and pwd are already generated). Similarly, if the ICE 1769 configuration has changed, this option has no effect, since new ufrag 1770 and pwd attributes will be generated automatically. This option is 1771 primarily useful for reestablishing connectivity in cases where 1772 failures are detected by the application. 1774 5.2.3.4. VoiceActivityDetection 1776 If the "VoiceActivityDetection" option is specified, with a value of 1777 "true", the offer MUST indicate support for silence suppression in 1778 the audio it receives by including comfort noise ("CN") codecs for 1779 each offered audio codec, as specified in [RFC3389], Section 5.1, 1780 except for codecs that have their own internal silence suppression 1781 support. For codecs that have their own internal silence suppression 1782 support, the appropriate fmtp parameters for that codec MUST be 1783 specified to indicate that silence suppression for received audio is 1784 desired. For example, when using the Opus codec, the "usedtx=1" 1785 parameter would be specified in the offer. This option allows the 1786 endpoint to significantly reduce the amount of audio bandwidth it 1787 receives, at the cost of some fidelity, depending on the quality of 1788 the remote VAD algorithm. 1790 5.3. Generating an Answer 1792 When createAnswer is called, a new SDP description must be created 1793 that is compatible with the supplied remote description as well as 1794 the requirements specified in [I-D.ietf-rtcweb-rtp-usage]. The exact 1795 details of this process are explained below. 1797 5.3.1. Initial Answers 1799 When createAnswer is called for the first time after a remote 1800 description has been provided, the result is known as the initial 1801 answer. If no remote description has been installed, an answer 1802 cannot be generated, and an error MUST be returned. 1804 Note that the remote description SDP may not have been created by a 1805 JSEP endpoint and may not conform to all the requirements listed in 1806 Section 5.2. For many cases, this is not a problem. However, if any 1807 mandatory SDP attributes are missing, or functionality listed as 1808 mandatory-to-use above is not present, this MUST be treated as an 1809 error, and MUST cause the affected m= sections to be marked as 1810 rejected. 1812 The first step in generating an initial answer is to generate 1813 session-level attributes. The process here is identical to that 1814 indicated in the Initial Offers section above. 1816 The next step is to generate lip sync groups as defined in [RFC5888], 1817 Section 7. For each MediaStream with more than one MediaStreamTrack, 1818 a group of type "LS" MUST be added that contains the mid values for 1819 each MediaStreamTrack in that MediaStream. In some cases this may 1820 result in adding a mid to a given LS group that was not in that LS 1821 group in the associated offer. Although this is not allowed by 1822 [RFC5888], it is allowed when implementing this specification. 1823 [[OPEN ISSUE: This is still under discussion. See: 1824 https://github.com/rtcweb-wg/jsep/issues/162.]] 1826 The next step is to generate m= sections for each m= section that is 1827 present in the remote offer, as specified in [RFC3264], Section 6. 1828 For the purposes of this discussion, any session-level attributes in 1829 the offer that are also valid as media-level attributes SHALL be 1830 considered to be present in each m= section. 1832 The next step is to go through each offered m= section. If there is 1833 a local MediaStreamTrack of the same type which has been added to the 1834 PeerConnection via addStream and not yet associated with a m= 1835 section, and the specific m= section is either sendrecv or recvonly, 1836 the MediaStreamTrack will be associated with the m= section at this 1837 time. MediaStreamTracks are assigned to m= sections using the 1838 canonical order described in Section 5.2.1. If there are more m= 1839 sections of a certain type than MediaStreamTracks, some m= sections 1840 will not have an associated MediaStreamTrack. If there are more 1841 MediaStreamTracks of a certain type than compatible m= sections, only 1842 the first N MediaStreamTracks will be able to be associated in the 1843 constructed answer. The remainder will need to be associated in a 1844 subsequent offer. 1846 For each offered m= section, if the associated remote 1847 MediaStreamTrack has been stopped, and is therefore in state "ended", 1848 and no local MediaStreamTrack has been associated, the corresponding 1849 m= section in the answer MUST be marked as rejected by setting the 1850 port in the m= line to zero, as indicated in [RFC3264], Section 6., 1851 and further processing for this m= section can be skipped. 1853 Provided that is not the case, each m= section in the answer should 1854 then be generated as specified in [RFC3264], Section 6.1. For the m= 1855 line itself, the following rules must be followed: 1857 o The port value would normally be set to the port of the default 1858 ICE candidate for this m= section, but given that no candidates 1859 have yet been gathered, the "dummy" port value of 9 (Discard) MUST 1860 be used, as indicated in [I-D.ietf-mmusic-trickle-ice], 1861 Section 5.1. 1863 o The field MUST be set to exactly match the field 1864 for the corresponding m= line in the offer. 1866 The m= line MUST be followed immediately by a "c=" line, as specified 1867 in [RFC4566], Section 5.7. Again, as no candidates have yet been 1868 gathered, the "c=" line must contain the "dummy" value "IN IP4 1869 0.0.0.0", as defined in [I-D.ietf-mmusic-trickle-ice], Section 5.1. 1871 If the offer supports BUNDLE, all m= sections to be BUNDLEd must use 1872 the same ICE credentials and candidates; all m= sections not being 1873 BUNDLEd must use unique ICE credentials and candidates. Each m= 1874 section MUST include the following: 1876 o If present in the offer, an "a=mid" line, as specified in 1877 [RFC5888], Section 9.1. The "mid" value MUST match that specified 1878 in the offer. 1880 o An "a=rtcp" line, as specified in [RFC3605], Section 2.1, 1881 containing the dummy value "9 IN IP4 0.0.0.0", because no 1882 candidates have yet been gathered. 1884 o If a local MediaStreamTrack has been associated, an "a=msid" line, 1885 as specified in [I-D.ietf-mmusic-msid], Section 2. 1887 o Depending on the directionality of the offer, the disposition of 1888 any associated remote MediaStreamTrack, and the presence of an 1889 associated local MediaStreamTrack, the appropriate directionality 1890 attribute, as specified in [RFC3264], Section 6.1. If the offer 1891 was sendrecv, and the remote MediaStreamTrack is still "live", and 1892 there is a local MediaStreamTrack that has been associated, the 1893 directionality MUST be set as sendrecv. If the offer was 1894 sendonly, and the remote MediaStreamTrack is still "live", the 1895 directionality MUST be set as recvonly. If the offer was 1896 recvonly, and a local MediaStreamTrack has been associated, the 1897 directionality MUST be set as sendonly. If the offer was 1898 inactive, the directionality MUST be set as inactive. 1900 o For each supported codec that is present in the offer, "a=rtpmap" 1901 and "a=fmtp" lines, as specified in [RFC4566], Section 6, and 1902 [RFC3264], Section 6.1. The audio and video codecs that MUST be 1903 supported are specified in [I-D.ietf-rtcweb-audio] (see Section 3) 1904 and [I-D.ietf-rtcweb-video] (see Section 5). Note that for 1905 simplicity, the answerer MAY use different payload types for 1906 codecs than the offerer, as it is not prohibited by Section 6.1. 1908 o If this m= section is for media with configurable frame sizes, 1909 e.g. audio, an "a=maxptime" line, indicating the smallest of the 1910 maximum supported frame sizes out of all codecs included above, as 1911 specified in [RFC4566], Section 6. 1913 o If this m= section is for video media, and there are known 1914 limitations on the size of images which can be decoded, an 1915 "a=imageattr" line, as specified in Section 3.5. 1917 o If "rtx" is present in the offer, for each primary codec where RTP 1918 retransmission should be used, a corresponding "a=rtpmap" line 1919 indicating "rtx" with the clock rate of the primary codec and an 1920 "a=fmtp" line that references the payload type of the primary 1921 codec, as specified in [RFC4588], Section 8.1. 1923 o For each supported FEC mechanism, "a=rtpmap" and "a=fmtp" lines, 1924 as specified in [RFC4566], Section 6. The FEC mechanisms that 1925 MUST be supported are specified in [I-D.ietf-rtcweb-fec], 1926 Section 6, and specific usage for each media type is outlined in 1927 Sections 4 and 5. 1929 o "a=ice-ufrag" and "a=ice-passwd" lines, as specified in [RFC5245], 1930 Section 15.4. 1932 o If the "trickle" ICE option is present in the offer, an "a=ice- 1933 options" line, with the "trickle" option, as specified in 1934 [I-D.ietf-mmusic-trickle-ice], Section 4. 1936 o An "a=fingerprint" line for each of the endpoint's certificates, 1937 as specified in [RFC4572], Section 5; the digest algorithm used 1938 for the fingerprint MUST match that used in the certificate 1939 signature. 1941 o An "a=setup" line, as specified in [RFC4145], Section 4, and 1942 clarified for use in DTLS-SRTP scenarios in [RFC5763], Section 5. 1943 The role value in the answer MUST be "active" or "passive"; the 1944 "active" role is RECOMMENDED. 1946 o If present in the offer, an "a=rtcp-mux" line, as specified in 1947 [RFC5761], Section 5.1.1. If the "require" RTCP multiplexing 1948 policy is set and no "a=rtcp-mux" line is present in the offer, 1949 then the m=line MUST be marked as rejected by setting the port in 1950 the m= line to zero, as indicated in [RFC3264], Section 6. 1952 o If present in the offer, an "a=rtcp-rsize" line, as specified in 1953 [RFC5506], Section 5. 1955 o For each supported RTP header extension that is present in the 1956 offer, an "a=extmap" line, as specified in [RFC5285], Section 5. 1957 The list of header extensions that SHOULD/MUST be supported is 1958 specified in [I-D.ietf-rtcweb-rtp-usage], Section 5.2. Any header 1959 extensions that require encryption MUST be specified as indicated 1960 in [RFC6904], Section 4. 1962 o For each supported RTCP feedback mechanism that is present in the 1963 offer, an "a=rtcp-fb" mechanism, as specified in [RFC4585], 1964 Section 4.2. The list of RTCP feedback mechanisms that SHOULD/ 1965 MUST be supported is specified in [I-D.ietf-rtcweb-rtp-usage], 1966 Section 5.1. 1968 o If a local MediaStreamTrack has been associated, an "a=ssrc" line, 1969 as specified in [RFC5576], Section 4.1, indicating the SSRC to be 1970 used for sending media, along with the mandatory "cname" source 1971 attribute, as specified in Section 6.1, indicating the CNAME for 1972 the source. The CNAME MUST be generated in accordance with 1973 Section 4.9 of [I-D.ietf-rtcweb-rtp-usage]. 1975 o If a local MediaStreamTrack has been associated, and RTX has been 1976 negotiated for this m= section, another "a=ssrc" line with the RTX 1977 SSRC, and an "a=ssrc-group" line, as specified in [RFC5576], 1978 section 4.2, with semantics set to "FID" and including the primary 1979 and RTX SSRCs. 1981 o If a local MediaStreamTrack has been associated, and FEC has been 1982 negotiated for this m= section, another "a=ssrc" line with the FEC 1983 SSRC, and an "a=ssrc-group" line with semantics set to "FEC-FR" 1984 and including the primary and FEC SSRCs, as specified in 1985 [RFC5956], section 4.3. For simplicity, if both RTX and FEC are 1986 supported, the FEC SSRC MUST be the same as the RTX SSRC. 1988 If a data channel m= section has been offered, a m= section MUST also 1989 be generated for data. The field MUST be set to 1990 "application" and the and "fmt" fields MUST be set to exactly 1991 match the fields in the offer. 1993 Within the data m= section, the "a=mid", "a=ice-ufrag", "a=ice- 1994 passwd", "a=ice-options", "a=candidate", "a=fingerprint", and 1995 "a=setup" lines MUST be included as mentioned above, along with an 1996 "a=fmtp:webrtc-datachannel" line and an "a=sctp-port" line 1997 referencing the SCTP port number as defined in 1998 [I-D.ietf-mmusic-sctp-sdp], Section 4.1. 2000 If "a=group" attributes with semantics of "BUNDLE" are offered, 2001 corresponding session-level "a=group" attributes MUST be added as 2002 specified in [RFC5888]. These attributes MUST have semantics 2003 "BUNDLE", and MUST include the all mid identifiers from the offered 2004 BUNDLE groups that have not been rejected. Note that regardless of 2005 the presence of "a=bundle-only" in the offer, no m= sections in the 2006 answer should have an "a=bundle-only" line. 2008 Attributes that are common between all m= sections MAY be moved to 2009 session-level, if explicitly defined to be valid at session-level. 2011 The attributes prohibited in the creation of offers are also 2012 prohibited in the creation of answers. 2014 5.3.2. Subsequent Answers 2016 When createAnswer is called a second (or later) time, or is called 2017 after a local description has already been installed, the processing 2018 is somewhat different than for an initial answer. 2020 If the initial answer was not applied using setLocalDescription, 2021 meaning the PeerConnection is still in the "have-remote-offer" state, 2022 the steps for generating an initial answer should be followed, 2023 subject to the following restriction: 2025 o The fields of the "o=" line MUST stay the same except for the 2026 field, which MUST increment if the session 2027 description changes in any way from the previously generated 2028 answer. 2030 If any session description was previously supplied to 2031 setLocalDescription, an answer is generated by following the steps in 2032 the "have-remote-offer" state above, along with these exceptions: 2034 o The "s=" and "t=" lines MUST stay the same. 2036 o Each "m=" and c=" line MUST be filled in with the port and address 2037 of the default candidate for the m= section, as described in 2038 [RFC5245], Section 4.3. Note, however, that the m= line protocol 2039 need not match the default candidate, because this protocol value 2040 must instead match what was supplied in the offer, as described 2041 above. Each "a=rtcp" attribute line MUST also be filled in with 2042 the port and address of the appropriate default candidate, either 2043 the default RTP or RTCP candidate, depending on whether RTCP 2044 multiplexing is enabled in the answer. In each case, if no 2045 candidates of the desired type have yet been gathered, dummy 2046 values MUST be used, as described in the initial answer section 2047 above. 2049 o Each "a=ice-ufrag" and "a=ice-pwd" line MUST stay the same. 2051 o Within each m= section, for each candidate that has been gathered 2052 during the most recent gathering phase (see Section 3.4.1), an 2053 "a=candidate" line MUST be added, as specified in [RFC5245], 2054 Section 4.3., paragraph 3. If candidate gathering for the section 2055 has completed, an "a=end-of-candidates" attribute MUST be added, 2056 as described in [I-D.ietf-mmusic-trickle-ice], Section 9.3. 2058 o For MediaStreamTracks that are still present, the "a=msid", 2059 "a=ssrc", and "a=ssrc-group" lines MUST stay the same. 2061 5.3.3. Options Handling 2063 The createAnswer method takes as a parameter an RTCAnswerOptions 2064 object. The set of parameters for RTCAnswerOptions is different than 2065 those supported in RTCOfferOptions; the OfferToReceiveAudio, 2066 OfferToReceiveVideo, and IceRestart options mentioned in 2067 Section 5.2.3 are meaningless in the context of generating an answer, 2068 as there is no need to generate extra m= lines in an answer, and ICE 2069 credentials will automatically be changed for all m= lines where the 2070 offerer chose to perform ICE restart. 2072 The following options are supported in RTCAnswerOptions. 2074 5.3.3.1. VoiceActivityDetection 2076 Silence suppression in the answer is handled as described in 2077 Section 5.2.3.4. 2079 5.4. Processing a Local Description 2081 When a SessionDescription is supplied to setLocalDescription, the 2082 following steps MUST be performed: 2084 o First, the type of the SessionDescription is checked against the 2085 current state of the PeerConnection: 2087 * If the type is "offer", the PeerConnection state MUST be either 2088 "stable" or "have-local-offer". 2090 * If the type is "pranswer" or "answer", the PeerConnection state 2091 MUST be either "have-remote-offer" or "have-local-pranswer". 2093 o If the type is not correct for the current state, processing MUST 2094 stop and an error MUST be returned. 2096 o Next, the SessionDescription is parsed into a data structure, as 2097 described in the Section 5.6 section below. If parsing fails for 2098 any reason, processing MUST stop and an error MUST be returned. 2100 o Finally, the parsed SessionDescription is applied as described in 2101 the Section 5.7 section below. 2103 5.5. Processing a Remote Description 2105 When a SessionDescription is supplied to setRemoteDescription, the 2106 following steps MUST be performed: 2108 o First, the type of the SessionDescription is checked against the 2109 current state of the PeerConnection: 2111 * If the type is "offer", the PeerConnection state MUST be either 2112 "stable" or "have-remote-offer". 2114 * If the type is "pranswer" or "answer", the PeerConnection state 2115 MUST be either "have-local-offer" or "have-remote-pranswer". 2117 o If the type is not correct for the current state, processing MUST 2118 stop and an error MUST be returned. 2120 o Next, the SessionDescription is parsed into a data structure, as 2121 described in the Section 5.6 section below. If parsing fails for 2122 any reason, processing MUST stop and an error MUST be returned. 2124 o Finally, the parsed SessionDescription is applied as described in 2125 the Section 5.8 section below. 2127 5.6. Parsing a Session Description 2129 [The behavior described herein is a draft version, and needs more 2130 discussion to resolve various open issues.] 2132 When a SessionDescription of any type is supplied to setLocal/ 2133 RemoteDescription, the implementation must parse it and reject it if 2134 it is invalid. The exact details of this process are explained 2135 below. 2137 The SDP contained in the session description object consists of a 2138 sequence of text lines, each containing a key-value expression, as 2139 described in [RFC4566], Section 5. The SDP is read, line-by-line, 2140 and converted to a data structure that contains the deserialized 2141 information. However, SDP allows many types of lines, not all of 2142 which are relevant to JSEP applications. For each line, the 2143 implementation will first ensure it is syntactically correct 2144 according its defining ABNF [TODO: reference], check that it conforms 2145 to [RFC4566] and [RFC3264] semantics, and then either parse and store 2146 or discard the provided value, as described below. [TODO: ensure 2147 that every line is listed below.] If the line is not well-formed, or 2148 cannot be parsed as described, the parser MUST stop with an error and 2149 reject the session description. This ensures that implementations do 2150 not accidentally misinterpret ambiguous SDP. 2152 5.6.1. Session-Level Parsing 2154 First, the session-level lines are checked and parsed. These lines 2155 MUST occur in a specific order, and with a specific syntax, as 2156 defined in [RFC4566], Section 5. Note that while the specific line 2157 types (e.g. "v=", "c=") MUST occur in the defined order, lines of the 2158 same type (typically "a=") can occur in any order, and their ordering 2159 is not meaningful. 2161 For non-attribute (non-"a=") lines, their sequencing, syntax, and 2162 semantics, are checked, as mentioned above. The following lines are 2163 not meaningful in the JSEP context and MAY be discarded once they 2164 have been checked. 2166 The "c=" line MUST be checked for syntax but its value is not 2167 used. This supersedes the guidance in [RFC5245], Section 6.1, to 2168 use "ice-mismatch" to indicate mismatches between "c=" and the 2169 candidate lines; because JSEP always uses ICE, "ice-mismatch" is 2170 not useful in this context. 2172 TODO 2174 The remaining lines are processed as follows: 2176 The "b=" line, if present, MUST be parsed as specified in 2177 [RFC4566], Section 5.8, and the bwtype and bandwidth values 2178 stored. 2180 [OPEN ISSUE: is this WG consensus? Are there other non-a= lines 2181 that we need to do more than just syntactical validation, e.g. 2182 v=?] 2184 Specific processing MUST be applied for the following session-level 2185 attribute ("a=") lines: 2187 o Any "a=group" lines are parsed as specified in [RFC5888], 2188 Section 5, and the group's semantics and mids are stored. 2190 o If present, a single "a=ice-lite" line is parsed as specified in 2191 [RFC5245], Section 15.3, and a value indicating the presence of 2192 ice-lite is stored. 2194 o If present, a single "a=ice-ufrag" line is parsed as specified in 2195 [RFC5245], Section 15.4, and the ufrag value is stored. 2197 o If present, a single "a=ice-pwd" line is parsed as specified in 2198 [RFC5245], Section 15.4, and the password value is stored. 2200 o If present, a single "a=ice-options" line is parsed as specified 2201 in [RFC5245], Section 15.5, and the set of specified options is 2202 stored. 2204 o Any "a=fingerprint" lines are parsed as specified in [RFC4572], 2205 Section 5, and the set of fingerprint and algorithm values is 2206 stored. 2208 o If present, a single "a=setup" line is parsed as specified in 2209 [RFC4145], Section 4, and the setup value is stored. 2211 o Any "a=extmap" lines are parsed as specified in [RFC5285], 2212 Section 5, and their values are stored. 2214 o TODO: msid-semantic, identity, rtcp-rsize, rtcp-mux, and any other 2215 attribs valid at session level. 2217 Once all the session-level lines have been parsed, processing 2218 continues with the lines in media sections. 2220 5.6.2. Media Section Parsing 2222 Like the session-level lines, the media session lines MUST occur in 2223 the specific order and with the specific syntax defined in [RFC4566], 2224 Section 5. 2226 The "m=" line itself MUST be parsed as described in [RFC4566], 2227 Section 5.14, and the media, port, proto, and fmt values stored. 2229 Following the "m=" line, specific processing MUST be applied for the 2230 following non-attribute lines: 2232 o As with the "c=" line at the session level, the "c=" line MUST be 2233 parsed according to [RFC4566], Section 5.7, but its value is not 2234 used. 2236 o The "b=" line, if present, MUST be parsed as specified in 2237 [RFC4566], Section 5.8, and the bwtype and bandwidth values 2238 stored. 2240 Specific processing MUST also be applied for the following attribute 2241 lines: 2243 o If present, a single "a=ice-ufrag" line is parsed as specified in 2244 [RFC5245], Section 15.4, and the ufrag value is stored. 2246 o If present, a single "a=ice-pwd" line is parsed as specified in 2247 [RFC5245], Section 15.4, and the password value is stored. 2249 o If present, a single "a=ice-options" line is parsed as specified 2250 in [RFC5245], Section 15.5, and the set of specified options is 2251 stored. 2253 o Any "a=fingerprint" lines are parsed as specified in [RFC4572], 2254 Section 5, and the set of fingerprint and algorithm values is 2255 stored. 2257 o If present, a single "a=setup" line is parsed as specified in 2258 [RFC4145], Section 4, and the setup value is stored. 2260 If the "m=" proto value indicates use of RTP, as decribed in the 2261 Section 5.1.3 section above, the following attribute lines MUST be 2262 processed: 2264 o The "m=" fmt value MUST be parsed as specified in [RFC4566], 2265 Section 5.14, and the individual values stored. 2267 o Any "a=rtpmap" or "a=fmtp" lines MUST be parsed as specified in 2268 [RFC4566], Section 6, and their values stored. 2270 o If present, a single "a=ptime" line MUST be parsed as described in 2271 [RFC4566], Section 6, and its value stored. 2273 o If present, a single direction attribute line (e.g. "a=sendrecv") 2274 MUST be parsed as described in [RFC4566], Section 6, and its value 2275 stored. 2277 o Any "a=ssrc" or "a=ssrc-group" attributes MUST be parsed as 2278 specified in [RFC5576], Sections 4.1-4.2, and their values stored. 2280 o Any "a=extmap" attributes MUST be parsed as specified in 2281 [RFC5285], Section 5, and their values stored. 2283 o Any "a=rtcp-fb" attributes MUST be parsed as specified in 2284 [RFC4585], Section 4.2., and their values stored. 2286 o If present, a single "a=rtcp-mux" line MUST be parsed as specified 2287 in [RFC5761], Section 5.1.1, and its presence or absence flagged 2288 and stored. 2290 o TODO: a=rtcp-rsize, a=rtcp, a=msid, a=candidate, a=end-of- 2291 candidates 2293 Otherwise, if the "m=" proto value indicats use of SCTP, the 2294 following attribute lines MUST be processed: 2296 o The "m=" fmt value MUST be parsed as specified in 2297 [I-D.ietf-mmusic-sctp-sdp], Section 4.3, and the application 2298 protocol value stored. 2300 o An "a=sctp-port" attribute MUST be present, and it MUST be parsed 2301 as specified in [I-D.ietf-mmusic-sctp-sdp], Section 5.2, and the 2302 value stored. 2304 o TODO: max message size 2306 5.6.3. Semantics Verification 2308 Assuming parsing completes successfully, the parsed description is 2309 then evaluated to ensure internal consistency as well as proper 2310 support for mandatory features. Specifically, the following checks 2311 are performed: 2313 o For each m= section, valid values for each of the mandatory-to-use 2314 features enumerated in Section 5.1.2 MUST be present. These 2315 values MAY either be present at the media level, or inherited from 2316 the session level. 2318 * ICE ufrag and password values, which MUST comply with the size 2319 limits specified in [RFC5245], Section 15.4. 2321 * DTLS setup value, which MUST be set according to the rules 2322 specified in [RFC5763], Section 5, and MUST be consistent with 2323 the selected role of the current DTLS connection, if one 2324 exists.[TODO: may need revision, i.e., use of actpass 2326 * DTLS fingerprint values, where at least one fingerprint MUST be 2327 present. 2329 o Each m= section is also checked to ensure prohibited features are 2330 not used. If this is a local description, the "ice-lite" 2331 attribute MUST NOT be specified. 2333 If this session description is of type "pranswer" or "answer", the 2334 following additional checks are applied: 2336 o The session description must follow the rules defined in 2337 [RFC3264], Section 6, including the requirement that the number of 2338 m= sections MUST exactly match the number of m= sections in the 2339 associated offer. 2341 o For each m= section, the media type and protocol values MUST 2342 exactly match the media type and protocol values in the 2343 corresponding m= section in the associated offer. 2345 5.7. Applying a Local Description 2347 The following steps are performed at the media engine level to apply 2348 a local description. 2350 First, the parsed parameters are checked to ensure that any 2351 modifications performed fall within those explicitly permitted by 2352 Section 6; otherwise, processing MUST stop and an error MUST be 2353 returned. 2355 Next, media sections are processed. For each media section, the 2356 following steps MUST be performed; if any parameters are out of 2357 bounds, or cannot be applied, processing MUST stop and an error MUST 2358 be returned. 2360 o If this media section is new, begin gathering candidates for it, 2361 as defined in [RFC5245], Section 4.1.1, unless it has been marked 2362 as bundle-only. 2364 o Or, if the ICE ufrag and password values have changed, trigger the 2365 ICE Agent to start an ICE restart and begin gathering new 2366 candidates for the media section, as defined in [RFC5245], 2367 Section 9.1.1.1, unless it has been marked as bundle-only. 2369 o If the media section proto value indicates use of RTP: 2371 * If RTCP mux is indicated, prepare to demux RTP and RTCP from 2372 the RTP ICE component, as specified in [RFC5761], 2373 Section 5.1.1. If RTCP mux is not indicated, but was indicated 2374 in a previous description, this MUST result in an error. 2376 * For each specified RTP header extension, establish a mapping 2377 between the extension ID and URI. If any indicated RTP header 2378 extension is unknown, this MUST result in an error. [TODO: 2379 ref] 2381 * If the MID header extension is supported, prepare to demux RTP 2382 data intended for this media section based on the MID header 2383 extension. [TODO: ref] 2385 * For each specified payload type, establish a mapping between 2386 the payload type ID and the actual media format. [TODO: ref] 2387 If any indicated payload type is unknown, this MUST result in 2388 an error. 2390 * For each specified "rtx" media format, establish a mapping 2391 between the RTX payload type and its associated primary payload 2392 type. [TODO: ref] If any referenced primary payload types are 2393 not present, this MUST result in an error. 2395 * If the directional attribute is of type "sendrecv" or 2396 "recvonly", enable receipt and decoding of media. 2398 Finally, if this description is of type "pranswer" or "answer", 2399 follow the processing defined in the Section 5.9 section below. 2401 5.8. Applying a Remote Description 2403 The following steps are performed at the media engine level to apply 2404 a remote description. 2406 For each media section, the following steps MUST be performed; if any 2407 parameters are out of bounds, or cannot be applied, processing MUST 2408 stop and an error MUST be returned. 2410 o If the description is of type "offer", and the ICE ufrag or 2411 password changed from the previous remote description, [TODO: 2412 ref], mark that an ICE restart is needed. 2414 o Configure the ICE components associated with this media section to 2415 use the supplied ICE remote ufrag and password for their 2416 connectivity checks. 2418 o Pair any supplied ICE candidates with any gathered local 2419 candidates, as described in [TODO: ref] and start connectivity 2420 checks with the appropriate credentials. 2422 o If the media section proto value indicates use of RTP: 2424 * [TODO: header extensions] 2426 * For each specified payload type that is also supported by the 2427 local implementation, establish a mapping between the payload 2428 type ID and the actual media format. [TODO: ref] If any 2429 indicated payload type is unknown, it MUST be ignored. [TODO: 2430 should fail on answers] 2432 * For each specified "rtx" media format, establish a mapping 2433 between the RTX payload type and its associated primary payload 2434 type. [TODO: ref] If any referenced primary payload types are 2435 not present, this MUST result in an error. 2437 * For each specified fmtp parameter that is supported by the 2438 local implementation, enable them on the associated payload 2439 types. 2441 * For each specified RTCP feedback mechanism that is supported by 2442 the local implementation, enable them on the associated payload 2443 types. 2445 * For any specified "TIAS" bandwidth value, set this value as the 2446 maximum RTP bitrate to be used when sending media. If a "TIAS" 2447 value is not present, but an "AS" value is, generate a TIAS a 2448 value using this formula: [TODO: convert AS to TIAS]. 2450 * [TODO: handling of CN, telephone-event, "red"] 2452 * If the media section if of type audio: 2454 + For any specified "ptime" value, configure the available 2455 payload types to use the specified packet size. If the 2456 specified size is not supported for a payload type, use the 2457 next closest value instead. 2459 Finally, if this description is of type "pranswer" or "answer", 2460 follow the processing defined in the Section 5.9 section below. 2462 5.9. Applying an Answer 2464 In addition to the steps mentioned above for processing a local or 2465 remote description, the following steps are performed when processing 2466 a description of type "pranswer" or "answer". 2468 For each media section, the following steps MUST be performed: 2470 o If the media section has been rejected (i.e. port is set to zero 2471 in the answer), stop any reception or transmission of media for 2472 this section, and discard any associated ICE components. [TODO: 2473 ref] 2475 o If the remote DTLS fingerprint has been changed, tear down the 2476 existing DTLS connection. 2478 o If no valid DTLS connection exists, prepare to start a DTLS 2479 connection, using the specified roles and fingerprints, on any 2480 underlying ICE components, once they are active. 2482 o If the media section proto value indicates use of RTP: 2484 * If the media section has RTCP mux enabled, discard any RTCP 2485 component, and begin or continue muxing RTCP over the RTP 2486 component, as specified in [RFC5761], Section 5.1.3. 2487 Otherwise, transmit RTCP over the RTCP component; if no RTCP 2488 component exists, because RTCP mux was previously enabled, this 2489 MUST result in an error. 2491 * If the media section has reduced-size RTCP enabled, configure 2492 the RTCP transmission for this media section to use reduced- 2493 size RTCP, as specified in [TODO: ref] 2495 * If the directional attribute in the answer is of type 2496 "sendrecv" or "sendonly", prepare to start transmitting media 2497 using the specified primary SSRC and one of the selected 2498 payload types, once the underlying transport layers have been 2499 established. Otherwise, stop transmitting RTP media, although 2500 RTCP should still be sent. [TODO: ref] 2502 o If the media section proto value indicates use of SCTP: 2504 * If no SCTP association yet exists, prepare to initiate a SCTP 2505 association over the associated ICE component and DTLS 2506 connection, using the local SCTP port value from the local 2507 description, and the remote SCTP port value from the remote 2508 description. [TODO: ref] 2510 If the answer contains valid BUNDLE groups, discard any ICE 2511 components for the m= sections that will be bundled onto the primary 2512 ICE components in each BUNDLE, and begin muxing these m= sections 2513 accordingly. [TODO: ref] 2515 If the answer contains any "a=ice-options" attributes where "trickle" 2516 is listed as an attribute, update the PeerConnection canTrickle 2517 property to be true. Otherwise, set this property to false. 2519 6. Configurable SDP Parameters 2521 It is possible to change elements in the SDP returned from 2522 createOffer before passing it to setLocalDescription. When an 2523 implementation receives modified SDP it MUST either: 2525 o Accept the changes and adjust its behavior to match the SDP. 2527 o Reject the changes and return an error via the error callback. 2529 Changes MUST NOT be silently ignored. 2531 The following elements of the session description MUST NOT be changed 2532 between the createOffer and the setLocalDescription (or between the 2533 createAnswer and the setLocalDescription), since they reflect 2534 transport attributes that are solely under browser control, and the 2535 browser MUST NOT honor an attempt to change them: 2537 o The number, type and port number of m= lines. 2539 o The generated ICE credentials (a=ice-ufrag and a=ice-pwd). 2541 o The set of ICE candidates and their parameters (a=candidate). 2543 o The DTLS fingerprint(s) (a=fingerprint). 2545 o The contents of BUNDLE groups, bundle-only parameters, or "a=rtcp- 2546 mux" parameters. 2548 The following modifications, if done by the browser to a description 2549 between createOffer/createAnswer and the setLocalDescription, MUST be 2550 honored by the browser: 2552 o Remove or reorder codecs (m=) 2554 The following parameters may be controlled by options passed into 2555 createOffer/createAnswer. As an open issue, these changes may also 2556 be be performed by manipulating the SDP returned from createOffer/ 2557 createAnswer, as indicated above, as long as the capabilities of the 2558 endpoint are not exceeded (e.g. asking for a resolution greater than 2559 what the endpoint can encode): 2561 o [[OPEN ISSUE: This is a placeholder for other modifications, which 2562 we may continue adding as use cases appear.]] 2564 Implementations MAY choose to either honor or reject any elements not 2565 listed in the above two categories, but must do so explicitly as 2566 described at the beginning of this section. Note that future 2567 standards may add new SDP elements to the list of elements which must 2568 be accepted or rejected, but due to version skew, applications must 2569 be prepared for implementations to accept changes which must be 2570 rejected and vice versa. 2572 The application can also modify the SDP to reduce the capabilities in 2573 the offer it sends to the far side or the offer that it installs from 2574 the far side in any way the application sees fit, as long as it is a 2575 valid SDP offer and specifies a subset of what was in the original 2576 offer. This is safe because the answer is not permitted to expand 2577 capabilities and therefore will just respond to what is actually in 2578 the offer. 2580 As always, the application is solely responsible for what it sends to 2581 the other party, and all incoming SDP will be processed by the 2582 browser to the extent of its capabilities. It is an error to assume 2583 that all SDP is well-formed; however, one should be able to assume 2584 that any implementation of this specification will be able to 2585 process, as a remote offer or answer, unmodified SDP coming from any 2586 other implementation of this specification. 2588 7. Examples 2590 Note that this example section shows several SDP fragments. To 2591 format in 72 columns, some of the lines in SDP have been split into 2592 multiple lines, where leading whitespace indicates that a line is a 2593 continuation of the previous line. In addition, some blank lines 2594 have been added to improve readability but are not valid in SDP. 2596 More examples of SDP for WebRTC call flows can be found in 2597 [I-D.nandakumar-rtcweb-sdp]. 2599 7.1. Simple Example 2601 This section shows a very simple example that sets up a minimal audio 2602 / video call between two browsers and does not use trickle ICE. The 2603 example in the following section provides a more realistic example of 2604 what would happen in a normal browser to browser connection. 2606 The flow shows Alice's browser initiating the session to Bob's 2607 browser. The messages from Alice's JS to Bob's JS are assumed to 2608 flow over some signaling protocol via a web server. The JS on both 2609 Alice's side and Bob's side waits for all candidates before sending 2610 the offer or answer, so the offers and answers are complete. Trickle 2611 ICE is not used. Both Alice and Bob are using the default policy of 2612 balanced. 2614 // set up local media state 2615 AliceJS->AliceUA: create new PeerConnection 2616 AliceJS->AliceUA: addStream with stream containing audio and video 2617 AliceJS->AliceUA: createOffer to get offer 2618 AliceJS->AliceUA: setLocalDescription with offer 2619 AliceUA->AliceJS: multiple onicecandidate events with candidates 2621 // wait for ICE gathering to complete 2622 AliceUA->AliceJS: onicecandidate event with null candidate 2623 AliceJS->AliceUA: get |offer-A1| from value of localDescription 2625 // |offer-A1| is sent over signaling protocol to Bob 2626 AliceJS->WebServer: signaling with |offer-A1| 2627 WebServer->BobJS: signaling with |offer-A1| 2629 // |offer-A1| arrives at Bob 2630 BobJS->BobUA: create a PeerConnection 2631 BobJS->BobUA: setRemoteDescription with |offer-A1| 2632 BobUA->BobJS: onaddstream event with remoteStream 2634 // Bob accepts call 2635 BobJS->BobUA: addStream with local media 2636 BobJS->BobUA: createAnswer 2637 BobJS->BobUA: setLocalDescription with answer 2638 BobUA->BobJS: multiple onicecandidate events with candidates 2640 // wait for ICE gathering to complete 2641 BobUA->BobJS: onicecandidate event with null candidate 2642 BobJS->BobUA: get |answer-A1| from value of localDescription 2644 // |answer-A1| is sent over signaling protocol to Alice 2645 BobJS->WebServer: signaling with |answer-A1| 2646 WebServer->AliceJS: signaling with |answer-A1| 2648 // |answer-A1| arrives at Alice 2649 AliceJS->AliceUA: setRemoteDescription with |answer-A1| 2650 AliceUA->AliceJS: onaddstream event with remoteStream 2652 // media flows 2653 BobUA->AliceUA: media sent from Bob to Alice 2654 AliceUA->BobUA: media sent from Alice to Bob 2656 The SDP for |offer-A1| looks like: 2658 v=0 2659 o=- 4962303333179871722 1 IN IP4 0.0.0.0 2660 s=- 2661 t=0 0 2662 a=msid-semantic:WMS 2663 a=group:BUNDLE a1 v1 2664 m=audio 56500 UDP/TLS/RTP/SAVPF 96 0 8 97 98 2665 c=IN IP4 192.0.2.1 2666 a=mid:a1 2667 a=rtcp:56501 IN IP4 192.0.2.1 2668 a=msid:47017fee-b6c1-4162-929c-a25110252400 2669 f83006c5-a0ff-4e0a-9ed9-d3e6747be7d9 2670 a=sendrecv 2671 a=rtpmap:96 opus/48000/2 2672 a=rtpmap:0 PCMU/8000 2673 a=rtpmap:8 PCMA/8000 2674 a=rtpmap:97 telephone-event/8000 2675 a=rtpmap:98 telephone-event/48000 2676 a=maxptime:120 2677 a=ice-ufrag:ETEn1v9DoTMB9J4r 2678 a=ice-pwd:OtSK0WpNtpUjkY4+86js7ZQl 2679 a=ice-options:trickle 2680 a=fingerprint:sha-256 2681 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 2682 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 2683 a=setup:actpass 2684 a=rtcp-mux 2685 a=rtcp-rsize 2686 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level 2687 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 2688 a=ssrc:1732846380 cname:EocUG1f0fcg/yvY7 2689 a=candidate:3348148302 1 udp 2113937151 192.0.2.1 56500 2690 typ host 2691 a=candidate:3348148302 2 udp 2113937151 192.0.2.1 56501 2692 typ host 2693 a=end-of-candidates 2695 m=video 56502 UDP/TLS/RTP/SAVPF 100 101 2696 c=IN IP4 192.0.2.1 2697 a=rtcp:56503 IN IP4 192.0.2.1 2698 a=mid:v1 2699 a=msid:61317484-2ed4-49d7-9eb7-1414322a7aae 2700 f30bdb4a-5db8-49b5-bcdc-e0c9a23172e0 2701 a=sendrecv 2702 a=rtpmap:100 VP8/90000 2703 a=rtpmap:101 rtx/90000 2704 a=fmtp:101 apt=100 2705 a=ice-ufrag:BGKkWnG5GmiUpdIV 2706 a=ice-pwd:mqyWsAjvtKwTGnvhPztQ9mIf 2707 a=ice-options:trickle 2708 a=fingerprint:sha-256 2709 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 2711 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 2712 a=setup:actpass 2713 a=rtcp-mux 2714 a=rtcp-rsize 2715 a=extmap:3 urn:ietf:params:rtp-hdrext:sdes:mid 2716 a=rtcp-fb:100 ccm fir 2717 a=rtcp-fb:100 nack 2718 a=rtcp-fb:100 nack pli 2719 a=ssrc:1366781083 cname:EocUG1f0fcg/yvY7 2720 a=ssrc:1366781084 cname:EocUG1f0fcg/yvY7 2721 a=ssrc-group:FID 1366781083 1366781084 2722 a=candidate:3348148302 1 udp 2113937151 192.0.2.1 56502 2723 typ host 2724 a=candidate:3348148302 2 udp 2113937151 192.0.2.1 56503 2725 typ host 2726 a=end-of-candidates 2728 The SDP for |answer-A1| looks like: 2730 v=0 2731 o=- 6729291447651054566 1 IN IP4 0.0.0.0 2732 s=- 2733 t=0 0 2734 a=msid-semantic:WMS 2735 m=audio 20000 UDP/TLS/RTP/SAVPF 96 0 8 97 98 2736 c=IN IP4 192.0.2.2 2737 a=mid:a1 2738 a=rtcp:20000 IN IP4 192.0.2.2 2739 a=msid:PI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1 2740 PI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1a0 2741 a=sendrecv 2742 a=rtpmap:96 opus/48000/2 2743 a=rtpmap:0 PCMU/8000 2744 a=rtpmap:8 PCMA/8000 2745 a=rtpmap:97 telephone-event/8000 2746 a=rtpmap:98 telephone-event/48000 2747 a=maxptime:120 2748 a=ice-ufrag:6sFvz2gdLkEwjZEr 2749 a=ice-pwd:cOTZKZNVlO9RSGsEGM63JXT2 2750 a=fingerprint:sha-256 6B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35 2751 :DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 2752 a=setup:active 2753 a=rtcp-mux 2754 a=rtcp-rsize 2755 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level 2756 a=ssrc:3429951804 cname:Q/NWs1ao1HmN4Xa5 2757 a=candidate:2299743422 1 udp 2113937151 192.0.2.2 20000 2758 typ host 2760 a=end-of-candidates 2762 m=video 20001 UDP/TLS/RTP/SAVPF 100 101 2763 c=IN IP4 192.0.2.2 2764 a=rtcp 20001 IN IP4 192.0.2.2 2765 a=mid:v1 2766 a=msid:PI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1 2767 PI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1v0 2768 a=sendrecv 2769 a=rtpmap:100 VP8/90000 2770 a=rtpmap:101 rtx/90000 2771 a=fmtp:101 apt=100 2772 a=ice-ufrag:6sFvz2gdLkEwjZEr 2773 a=ice-pwd:cOTZKZNVlO9RSGsEGM63JXT2 2774 a=fingerprint:sha-256 6B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35 2775 :DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 2776 a=setup:active 2777 a=rtcp-mux 2778 a=rtcp-rsize 2779 a=rtcp-fb:100 ccm fir 2780 a=rtcp-fb:100 nack 2781 a=rtcp-fb:100 nack pli 2782 a=ssrc:3229706345 cname:Q/NWs1ao1HmN4Xa5 2783 a=ssrc:3229706346 cname:Q/NWs1ao1HmN4Xa5 2784 a=ssrc-group:FID 3229706345 3229706346 2785 a=candidate:2299743422 1 udp 2113937151 192.0.2.2 20001 2786 typ host 2787 a=end-of-candidates 2789 7.2. Normal Examples 2791 This section shows a typical example of a session between two 2792 browsers setting up an audio channel and a data channel. Trickle ICE 2793 is used in full trickle mode with a bundle policy of max-bundle, an 2794 RTCP mux policy of require, and a single TURN server. Later, two 2795 video flows, one for the presenter and one for screen sharing, are 2796 added to the session. This example shows Alice's browser initiating 2797 the session to Bob's browser. The messages from Alice's JS to Bob's 2798 JS are assumed to flow over some signaling protocol via a web server. 2800 // set up local media state 2801 AliceJS->AliceUA: create new PeerConnection 2802 AliceJS->AliceUA: addStream that contains audio track 2803 AliceJS->AliceUA: createDataChannel to get data channel 2804 AliceJS->AliceUA: createOffer to get |offer-B1| 2805 AliceJS->AliceUA: setLocalDescription with |offer-B1| 2807 // |offer-B1| is sent over signaling protocol to Bob 2808 AliceJS->WebServer: signaling with |offer-B1| 2809 WebServer->BobJS: signaling with |offer-B1| 2811 // |offer-B1| arrives at Bob 2812 BobJS->BobUA: create a PeerConnection 2813 BobJS->BobUA: setRemoteDescription with |offer-B1| 2814 BobUA->BobJS: onaddstream with audio track from Alice 2816 // candidates are sent to Bob 2817 AliceUA->AliceJS: onicecandidate event with |candidate-B1| (host) 2818 AliceJS->WebServer: signaling with |candidate-B1| 2819 AliceUA->AliceJS: onicecandidate event with |candidate-B2| (srflx) 2820 AliceJS->WebServer: signaling with |candidate-B2| 2821 AliceUA->AliceJS: onicecandidate event with |candidate-B3| (relay) 2822 AliceJS->WebServer: signaling with |candidate-B3| 2824 WebServer->BobJS: signaling with |candidate-B1| 2825 BobJS->BobUA: addIceCandidate with |candidate-B1| 2826 WebServer->BobJS: signaling with |candidate-B2| 2827 BobJS->BobUA: addIceCandidate with |candidate-B2| 2828 WebServer->BobJS: signaling with |candidate-B3| 2829 BobJS->BobUA: addIceCandidate with |candidate-B3| 2831 // Bob accepts call 2832 BobJS->BobUA: addStream with local audio stream 2833 BobJS->BobUA: createDataChannel to get data channel 2834 BobJS->BobUA: createAnswer to get |answer-B1| 2835 BobJS->BobUA: setLocalDescription with |answer-B1| 2837 // |answer-B1| is sent to Alice 2838 BobJS->WebServer: signaling with |answer-B1| 2839 WebServer->AliceJS: signaling with |answer-B1| 2840 AliceJS->AliceUA: setRemoteDescription with |answer-B1| 2841 AliceUA->AliceJS: onaddstream event with audio track from Bob 2843 // candidates are sent to Alice 2844 BobUA->BobJS: onicecandidate event with |candidate-B4| (host) 2845 BobJS->WebServer: signaling with |candidate-B4| 2846 BobUA->BobJS: onicecandidate event with |candidate-B5| (srflx) 2847 BobJS->WebServer: signaling with |candidate-B5| 2848 BobUA->BobJS: onicecandidate event with |candidate-B6| (relay) 2849 BobJS->WebServer: signaling with |candidate-B6| 2851 WebServer->AliceJS: signaling with |candidate-B4| 2852 AliceJS->AliceUA: addIceCandidate with |candidate-B4| 2853 WebServer->AliceJS: signaling with |candidate-B5| 2854 AliceJS->AliceUA: addIceCandidate with |candidate-B5| 2855 WebServer->AliceJS: signaling with |candidate-B6| 2856 AliceJS->AliceUA: addIceCandidate with |candidate-B6| 2858 // data channel opens 2859 BobUA->BobJS: ondatachannel event 2860 AliceUA->AliceJS: ondatachannel event 2861 BobUA->BobJS: onopen 2862 AliceUA->AliceJS: onopen 2864 // media is flowing between browsers 2865 BobUA->AliceUA: audio+data sent from Bob to Alice 2866 AliceUA->BobUA: audio+data sent from Alice to Bob 2868 // some time later Bob adds two video streams 2869 // note, no candidates exchanged, because of BUNDLE 2870 BobJS->BobUA: addStream with first video stream 2871 BobJS->BobUA: addStream with second video stream 2872 BobJS->BobUA: createOffer to get |offer-B2| 2873 BobJS->BobUA: setLocalDescription with |offer-B2| 2875 // |offer-B2| is sent to Alice 2876 BobJS->WebServer: signaling with |offer-B2| 2877 WebServer->AliceJS: signaling with |offer-B2| 2878 AliceJS->AliceUA: setRemoteDescription with |offer-B2| 2879 AliceUA->AliceJS: onaddstream event with first video stream 2880 AliceUA->AliceJS: onaddstream event with second video stream 2881 AliceJS->AliceUA: createAnswer to get |answer-B2| 2882 AliceJS->AliceUA: setLocalDescription with |answer-B2| 2884 // |answer-B2| is sent over signaling protocol to Bob 2885 AliceJS->WebServer: signaling with |answer-B2| 2886 WebServer->BobJS: signaling with |answer-B2| 2887 BobJS->BobUA: setRemoteDescription with |answer-B2| 2889 // media is flowing between browsers 2890 BobUA->AliceUA: audio+video+data sent from Bob to Alice 2891 AliceUA->BobUA: audio+video+data sent from Alice to Bob 2893 The SDP for |offer-B1| looks like: 2895 v=0 2896 o=- 4962303333179871723 1 IN IP4 0.0.0.0 2897 s=- 2898 t=0 0 2899 a=msid-semantic:WMS 2900 a=group:BUNDLE a1 d1 2901 m=audio 9 UDP/TLS/RTP/SAVPF 96 0 8 97 98 2902 c=IN IP4 0.0.0.0 2903 a=rtcp:9 IN IP4 0.0.0.0 2904 a=mid:a1 2905 a=msid:57017fee-b6c1-4162-929c-a25110252400 2906 e83006c5-a0ff-4e0a-9ed9-d3e6747be7d9 2907 a=sendrecv 2908 a=rtpmap:96 opus/48000/2 2909 a=rtpmap:0 PCMU/8000 2910 a=rtpmap:8 PCMA/8000 2911 a=rtpmap:97 telephone-event/8000 2912 a=rtpmap:98 telephone-event/48000 2913 a=maxptime:120 2914 a=ice-ufrag:ATEn1v9DoTMB9J4r 2915 a=ice-pwd:AtSK0WpNtpUjkY4+86js7ZQl 2916 a=ice-options:trickle 2917 a=fingerprint:sha-256 2918 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 2919 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 2920 a=setup:actpass 2921 a=rtcp-mux 2922 a=rtcp-rsize 2923 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level 2924 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 2925 a=ssrc:1732846380 cname:FocUG1f0fcg/yvY7 2927 m=application 9 UDP/DTLS/SCTP webrtc-datachannel 2928 c=IN IP4 0.0.0.0 2929 a=mid:d1 2930 a=fmtp:webrtc-datachannel max-message-size=65536 2931 a=sctp-port 5000 2932 a=ice-ufrag:ATEn1v9DoTMB9J4r 2933 a=ice-pwd:AtSK0WpNtpUjkY4+86js7ZQl 2934 a=ice-options:trickle 2935 a=fingerprint:sha-256 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 2936 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 2937 a=setup:actpass 2939 The SDP for |candidate-B1| looks like: 2941 candidate:109270923 1 udp 2122194687 192.168.1.2 51556 typ host 2942 The SDP for |candidate-B2| looks like: 2944 candidate:4036177503 1 udp 1685987071 11.22.33.44 52546 typ srflx 2945 raddr 192.168.1.2 rport 51556 2947 The SDP for |candidate-B3| looks like: 2949 candidate:3671762466 1 udp 41819903 22.33.44.55 61405 typ relay 2950 raddr 11.22.33.44 rport 52546 2952 The SDP for |answer-B1| looks like: 2954 v=0 2955 o=- 7729291447651054566 1 IN IP4 0.0.0.0 2956 s=- 2957 t=0 0 2958 a=msid-semantic:WMS 2959 a=group:BUNDLE a1 d1 2960 m=audio 9 UDP/TLS/RTP/SAVPF 96 0 8 97 98 2961 c=IN IP4 0.0.0.0 2962 a=rtcp:9 IN IP4 0.0.0.0 2963 a=mid:a1 2964 a=msid:QI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1 2965 QI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1a0 2966 a=sendrecv 2967 a=rtpmap:96 opus/48000/2 2968 a=rtpmap:0 PCMU/8000 2969 a=rtpmap:8 PCMA/8000 2970 a=rtpmap:97 telephone-event/8000 2971 a=rtpmap:98 telephone-event/48000 2972 a=maxptime:120 2973 a=ice-ufrag:7sFvz2gdLkEwjZEr 2974 a=ice-pwd:dOTZKZNVlO9RSGsEGM63JXT2 2975 a=ice-options:trickle 2976 a=fingerprint:sha-256 6B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35 2977 :DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 2978 a=setup:active 2979 a=rtcp-mux 2980 a=rtcp-rsize 2981 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level 2982 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 2983 a=ssrc:4429951804 cname:Q/NWs1ao1HmN4Xa5 2985 m=application 9 UDP/DTLS/SCTP webrtc-datachannel 2986 c=IN IP4 0.0.0.0 2987 a=mid:d1 2988 a=fmtp:webrtc-datachannel max-message-size=65536 2989 a=sctp-port 5000 2990 a=ice-ufrag:7sFvz2gdLkEwjZEr 2991 a=ice-pwd:dOTZKZNVlO9RSGsEGM63JXT2 2992 a=ice-options:trickle 2993 a=fingerprint:sha-256 6B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35 2994 :DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 2995 a=setup:active 2997 The SDP for |candidate-B4| looks like: 2999 candidate:109270924 1 udp 2122194687 192.168.2.3 61665 typ host 3001 The SDP for |candidate-B5| looks like: 3003 candidate:4036177504 1 udp 1685987071 55.66.77.88 64532 typ srflx 3004 raddr 192.168.2.3 rport 61665 3006 The SDP for |candidate-B6| looks like: 3008 candidate:3671762467 1 udp 41819903 66.77.88.99 50416 typ relay 3009 raddr 55.66.77.88 rport 64532 3011 The SDP for |offer-B2| looks like: (note the increment of the version 3012 number in the o= line, and the c= and a=rtcp lines, which indicate 3013 the local candidate that was selected) 3015 v=0 3016 o=- 7729291447651054566 2 IN IP4 0.0.0.0 3017 s=- 3018 t=0 0 3019 a=msid-semantic:WMS 3020 a=group:BUNDLE a1 d1 v1 v2 3021 m=audio 64532 UDP/TLS/RTP/SAVPF 96 0 8 97 98 3022 c=IN IP4 55.66.77.88 3023 a=rtcp:64532 IN IP4 55.66.77.88 3024 a=mid:a1 3025 a=msid:QI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1 3026 QI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1a0 3027 a=sendrecv 3028 a=rtpmap:96 opus/48000/2 3029 a=rtpmap:0 PCMU/8000 3030 a=rtpmap:8 PCMA/8000 3031 a=rtpmap:97 telephone-event/8000 3032 a=rtpmap:98 telephone-event/48000 3033 a=maxptime:120 3034 a=ice-ufrag:7sFvz2gdLkEwjZEr 3035 a=ice-pwd:dOTZKZNVlO9RSGsEGM63JXT2 3036 a=ice-options:trickle 3037 a=fingerprint:sha-256 6B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35 3038 :DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 3039 a=setup:actpass 3040 a=rtcp-mux 3041 a=rtcp-rsize 3042 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level 3043 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 3044 a=ssrc:4429951804 cname:Q/NWs1ao1HmN4Xa5 3045 a=candidate:109270924 1 udp 2122194687 192.168.2.3 61665 typ host 3046 a=candidate:4036177504 1 udp 1685987071 55.66.77.88 64532 typ srflx 3047 raddr 192.168.2.3 rport 61665 3048 a=candidate:3671762467 1 udp 41819903 66.77.88.99 50416 typ relay 3049 raddr 55.66.77.88 rport 64532 3050 a=end-of-candidates 3051 m=application 64532 UDP/DTLS/SCTP webrtc-datachannel 3052 c=IN IP4 55.66.77.88 3053 a=mid:d1 3054 a=fmtp:webrtc-datachannel max-message-size=65536 3055 a=sctp-port 5000 3056 a=ice-ufrag:7sFvz2gdLkEwjZEr 3057 a=ice-pwd:dOTZKZNVlO9RSGsEGM63JXT2 3058 a=ice-options:trickle 3059 a=fingerprint:sha-256 6B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35 3060 :DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 3061 a=setup:actpass 3062 a=candidate:109270924 1 udp 2122194687 192.168.2.3 61665 typ host 3063 a=candidate:4036177504 1 udp 1685987071 55.66.77.88 64532 typ srflx 3064 raddr 192.168.2.3 rport 61665 3065 a=candidate:3671762467 1 udp 41819903 66.77.88.99 50416 typ relay 3066 raddr 55.66.77.88 rport 64532 3067 a=end-of-candidates 3069 m=video 64532 UDP/TLS/RTP/SAVPF 100 101 3070 c=IN IP4 55.66.77.88 3071 a=rtcp:64532 IN IP4 55.66.77.88 3072 a=mid:v1 3073 a=msid:61317484-2ed4-49d7-9eb7-1414322a7aae 3074 f30bdb4a-5db8-49b5-bcdc-e0c9a23172e0 3075 a=sendrecv 3076 a=rtpmap:100 VP8/90000 3077 a=rtpmap:101 rtx/90000 3078 a=fmtp:101 apt=100 3079 a=ice-ufrag:7sFvz2gdLkEwjZEr 3080 a=ice-pwd:dOTZKZNVlO9RSGsEGM63JXT2 3081 a=ice-options:trickle 3082 a=fingerprint:sha-256 3083 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 3084 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3085 a=setup:actpass 3086 a=rtcp-mux 3087 a=rtcp-rsize 3088 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 3089 a=rtcp-fb:100 ccm fir 3090 a=rtcp-fb:100 nack 3091 a=rtcp-fb:100 nack pli 3092 a=ssrc:1366781083 cname:Q/NWs1ao1HmN4Xa5 3093 a=ssrc:1366781084 cname:Q/NWs1ao1HmN4Xa5 3094 a=ssrc-group:FID 1366781083 1366781084 3095 a=candidate:109270924 1 udp 2122194687 192.168.2.3 61665 typ host 3096 a=candidate:4036177504 1 udp 1685987071 55.66.77.88 64532 typ srflx 3097 raddr 192.168.2.3 rport 61665 3098 a=candidate:3671762467 1 udp 41819903 66.77.88.99 50416 typ relay 3099 raddr 55.66.77.88 rport 64532 3100 a=end-of-candidates 3102 m=video 64532 UDP/TLS/RTP/SAVPF 100 101 3103 c=IN IP4 55.66.77.88 3104 a=rtcp:64532 IN IP4 55.66.77.88 3105 a=mid:v1 3106 a=msid:71317484-2ed4-49d7-9eb7-1414322a7aae 3107 f30bdb4a-5db8-49b5-bcdc-e0c9a23172e0 3108 a=sendrecv 3109 a=rtpmap:100 VP8/90000 3110 a=rtpmap:101 rtx/90000 3111 a=fmtp:101 apt=100 3112 a=ice-ufrag:7sFvz2gdLkEwjZEr 3113 a=ice-pwd:dOTZKZNVlO9RSGsEGM63JXT2 3114 a=ice-options:trickle 3115 a=fingerprint:sha-256 3116 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 3117 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3118 a=setup:actpass 3119 a=rtcp-mux 3120 a=rtcp-rsize 3121 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 3122 a=rtcp-fb:100 ccm fir 3123 a=rtcp-fb:100 nack 3124 a=rtcp-fb:100 nack pli 3125 a=ssrc:2366781083 cname:Q/NWs1ao1HmN4Xa5 3126 a=ssrc:2366781084 cname:Q/NWs1ao1HmN4Xa5 3127 a=ssrc-group:FID 2366781083 2366781084 3128 a=candidate:109270924 1 udp 2122194687 192.168.2.3 61665 typ host 3129 a=candidate:4036177504 1 udp 1685987071 55.66.77.88 64532 typ srflx 3130 raddr 192.168.2.3 rport 61665 3131 a=candidate:3671762467 1 udp 41819903 66.77.88.99 50416 typ relay 3132 raddr 55.66.77.88 rport 64532 3133 a=end-of-candidates 3135 The SDP for |answer-B2| looks like: (note the use of setup:passive to 3136 maintain the existing DTLS roles, and the use of a=recvonly to 3137 indicate that the video streams are one-way) 3139 v=0 3140 o=- 4962303333179871723 2 IN IP4 0.0.0.0 3141 s=- 3142 t=0 0 3143 a=msid-semantic:WMS 3144 a=group:BUNDLE a1 d1 v1 v2 3145 m=audio 52546 UDP/TLS/RTP/SAVPF 96 0 8 97 98 3146 c=IN IP4 11.22.33.44 3147 a=rtcp:52546 IN IP4 11.22.33.44 3148 a=mid:a1 3149 a=msid:57017fee-b6c1-4162-929c-a25110252400 3150 e83006c5-a0ff-4e0a-9ed9-d3e6747be7d9 3151 a=sendrecv 3152 a=rtpmap:96 opus/48000/2 3153 a=rtpmap:0 PCMU/8000 3154 a=rtpmap:8 PCMA/8000 3155 a=rtpmap:97 telephone-event/8000 3156 a=rtpmap:98 telephone-event/48000 3157 a=maxptime:120 3158 a=ice-ufrag:ATEn1v9DoTMB9J4r 3159 a=ice-pwd:AtSK0WpNtpUjkY4+86js7ZQl 3160 a=ice-options:trickle 3161 a=fingerprint:sha-256 3162 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 3163 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3164 a=setup:passive 3165 a=rtcp-mux 3166 a=rtcp-rsize 3167 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level 3168 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 3169 a=ssrc:1732846380 cname:FocUG1f0fcg/yvY7 3170 a=candidate:109270923 1 udp 2122194687 192.168.1.2 51556 typ host 3171 a=candidate:4036177503 1 udp 1685987071 11.22.33.44 52546 typ srflx 3172 raddr 192.168.1.2 rport 51556 3173 a=candidate:3671762466 1 udp 41819903 22.33.44.55 61405 typ relay 3174 raddr 11.22.33.44 rport 52546 3175 a=end-of-candidates 3177 m=application 52546 UDP/DTLS/SCTP webrtc-datachannel 3178 c=IN IP4 11.22.33.44 3179 a=mid:d1 3180 a=fmtp:webrtc-datachannel max-message-size=65536 3181 a=sctp-port 5000 3182 a=ice-ufrag:ATEn1v9DoTMB9J4r 3183 a=ice-pwd:AtSK0WpNtpUjkY4+86js7ZQl 3184 a=ice-options:trickle 3185 a=fingerprint:sha-256 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 3186 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3187 a=setup:passive 3188 a=candidate:109270923 1 udp 2122194687 192.168.1.2 51556 typ host 3189 a=candidate:4036177503 1 udp 1685987071 11.22.33.44 52546 typ srflx 3190 raddr 192.168.1.2 rport 51556 3191 a=candidate:3671762466 1 udp 41819903 22.33.44.55 61405 typ relay 3192 raddr 11.22.33.44 rport 52546 3193 a=end-of-candidates 3194 m=video 52546 UDP/TLS/RTP/SAVPF 100 101 3195 c=IN IP4 11.22.33.44 3196 a=rtcp:52546 IN IP4 11.22.33.44 3197 a=mid:v1 3198 a=recvonly 3199 a=rtpmap:100 VP8/90000 3200 a=rtpmap:101 rtx/90000 3201 a=fmtp:101 apt=100 3202 a=ice-ufrag:ATEn1v9DoTMB9J4r 3203 a=ice-pwd:AtSK0WpNtpUjkY4+86js7ZQl 3204 a=ice-options:trickle 3205 a=fingerprint:sha-256 3206 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 3207 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3208 a=setup:passive 3209 a=rtcp-mux 3210 a=rtcp-rsize 3211 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 3212 a=rtcp-fb:100 ccm fir 3213 a=rtcp-fb:100 nack 3214 a=rtcp-fb:100 nack pli 3215 a=candidate:109270923 1 udp 2122194687 192.168.1.2 51556 typ host 3216 a=candidate:4036177503 1 udp 1685987071 11.22.33.44 52546 typ srflx 3217 raddr 192.168.1.2 rport 51556 3218 a=candidate:3671762466 1 udp 41819903 22.33.44.55 61405 typ relay 3219 raddr 11.22.33.44 rport 52546 3220 a=end-of-candidates 3222 m=video 52546 UDP/TLS/RTP/SAVPF 100 101 3223 c=IN IP4 11.22.33.44 3224 a=rtcp:52546 IN IP4 11.22.33.44 3225 a=mid:v2 3226 a=recvonly 3227 a=rtpmap:100 VP8/90000 3228 a=rtpmap:101 rtx/90000 3229 a=fmtp:101 apt=100 3230 a=ice-ufrag:ATEn1v9DoTMB9J4r 3231 a=ice-pwd:AtSK0WpNtpUjkY4+86js7ZQl 3232 a=ice-options:trickle 3233 a=fingerprint:sha-256 3234 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 3235 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3236 a=setup:passive 3237 a=rtcp-mux 3238 a=rtcp-rsize 3239 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 3240 a=rtcp-fb:100 ccm fir 3241 a=rtcp-fb:100 nack 3242 a=rtcp-fb:100 nack pli 3243 a=candidate:109270923 1 udp 2122194687 192.168.1.2 51556 typ host 3244 a=candidate:4036177503 1 udp 1685987071 11.22.33.44 52546 typ srflx 3245 raddr 192.168.1.2 rport 51556 3246 a=candidate:3671762466 1 udp 41819903 22.33.44.55 61405 typ relay 3247 raddr 11.22.33.44 rport 52546 3248 a=end-of-candidates 3250 8. Security Considerations 3252 The IETF has published separate documents 3253 [I-D.ietf-rtcweb-security-arch] [I-D.ietf-rtcweb-security] describing 3254 the security architecture for WebRTC as a whole. The remainder of 3255 this section describes security considerations for this document. 3257 While formally the JSEP interface is an API, it is better to think of 3258 it is an Internet protocol, with the JS being untrustworthy from the 3259 perspective of the browser. Thus, the threat model of [RFC3552] 3260 applies. In particular, JS can call the API in any order and with 3261 any inputs, including malicious ones. This is particularly relevant 3262 when we consider the SDP which is passed to setLocalDescription(). 3263 While correct API usage requires that the application pass in SDP 3264 which was derived from createOffer() or createAnswer() (perhaps 3265 suitably modified as described in Section 6, there is no guarantee 3266 that applications do so. The browser MUST be prepared for the JS to 3267 pass in bogus data instead. 3269 Conversely, the application programmer MUST recognize that the JS 3270 does not have complete control of browser behavior. One case that 3271 bears particular mention is that editing ICE candidates out of the 3272 SDP or suppressing trickled candidates does not have the expected 3273 behavior: implementations will still perform checks from those 3274 candidates even if they are not sent to the other side. Thus, for 3275 instance, it is not possible to prevent the remote peer from learning 3276 your public IP address by removing server reflexive candidates. 3277 Applications which wish to conceal their public IP address should 3278 instead configure the ICE agent to use only relay candidates. 3280 9. IANA Considerations 3282 This document requires no actions from IANA. 3284 10. Acknowledgements 3286 Significant text incorporated in the draft as well and review was 3287 provided by Harald Alvestrand and Suhas Nandakumar. Dan Burnett, 3288 Neil Stratford, Eric Rescorla, Anant Narayanan, Andrew Hutton, 3289 Richard Ejzak, Adam Bergkvist and Matthew Kaufman all provided 3290 valuable feedback on this proposal. 3292 11. References 3294 11.1. Normative References 3296 [I-D.ietf-mmusic-msid] 3297 Alvestrand, H., "Cross Session Stream Identification in 3298 the Session Description Protocol", draft-ietf-mmusic- 3299 msid-01 (work in progress), August 2013. 3301 [I-D.ietf-mmusic-sctp-sdp] 3302 Loreto, S. and G. Camarillo, "Stream Control Transmission 3303 Protocol (SCTP)-Based Media Transport in the Session 3304 Description Protocol (SDP)", draft-ietf-mmusic-sctp-sdp-04 3305 (work in progress), June 2013. 3307 [I-D.ietf-mmusic-sdp-bundle-negotiation] 3308 Holmberg, C., Alvestrand, H., and C. Jennings, 3309 "Multiplexing Negotiation Using Session Description 3310 Protocol (SDP) Port Numbers", draft-ietf-mmusic-sdp- 3311 bundle-negotiation-04 (work in progress), June 2013. 3313 [I-D.ietf-mmusic-sdp-mux-attributes] 3314 Nandakumar, S., "A Framework for SDP Attributes when 3315 Multiplexing", draft-ietf-mmusic-sdp-mux-attributes-01 3316 (work in progress), February 2014. 3318 [I-D.ietf-mmusic-trickle-ice] 3319 Ivov, E., Rescorla, E., and J. Uberti, "Trickle ICE: 3320 Incremental Provisioning of Candidates for the Interactive 3321 Connectivity Establishment (ICE) Protocol", draft-ietf- 3322 mmusic-trickle-ice-00 (work in progress), March 2013. 3324 [I-D.ietf-rtcweb-audio] 3325 Valin, J. and C. Bran, "WebRTC Audio Codec and Processing 3326 Requirements", draft-ietf-rtcweb-audio-02 (work in 3327 progress), August 2013. 3329 [I-D.ietf-rtcweb-data-protocol] 3330 Jesup, R., Loreto, S., and M. Tuexen, "WebRTC Data Channel 3331 Protocol", draft-ietf-rtcweb-data-protocol-04 (work in 3332 progress), February 2013. 3334 [I-D.ietf-rtcweb-fec] 3335 Uberti, J., "WebRTC Forward Error Correction 3336 Requirements", draft-ietf-rtcweb-fec-00 (work in 3337 progress), February 2015. 3339 [I-D.ietf-rtcweb-rtp-usage] 3340 Perkins, C., Westerlund, M., and J. Ott, "Web Real-Time 3341 Communication (WebRTC): Media Transport and Use of RTP", 3342 draft-ietf-rtcweb-rtp-usage-09 (work in progress), 3343 September 2013. 3345 [I-D.ietf-rtcweb-security] 3346 Rescorla, E., "Security Considerations for WebRTC", draft- 3347 ietf-rtcweb-security-06 (work in progress), January 2014. 3349 [I-D.ietf-rtcweb-security-arch] 3350 Rescorla, E., "WebRTC Security Architecture", draft-ietf- 3351 rtcweb-security-arch-09 (work in progress), February 2014. 3353 [I-D.ietf-rtcweb-video] 3354 Roach, A., "WebRTC Video Processing and Codec 3355 Requirements", draft-ietf-rtcweb-video-00 (work in 3356 progress), July 2014. 3358 [I-D.nandakumar-mmusic-proto-iana-registration] 3359 Nandakumar, S., "IANA registration of SDP 'proto' 3360 attribute for transporting RTP Media over TCP under 3361 various RTP profiles.", September 2014. 3363 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 3364 Requirement Levels", BCP 14, RFC 2119, March 1997. 3366 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 3367 A., Peterson, J., Sparks, R., Handley, M., and E. 3368 Schooler, "SIP: Session Initiation Protocol", RFC 3261, 3369 June 2002. 3371 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 3372 with Session Description Protocol (SDP)", RFC 3264, June 3373 2002. 3375 [RFC3552] Rescorla, E. and B. Korver, "Guidelines for Writing RFC 3376 Text on Security Considerations", BCP 72, RFC 3552, July 3377 2003. 3379 [RFC3605] Huitema, C., "Real Time Control Protocol (RTCP) attribute 3380 in Session Description Protocol (SDP)", RFC 3605, October 3381 2003. 3383 [RFC4145] Yon, D. and G. Camarillo, "TCP-Based Media Transport in 3384 the Session Description Protocol (SDP)", RFC 4145, 3385 September 2005. 3387 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 3388 Description Protocol", RFC 4566, July 2006. 3390 [RFC4572] Lennox, J., "Connection-Oriented Media Transport over the 3391 Transport Layer Security (TLS) Protocol in the Session 3392 Description Protocol (SDP)", RFC 4572, July 2006. 3394 [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, 3395 "Extended RTP Profile for Real-time Transport Control 3396 Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, July 3397 2006. 3399 [RFC5124] Ott, J. and E. Carrara, "Extended Secure RTP Profile for 3400 Real-time Transport Control Protocol (RTCP)-Based Feedback 3401 (RTP/SAVPF)", RFC 5124, February 2008. 3403 [RFC5245] Rosenberg, J., "Interactive Connectivity Establishment 3404 (ICE): A Protocol for Network Address Translator (NAT) 3405 Traversal for Offer/Answer Protocols", RFC 5245, April 3406 2010. 3408 [RFC5285] Singer, D. and H. Desineni, "A General Mechanism for RTP 3409 Header Extensions", RFC 5285, July 2008. 3411 [RFC5761] Perkins, C. and M. Westerlund, "Multiplexing RTP Data and 3412 Control Packets on a Single Port", RFC 5761, April 2010. 3414 [RFC5888] Camarillo, G. and H. Schulzrinne, "The Session Description 3415 Protocol (SDP) Grouping Framework", RFC 5888, June 2010. 3417 [RFC6236] Johansson, I. and K. Jung, "Negotiation of Generic Image 3418 Attributes in the Session Description Protocol (SDP)", RFC 3419 6236, May 2011. 3421 [RFC6347] Rescorla, E. and N. Modadugu, "Datagram Transport Layer 3422 Security Version 1.2", RFC 6347, January 2012. 3424 [RFC6904] Lennox, J., "Encryption of Header Extensions in the Secure 3425 Real-time Transport Protocol (SRTP)", RFC 6904, April 3426 2013. 3428 [RFC7022] Begen, A., Perkins, C., Wing, D., and E. Rescorla, 3429 "Guidelines for Choosing RTP Control Protocol (RTCP) 3430 Canonical Names (CNAMEs)", RFC 7022, September 2013. 3432 11.2. Informative References 3434 [I-D.nandakumar-rtcweb-sdp] 3435 Nandakumar, S. and C. Jennings, "SDP for the WebRTC", 3436 draft-nandakumar-rtcweb-sdp-02 (work in progress), July 3437 2013. 3439 [RFC3389] Zopf, R., "Real-time Transport Protocol (RTP) Payload for 3440 Comfort Noise (CN)", RFC 3389, September 2002. 3442 [RFC3556] Casner, S., "Session Description Protocol (SDP) Bandwidth 3443 Modifiers for RTP Control Protocol (RTCP) Bandwidth", RFC 3444 3556, July 2003. 3446 [RFC3960] Camarillo, G. and H. Schulzrinne, "Early Media and Ringing 3447 Tone Generation in the Session Initiation Protocol (SIP)", 3448 RFC 3960, December 2004. 3450 [RFC4568] Andreasen, F., Baugher, M., and D. Wing, "Session 3451 Description Protocol (SDP) Security Descriptions for Media 3452 Streams", RFC 4568, July 2006. 3454 [RFC4588] Rey, J., Leon, D., Miyazaki, A., Varsa, V., and R. 3455 Hakenberg, "RTP Retransmission Payload Format", RFC 4588, 3456 July 2006. 3458 [RFC5506] Johansson, I. and M. Westerlund, "Support for Reduced-Size 3459 Real-Time Transport Control Protocol (RTCP): Opportunities 3460 and Consequences", RFC 5506, April 2009. 3462 [RFC5576] Lennox, J., Ott, J., and T. Schierl, "Source-Specific 3463 Media Attributes in the Session Description Protocol 3464 (SDP)", RFC 5576, June 2009. 3466 [RFC5763] Fischl, J., Tschofenig, H., and E. Rescorla, "Framework 3467 for Establishing a Secure Real-time Transport Protocol 3468 (SRTP) Security Context Using Datagram Transport Layer 3469 Security (DTLS)", RFC 5763, May 2010. 3471 [RFC5764] McGrew, D. and E. Rescorla, "Datagram Transport Layer 3472 Security (DTLS) Extension to Establish Keys for the Secure 3473 Real-time Transport Protocol (SRTP)", RFC 5764, May 2010. 3475 [RFC5956] Begen, A., "Forward Error Correction Grouping Semantics in 3476 the Session Description Protocol", RFC 5956, September 3477 2010. 3479 [W3C.WD-webrtc-20140617] 3480 Bergkvist, A., Burnett, D., Narayanan, A., and C. 3481 Jennings, "WebRTC 1.0: Real-time Communication Between 3482 Browsers", World Wide Web Consortium WD WD-webrtc- 3483 20140617, June 2014, 3484 . 3486 Appendix A. Change log 3488 Note: This section will be removed by RFC Editor before publication. 3490 Changes in draft-12: 3492 o Filled in sections on applying local and remote descriptions. 3494 o Discussed downscaling and upscaling to fulfill imageattr 3495 requirements. 3497 o Updated what SDP can be modified by the application. 3499 o Updated to latest datachannel SDP. 3501 o Allowed multiple fingerprint lines. 3503 o Switched back to IPv4 for dummy candidates. 3505 o Added additional clarity on ICE default candidates. 3507 Changes in draft-11: 3509 o Clarified handling of RTP CNAMEs. 3511 o Updated what SDP lines should be processed or ignored. 3513 o Specified how a=imageattr should be used. 3515 Changes in draft-10: 3517 o TODO 3519 Changes in draft-09: 3521 o Don't return null for {local,remote}Description after close(). 3523 o Changed TCP/TLS to UDP/DTLS in RTP profile names. 3525 o Separate out bundle and mux policy. 3527 o Added specific references to FEC mechanisms. 3529 o Added canTrickle mechanism. 3531 o Added section on subsequent answers and, answer options. 3533 o Added text defining set{Local,Remote}Description behavior. 3535 Changes in draft-08: 3537 o Added new example section and removed old examples in appendix. 3539 o Fixed field handling. 3541 o Added text describing a=rtcp attribute. 3543 o Reworked handling of OfferToReceiveAudio and OfferToReceiveVideo 3544 per discussion at IETF 90. 3546 o Reworked trickle ICE handling and its impact on m= and c= lines 3547 per discussion at interim. 3549 o Added max-bundle-and-rtcp-mux policy. 3551 o Added description of maxptime handling. 3553 o Updated ICE candidate pool default to 0. 3555 o Resolved open issues around AppID/receiver-ID. 3557 o Reworked and expanded how changes to the ICE configuration are 3558 handled. 3560 o Some reference updates. 3562 o Editorial clarification. 3564 Changes in draft-07: 3566 o Expanded discussion of VAD and Opus DTX. 3568 o Added a security considerations section. 3570 o Rewrote the section on modifying SDP to require implementations to 3571 clearly indicate whether any given modification is allowed. 3573 o Clarified impact of IceRestart on CreateOffer in local-offer 3574 state. 3576 o Guidance on whether attributes should be defined at the media 3577 level or the session level. 3579 o Renamed "default" bundle policy to "balanced". 3581 o Removed default ICE candidate pool size and clarify how it works. 3583 o Defined a canonical order for assignment of MSTs to m= lines. 3585 o Removed discussion of rehydration. 3587 o Added Eric Rescorla as a draft editor. 3589 o Cleaned up references. 3591 o Editorial cleanup 3593 Changes in draft-06: 3595 o Reworked handling of m= line recycling. 3597 o Added handling of BUNDLE and bundle-only. 3599 o Clarified handling of rollback. 3601 o Added text describing the ICE Candidate Pool and its behavior. 3603 o Allowed OfferToReceiveX to create multiple recvonly m= sections. 3605 Changes in draft-05: 3607 o Fixed several issues identified in the createOffer/Answer sections 3608 during document review. 3610 o Updated references. 3612 Changes in draft-04: 3614 o Filled in sections on createOffer and createAnswer. 3616 o Added SDP examples. 3618 o Fixed references. 3620 Changes in draft-03: 3622 o Added text describing relationship to W3C specification 3623 Changes in draft-02: 3625 o Converted from nroff 3627 o Removed comparisons to old approaches abandoned by the working 3628 group 3630 o Removed stuff that has moved to W3C specification 3632 o Align SDP handling with W3C draft 3634 o Clarified section on forking. 3636 Changes in draft-01: 3638 o Added diagrams for architecture and state machine. 3640 o Added sections on forking and rehydration. 3642 o Clarified meaning of "pranswer" and "answer". 3644 o Reworked how ICE restarts and media directions are controlled. 3646 o Added list of parameters that can be changed in a description. 3648 o Updated suggested API and examples to match latest thinking. 3650 o Suggested API and examples have been moved to an appendix. 3652 Changes in draft -00: 3654 o Migrated from draft-uberti-rtcweb-jsep-02. 3656 Authors' Addresses 3658 Justin Uberti 3659 Google 3660 747 6th Ave S 3661 Kirkland, WA 98033 3662 USA 3664 Email: justin@uberti.name 3665 Cullen Jennings 3666 Cisco 3667 170 West Tasman Drive 3668 San Jose, CA 95134 3669 USA 3671 Email: fluffy@iii.ca 3673 Eric Rescorla (editor) 3674 Mozilla 3675 331 Evelyn Ave 3676 Mountain View, CA 94041 3677 USA 3679 Email: ekr@rtfm.com