idnits 2.17.1 draft-ietf-rtcweb-jsep-13.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 25 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. == There are 10 instances of lines with private range IPv4 addresses in the document. If these are generic example addresses, they should be changed to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x, 198.51.100.x or 203.0.113.x. -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 9, 2016) is 2967 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '0' on line 702 -- No information found for draft-ietf-ice-trickle - is the name correct? -- Possible downref: Normative reference to a draft: ref. 'I-D.ietf-ice-trickle' == Outdated reference: A later version (-17) exists of draft-ietf-mmusic-msid-01 == Outdated reference: A later version (-26) exists of draft-ietf-mmusic-sctp-sdp-04 == Outdated reference: A later version (-54) exists of draft-ietf-mmusic-sdp-bundle-negotiation-04 == Outdated reference: A later version (-19) exists of draft-ietf-mmusic-sdp-mux-attributes-01 == Outdated reference: A later version (-11) exists of draft-ietf-rtcweb-audio-02 == Outdated reference: A later version (-10) exists of draft-ietf-rtcweb-fec-00 == Outdated reference: A later version (-26) exists of draft-ietf-rtcweb-rtp-usage-09 == Outdated reference: A later version (-12) exists of draft-ietf-rtcweb-security-06 == Outdated reference: A later version (-20) exists of draft-ietf-rtcweb-security-arch-09 == Outdated reference: A later version (-06) exists of draft-ietf-rtcweb-video-00 -- No information found for draft-nandakumar-mmusic-proto-iana-registration - is the name correct? -- Possible downref: Normative reference to a draft: ref. 'I-D.nandakumar-mmusic-proto-iana-registration' ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) ** Obsolete normative reference: RFC 4572 (Obsoleted by RFC 8122) ** Obsolete normative reference: RFC 5245 (Obsoleted by RFC 8445, RFC 8839) ** Obsolete normative reference: RFC 5285 (Obsoleted by RFC 8285) ** Obsolete normative reference: RFC 6347 (Obsoleted by RFC 9147) == Outdated reference: A later version (-08) exists of draft-nandakumar-rtcweb-sdp-02 Summary: 5 errors (**), 0 flaws (~~), 14 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group J. Uberti 3 Internet-Draft Google 4 Intended status: Standards Track C. Jennings 5 Expires: September 10, 2016 Cisco 6 E. Rescorla, Ed. 7 Mozilla 8 March 9, 2016 10 Javascript Session Establishment Protocol 11 draft-ietf-rtcweb-jsep-13 13 Abstract 15 This document describes the mechanisms for allowing a Javascript 16 application to control the signaling plane of a multimedia session 17 via the interface specified in the W3C RTCPeerConnection API, and 18 discusses how this relates to existing signaling protocols. 20 Status of This Memo 22 This Internet-Draft is submitted in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at http://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on September 10, 2016. 37 Copyright Notice 39 Copyright (c) 2016 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents 44 (http://trustee.ietf.org/license-info) in effect on the date of 45 publication of this document. Please review these documents 46 carefully, as they describe your rights and restrictions with respect 47 to this document. Code Components extracted from this document must 48 include Simplified BSD License text as described in Section 4.e of 49 the Trust Legal Provisions and are provided without warranty as 50 described in the Simplified BSD License. 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 55 1.1. General Design of JSEP . . . . . . . . . . . . . . . . . 3 56 1.2. Other Approaches Considered . . . . . . . . . . . . . . . 5 57 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 58 3. Semantics and Syntax . . . . . . . . . . . . . . . . . . . . 6 59 3.1. Signaling Model . . . . . . . . . . . . . . . . . . . . . 6 60 3.2. Session Descriptions and State Machine . . . . . . . . . 7 61 3.3. Session Description Format . . . . . . . . . . . . . . . 10 62 3.4. ICE . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 63 3.4.1. ICE Gathering Overview . . . . . . . . . . . . . . . 10 64 3.4.2. ICE Candidate Trickling . . . . . . . . . . . . . . . 11 65 3.4.2.1. ICE Candidate Format . . . . . . . . . . . . . . 12 66 3.4.3. ICE Candidate Policy . . . . . . . . . . . . . . . . 12 67 3.4.4. ICE Candidate Pool . . . . . . . . . . . . . . . . . 13 68 3.5. Video Size Negotiation . . . . . . . . . . . . . . . . . 14 69 3.5.1. Creating an imageattr Attribute . . . . . . . . . . . 14 70 3.5.2. Interpreting an imageattr Attribute . . . . . . . . . 15 71 3.6. Interactions With Forking . . . . . . . . . . . . . . . . 16 72 3.6.1. Sequential Forking . . . . . . . . . . . . . . . . . 16 73 3.6.2. Parallel Forking . . . . . . . . . . . . . . . . . . 17 74 4. Interface . . . . . . . . . . . . . . . . . . . . . . . . . . 18 75 4.1. Methods . . . . . . . . . . . . . . . . . . . . . . . . . 18 76 4.1.1. Constructor . . . . . . . . . . . . . . . . . . . . . 18 77 4.1.2. createOffer . . . . . . . . . . . . . . . . . . . . . 20 78 4.1.3. createAnswer . . . . . . . . . . . . . . . . . . . . 21 79 4.1.4. SessionDescriptionType . . . . . . . . . . . . . . . 22 80 4.1.4.1. Use of Provisional Answers . . . . . . . . . . . 22 81 4.1.4.2. Rollback . . . . . . . . . . . . . . . . . . . . 23 82 4.1.5. setLocalDescription . . . . . . . . . . . . . . . . . 24 83 4.1.6. setRemoteDescription . . . . . . . . . . . . . . . . 24 84 4.1.7. currentLocalDescription . . . . . . . . . . . . . . . 25 85 4.1.8. pendingLocalDescription . . . . . . . . . . . . . . . 25 86 4.1.9. currentRemoteDescription . . . . . . . . . . . . . . 25 87 4.1.10. pendingRemoteDescription . . . . . . . . . . . . . . 25 88 4.1.11. canTrickleIceCandidates . . . . . . . . . . . . . . . 26 89 4.1.12. setConfiguration . . . . . . . . . . . . . . . . . . 26 90 4.1.13. addIceCandidate . . . . . . . . . . . . . . . . . . . 27 91 5. SDP Interaction Procedures . . . . . . . . . . . . . . . . . 27 92 5.1. Requirements Overview . . . . . . . . . . . . . . . . . . 27 93 5.1.1. Implementation Requirements . . . . . . . . . . . . . 28 94 5.1.2. Usage Requirements . . . . . . . . . . . . . . . . . 29 95 5.1.3. Profile Names and Interoperability . . . . . . . . . 29 96 5.2. Constructing an Offer . . . . . . . . . . . . . . . . . . 30 97 5.2.1. Initial Offers . . . . . . . . . . . . . . . . . . . 30 98 5.2.2. Subsequent Offers . . . . . . . . . . . . . . . . . . 35 99 5.2.3. Options Handling . . . . . . . . . . . . . . . . . . 38 100 5.2.3.1. OfferToReceiveAudio . . . . . . . . . . . . . . . 38 101 5.2.3.2. OfferToReceiveVideo . . . . . . . . . . . . . . . 39 102 5.2.3.3. IceRestart . . . . . . . . . . . . . . . . . . . 39 103 5.2.3.4. VoiceActivityDetection . . . . . . . . . . . . . 39 104 5.3. Generating an Answer . . . . . . . . . . . . . . . . . . 40 105 5.3.1. Initial Answers . . . . . . . . . . . . . . . . . . . 40 106 5.3.2. Subsequent Answers . . . . . . . . . . . . . . . . . 45 107 5.3.3. Options Handling . . . . . . . . . . . . . . . . . . 46 108 5.3.3.1. VoiceActivityDetection . . . . . . . . . . . . . 46 109 5.4. Processing a Local Description . . . . . . . . . . . . . 46 110 5.5. Processing a Remote Description . . . . . . . . . . . . . 47 111 5.6. Parsing a Session Description . . . . . . . . . . . . . . 47 112 5.6.1. Session-Level Parsing . . . . . . . . . . . . . . . . 48 113 5.6.2. Media Section Parsing . . . . . . . . . . . . . . . . 50 114 5.6.3. Semantics Verification . . . . . . . . . . . . . . . 52 115 5.7. Applying a Local Description . . . . . . . . . . . . . . 53 116 5.8. Applying a Remote Description . . . . . . . . . . . . . . 54 117 5.9. Applying an Answer . . . . . . . . . . . . . . . . . . . 56 118 6. Configurable SDP Parameters . . . . . . . . . . . . . . . . . 57 119 7. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 59 120 7.1. Simple Example . . . . . . . . . . . . . . . . . . . . . 59 121 7.2. Normal Examples . . . . . . . . . . . . . . . . . . . . . 63 122 8. Security Considerations . . . . . . . . . . . . . . . . . . . 72 123 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 72 124 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 72 125 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 73 126 11.1. Normative References . . . . . . . . . . . . . . . . . . 73 127 11.2. Informative References . . . . . . . . . . . . . . . . . 75 128 Appendix A. Change log . . . . . . . . . . . . . . . . . . . . . 77 129 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 81 131 1. Introduction 133 This document describes how the W3C WEBRTC RTCPeerConnection 134 interface[W3C.WD-webrtc-20140617] is used to control the setup, 135 management and teardown of a multimedia session. 137 1.1. General Design of JSEP 139 The thinking behind WebRTC call setup has been to fully specify and 140 control the media plane, but to leave the signaling plane up to the 141 application as much as possible. The rationale is that different 142 applications may prefer to use different protocols, such as the 143 existing SIP or Jingle call signaling protocols, or something custom 144 to the particular application, perhaps for a novel use case. In this 145 approach, the key information that needs to be exchanged is the 146 multimedia session description, which specifies the necessary 147 transport and media configuration information necessary to establish 148 the media plane. 150 With these considerations in mind, this document describes the 151 Javascript Session Establishment Protocol (JSEP) that allows for full 152 control of the signaling state machine from Javascript. JSEP removes 153 the browser almost entirely from the core signaling flow, which is 154 instead handled by the Javascript making use of two interfaces: (1) 155 passing in local and remote session descriptions and (2) interacting 156 with the ICE state machine. 158 In this document, the use of JSEP is described as if it always occurs 159 between two browsers. Note though in many cases it will actually be 160 between a browser and some kind of server, such as a gateway or MCU. 161 This distinction is invisible to the browser; it just follows the 162 instructions it is given via the API. 164 JSEP's handling of session descriptions is simple and 165 straightforward. Whenever an offer/answer exchange is needed, the 166 initiating side creates an offer by calling a createOffer() API. The 167 application optionally modifies that offer, and then uses it to set 168 up its local config via the setLocalDescription() API. The offer is 169 then sent off to the remote side over its preferred signaling 170 mechanism (e.g., WebSockets); upon receipt of that offer, the remote 171 party installs it using the setRemoteDescription() API. 173 To complete the offer/answer exchange, the remote party uses the 174 createAnswer() API to generate an appropriate answer, applies it 175 using the setLocalDescription() API, and sends the answer back to the 176 initiator over the signaling channel. When the initiator gets that 177 answer, it installs it using the setRemoteDescription() API, and 178 initial setup is complete. This process can be repeated for 179 additional offer/answer exchanges. 181 Regarding ICE [RFC5245], JSEP decouples the ICE state machine from 182 the overall signaling state machine, as the ICE state machine must 183 remain in the browser, because only the browser has the necessary 184 knowledge of candidates and other transport info. Performing this 185 separation also provides additional flexibility; in protocols that 186 decouple session descriptions from transport, such as Jingle, the 187 session description can be sent immediately and the transport 188 information can be sent when available. In protocols that don't, 189 such as SIP, the information can be used in the aggregated form. 190 Sending transport information separately can allow for faster ICE and 191 DTLS startup, since ICE checks can start as soon as any transport 192 information is available rather than waiting for all of it. 194 Through its abstraction of signaling, the JSEP approach does require 195 the application to be aware of the signaling process. While the 196 application does not need to understand the contents of session 197 descriptions to set up a call, the application must call the right 198 APIs at the right times, convert the session descriptions and ICE 199 information into the defined messages of its chosen signaling 200 protocol, and perform the reverse conversion on the messages it 201 receives from the other side. 203 One way to mitigate this is to provide a Javascript library that 204 hides this complexity from the developer; said library would 205 implement a given signaling protocol along with its state machine and 206 serialization code, presenting a higher level call-oriented interface 207 to the application developer. For example, libraries exist to adapt 208 the JSEP API into an API suitable for a SIP or XMPP. Thus, JSEP 209 provides greater control for the experienced developer without 210 forcing any additional complexity on the novice developer. 212 1.2. Other Approaches Considered 214 One approach that was considered instead of JSEP was to include a 215 lightweight signaling protocol. Instead of providing session 216 descriptions to the API, the API would produce and consume messages 217 from this protocol. While providing a more high-level API, this put 218 more control of signaling within the browser, forcing the browser to 219 have to understand and handle concepts like signaling glare. In 220 addition, it prevented the application from driving the state machine 221 to a desired state, as is needed in the page reload case. 223 A second approach that was considered but not chosen was to decouple 224 the management of the media control objects from session 225 descriptions, instead offering APIs that would control each component 226 directly. This was rejected based on a feeling that requiring 227 exposure of this level of complexity to the application programmer 228 would not be beneficial; it would result in an API where even a 229 simple example would require a significant amount of code to 230 orchestrate all the needed interactions, as well as creating a large 231 API surface that needed to be agreed upon and documented. In 232 addition, these API points could be called in any order, resulting in 233 a more complex set of interactions with the media subsystem than the 234 JSEP approach, which specifies how session descriptions are to be 235 evaluated and applied. 237 One variation on JSEP that was considered was to keep the basic 238 session description-oriented API, but to move the mechanism for 239 generating offers and answers out of the browser. Instead of 240 providing createOffer/createAnswer methods within the browser, this 241 approach would instead expose a getCapabilities API which would 242 provide the application with the information it needed in order to 243 generate its own session descriptions. This increases the amount of 244 work that the application needs to do; it needs to know how to 245 generate session descriptions from capabilities, and especially how 246 to generate the correct answer from an arbitrary offer and the 247 supported capabilities. While this could certainly be addressed by 248 using a library like the one mentioned above, it basically forces the 249 use of said library even for a simple example. Providing 250 createOffer/createAnswer avoids this problem, but still allows 251 applications to generate their own offers/answers (to a large extent) 252 if they choose, using the description generated by createOffer as an 253 indication of the browser's capabilities. 255 2. Terminology 257 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 258 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 259 document are to be interpreted as described in [RFC2119]. 261 3. Semantics and Syntax 263 3.1. Signaling Model 265 JSEP does not specify a particular signaling model or state machine, 266 other than the generic need to exchange session descriptions in the 267 fashion described by [RFC3264] (offer/answer) in order for both sides 268 of the session to know how to conduct the session. JSEP provides 269 mechanisms to create offers and answers, as well as to apply them to 270 a session. However, the browser is totally decoupled from the actual 271 mechanism by which these offers and answers are communicated to the 272 remote side, including addressing, retransmission, forking, and glare 273 handling. These issues are left entirely up to the application; the 274 application has complete control over which offers and answers get 275 handed to the browser, and when. 277 +-----------+ +-----------+ 278 | Web App |<--- App-Specific Signaling -->| Web App | 279 +-----------+ +-----------+ 280 ^ ^ 281 | SDP | SDP 282 V V 283 +-----------+ +-----------+ 284 | Browser |<----------- Media ------------>| Browser | 285 +-----------+ +-----------+ 287 Figure 1: JSEP Signaling Model 289 3.2. Session Descriptions and State Machine 291 In order to establish the media plane, the user agent needs specific 292 parameters to indicate what to transmit to the remote side, as well 293 as how to handle the media that is received. These parameters are 294 determined by the exchange of session descriptions in offers and 295 answers, and there are certain details to this process that must be 296 handled in the JSEP APIs. 298 Whether a session description applies to the local side or the remote 299 side affects the meaning of that description. For example, the list 300 of codecs sent to a remote party indicates what the local side is 301 willing to receive, which, when intersected with the set of codecs 302 the remote side supports, specifies what the remote side should send. 303 However, not all parameters follow this rule; for example, the DTLS- 304 SRTP parameters [RFC5763] sent to a remote party indicate what 305 certificate the local side will use in DTLS setup, and thereby what 306 the remote party should expect to receive; the remote party will have 307 to accept these parameters, with no option to choose different 308 values. 310 In addition, various RFCs put different conditions on the format of 311 offers versus answers. For example, an offer may propose an 312 arbitrary number of media streams (i.e. m= sections), but an answer 313 must contain the exact same number as the offer. 315 Lastly, while the exact media parameters are only known only after an 316 offer and an answer have been exchanged, it is possible for the 317 offerer to receive media after they have sent an offer and before 318 they have received an answer. To properly process incoming media in 319 this case, the offerer's media handler must be aware of the details 320 of the offer before the answer arrives. 322 Therefore, in order to handle session descriptions properly, the user 323 agent needs: 325 1. To know if a session description pertains to the local or remote 326 side. 328 2. To know if a session description is an offer or an answer. 330 3. To allow the offer to be specified independently of the answer. 332 JSEP addresses this by adding both setLocalDescription and 333 setRemoteDescription methods and having session description objects 334 contain a type field indicating the type of session description being 335 supplied. This satisfies the requirements listed above for both the 336 offerer, who first calls setLocalDescription(sdp [offer]) and then 337 later setRemoteDescription(sdp [answer]), as well as for the 338 answerer, who first calls setRemoteDescription(sdp [offer]) and then 339 later setLocalDescription(sdp [answer]). 341 JSEP also allows for an answer to be treated as provisional by the 342 application. Provisional answers provide a way for an answerer to 343 communicate initial session parameters back to the offerer, in order 344 to allow the session to begin, while allowing a final answer to be 345 specified later. This concept of a final answer is important to the 346 offer/answer model; when such an answer is received, any extra 347 resources allocated by the caller can be released, now that the exact 348 session configuration is known. These "resources" can include things 349 like extra ICE components, TURN candidates, or video decoders. 350 Provisional answers, on the other hand, do no such deallocation 351 results; as a result, multiple dissimilar provisional answers can be 352 received and applied during call setup. 354 In [RFC3264], the constraint at the signaling level is that only one 355 offer can be outstanding for a given session, but at the media stack 356 level, a new offer can be generated at any point. For example, when 357 using SIP for signaling, if one offer is sent, then cancelled using a 358 SIP CANCEL, another offer can be generated even though no answer was 359 received for the first offer. To support this, the JSEP media layer 360 can provide an offer via the createOffer() method whenever the 361 Javascript application needs one for the signaling. The answerer can 362 send back zero or more provisional answers, and finally end the 363 offer-answer exchange by sending a final answer. The state machine 364 for this is as follows: 366 setRemote(OFFER) setLocal(PRANSWER) 367 /-----\ /-----\ 368 | | | | 369 v | v | 370 +---------------+ | +---------------+ | 371 | |----/ | |----/ 372 | | setLocal(PRANSWER) | | 373 | Remote-Offer |------------------- >| Local-Pranswer| 374 | | | | 375 | | | | 376 +---------------+ +---------------+ 377 ^ | | 378 | | setLocal(ANSWER) | 379 setRemote(OFFER) | | 380 | V setLocal(ANSWER) | 381 +---------------+ | 382 | | | 383 | |<---------------------------+ 384 | Stable | 385 | |<---------------------------+ 386 | | | 387 +---------------+ setRemote(ANSWER) | 388 ^ | | 389 | | setLocal(OFFER) | 390 setRemote(ANSWER) | | 391 | V | 392 +---------------+ +---------------+ 393 | | | | 394 | | setRemote(PRANSWER) | | 395 | Local-Offer |------------------- >|Remote-Pranswer| 396 | | | | 397 | |----\ | |----\ 398 +---------------+ | +---------------+ | 399 ^ | ^ | 400 | | | | 401 \-----/ \-----/ 402 setLocal(OFFER) setRemote(PRANSWER) 404 Figure 2: JSEP State Machine 406 Aside from these state transitions there is no other difference 407 between the handling of provisional ("pranswer") and final ("answer") 408 answers. 410 3.3. Session Description Format 412 In the WebRTC specification, session descriptions are formatted as 413 SDP messages. While this format is not optimal for manipulation from 414 Javascript, it is widely accepted, and frequently updated with new 415 features. Any alternate encoding of session descriptions would have 416 to keep pace with the changes to SDP, at least until the time that 417 this new encoding eclipsed SDP in popularity. As a result, JSEP 418 currently uses SDP as the internal representation for its session 419 descriptions. 421 However, to simplify Javascript processing, and provide for future 422 flexibility, the SDP syntax is encapsulated within a 423 SessionDescription object, which can be constructed from SDP, and be 424 serialized out to SDP. If future specifications agree on a JSON 425 format for session descriptions, we could easily enable this object 426 to generate and consume that JSON. 428 Other methods may be added to SessionDescription in the future to 429 simplify handling of SessionDescriptions from Javascript. In the 430 meantime, Javascript libraries can be used to perform these 431 manipulations. 433 Note that most applications should be able to treat the 434 SessionDescriptions produced and consumed by these various API calls 435 as opaque blobs; that is, the application will not need to read or 436 change them. The W3C WebRTC API specification will provide 437 appropriate APIs to allow the application to control various session 438 parameters, which will provide the necessary information to the 439 browser about what sort of SessionDescription to produce. 441 3.4. ICE 443 3.4.1. ICE Gathering Overview 445 JSEP gathers ICE candidates as needed by the application. Collection 446 of ICE candidates is referred to as a gathering phase, and this is 447 triggered either by the addition of a new or recycled m= line to the 448 local session description, or new ICE credentials in the description, 449 indicating an ICE restart. Use of new ICE credentials can be 450 triggered explicitly by the application, or implicitly by the browser 451 in response to changes in the ICE configuration. 453 When the ICE configuration changes in a way that requires a new 454 gathering phase, a 'needs-ice-restart' bit is set. When this bit is 455 set, calls to the createOffer API will generate new ICE credentials. 456 This bit is cleared by a call to the setLocalDescription API with new 457 ICE credentials from either an offer or an answer, i.e., from either 458 a local- or remote-initiated ICE restart. 460 When a new gathering phase starts, the ICE Agent will notify the 461 application that gathering is occurring through an event. Then, when 462 each new ICE candidate becomes available, the ICE Agent will supply 463 it to the application via an additional event; these candidates will 464 also automatically be added to the current and/or pending local 465 session description. Finally, when all candidates have been 466 gathered, an event will be dispatched to signal that the gathering 467 process is complete. 469 Note that gathering phases only gather the candidates needed by 470 new/recycled/restarting m= lines; other m= lines continue to use 471 their existing candidates. Also, when bundling is active, candidates 472 are only gathered (and exchanged) for the m= lines referenced in 473 BUNDLE-tags, as described in 474 [I-D.ietf-mmusic-sdp-bundle-negotiation]. 476 3.4.2. ICE Candidate Trickling 478 Candidate trickling is a technique through which a caller may 479 incrementally provide candidates to the callee after the initial 480 offer has been dispatched; the semantics of "Trickle ICE" are defined 481 in [I-D.ietf-ice-trickle]. This process allows the callee to begin 482 acting upon the call and setting up the ICE (and perhaps DTLS) 483 connections immediately, without having to wait for the caller to 484 gather all possible candidates. This results in faster media setup 485 in cases where gathering is not performed prior to initiating the 486 call. 488 JSEP supports optional candidate trickling by providing APIs, as 489 described above, that provide control and feedback on the ICE 490 candidate gathering process. Applications that support candidate 491 trickling can send the initial offer immediately and send individual 492 candidates when they get the notified of a new candidate; 493 applications that do not support this feature can simply wait for the 494 indication that gathering is complete, and then create and send their 495 offer, with all the candidates, at this time. 497 Upon receipt of trickled candidates, the receiving application will 498 supply them to its ICE Agent. This triggers the ICE Agent to start 499 using the new remote candidates for connectivity checks. 501 3.4.2.1. ICE Candidate Format 503 As with session descriptions, the syntax of the IceCandidate object 504 provides some abstraction, but can be easily converted to and from 505 the SDP candidate lines. 507 The candidate lines are the only SDP information that is contained 508 within IceCandidate, as they represent the only information needed 509 that is not present in the initial offer (i.e., for trickle 510 candidates). This information is carried with the same syntax as the 511 "candidate-attribute" field defined for ICE. For example: 513 candidate:1 1 UDP 1694498815 192.0.2.33 10000 typ host 515 The IceCandidate object also contains fields to indicate which m= 516 line it should be associated with. The m= line can be identified in 517 one of two ways; either by a m= line index, or a MID. The m= line 518 index is a zero-based index, with index N referring to the N+1th m= 519 line in the SDP sent by the entity which sent the IceCandidate. The 520 MID uses the "media stream identification" attribute, as defined in 521 [RFC5888], Section 4, to identify the m= line. JSEP implementations 522 creating an ICE Candidate object MUST populate both of these fields. 523 Implementations receiving an ICE Candidate object MUST use the MID if 524 present, or the m= line index, if not (as it could have come from a 525 non-JSEP endpoint). 527 3.4.3. ICE Candidate Policy 529 Typically, when gathering ICE candidates, the browser will gather all 530 possible forms of initial candidates - host, server reflexive, and 531 relay. However, in certain cases, applications may want to have more 532 specific control over the gathering process, due to privacy or 533 related concerns. For example, one may want to suppress the use of 534 host candidates, to avoid exposing information about the local 535 network, or go as far as only using relay candidates, to leak as 536 little location information as possible (note that these choices come 537 with corresponding operational costs). To accomplish this, the 538 browser MUST allow the application to restrict which ICE candidates 539 are used in a session. Note that this filtering is applied on top of 540 any restrictions the browser chooses to enforce regarding which IP 541 addresses are permitted for the application, as discussed in 542 [I-D.shieh-rtcweb-ip-handling]. 544 There may also be cases where the application wants to change which 545 types of candidates are used while the session is active. A prime 546 example is where a callee may initially want to use only relay 547 candidates, to avoid leaking location information to an arbitrary 548 caller, but then change to use all candidates (for lower operational 549 cost) once the user has indicated they want to take the call. For 550 this scenario, the browser MUST allow the candidate policy to be 551 changed in mid-session, subject to the aforementioned interactions 552 with local policy. 554 To administer the ICE candidate policy, the browser will determine 555 the current setting at the start of each gathering phase. Then, 556 during the gathering phase, the browser MUST NOT expose candidates 557 disallowed by the current policy to the application, use them as the 558 source of connectivity checks, or indirectly expose them via other 559 fields, such as the raddr/rport attributes for other ICE candidates. 560 Later, if a different policy is specified by the application, the 561 application can apply it by kicking off a new gathering phase via an 562 ICE restart. 564 3.4.4. ICE Candidate Pool 566 JSEP applications typically inform the browser to begin ICE gathering 567 via the information supplied to setLocalDescription, as this is where 568 the app specifies the number of media streams, and thereby ICE 569 components, for which to gather candidates. However, to accelerate 570 cases where the application knows the number of ICE components to use 571 ahead of time, it may ask the browser to gather a pool of potential 572 ICE candidates to help ensure rapid media setup. 574 When setLocalDescription is eventually called, and the browser goes 575 to gather the needed ICE candidates, it SHOULD start by checking if 576 any candidates are available in the pool. If there are candidates in 577 the pool, they SHOULD be handed to the application immediately via 578 the ICE candidate event. If the pool becomes depleted, either 579 because a larger-than-expected number of ICE components is used, or 580 because the pool has not had enough time to gather candidates, the 581 remaining candidates are gathered as usual. 583 One example of where this concept is useful is an application that 584 expects an incoming call at some point in the future, and wants to 585 minimize the time it takes to establish connectivity, to avoid 586 clipping of initial media. By pre-gathering candidates into the 587 pool, it can exchange and start sending connectivity checks from 588 these candidates almost immediately upon receipt of a call. Note 589 though that by holding on to these pre-gathered candidates, which 590 will be kept alive as long as they may be needed, the application 591 will consume resources on the STUN/TURN servers it is using. 593 3.5. Video Size Negotiation 595 Video size negotiation is the process through which a receiver can 596 use the "a=imageattr" SDP attribute [RFC6236] to indicate what video 597 frame sizes it is capable of receiving. A receiver may have hard 598 limits on what its video decoder can process, or it may wish to 599 constrain what it receives due to application preferences, e.g. a 600 specific size for the window in which the video will be displayed. 602 3.5.1. Creating an imageattr Attribute 604 In order to determine the limits on what video resolution a receiver 605 wants to receive, it will intersect its decoder hard limits with any 606 mandatory constraints that have been applied to the associated 607 MediaStreamTrack. If the decoder limits are unknown, e.g. when using 608 a software decoder, the mandatory constraints are used directly. For 609 the answerer, these mandatory constraints can be applied to the 610 remote MediaStreamTracks that are created by a setRemoteDescription 611 call, and will affect the output of the ensuing createAnswer call. 612 Any constraints set after setLocalDescription is used to set the 613 answer will result in a new offer-answer exchange. For the offerer, 614 because it does not know about any remote MediaStreamTracks until it 615 receives the answer, the offer can only reflect decoder hard limits. 616 If the offerer wishes to set mandatory constraints on video 617 resolution, it must do so after receiving the answer, and the result 618 will be a new offer-answer to communicate them. 620 If there are no known decoder limits or mandatory constraints, the 621 "a=imageattr" attribute SHOULD be omitted. 623 Otherwise, an "a=imageattr" attribute is created with "recv" 624 direction, and the resulting resolution space formed by intersecting 625 the decoder limits and constraints is used to specify its minimum and 626 maximum x= and y= values. If the intersection is the null set, i.e., 627 there are no resolutions that are permitted by both the decoder and 628 the mandatory constraints, this SHOULD be represented by x=0 and y=0 629 values. 631 The rules here express a single set of preferences, and therefore, 632 the "a=imageattr" q= value is not important. It SHOULD be set to 633 1.0. 635 The "a=imageattr" field is payload type specific. When all video 636 codecs supported have the same capabilities, use of a single 637 attribute, with the wildcard payload type (*), is RECOMMENDED. 638 However, when the supported video codecs have differing capabilities, 639 specific "a=imageattr" attributes MUST be inserted for each payload 640 type. 642 As an example, consider a system with a HD-capable, multiformat video 643 decoder, where the application has constrained the received track to 644 at most 360p. In this case, the implemention would generate this 645 attribute: 647 a=imageattr:* recv [x=[16:640],y=[16:360],q=1.0] 649 This declaration indicates that the receiver is capable of decoding 650 any image resolution from 16x16 up to 640x360 pixels. 652 3.5.2. Interpreting an imageattr Attribute 654 [RFC6236] defines "a=imageattr" to be an advisory field. This means 655 that it does not absolutely constrain the video formats that the 656 sender can use, but gives an indication of the preferred values. 658 This specification prescribes more specific behavior. When a sender 659 of a given MediaStreamTrack, which is producing video of a certain 660 resolution, receives an "a=imageattr recv" attribute, it MUST check 661 to see if the original resolution meets the size criteria specified 662 in the attribute, and adapt the resolution accordingly by scaling (if 663 appropriate). Note that when considering a MediaStreamTrack that is 664 producing rotated video, the unrotated resolution MUST be used. This 665 is required regardless of whether the receiver supports performing 666 receive-side rotation (e.g., through CVO), as it significantly 667 simplifies the matching logic. 669 For an "a=imageattr recv" attribute, only size limits are considered. 670 Any other values, e.g. aspect ratio, MUST be ignored. 672 When communicating with a non-JSEP endpoint, multiple relevant 673 "a=imageattr recv" attributes may be received. If this occurs, 674 attributes other than the one with the highest "q=" value MUST be 675 ignored. 677 If an "a=imageattr recv" attribute references a different video codec 678 than what has been selected for the MediaStreamTrack, it MUST be 679 ignored. 681 If the original resolution matches the size limits in the attribute, 682 the track MUST be transmitted untouched. 684 If the original resolution exceeds the size limits in the attribute, 685 the sender SHOULD apply downscaling to the output of the 686 MediaStreamTrack in order to satisfy the limits. Downscaling MUST 687 NOT change the track aspect ratio. 689 If the original resolution is less than the size limits in the 690 attribute, upscaling is needed, but this may not be appropriate in 691 all cases. To address this concern, the application can set an 692 upscaling policy for each sent track. For this case, if upscaling is 693 permitted by policy, the sender SHOULD apply upscaling in order to 694 provide the desired resolution. Otherwise, the sender MUST NOT apply 695 upscaling. The sender SHOULD NOT upscale in other cases, even if the 696 policy permits it. Upscaling MUST NOT change the track aspect ratio. 698 If there is no appropriate and permitted scaling mechanism that 699 allows the received size limits to be satisfied, the sender MUST NOT 700 transmit the track. 702 In the special case of receiving a maximum resolution of [0, 0], as 703 described above, the sender MUST NOT transmit the track. 705 3.6. Interactions With Forking 707 Some call signaling systems allow various types of forking where an 708 SDP Offer may be provided to more than one device. For example, SIP 709 [RFC3261] defines both a "Parallel Search" and "Sequential Search". 710 Although these are primarily signaling level issues that are outside 711 the scope of JSEP, they do have some impact on the configuration of 712 the media plane that is relevant. When forking happens at the 713 signaling layer, the Javascript application responsible for the 714 signaling needs to make the decisions about what media should be sent 715 or received at any point of time, as well as which remote endpoint it 716 should communicate with; JSEP is used to make sure the media engine 717 can make the RTP and media perform as required by the application. 718 The basic operations that the applications can have the media engine 719 do are: 721 o Start exchanging media with a given remote peer, but keep all the 722 resources reserved in the offer. 724 o Start exchanging media with a given remote peer, and free any 725 resources in the offer that are not being used. 727 3.6.1. Sequential Forking 729 Sequential forking involves a call being dispatched to multiple 730 remote callees, where each callee can accept the call, but only one 731 active session ever exists at a time; no mixing of received media is 732 performed. 734 JSEP handles sequential forking well, allowing the application to 735 easily control the policy for selecting the desired remote endpoint. 736 When an answer arrives from one of the callees, the application can 737 choose to apply it either as a provisional answer, leaving open the 738 possibility of using a different answer in the future, or apply it as 739 a final answer, ending the setup flow. 741 In a "first-one-wins" situation, the first answer will be applied as 742 a final answer, and the application will reject any subsequent 743 answers. In SIP parlance, this would be ACK + BYE. 745 In a "last-one-wins" situation, all answers would be applied as 746 provisional answers, and any previous call leg will be terminated. 747 At some point, the application will end the setup process, perhaps 748 with a timer; at this point, the application could reapply the 749 pending remote description as a final answer. 751 3.6.2. Parallel Forking 753 Parallel forking involves a call being dispatched to multiple remote 754 callees, where each callee can accept the call, and multiple 755 simultaneous active signaling sessions can be established as a 756 result. If multiple callees send media at the same time, the 757 possibilities for handling this are described in Section 3.1 of 758 [RFC3960]. Most SIP devices today only support exchanging media with 759 a single device at a time, and do not try to mix multiple early media 760 audio sources, as that could result in a confusing situation. For 761 example, consider having a European ringback tone mixed together with 762 the North American ringback tone - the resulting sound would not be 763 like either tone, and would confuse the user. If the signaling 764 application wishes to only exchange media with one of the remote 765 endpoints at a time, then from a media engine point of view, this is 766 exactly like the sequential forking case. 768 In the parallel forking case where the Javascript application wishes 769 to simultaneously exchange media with multiple peers, the flow is 770 slightly more complex, but the Javascript application can follow the 771 strategy that [RFC3960] describes using UPDATE. The UPDATE approach 772 allows the signaling to set up a separate media flow for each peer 773 that it wishes to exchange media with. In JSEP, this offer used in 774 the UPDATE would be formed by simply creating a new PeerConnection 775 and making sure that the same local media streams have been added 776 into this new PeerConnection. Then the new PeerConnection object 777 would produce a SDP offer that could be used by the signaling to 778 perform the UPDATE strategy discussed in [RFC3960]. 780 As a result of sharing the media streams, the application will end up 781 with N parallel PeerConnection sessions, each with a local and remote 782 description and their own local and remote addresses. The media flow 783 from these sessions can be managed by specifying SDP direction 784 attributes in the descriptions, or the application can choose to play 785 out the media from all sessions mixed together. Of course, if the 786 application wants to only keep a single session, it can simply 787 terminate the sessions that it no longer needs. 789 4. Interface 791 This section details the basic operations that must be present to 792 implement JSEP functionality. The actual API exposed in the W3C API 793 may have somewhat different syntax, but should map easily to these 794 concepts. 796 4.1. Methods 798 4.1.1. Constructor 800 The PeerConnection constructor allows the application to specify 801 global parameters for the media session, such as the STUN/TURN 802 servers and credentials to use when gathering candidates, as well as 803 the initial ICE candidate policy and pool size, and also the bundle 804 policy to use. 806 If an ICE candidate policy is specified, it functions as described in 807 Section 3.4.3, causing the browser to only surface the permitted 808 candidates (including any internal browser filtering) to the 809 application, and only use those candidates for connectivity checks. 810 The set of available policies is as follows: 812 all: All candidates permitted by browser policy will be gathered and 813 used. 815 relay: All candidates except relay candidates will be filtered out. 816 This obfuscates the location information that might be ascertained 817 by the remote peer from the received candidates. Depending on how 818 the application deploys its relay servers, this could obfuscate 819 location to a metro or possibly even global level. 821 The default ICE candidate policy MUST be set to "all" as this is 822 generally the desired policy, and also typically reduces use of 823 application TURN server resources significantly. 825 If a size is specified for the ICE candidate pool, this indicates the 826 number of ICE components to pre-gather candidates for. Because pre- 827 gathering results in utilizing STUN/TURN server resources for 828 potentially long periods of time, this must only occur upon 829 application request, and therefore the default candidate pool size 830 MUST be zero. 832 The application can specify its preferred policy regarding use of 833 bundle, the multiplexing mechanism defined in 834 [I-D.ietf-mmusic-sdp-bundle-negotiation]. Regardless of policy, the 835 application will always try to negotiate bundle onto a single 836 transport, and will offer a single bundle group across all media 837 section; use of this single transport is contingent upon the answerer 838 accepting bundle. However, by specifying a policy from the list 839 below, the application can control exactly how aggressively it will 840 try to bundle media streams together, which affects how it will 841 interoperate with a non-bundle-aware endpoint. When negotiating with 842 a non-bundle-aware endpoint, only the streams not marked as bundle- 843 only streams will be established. 845 The set of available policies is as follows: 847 balanced: The first media section of each type (audio, video, or 848 application) will contain transport parameters, which will allow 849 an answerer to unbundle that section. The second and any 850 subsequent media section of each type will be marked bundle-only. 851 The result is that if there are N distinct media types, then 852 candidates will be gathered for for N media streams. This policy 853 balances desire to multiplex with the need to ensure basic audio 854 and video can still be negotiated in legacy cases. 856 max-compat: All media sections will contain transport parameters; 857 none will be marked as bundle-only. This policy will allow all 858 streams to be received by non-bundle-aware endpoints, but require 859 separate candidates to be gathered for each media stream. 861 max-bundle: Only the first media section will contain transport 862 parameters; all streams other than the first will be marked as 863 bundle-only. This policy aims to minimize candidate gathering and 864 maximize multiplexing, at the cost of less compatibility with 865 legacy endpoints. 867 As it provides the best tradeoff between performance and 868 compatibility with legacy endpoints, the default bundle policy MUST 869 be set to "balanced". 871 The application can specify its preferred policy regarding use of 872 RTP/RTCP multiplexing [RFC5761] using one of the following policies: 874 negotiate: The browser will gather both RTP and RTCP candidates but 875 also will offer "a=rtcp-mux", thus allowing for compatibility with 876 either multiplexing or non-multiplexing endpoints. 878 require: The browser will only gather RTP candidates. This halves 879 the number of candidates that the offerer needs to gather. When 880 acting as answerer, the browser will reject any m= section that 881 does not provide an "a=rtcp-mux" attribute. 883 The default multiplexing policy MUST be set to "require". 884 Implementations MAY choose to reject attempts by the application to 885 set the multiplexing policy to "negotiate". 887 4.1.2. createOffer 889 The createOffer method generates a blob of SDP that contains a 890 [RFC3264] offer with the supported configurations for the session, 891 including descriptions of the local MediaStreams attached to this 892 PeerConnection, the codec/RTP/RTCP options supported by this 893 implementation, and any candidates that have been gathered by the ICE 894 Agent. An options parameter may be supplied to provide additional 895 control over the generated offer. This options parameter should 896 allow for the following manipulations to be performed: 898 o To indicate support for a media type even if no MediaStreamTracks 899 of that type have been added to the session (e.g., an audio call 900 that wants to receive video.) 902 o To trigger an ICE restart, for the purpose of reestablishing 903 connectivity. 905 In the initial offer, the generated SDP will contain all desired 906 functionality for the session (functionality that is supported but 907 not desired by default may be omitted); for each SDP line, the 908 generation of the SDP will follow the process defined for generating 909 an initial offer from the document that specifies the given SDP line. 910 The exact handling of initial offer generation is detailed in 911 Section 5.2.1 below. 913 In the event createOffer is called after the session is established, 914 createOffer will generate an offer to modify the current session 915 based on any changes that have been made to the session, e.g. adding 916 or removing MediaStreams, or requesting an ICE restart. For each 917 existing stream, the generation of each SDP line must follow the 918 process defined for generating an updated offer from the RFC that 919 specifies the given SDP line. For each new stream, the generation of 920 the SDP must follow the process of generating an initial offer, as 921 mentioned above. If no changes have been made, or for SDP lines that 922 are unaffected by the requested changes, the offer will only contain 923 the parameters negotiated by the last offer-answer exchange. The 924 exact handling of subsequent offer generation is detailed in 925 Section 5.2.2. below. 927 Session descriptions generated by createOffer must be immediately 928 usable by setLocalDescription; if a system has limited resources 929 (e.g. a finite number of decoders), createOffer should return an 930 offer that reflects the current state of the system, so that 931 setLocalDescription will succeed when it attempts to acquire those 932 resources. Because this method may need to inspect the system state 933 to determine the currently available resources, it may be implemented 934 as an async operation. 936 Calling this method may do things such as generate new ICE 937 credentials, but does not result in candidate gathering, or cause 938 media to start or stop flowing. 940 4.1.3. createAnswer 942 The createAnswer method generates a blob of SDP that contains a 943 [RFC3264] SDP answer with the supported configuration for the session 944 that is compatible with the parameters supplied in the most recent 945 call to setRemoteDescription, which MUST have been called prior to 946 calling createAnswer. Like createOffer, the returned blob contains 947 descriptions of the local MediaStreams attached to this 948 PeerConnection, the codec/RTP/RTCP options negotiated for this 949 session, and any candidates that have been gathered by the ICE Agent. 950 An options parameter may be supplied to provide additional control 951 over the generated answer. 953 As an answer, the generated SDP will contain a specific configuration 954 that specifies how the media plane should be established; for each 955 SDP line, the generation of the SDP must follow the process defined 956 for generating an answer from the document that specifies the given 957 SDP line. The exact handling of answer generation is detailed in 958 Section 5.3. below. 960 Session descriptions generated by createAnswer must be immediately 961 usable by setLocalDescription; like createOffer, the returned 962 description should reflect the current state of the system. Because 963 this method may need to inspect the system state to determine the 964 currently available resources, it may need to be implemented as an 965 async operation. 967 Calling this method may do things such as generate new ICE 968 credentials, but does not trigger candidate gathering or change media 969 state. 971 4.1.4. SessionDescriptionType 973 Session description objects (RTCSessionDescription) may be of type 974 "offer", "pranswer", "answer" or "rollback". These types provide 975 information as to how the description parameter should be parsed, and 976 how the media state should be changed. 978 "offer" indicates that a description should be parsed as an offer; 979 said description may include many possible media configurations. A 980 description used as an "offer" may be applied anytime the 981 PeerConnection is in a stable state, or as an update to a previously 982 supplied but unanswered "offer". 984 "pranswer" indicates that a description should be parsed as an 985 answer, but not a final answer, and so should not result in the 986 freeing of allocated resources. It may result in the start of media 987 transmission, if the answer does not specify an inactive media 988 direction. A description used as a "pranswer" may be applied as a 989 response to an "offer", or an update to a previously sent "pranswer". 991 "answer" indicates that a description should be parsed as an answer, 992 the offer-answer exchange should be considered complete, and any 993 resources (decoders, candidates) that are no longer needed can be 994 released. A description used as an "answer" may be applied as a 995 response to an "offer", or an update to a previously sent "pranswer". 997 The only difference between a provisional and final answer is that 998 the final answer results in the freeing of any unused resources that 999 were allocated as a result of the offer. As such, the application 1000 can use some discretion on whether an answer should be applied as 1001 provisional or final, and can change the type of the session 1002 description as needed. For example, in a serial forking scenario, an 1003 application may receive multiple "final" answers, one from each 1004 remote endpoint. The application could choose to accept the initial 1005 answers as provisional answers, and only apply an answer as final 1006 when it receives one that meets its criteria (e.g. a live user 1007 instead of voicemail). 1009 "rollback" is a special session description type implying that the 1010 state machine should be rolled back to the previous state, as 1011 described in Section 4.1.4.2. The contents MUST be empty. 1013 4.1.4.1. Use of Provisional Answers 1015 Most web applications will not need to create answers using the 1016 "pranswer" type. While it is good practice to send an immediate 1017 response to an "offer", in order to warm up the session transport and 1018 prevent media clipping, the preferred handling for a web application 1019 would be to create and send an "inactive" final answer immediately 1020 after receiving the offer. Later, when the called user actually 1021 accepts the call, the application can create a new "sendrecv" offer 1022 to update the previous offer/answer pair and start the media flow. 1023 While this could also be done with an inactive "pranswer", followed 1024 by a sendrecv "answer", the initial "pranswer" leaves the offer- 1025 answer exchange open, which means that neither side can send an 1026 updated offer during this time. 1028 As an example, consider a typical web application that will set up a 1029 data channel, an audio channel, and a video channel. When an 1030 endpoint receives an offer with these channels, it could send an 1031 answer accepting the data channel for two-way data, and accepting the 1032 audio and video tracks as inactive or receive-only. It could then 1033 ask the user to accept the call, acquire the local media streams, and 1034 send a new offer to the remote side moving the audio and video to be 1035 two-way media. By the time the human has accepted the call and 1036 triggered the new offer, it is likely that the ICE and DTLS 1037 handshaking for all the channels will already have finished. 1039 Of course, some applications may not be able to perform this double 1040 offer-answer exchange, particularly ones that are attempting to 1041 gateway to legacy signaling protocols. In these cases, "pranswer" 1042 can still provide the application with a mechanism to warm up the 1043 transport. 1045 4.1.4.2. Rollback 1047 In certain situations it may be desirable to "undo" a change made to 1048 setLocalDescription or setRemoteDescription. Consider a case where a 1049 call is ongoing, and one side wants to change some of the session 1050 parameters; that side generates an updated offer and then calls 1051 setLocalDescription. However, the remote side, either before or 1052 after setRemoteDescription, decides it does not want to accept the 1053 new parameters, and sends a reject message back to the offerer. Now, 1054 the offerer, and possibly the answerer as well, need to return to a 1055 stable state and the previous local/remote description. To support 1056 this, we introduce the concept of "rollback". 1058 A rollback discards any proposed changes to the session, returning 1059 the state machine to the stable state, and setting the pending local 1060 and/or remote description back to null. Any resources or candidates 1061 that were allocated by the abandoned local description are discarded; 1062 any media that is received will be processed according to the 1063 previous local and remote descriptions. Rollback can only be used to 1064 cancel proposed changes; there is no support for rolling back from a 1065 stable state to a previous stable state. Note that this implies that 1066 once the answerer has performed setLocalDescription with his answer, 1067 this cannot be rolled back. 1069 A rollback is performed by supplying a session description of type 1070 "rollback" with empty contents to either setLocalDescription or 1071 setRemoteDescription, depending on which was most recently used (i.e. 1072 if the new offer was supplied to setLocalDescription, the rollback 1073 should be done using setLocalDescription as well). 1075 4.1.5. setLocalDescription 1077 The setLocalDescription method instructs the PeerConnection to apply 1078 the supplied session description as its local configuration. The 1079 type field indicates whether the description should be processed as 1080 an offer, provisional answer, or final answer; offers and answers are 1081 checked differently, using the various rules that exist for each SDP 1082 line. 1084 This API changes the local media state; among other things, it sets 1085 up local resources for receiving and decoding media. In order to 1086 successfully handle scenarios where the application wants to offer to 1087 change from one media format to a different, incompatible format, the 1088 PeerConnection must be able to simultaneously support use of both the 1089 current and pending local descriptions (e.g. support codecs that 1090 exist in both descriptions) until a final answer is received, at 1091 which point the PeerConnection can fully adopt the pending local 1092 description, or roll back to the current description if the remote 1093 side denied the change. 1095 This API indirectly controls the candidate gathering process. When a 1096 local description is supplied, and the number of transports currently 1097 in use does not match the number of transports needed by the local 1098 description, the PeerConnection will create transports as needed and 1099 begin gathering candidates for them. 1101 If setRemoteDescription was previously called with an offer, and 1102 setLocalDescription is called with an answer (provisional or final), 1103 and the media directions are compatible, and media are available to 1104 send, this will result in the starting of media transmission. 1106 4.1.6. setRemoteDescription 1108 The setRemoteDescription method instructs the PeerConnection to apply 1109 the supplied session description as the desired remote configuration. 1110 As in setLocalDescription, the type field of the description 1111 indicates how it should be processed. 1113 This API changes the local media state; among other things, it sets 1114 up local resources for sending and encoding media. 1116 If setLocalDescription was previously called with an offer, and 1117 setRemoteDescription is called with an answer (provisional or final), 1118 and the media directions are compatible, and media are available to 1119 send, this will result in the starting of media transmission. 1121 4.1.7. currentLocalDescription 1123 The currentLocalDescription method returns a copy of the current 1124 negotiated local description - i.e., the local description from the 1125 last successful offer/answer exchange - in addition to any local 1126 candidates that have been generated by the ICE Agent since the local 1127 description was set. 1129 A null object will be returned if an offer/answer exchange has not 1130 yet been completed. 1132 4.1.8. pendingLocalDescription 1134 The pendingLocalDescription method returns a copy of the local 1135 description currently in negotiation - i.e., a local offer set 1136 without any corresponding remote answer - in addition to any local 1137 candidates that have been generated by the ICE Agent since the local 1138 description was set. 1140 A null object will be returned if the state of the PeerConnection is 1141 "stable" or "have-remote-offer". 1143 4.1.9. currentRemoteDescription 1145 The currentRemoteDescription method returns a copy of the current 1146 negotiated remote description - i.e., the remote description from the 1147 last successful offer/answer exchange - in addition to any remote 1148 candidates that have been supplied via processIceMessage since the 1149 remote description was set. 1151 A null object will be returned if an offer/answer exchange has not 1152 yet been completed. 1154 4.1.10. pendingRemoteDescription 1156 The pendingRemoteDescription method returns a copy of the remote 1157 description currently in negotiation - i.e., a remote offer set 1158 without any corresponding local answer - in addition to any remote 1159 candidates that have been supplied via processIceMessage since the 1160 remote description was set. 1162 A null object will be returned if the state of the PeerConnection is 1163 "stable" or "have-local-offer". 1165 4.1.11. canTrickleIceCandidates 1167 The canTrickleIceCandidates property indicates whether the remote 1168 side supports receiving trickled candidates. There are three 1169 potential values: 1171 null: No SDP has been received from the other side, so it is not 1172 known if it can handle trickle. This is the initial value before 1173 setRemoteDescription() is called. 1175 true: SDP has been received from the other side indicating that it 1176 can support trickle. 1178 false: SDP has been received from the other side indicating that it 1179 cannot support trickle. 1181 As described in Section 3.4.2, JSEP implementations always provide 1182 candidates to the application individually, consistent with what is 1183 needed for Trickle ICE. However, applications can use the 1184 canTrickleIceCandidates property to determine whether their peer can 1185 actually do Trickle ICE, i.e., whether it is safe to send an initial 1186 offer or answer followed later by candidates as they are gathered. 1187 As "true" is the only value that definitively indicates remote 1188 Trickle ICE support, an application which compares 1189 canTrickleIceCandidates against "true" will by default attempt Half 1190 Trickle on initial offers and Full Trickle on subsequent interactions 1191 with a Trickle ICE-compatible agent. 1193 4.1.12. setConfiguration 1195 The setConfiguration method allows the global configuration of the 1196 PeerConnection, which was initially set by constructor parameters, to 1197 be changed during the session. The effects of this method call 1198 depend on when it is invoked, and differ depending on which specific 1199 parameters are changed: 1201 o Any changes to the STUN/TURN servers to use affect the next 1202 gathering phase. If an ICE gathering phase has already started or 1203 completed, the 'needs-ice-restart' bit mentioned in Section 3.4.1 1204 will be set. This will cause the next call to createOffer to 1205 generate new ICE credentials, for the purpose of forcing an ICE 1206 restart and kicking off a new gathering phase, in which the new 1207 servers will be used. If the ICE candidate pool has a nonzero 1208 size, any existing candidates will be discarded, and new 1209 candidates will be gathered from the new servers. 1211 o Any change to the ICE candidate policy affects the next gathering 1212 phase. If an ICE gathering phase has already started or 1213 completed, the 'needs-ice-restart' bit will be set. Either way, 1214 changes to the policy have no effect on the candidate pool, 1215 because pooled candidates are not surfaced to the application 1216 until a gathering phase occurs, and so any necessary filtering can 1217 still be done on any pooled candidates. 1219 o Any changes to the ICE candidate pool size take effect 1220 immediately; if increased, additional candidates are pre-gathered; 1221 if decreased, the now-superfluous candidates are discarded. 1223 o The bundle and RTCP-multiplexing policies MUST NOT be changed 1224 after the construction of the PeerConnection. 1226 This call may result in a change to the state of the ICE Agent, and 1227 may result in a change to media state if it results in connectivity 1228 being established. 1230 4.1.13. addIceCandidate 1232 The addIceCandidate method provides a remote candidate to the ICE 1233 Agent, which, if parsed successfully, will be added to the current 1234 and/or pending remote description according to the rules defined for 1235 Trickle ICE. Connectivity checks will be sent to the new candidate. 1237 This call will result in a change to the state of the ICE Agent, and 1238 may result in a change to media state if it results in connectivity 1239 being established. 1241 5. SDP Interaction Procedures 1243 This section describes the specific procedures to be followed when 1244 creating and parsing SDP objects. 1246 5.1. Requirements Overview 1248 JSEP implementations must comply with the specifications listed below 1249 that govern the creation and processing of offers and answers. 1251 The first set of specifications is the "mandatory-to-implement" set. 1252 All implementations must support these behaviors, but may not use all 1253 of them if the remote side, which may not be a JSEP endpoint, does 1254 not support them. 1256 The second set of specifications is the "mandatory-to-use" set. The 1257 local JSEP endpoint and any remote endpoint must indicate support for 1258 these specifications in their session descriptions. 1260 5.1.1. Implementation Requirements 1262 This list of mandatory-to-implement specifications is derived from 1263 the requirements outlined in [I-D.ietf-rtcweb-rtp-usage]. 1265 R-1 [RFC4566] is the base SDP specification and MUST be 1266 implemented. 1268 R-2 [RFC5764] MUST be supported for signaling the UDP/TLS/RTP/SAVPF 1269 [RFC5764], TCP/DTLS/RTP/SAVPF 1270 [I-D.nandakumar-mmusic-proto-iana-registration], "UDP/DTLS/ 1271 SCTP" [I-D.ietf-mmusic-sctp-sdp], and "TCP/DTLS/SCTP" 1272 [I-D.ietf-mmusic-sctp-sdp] RTP profiles. 1274 R-3 [RFC5245] MUST be implemented for signaling the ICE credentials 1275 and candidate lines corresponding to each media stream. The 1276 ICE implementation MUST be a Full implementation, not a Lite 1277 implementation. 1279 R-4 [RFC5763] MUST be implemented to signal DTLS certificate 1280 fingerprints. 1282 R-5 [RFC4568] MUST NOT be implemented to signal SDES SRTP keying 1283 information. 1285 R-6 The [RFC5888] grouping framework MUST be implemented for 1286 signaling grouping information, and MUST be used to identify m= 1287 lines via the a=mid attribute. 1289 R-7 [I-D.ietf-mmusic-msid] MUST be supported, in order to signal 1290 associations between RTP objects and W3C MediaStreams and 1291 MediaStreamTracks in a standard way. 1293 R-8 The bundle mechanism in 1294 [I-D.ietf-mmusic-sdp-bundle-negotiation] MUST be supported to 1295 signal the ability to multiplex RTP streams on a single UDP 1296 port, in order to avoid excessive use of port number resources. 1298 R-9 The SDP attributes of "sendonly", "recvonly", "inactive", and 1299 "sendrecv" from [RFC4566] MUST be implemented to signal 1300 information about media direction. 1302 R-10 [RFC5576] MUST be implemented to signal RTP SSRC values and 1303 grouping semantics. 1305 R-11 [RFC4585] MUST be implemented to signal RTCP based feedback. 1307 R-12 [RFC5761] MUST be implemented to signal multiplexing of RTP and 1308 RTCP. 1310 R-13 [RFC5506] MUST be implemented to signal reduced-size RTCP 1311 messages. 1313 R-14 [RFC4588] MUST be implemented to signal RTX payload type 1314 associations. 1316 R-15 [RFC3556] with bandwidth modifiers MAY be supported for 1317 specifying RTCP bandwidth as a fraction of the media bandwidth, 1318 RTCP fraction allocated to the senders and setting maximum 1319 media bit-rate boundaries. 1321 R-16 TODO: any others? 1323 As required by [RFC4566], Section 5.13, JSEP implementations MUST 1324 ignore unknown attribute (a=) lines. 1326 5.1.2. Usage Requirements 1328 All session descriptions handled by JSEP endpoints, both local and 1329 remote, MUST indicate support for the following specifications. If 1330 any of these are absent, this omission MUST be treated as an error. 1332 R-1 ICE, as specified in [RFC5245], MUST be used. Note that the 1333 remote endpoint may use a Lite implementation; implementations 1334 MUST properly handle remote endpoints which do ICE-Lite. 1336 R-2 DTLS [RFC6347] or DTLS-SRTP [RFC5763], MUST be used, as 1337 appropriate for the media type, as specified in 1338 [I-D.ietf-rtcweb-security-arch] 1340 5.1.3. Profile Names and Interoperability 1342 For media m= sections, JSEP endpoints MUST support both the "UDP/TLS/ 1343 RTP/SAVPF" and "TCP/DTLS/RTP/SAVPF" profiles and MUST indicate one of 1344 these two profiles for each media m= line they produce in an offer. 1345 For data m= sections, JSEP endpoints must support both the "UDP/DTLS/ 1346 SCTP" and "TCP/DTLS/SCTP" profiles and MUST indicate one of these two 1347 profiles for each data m= line they produce in an offer. Because ICE 1348 can select either TCP or UDP transport depending on network 1349 conditions, both advertisements are consistent with ICE eventually 1350 selecting either either UDP or TCP. 1352 Unfortunately, in an attempt at compatibility, some endpoints 1353 generate other profile strings even when they mean to support one of 1354 these profiles. For instance, an endpoint might generate "RTP/AVP" 1355 but supply "a=fingerprint" and "a=rtcp-fb" attributes, indicating its 1356 willingness to support "(UDP,TCP)/TLS/RTP/SAVPF". In order to 1357 simplify compatibility with such endpoints, JSEP endpoints MUST 1358 follow the following rules when processing the media m= sections in 1359 an offer: 1361 o The profile in any "m=" line in any answer MUST exactly match the 1362 profile provided in the offer. 1364 o Any profile matching the following patterns MUST be accepted: 1365 "RTP/[S]AVP[F]" and "(UDP/TCP)/TLS/RTP/SAVP[F]" 1367 o Because DTLS-SRTP is REQUIRED, the choice of SAVP or AVP has no 1368 effect; support for DTLS-SRTP is determined by the presence of one 1369 or more "a=fingerprint" attribute. Note that lack of an 1370 "a=fingerprint" attribute will lead to negotiation failure. 1372 o The use of AVPF or AVP simply controls the timing rules used for 1373 RTCP feedback. If AVPF is provided, or an "a=rtcp-fb" attribute 1374 is present, assume AVPF timing, i.e. a default value of "trr- 1375 int=0". Otherwise, assume that AVPF is being used in an AVP 1376 compatible mode and use AVP timing, i.e., "trr-int=4". 1378 o For data m= sections, JSEP endpoints MUST support receiving the 1379 "UDP/ DTLS/SCTP", "TCP/DTLS/SCTP", or "DTLS/SCTP" (for backwards 1380 compatibility) profiles. 1382 Note that re-offers by JSEP endpoints MUST use the correct profile 1383 strings even if the initial offer/answer exchange used an (incorrect) 1384 older profile string. 1386 5.2. Constructing an Offer 1388 When createOffer is called, a new SDP description must be created 1389 that includes the functionality specified in 1390 [I-D.ietf-rtcweb-rtp-usage]. The exact details of this process are 1391 explained below. 1393 5.2.1. Initial Offers 1395 When createOffer is called for the first time, the result is known as 1396 the initial offer. 1398 The first step in generating an initial offer is to generate session- 1399 level attributes, as specified in [RFC4566], Section 5. 1400 Specifically: 1402 o The first SDP line MUST be "v=0", as specified in [RFC4566], 1403 Section 5.1 1405 o The second SDP line MUST be an "o=" line, as specified in 1406 [RFC4566], Section 5.2. The value of the field SHOULD 1407 be "-". The value of the field SHOULD be a 1408 cryptographically random number. To ensure uniqueness, this 1409 number SHOULD be at least 64 bits long. The value of the field SHOULD be zero. The value of the 1411 tuple SHOULD be set to a non- 1412 meaningful address, such as IN IP4 0.0.0.0, to prevent leaking the 1413 local address in this field. As mentioned in [RFC4566], the 1414 entire o= line needs to be unique, but selecting a random number 1415 for is sufficient to accomplish this. 1417 o The third SDP line MUST be a "s=" line, as specified in [RFC4566], 1418 Section 5.3; to match the "o=" line, a single dash SHOULD be used 1419 as the session name, e.g. "s=-". Note that this differs from the 1420 advice in [RFC4566] which proposes a single space, but as both 1421 "o=" and "s=" are meaningless, having the same meaningless value 1422 seems clearer. 1424 o Session Information ("i="), URI ("u="), Email Address ("e="), 1425 Phone Number ("p="), Bandwidth ("b="), Repeat Times ("r="), and 1426 Time Zones ("z=") lines are not useful in this context and SHOULD 1427 NOT be included. 1429 o Encryption Keys ("k=") lines do not provide sufficient security 1430 and MUST NOT be included. 1432 o A "t=" line MUST be added, as specified in [RFC4566], Section 5.9; 1433 both and SHOULD be set to zero, e.g. "t=0 1434 0". 1436 o An "a=ice-options" line with the "trickle" option MUST be added, 1437 as specified in [I-D.ietf-ice-trickle], Section 4. 1439 The next step is to generate m= sections, as specified in [RFC4566] 1440 Section 5.14, for each MediaStreamTrack that has been added to the 1441 PeerConnection via the addStream method. (Note that this method 1442 takes a MediaStream, which can contain multiple MediaStreamTracks, 1443 and therefore multiple m= sections can be generated even if addStream 1444 is only called once.) m=sections MUST be sorted first by the order in 1445 which the MediaStreams were added to the PeerConnection, and then by 1446 the alphabetical ordering of the media type for the MediaStreamTrack. 1447 For example, if a MediaStream containing both an audio and a video 1448 MediaStreamTrack is added to a PeerConnection, the resultant m=audio 1449 section will precede the m=video section. If a second MediaStream 1450 containing an audio MediaStreamTrack was added, it would follow the 1451 m=video section. 1453 Each m= section, provided it is not marked as bundle-only, MUST 1454 generate a unique set of ICE credentials and gather its own unique 1455 set of ICE candidates. Bundle-only m= sections MUST NOT contain any 1456 ICE credentials and MUST NOT gather any candidates. 1458 For DTLS, all m= sections MUST use the certificate for the identity 1459 that has been specified for the PeerConnection; as a result, they 1460 MUST all have the same [RFC4572] fingerprint value, or this value 1461 MUST be a session-level attribute. 1463 Each m= section should be generated as specified in [RFC4566], 1464 Section 5.14. For the m= line itself, the following rules MUST be 1465 followed: 1467 o The port value is set to the port of the default ICE candidate for 1468 this m= section, but given that no candidates have yet been 1469 gathered, the "dummy" port value of 9 (Discard) MUST be used, as 1470 indicated in [I-D.ietf-ice-trickle], Section 5.1. 1472 o To properly indicate use of DTLS, the field MUST be set to 1473 "UDP/TLS/RTP/SAVPF", as specified in [RFC5764], Section 8, if the 1474 default candidate uses UDP transport, or "TCP/DTLS/RTP/SAVPF", as 1475 specified in[I-D.nandakumar-mmusic-proto-iana-registration] if the 1476 default candidate uses TCP transport. 1478 The m= line MUST be followed immediately by a "c=" line, as specified 1479 in [RFC4566], Section 5.7. Again, as no candidates have yet been 1480 gathered, the "c=" line must contain the "dummy" value "IN IP4 1481 0.0.0.0", as defined in [I-D.ietf-ice-trickle], Section 5.1. 1483 Each m= section MUST include the following attribute lines: 1485 o An "a=mid" line, as specified in [RFC5888], Section 4. When 1486 generating mid values, it is RECOMMENDED that the values be 3 1487 bytes or less, to allow them to efficiently fit into the RTP 1488 header extension defined in 1489 [I-D.ietf-mmusic-sdp-bundle-negotiation], Section 11. 1491 o An "a=rtcp" line, as specified in [RFC3605], Section 2.1, 1492 containing the dummy value "9 IN IP4 0.0.0.0", because no 1493 candidates have yet been gathered. 1495 o An "a=msid" line, as specified in [I-D.ietf-mmusic-msid], 1496 Section 2. 1498 o An "a=sendrecv" line, as specified in [RFC3264], Section 5.1. 1500 o For each supported codec, "a=rtpmap" and "a=fmtp" lines, as 1501 specified in [RFC4566], Section 6. The audio and video codecs 1502 that MUST be supported are specified in [I-D.ietf-rtcweb-audio] 1503 (see Section 3) and [I-D.ietf-rtcweb-video] (see Section 5). 1505 o If this m= section is for media with configurable frame sizes, 1506 e.g. audio, an "a=maxptime" line, indicating the smallest of the 1507 maximum supported frame sizes out of all codecs included above, as 1508 specified in [RFC4566], Section 6. 1510 o If this m= section is for video media, and there are known 1511 limitations on the size of images which can be decoded, an 1512 "a=imageattr" line, as specified in Section 3.5. 1514 o For each primary codec where RTP retransmission should be used, a 1515 corresponding "a=rtpmap" line indicating "rtx" with the clock rate 1516 of the primary codec and an "a=fmtp" line that references the 1517 payload type of the primary codec, as specified in [RFC4588], 1518 Section 8.1. 1520 o For each supported FEC mechanism, "a=rtpmap" and "a=fmtp" lines, 1521 as specified in [RFC4566], Section 6. The FEC mechanisms that 1522 MUST be supported are specified in [I-D.ietf-rtcweb-fec], 1523 Section 6, and specific usage for each media type is outlined in 1524 Sections 4 and 5. 1526 o "a=ice-ufrag" and "a=ice-pwd" lines, as specified in [RFC5245], 1527 Section 15.4. 1529 o An "a=fingerprint" line for each of the endpoint's certificates, 1530 as specified in [RFC4572], Section 5; the digest algorithm used 1531 for the fingerprint MUST match that used in the certificate 1532 signature. 1534 o An "a=setup" line, as specified in [RFC4145], Section 4, and 1535 clarified for use in DTLS-SRTP scenarios in [RFC5763], Section 5. 1536 The role value in the offer MUST be "actpass". 1538 o An "a=rtcp-mux" line, as specified in [RFC5761], Section 5.1.1. 1540 o An "a=rtcp-rsize" line, as specified in [RFC5506], Section 5. 1542 o For each supported RTP header extension, an "a=extmap" line, as 1543 specified in [RFC5285], Section 5. The list of header extensions 1544 that SHOULD/MUST be supported is specified in 1545 [I-D.ietf-rtcweb-rtp-usage], Section 5.2. Any header extensions 1546 that require encryption MUST be specified as indicated in 1547 [RFC6904], Section 4. 1549 o For each supported RTCP feedback mechanism, an "a=rtcp-fb" 1550 mechanism, as specified in [RFC4585], Section 4.2. The list of 1551 RTCP feedback mechanisms that SHOULD/MUST be supported is 1552 specified in [I-D.ietf-rtcweb-rtp-usage], Section 5.1. 1554 o An "a=ssrc" line, as specified in [RFC5576], Section 4.1, 1555 indicating the SSRC to be used for sending media, along with the 1556 mandatory "cname" source attribute, as specified in Section 6.1, 1557 indicating the CNAME for the source. The CNAME MUST be generated 1558 in accordance with Section 4.9 of [I-D.ietf-rtcweb-rtp-usage]. 1560 o If RTX is supported for this media type, another "a=ssrc" line 1561 with the RTX SSRC, and an "a=ssrc-group" line, as specified in 1562 [RFC5576], section 4.2, with semantics set to "FID" and including 1563 the primary and RTX SSRCs. 1565 o If FEC is supported for this media type, another "a=ssrc" line 1566 with the FEC SSRC, and an "a=ssrc-group" line with semantics set 1567 to "FEC-FR" and including the primary and FEC SSRCs, as specified 1568 in [RFC5956], section 4.3. For simplicity, if both RTX and FEC 1569 are supported, the FEC SSRC MUST be the same as the RTX SSRC. 1571 o If the bundle policy for this PeerConnection is set to "max- 1572 bundle", and this is not the first m= section, or the bundle 1573 policy is set to "balanced", and this is not the first m= section 1574 for this media type, an "a=bundle-only" line. 1576 Lastly, if a data channel has been created, a m= section MUST be 1577 generated for data. The field MUST be set to "application" 1578 and the field MUST be set to "UDP/DTLS/SCTP" if the default 1579 candidate uses UDP transport, or "TCP/DTLS/SCTP" if the default 1580 candidate uses TCP transport [I-D.ietf-mmusic-sctp-sdp]. The "fmt" 1581 value MUST be set to "webrtc-datachannel" as specified in 1582 [I-D.ietf-mmusic-sctp-sdp], Section 4.1. 1584 Within the data m= section, the "a=mid", "a=ice-ufrag", "a=ice-pwd", 1585 "a=fingerprint", and "a=setup" lines MUST be included as mentioned 1586 above, along with an "a=fmtp:webrtc-datachannel" line and an "a=sctp- 1587 port" line referencing the SCTP port number as defined in 1588 [I-D.ietf-mmusic-sctp-sdp], Section 4.1. 1590 Once all m= sections have been generated, a session-level "a=group" 1591 attribute MUST be added as specified in [RFC5888]. This attribute 1592 MUST have semantics "bundle", and MUST include the mid identifiers of 1593 each m= section. The effect of this is that the browser offers all 1594 m= sections as one bundle group. However, whether the m= sections 1595 are bundle-only or not depends on the bundle policy. 1597 The next step is to generate session-level lip sync groups as defined 1598 in [RFC5888], Section 7. For each MediaStream with more than one 1599 MediaStreamTrack, a group of type "LS" MUST be added that contains 1600 the mid values for each MediaStreamTrack in that MediaStream. 1602 Attributes which SDP permits to either be at the session level or the 1603 media level SHOULD generally be at the media level even if they are 1604 identical. This promotes readability, especially if one of a set of 1605 initially identical attributes is subsequently changed. 1607 Attributes other than the ones specified above MAY be included, 1608 except for the following attributes which are specifically 1609 incompatible with the requirements of [I-D.ietf-rtcweb-rtp-usage], 1610 and MUST NOT be included: 1612 o "a=crypto" 1614 o "a=key-mgmt" 1616 o "a=ice-lite" 1618 Note that when bundle is used, any additional attributes that are 1619 added MUST follow the advice in [I-D.ietf-mmusic-sdp-mux-attributes] 1620 on how those attributes interact with bundle. 1622 Note that these requirements are in some cases stricter than those of 1623 SDP. Implementations MUST be prepared to accept compliant SDP even 1624 if it would not conform to the requirements for generating SDP in 1625 this specification. 1627 5.2.2. Subsequent Offers 1629 When createOffer is called a second (or later) time, or is called 1630 after a local description has already been installed, the processing 1631 is somewhat different than for an initial offer. 1633 If the initial offer was not applied using setLocalDescription, 1634 meaning the PeerConnection is still in the "stable" state, the steps 1635 for generating an initial offer should be followed, subject to the 1636 following restriction: 1638 o The fields of the "o=" line MUST stay the same except for the 1639 field, which MUST increment if the session 1640 description changes in any way, including the addition of ICE 1641 candidates. 1643 If the initial offer was applied using setLocalDescription, but an 1644 answer from the remote side has not yet been applied, meaning the 1645 PeerConnection is still in the "local-offer" state, an offer is 1646 generated by following the steps in the "stable" state above, along 1647 with these exceptions: 1649 o The "s=" and "t=" lines MUST stay the same. 1651 o Each "m=" and c=" line MUST be filled in with the port, protocol, 1652 and address of the default candidate for the m= section, as 1653 described in [RFC5245], Section 4.3. If ICE checking has already 1654 completed for one or more candidate pairs and a candidate pair is 1655 in active use, then that pair MUST be used, even if ICE has not 1656 yet completed. Note that this differs from the guidance in 1657 [RFC5245], Section 9.1.2.2, which only refers to offers created 1658 when ICE has completed. Each "a=rtcp" attribute line MUST also be 1659 filled in with the port and address of the appropriate default 1660 candidate, either the default RTP or RTCP candidate, depending on 1661 whether RTCP multiplexing is currently active or not. Note that 1662 if RTCP multiplexing is being offered, but not yet active, the 1663 default RTCP candidate MUST be used, as indicated in [RFC5761], 1664 section 5.1.3. In each case, if no candidates of the desired type 1665 have yet been gathered, dummy values MUST be used, as described 1666 above. 1668 o Each "a=mid" line MUST stay the same. 1670 o Each "a=ice-ufrag" and "a=ice-pwd" line MUST stay the same, unless 1671 the ICE configuration has changed (either changes to the supported 1672 STUN/TURN servers, or the ICE candidate policy), or the 1673 "IceRestart" option (Section 5.2.3.3 was specified. If the m= 1674 section is bundled into another m= section, it still MUST NOT 1675 contain any ICE credentials. 1677 o If the m= section is not bundled into another m= section, for each 1678 candidate that has been gathered during the most recent gathering 1679 phase (see Section 3.4.1), an "a=candidate" line MUST be added, as 1680 defined in [RFC5245], Section 4.3., paragraph 3. If candidate 1681 gathering for the section has completed, an "a=end-of-candidates" 1682 attribute MUST be added, as described in [I-D.ietf-ice-trickle], 1683 Section 9.3. If the m= section is bundled into another m= 1684 section, both "a=candidate" and "a=end-of-candidates" MUST be 1685 omitted. 1687 o For MediaStreamTracks that are still present, the "a=msid", 1688 "a=ssrc", and "a=ssrc-group" lines MUST stay the same. 1690 o If any MediaStreamTracks have been removed, either through the 1691 removeStream method or by removing them from an added MediaStream, 1692 their m= sections MUST be marked as recvonly by changing the value 1693 of the [RFC3264] directional attribute to "a=recvonly". The 1694 "a=msid", "a=ssrc", and "a=ssrc-group" lines MUST be removed from 1695 the associated m= sections. 1697 o If any MediaStreamTracks have been added, and there exist m= 1698 sections of the appropriate media type with no associated 1699 MediaStreamTracks (i.e. as described in the preceding paragraph), 1700 those m= sections MUST be recycled by adding the new 1701 MediaStreamTrack to the m= section. This is done by adding the 1702 necessary "a=msid", "a=ssrc", and "a=ssrc-group" lines to the 1703 recycled m= section, and removing the "a=recvonly" attribute. 1705 If the initial offer was applied using setLocalDescription, and an 1706 answer from the remote side has been applied using 1707 setRemoteDescription, meaning the PeerConnection is in the "remote- 1708 pranswer" or "stable" states, an offer is generated based on the 1709 negotiated session descriptions by following the steps mentioned for 1710 the "local-offer" state above, along with these exceptions: 1712 o If a m= section exists in the current local description, but does 1713 not have an associated local MediaStreamTrack (possibly because 1714 said MediaStreamTrack was removed since the last exchange), a m= 1715 section MUST still be generated in the new offer, as indicated in 1716 [RFC3264], Section 8. The disposition of this section will depend 1717 on the state of the remote MediaStreamTrack associated with this 1718 m= section. If one exists, and it is still in the "live" state, 1719 the new m= section MUST be marked as "a=recvonly", with no 1720 "a=msid" or related attributes present. If no remote 1721 MediaStreamTrack exists, or it is in the "ended" state, the m= 1722 section MUST be marked as rejected, by setting the port to zero, 1723 as indicated in [RFC3264], Section 8.2. 1725 o If any MediaStreamTracks have been added, and there exist recvonly 1726 m= sections of the appropriate media type with no associated 1727 MediaStreamTracks, or rejected m= sections of any media type, 1728 those m= sections MUST be recycled, and a local MediaStreamTrack 1729 associated with these recycled m= sections until all such existing 1730 m= sections have been used. This includes any recvonly or 1731 rejected m= sections created by the preceding paragraph. 1733 In addition, for each non-recycled, non-rejected m= section in the 1734 new offer, the following adjustments are made based on the contents 1735 of the corresponding m= section in the current remote description: 1737 o The m= line and corresponding "a=rtpmap" and "a=fmtp" lines MUST 1738 only include codecs present in the remote description. 1740 o The RTP header extensions MUST only include those that are present 1741 in the remote description. 1743 o The RTCP feedback extensions MUST only include those that are 1744 present in the remote description. 1746 o The "a=rtcp-mux" line MUST only be added if present in the remote 1747 description. 1749 o The "a=rtcp-rsize" line MUST only be added if present in the 1750 remote description. 1752 The "a=group:BUNDLE" attribute MUST include the mid identifiers 1753 specified in the bundle group in the most recent answer, minus any m= 1754 sections that have been marked as rejected, plus any newly added or 1755 re-enabled m= sections. In other words, the bundle attribute must 1756 contain all m= sections that were previously bundled, as long as they 1757 are still alive, as well as any new m= sections. 1759 The "LS" groups are generated in the same way as with initial offers. 1761 5.2.3. Options Handling 1763 The createOffer method takes as a parameter an RTCOfferOptions 1764 object. Special processing is performed when generating a SDP 1765 description if the following options are present. 1767 5.2.3.1. OfferToReceiveAudio 1769 If the "OfferToReceiveAudio" option is specified, with an integer 1770 value of N, and M audio MediaStreamTracks have been added to the 1771 PeerConnection, the offer MUST include N non-rejected m= sections 1772 with media type "audio", even if N is greater than M. This allows 1773 the offerer to receive audio, including multiple independent streams, 1774 even when not sending it; accordingly, the directional attribute on 1775 the N-M audio m= sections without associated MediaStreamTracks MUST 1776 be set to recvonly. 1778 If N is set to a value less than M, the offer MUST mark the m= 1779 sections associated with the M-N most recently added (since the last 1780 setLocalDescription) MediaStreamTracks as sendonly. This allows the 1781 offerer to indicate that it does not want to receive audio on some or 1782 all of its newly created streams. For m= sections that have 1783 previously been negotiated, this setting has no effect. [TODO: refer 1784 to RTCRtpSender in the future] 1785 For backwards compatibility with pre-standard versions of this 1786 specification, a value of "true" is interpreted as equivalent to N=1, 1787 and "false" as N=0. 1789 5.2.3.2. OfferToReceiveVideo 1791 If the "OfferToReceiveVideo" option is specified, with an integer 1792 value of N, and M video MediaStreamTracks have been added to the 1793 PeerConnection, the offer MUST include N non-rejected m= sections 1794 with media type "video", even if N is greater than M. This allows 1795 the offerer to receive video, including multiple independent streams, 1796 even when not sending it; accordingly, the directional attribute on 1797 the N-M video m= sections without associated MediaStreamTracks MUST 1798 be set to recvonly. 1800 If N is set to a value less than M, the offer MUST mark the m= 1801 sections associated with the M-N most recently added (since the last 1802 setLocalDescription) MediaStreamTracks as sendonly. This allows the 1803 offerer to indicate that it does not want to receive video on some or 1804 all of its newly created streams. For m= sections that have 1805 previously been negotiated, this setting has no effect. [TODO: refer 1806 to RTCRtpSender in the future] 1808 For backwards compatibility with pre-standard versions of this 1809 specification, a value of "true" is interpreted as equivalent to N=1, 1810 and "false" as N=0. 1812 5.2.3.3. IceRestart 1814 If the "IceRestart" option is specified, with a value of "true", the 1815 offer MUST indicate an ICE restart by generating new ICE ufrag and 1816 pwd attributes, as specified in [RFC5245], Section 9.1.1.1. If this 1817 option is specified on an initial offer, it has no effect (since a 1818 new ICE ufrag and pwd are already generated). Similarly, if the ICE 1819 configuration has changed, this option has no effect, since new ufrag 1820 and pwd attributes will be generated automatically. This option is 1821 primarily useful for reestablishing connectivity in cases where 1822 failures are detected by the application. 1824 5.2.3.4. VoiceActivityDetection 1826 If the "VoiceActivityDetection" option is specified, with a value of 1827 "true", the offer MUST indicate support for silence suppression in 1828 the audio it receives by including comfort noise ("CN") codecs for 1829 each offered audio codec, as specified in [RFC3389], Section 5.1, 1830 except for codecs that have their own internal silence suppression 1831 support. For codecs that have their own internal silence suppression 1832 support, the appropriate fmtp parameters for that codec MUST be 1833 specified to indicate that silence suppression for received audio is 1834 desired. For example, when using the Opus codec, the "usedtx=1" 1835 parameter would be specified in the offer. This option allows the 1836 endpoint to significantly reduce the amount of audio bandwidth it 1837 receives, at the cost of some fidelity, depending on the quality of 1838 the remote VAD algorithm. 1840 If the "VoiceActivityDetection" option is specified, with a value of 1841 "false", the browser MUST NOT emit "CN" codecs. For codecs that have 1842 their own internal silence suppression support, the appropriate fmtp 1843 parameters for that codec MUST be specified to indicate that silence 1844 suppression for received audio is not desired. For example, when 1845 using the Opus codec, the "usedtx=0" parameter would be specified in 1846 the offer. 1848 Note that setting the "VoiceActivityDetection" parameter when 1849 generating an offer is a request to receive audio with silence 1850 suppression. It has no impact on whether the local endpoint does 1851 silence suppression for the audio it sends. 1853 The "VoiceActivityDetection" option does not have any impact on the 1854 setting of the "vad" value in the signaling of the client to mixer 1855 audio level header extension described in [RFC6464], Section 4. 1857 5.3. Generating an Answer 1859 When createAnswer is called, a new SDP description must be created 1860 that is compatible with the supplied remote description as well as 1861 the requirements specified in [I-D.ietf-rtcweb-rtp-usage]. The exact 1862 details of this process are explained below. 1864 5.3.1. Initial Answers 1866 When createAnswer is called for the first time after a remote 1867 description has been provided, the result is known as the initial 1868 answer. If no remote description has been installed, an answer 1869 cannot be generated, and an error MUST be returned. 1871 Note that the remote description SDP may not have been created by a 1872 JSEP endpoint and may not conform to all the requirements listed in 1873 Section 5.2. For many cases, this is not a problem. However, if any 1874 mandatory SDP attributes are missing, or functionality listed as 1875 mandatory-to-use above is not present, this MUST be treated as an 1876 error, and MUST cause the affected m= sections to be marked as 1877 rejected. 1879 The first step in generating an initial answer is to generate 1880 session-level attributes. The process here is identical to that 1881 indicated in the Initial Offers section above, except that the 1882 "a=ice-options" line, with the "trickle" option as specified in 1883 [I-D.ietf-ice-trickle], Section 4, is only included if such an option 1884 was present in the offer. 1886 The next step is to generate lip sync groups as defined in [RFC5888], 1887 Section 7. For each MediaStream with more than one MediaStreamTrack, 1888 a group of type "LS" MUST be added that contains the mid values for 1889 each MediaStreamTrack in that MediaStream. In some cases this may 1890 result in adding a mid to a given LS group that was not in that LS 1891 group in the associated offer. Although this is not allowed by 1892 [RFC5888], it is allowed when implementing this specification. 1893 [[OPEN ISSUE: This is still under discussion. See: 1894 https://github.com/rtcweb-wg/jsep/issues/162.]] 1896 The next step is to generate m= sections for each m= section that is 1897 present in the remote offer, as specified in [RFC3264], Section 6. 1898 For the purposes of this discussion, any session-level attributes in 1899 the offer that are also valid as media-level attributes SHALL be 1900 considered to be present in each m= section. 1902 The next step is to go through each offered m= section. If there is 1903 a local MediaStreamTrack of the same type which has been added to the 1904 PeerConnection via addStream and not yet associated with a m= 1905 section, and the specific m= section is either sendrecv or recvonly, 1906 the MediaStreamTrack will be associated with the m= section at this 1907 time. MediaStreamTracks are assigned to m= sections using the 1908 canonical order described in Section 5.2.1. If there are more m= 1909 sections of a certain type than MediaStreamTracks, some m= sections 1910 will not have an associated MediaStreamTrack. If there are more 1911 MediaStreamTracks of a certain type than compatible m= sections, only 1912 the first N MediaStreamTracks will be able to be associated in the 1913 constructed answer. The remainder will need to be associated in a 1914 subsequent offer. 1916 For each offered m= section, if the associated remote 1917 MediaStreamTrack has been stopped, and is therefore in state "ended", 1918 and no local MediaStreamTrack has been associated, the corresponding 1919 m= section in the answer MUST be marked as rejected by setting the 1920 port in the m= line to zero, as indicated in [RFC3264], Section 6., 1921 and further processing for this m= section can be skipped. 1923 Provided that is not the case, each m= section in the answer should 1924 then be generated as specified in [RFC3264], Section 6.1. For the m= 1925 line itself, the following rules must be followed: 1927 o The port value would normally be set to the port of the default 1928 ICE candidate for this m= section, but given that no candidates 1929 have yet been gathered, the "dummy" port value of 9 (Discard) MUST 1930 be used, as indicated in [I-D.ietf-ice-trickle], Section 5.1. 1932 o The field MUST be set to exactly match the field 1933 for the corresponding m= line in the offer. 1935 The m= line MUST be followed immediately by a "c=" line, as specified 1936 in [RFC4566], Section 5.7. Again, as no candidates have yet been 1937 gathered, the "c=" line must contain the "dummy" value "IN IP4 1938 0.0.0.0", as defined in [I-D.ietf-ice-trickle], Section 5.1. 1940 If the offer supports bundle, all m= sections to be bundled must use 1941 the same ICE credentials and candidates; all m= sections not being 1942 bundled must use unique ICE credentials and candidates. Each m= 1943 section MUST include the following: 1945 o If and only if present in the offer, an "a=mid" line, as specified 1946 in [RFC5888], Section 9.1. The "mid" value MUST match that 1947 specified in the offer. 1949 o An "a=rtcp" line, as specified in [RFC3605], Section 2.1, 1950 containing the dummy value "9 IN IP4 0.0.0.0", because no 1951 candidates have yet been gathered. 1953 o If a local MediaStreamTrack has been associated, an "a=msid" line, 1954 as specified in [I-D.ietf-mmusic-msid], Section 2. 1956 o Depending on the directionality of the offer, the disposition of 1957 any associated remote MediaStreamTrack, and the presence of an 1958 associated local MediaStreamTrack, the appropriate directionality 1959 attribute, as specified in [RFC3264], Section 6.1. If the offer 1960 was sendrecv, and the remote MediaStreamTrack is still "live", and 1961 there is a local MediaStreamTrack that has been associated, the 1962 directionality MUST be set as sendrecv. If the offer was 1963 sendonly, and the remote MediaStreamTrack is still "live", the 1964 directionality MUST be set as recvonly. If the offer was 1965 recvonly, and a local MediaStreamTrack has been associated, the 1966 directionality MUST be set as sendonly. If the offer was 1967 inactive, the directionality MUST be set as inactive. 1969 o For each supported codec that is present in the offer, "a=rtpmap" 1970 and "a=fmtp" lines, as specified in [RFC4566], Section 6, and 1971 [RFC3264], Section 6.1. The audio and video codecs that MUST be 1972 supported are specified in [I-D.ietf-rtcweb-audio] (see Section 3) 1973 and [I-D.ietf-rtcweb-video] (see Section 5). 1975 o If this m= section is for media with configurable frame sizes, 1976 e.g. audio, an "a=maxptime" line, indicating the smallest of the 1977 maximum supported frame sizes out of all codecs included above, as 1978 specified in [RFC4566], Section 6. 1980 o If this m= section is for video media, and there are known 1981 limitations on the size of images which can be decoded, an 1982 "a=imageattr" line, as specified in Section 3.5. 1984 o If "rtx" is present in the offer, for each primary codec where RTP 1985 retransmission should be used, a corresponding "a=rtpmap" line 1986 indicating "rtx" with the clock rate of the primary codec and an 1987 "a=fmtp" line that references the payload type of the primary 1988 codec, as specified in [RFC4588], Section 8.1. 1990 o For each supported FEC mechanism, "a=rtpmap" and "a=fmtp" lines, 1991 as specified in [RFC4566], Section 6. The FEC mechanisms that 1992 MUST be supported are specified in [I-D.ietf-rtcweb-fec], 1993 Section 6, and specific usage for each media type is outlined in 1994 Sections 4 and 5. 1996 o "a=ice-ufrag" and "a=ice-pwd" lines, as specified in [RFC5245], 1997 Section 15.4. 1999 o An "a=fingerprint" line for each of the endpoint's certificates, 2000 as specified in [RFC4572], Section 5; the digest algorithm used 2001 for the fingerprint MUST match that used in the certificate 2002 signature. 2004 o An "a=setup" line, as specified in [RFC4145], Section 4, and 2005 clarified for use in DTLS-SRTP scenarios in [RFC5763], Section 5. 2006 The role value in the answer MUST be "active" or "passive"; the 2007 "active" role is RECOMMENDED. 2009 o If present in the offer, an "a=rtcp-mux" line, as specified in 2010 [RFC5761], Section 5.1.1. If the "require" RTCP multiplexing 2011 policy is set and no "a=rtcp-mux" line is present in the offer, 2012 then the m=line MUST be marked as rejected by setting the port in 2013 the m= line to zero, as indicated in [RFC3264], Section 6. 2015 o If present in the offer, an "a=rtcp-rsize" line, as specified in 2016 [RFC5506], Section 5. 2018 o For each supported RTP header extension that is present in the 2019 offer, an "a=extmap" line, as specified in [RFC5285], Section 5. 2020 The list of header extensions that SHOULD/MUST be supported is 2021 specified in [I-D.ietf-rtcweb-rtp-usage], Section 5.2. Any header 2022 extensions that require encryption MUST be specified as indicated 2023 in [RFC6904], Section 4. 2025 o For each supported RTCP feedback mechanism that is present in the 2026 offer, an "a=rtcp-fb" mechanism, as specified in [RFC4585], 2027 Section 4.2. The list of RTCP feedback mechanisms that SHOULD/ 2028 MUST be supported is specified in [I-D.ietf-rtcweb-rtp-usage], 2029 Section 5.1. 2031 o If a local MediaStreamTrack has been associated, an "a=ssrc" line, 2032 as specified in [RFC5576], Section 4.1, indicating the SSRC to be 2033 used for sending media, along with the mandatory "cname" source 2034 attribute, as specified in Section 6.1, indicating the CNAME for 2035 the source. The CNAME MUST be generated in accordance with 2036 Section 4.9 of [I-D.ietf-rtcweb-rtp-usage]. 2038 o If a local MediaStreamTrack has been associated, and RTX has been 2039 negotiated for this m= section, another "a=ssrc" line with the RTX 2040 SSRC, and an "a=ssrc-group" line, as specified in [RFC5576], 2041 section 4.2, with semantics set to "FID" and including the primary 2042 and RTX SSRCs. 2044 o If a local MediaStreamTrack has been associated, and FEC has been 2045 negotiated for this m= section, another "a=ssrc" line with the FEC 2046 SSRC, and an "a=ssrc-group" line with semantics set to "FEC-FR" 2047 and including the primary and FEC SSRCs, as specified in 2048 [RFC5956], section 4.3. For simplicity, if both RTX and FEC are 2049 supported, the FEC SSRC MUST be the same as the RTX SSRC. 2051 If a data channel m= section has been offered, a m= section MUST also 2052 be generated for data. The field MUST be set to 2053 "application" and the and "fmt" fields MUST be set to exactly 2054 match the fields in the offer. 2056 Within the data m= section, the "a=mid", "a=ice-ufrag", "a=ice-pwd", 2057 "a=candidate", "a=fingerprint", and "a=setup" lines MUST be included 2058 as mentioned above, along with an "a=fmtp:webrtc-datachannel" line 2059 and an "a=sctp-port" line referencing the SCTP port number as defined 2060 in [I-D.ietf-mmusic-sctp-sdp], Section 4.1. 2062 If "a=group" attributes with semantics of "BUNDLE" are offered, 2063 corresponding session-level "a=group" attributes MUST be added as 2064 specified in [RFC5888]. These attributes MUST have semantics 2065 "BUNDLE", and MUST include the all mid identifiers from the offered 2066 bundle groups that have not been rejected. Note that regardless of 2067 the presence of "a=bundle-only" in the offer, no m= sections in the 2068 answer should have an "a=bundle-only" line. 2070 Attributes that are common between all m= sections MAY be moved to 2071 session-level, if explicitly defined to be valid at session-level. 2073 The attributes prohibited in the creation of offers are also 2074 prohibited in the creation of answers. 2076 5.3.2. Subsequent Answers 2078 When createAnswer is called a second (or later) time, or is called 2079 after a local description has already been installed, the processing 2080 is somewhat different than for an initial answer. 2082 If the initial answer was not applied using setLocalDescription, 2083 meaning the PeerConnection is still in the "have-remote-offer" state, 2084 the steps for generating an initial answer should be followed, 2085 subject to the following restriction: 2087 o The fields of the "o=" line MUST stay the same except for the 2088 field, which MUST increment if the session 2089 description changes in any way from the previously generated 2090 answer. 2092 If any session description was previously supplied to 2093 setLocalDescription, an answer is generated by following the steps in 2094 the "have-remote-offer" state above, along with these exceptions: 2096 o The "s=" and "t=" lines MUST stay the same. 2098 o Each "m=" and c=" line MUST be filled in with the port and address 2099 of the default candidate for the m= section, as described in 2100 [RFC5245], Section 4.3. Note, however, that the m= line protocol 2101 need not match the default candidate, because this protocol value 2102 must instead match what was supplied in the offer, as described 2103 above. Each "a=rtcp" attribute line MUST also be filled in with 2104 the port and address of the appropriate default candidate, either 2105 the default RTP or RTCP candidate, depending on whether RTCP 2106 multiplexing is enabled in the answer. In each case, if no 2107 candidates of the desired type have yet been gathered, dummy 2108 values MUST be used, as described in the initial answer section 2109 above. 2111 o Each "a=ice-ufrag" and "a=ice-pwd" line MUST stay the same, unless 2112 the m= section is restarting, in which case new ICE credentials 2113 must be created as specified in [RFC5245], Section 9.2.1.1. If 2114 the m= section is bundled into another m= section, it still MUST 2115 NOT contain any ICE credentials. 2117 o If the m= section is not bundled into another m= section, for each 2118 candidate that has been gathered during the most recent gathering 2119 phase (see Section 3.4.1), an "a=candidate" line MUST be added, as 2120 defined in [RFC5245], Section 4.3., paragraph 3. If candidate 2121 gathering for the section has completed, an "a=end-of-candidates" 2122 attribute MUST be added, as described in [I-D.ietf-ice-trickle], 2123 Section 9.3. If the m= section is bundled into another m= 2124 section, both "a=candidate" and "a=end-of-candidates" MUST be 2125 omitted. 2127 o For MediaStreamTracks that are still present, the "a=msid", 2128 "a=ssrc", and "a=ssrc-group" lines MUST stay the same. 2130 5.3.3. Options Handling 2132 The createAnswer method takes as a parameter an RTCAnswerOptions 2133 object. The set of parameters for RTCAnswerOptions is different than 2134 those supported in RTCOfferOptions; the OfferToReceiveAudio, 2135 OfferToReceiveVideo, and IceRestart options mentioned in 2136 Section 5.2.3 are meaningless in the context of generating an answer, 2137 as there is no need to generate extra m= lines in an answer, and ICE 2138 credentials will automatically be changed for all m= lines where the 2139 offerer chose to perform ICE restart. 2141 The following options are supported in RTCAnswerOptions. 2143 5.3.3.1. VoiceActivityDetection 2145 Silence suppression in the answer is handled as described in 2146 Section 5.2.3.4, with one exception: if support for silence 2147 suppression was not indicated in the offer, the 2148 VoiceActivityDetection parameter has no effect, and the answer should 2149 be generated as if VoiceActivityDetection was set to false. This is 2150 done on a per-codec basis (e.g., if the offerer somehow offered 2151 support for CN but set "usedtx=0" for Opus, setting 2152 VoiceActivityDetection to true would result in an answer with CN 2153 codecs and "usedtx=0"). 2155 5.4. Processing a Local Description 2157 When a SessionDescription is supplied to setLocalDescription, the 2158 following steps MUST be performed: 2160 o First, the type of the SessionDescription is checked against the 2161 current state of the PeerConnection: 2163 * If the type is "offer", the PeerConnection state MUST be either 2164 "stable" or "have-local-offer". 2166 * If the type is "pranswer" or "answer", the PeerConnection state 2167 MUST be either "have-remote-offer" or "have-local-pranswer". 2169 o If the type is not correct for the current state, processing MUST 2170 stop and an error MUST be returned. 2172 o Next, the SessionDescription is parsed into a data structure, as 2173 described in the Section 5.6 section below. If parsing fails for 2174 any reason, processing MUST stop and an error MUST be returned. 2176 o Finally, the parsed SessionDescription is applied as described in 2177 the Section 5.7 section below. 2179 5.5. Processing a Remote Description 2181 When a SessionDescription is supplied to setRemoteDescription, the 2182 following steps MUST be performed: 2184 o First, the type of the SessionDescription is checked against the 2185 current state of the PeerConnection: 2187 * If the type is "offer", the PeerConnection state MUST be either 2188 "stable" or "have-remote-offer". 2190 * If the type is "pranswer" or "answer", the PeerConnection state 2191 MUST be either "have-local-offer" or "have-remote-pranswer". 2193 o If the type is not correct for the current state, processing MUST 2194 stop and an error MUST be returned. 2196 o Next, the SessionDescription is parsed into a data structure, as 2197 described in the Section 5.6 section below. If parsing fails for 2198 any reason, processing MUST stop and an error MUST be returned. 2200 o Finally, the parsed SessionDescription is applied as described in 2201 the Section 5.8 section below. 2203 5.6. Parsing a Session Description 2205 When a SessionDescription of any type is supplied to setLocal/ 2206 RemoteDescription, the implementation must parse it and reject it if 2207 it is invalid. The exact details of this process are explained 2208 below. 2210 The SDP contained in the session description object consists of a 2211 sequence of text lines, each containing a key-value expression, as 2212 described in [RFC4566], Section 5. The SDP is read, line-by-line, 2213 and converted to a data structure that contains the deserialized 2214 information. However, SDP allows many types of lines, not all of 2215 which are relevant to JSEP applications. For each line, the 2216 implementation will first ensure it is syntactically correct 2217 according its defining ABNF, check that it conforms to [RFC4566] and 2218 [RFC3264] semantics, and then either parse and store or discard the 2219 provided value, as described below. A partial list of ABNF 2220 definitions for SDP attributes can found in: 2222 +---------------------------+------------------------------------+ 2223 | Attribute | Reference | 2224 +---------------------------+------------------------------------+ 2225 | ptime | [RFC4566] Section 9 | 2226 | maxptime | [RFC4566] Section 9 | 2227 | rtpmap | [RFC4566] Section 9 | 2228 | recvonly | [RFC4566] Section 9 | 2229 | sendrecv | [RFC4566] Section 9 | 2230 | sendonly | [RFC4566] Section 9 | 2231 | inactive | [RFC4566] Section 9 | 2232 | framerate | [RFC4566] Section 9 | 2233 | fmtp | [RFC4566] Section 9 | 2234 | quality | [RFC4566] Section 9 | 2235 | msid | [I-D.ietf-mmusic-msid] Section 2 | 2236 | rtcp | [RFC3605] Section 2.1 | 2237 | setup | [RFC4145] Section 3, 4, and 5 | 2238 | connection | [RFC4145] Section 3, 4, and 5 | 2239 | fingerprint | [RFC4572] Section 5 | 2240 | rtcp-fb | [RFC4585] Section 4.2 | 2241 | candidate | [RFC5245] Section 15 | 2242 | extmap | [RFC5285] Section 7 | 2243 | mid | [RFC5888] Section 4 and 5 | 2244 | group | [RFC5888] Section 4 and 5 | 2245 | imageattr | [RFC6236] Section 3.1 | 2246 | extmap (encrypt option) | [RFC6904] Section 4 | 2247 +---------------------------+------------------------------------+ 2249 Table 1: SDP ABNF References 2251 [TODO: ensure that every line is listed below.] 2253 If the line is not well-formed, or cannot be parsed as described, the 2254 parser MUST stop with an error and reject the session description. 2255 This ensures that implementations do not accidentally misinterpret 2256 ambiguous SDP. 2258 5.6.1. Session-Level Parsing 2260 First, the session-level lines are checked and parsed. These lines 2261 MUST occur in a specific order, and with a specific syntax, as 2262 defined in [RFC4566], Section 5. Note that while the specific line 2263 types (e.g. "v=", "c=") MUST occur in the defined order, lines of the 2264 same type (typically "a=") can occur in any order, and their ordering 2265 is not meaningful. 2267 For non-attribute (non-"a=") lines, their sequencing, syntax, and 2268 semantics, are checked, as mentioned above. The following lines are 2269 not meaningful in the JSEP context and MAY be discarded once they 2270 have been checked. 2272 The "c=" line MUST be checked for syntax but its value is not 2273 used. This supersedes the guidance in [RFC5245], Section 6.1, to 2274 use "ice-mismatch" to indicate mismatches between "c=" and the 2275 candidate lines; because JSEP always uses ICE, "ice-mismatch" is 2276 not useful in this context. 2278 The "i=", "u=", "e=", "p=", "t=", "r=", "z=", and "k=" lines are 2279 not used by this specification; they MUST be checked for syntax 2280 but their values are not used. 2282 The remaining lines are processed as follows: 2284 The "v=" line MUST have a version of 0, as specified in [RFC4566], 2285 Section 5.1. 2287 The "o=" line MUST be parsed as specified in [RFC4566], 2288 Section 5.2. 2290 The "b=" line, if present, MUST be parsed as specified in 2291 [RFC4566], Section 5.8, and the bwtype and bandwidth values 2292 stored. 2294 Specific processing MUST be applied for the following session-level 2295 attribute ("a=") lines: 2297 o Any "a=group" lines are parsed as specified in [RFC5888], 2298 Section 5, and the group's semantics and mids are stored. 2300 o If present, a single "a=ice-lite" line is parsed as specified in 2301 [RFC5245], Section 15.3, and a value indicating the presence of 2302 ice-lite is stored. 2304 o If present, a single "a=ice-ufrag" line is parsed as specified in 2305 [RFC5245], Section 15.4, and the ufrag value is stored. 2307 o If present, a single "a=ice-pwd" line is parsed as specified in 2308 [RFC5245], Section 15.4, and the password value is stored. 2310 o If present, a single "a=ice-options" line is parsed as specified 2311 in [RFC5245], Section 15.5, and the set of specified options is 2312 stored. 2314 o Any "a=fingerprint" lines are parsed as specified in [RFC4572], 2315 Section 5, and the set of fingerprint and algorithm values is 2316 stored. 2318 o If present, a single "a=setup" line is parsed as specified in 2319 [RFC4145], Section 4, and the setup value is stored. 2321 o Any "a=extmap" lines are parsed as specified in [RFC5285], 2322 Section 5, and their values are stored. 2324 o TODO: identity, rtcp-rsize, rtcp-mux, and any other attribs valid 2325 at session level. 2327 Once all the session-level lines have been parsed, processing 2328 continues with the lines in media sections. 2330 5.6.2. Media Section Parsing 2332 Like the session-level lines, the media session lines MUST occur in 2333 the specific order and with the specific syntax defined in [RFC4566], 2334 Section 5. 2336 The "m=" line itself MUST be parsed as described in [RFC4566], 2337 Section 5.14, and the media, port, proto, and fmt values stored. 2339 Following the "m=" line, specific processing MUST be applied for the 2340 following non-attribute lines: 2342 o As with the "c=" line at the session level, the "c=" line MUST be 2343 parsed according to [RFC4566], Section 5.7, but its value is not 2344 used. 2346 o The "b=" line, if present, MUST be parsed as specified in 2347 [RFC4566], Section 5.8, and the bwtype and bandwidth values 2348 stored. 2350 Specific processing MUST also be applied for the following attribute 2351 lines: 2353 o If present, a single "a=ice-ufrag" line is parsed as specified in 2354 [RFC5245], Section 15.4, and the ufrag value is stored. 2356 o If present, a single "a=ice-pwd" line is parsed as specified in 2357 [RFC5245], Section 15.4, and the password value is stored. 2359 o If present, a single "a=ice-options" line is parsed as specified 2360 in [RFC5245], Section 15.5, and the set of specified options is 2361 stored. 2363 o Any "a=fingerprint" lines are parsed as specified in [RFC4572], 2364 Section 5, and the set of fingerprint and algorithm values is 2365 stored. 2367 o If present, a single "a=setup" line is parsed as specified in 2368 [RFC4145], Section 4, and the setup value is stored. 2370 If the "m=" proto value indicates use of RTP, as decribed in the 2371 Section 5.1.3 section above, the following attribute lines MUST be 2372 processed: 2374 o The "m=" fmt value MUST be parsed as specified in [RFC4566], 2375 Section 5.14, and the individual values stored. 2377 o Any "a=rtpmap" or "a=fmtp" lines MUST be parsed as specified in 2378 [RFC4566], Section 6, and their values stored. 2380 o If present, a single "a=ptime" line MUST be parsed as described in 2381 [RFC4566], Section 6, and its value stored. 2383 o If present, a single "a=maxptime" line MUST be parsed as described 2384 in [RFC4566], Section 6, and its value stored. 2386 o If present, a single direction attribute line (e.g. "a=sendrecv") 2387 MUST be parsed as described in [RFC4566], Section 6, and its value 2388 stored. 2390 o Any "a=ssrc" or "a=ssrc-group" attributes MUST be parsed as 2391 specified in [RFC5576], Sections 4.1-4.2, and their values stored. 2393 o Any "a=extmap" attributes MUST be parsed as specified in 2394 [RFC5285], Section 5, and their values stored. 2396 o Any "a=rtcp-fb" attributes MUST be parsed as specified in 2397 [RFC4585], Section 4.2., and their values stored. 2399 o If present, a single "a=rtcp-mux" attribute MUST be parsed as 2400 specified in [RFC5761], Section 5.1.1, and its presence or absence 2401 flagged and stored. 2403 o If present, a single "a=rtcp-rsize" attribute MUST be parsed as 2404 specified in [RFC5506], Section 5, and its presence or absence 2405 flagged and stored. 2407 o If present, a single "a=rtcp" attribute MUST be parsed as 2408 specified in [RFC3605], Section 2.1, but its value is ignored. 2410 o If present, a single "a=msid" attribute MUST be parsed as 2411 specified in [I-D.ietf-mmusic-msid], Section 3.2, and its value 2412 stored. 2414 o Any "a=candidate" attributes MUST be parsed as specified in 2415 [RFC5245], Section 4.3, and their values stored. 2417 o Any "a=remote-candidates" attributes MUST be parsed as specified 2418 in [RFC5245], Section 4.3, but their values are ignored. 2420 o If present, a single "a=end-of-candidates" attribute MUST be 2421 parsed as specified in [I-D.ietf-ice-trickle], Section 8.2, and 2422 its presence or absence flagged and stored. 2424 o Any "a=imageattr" attributes MUST be parsed as specified in 2425 [RFC6236], Section 3, and their values stored. 2427 Otherwise, if the "m=" proto value indicates use of SCTP, the 2428 following attribute lines MUST be processed: 2430 o The "m=" fmt value MUST be parsed as specified in 2431 [I-D.ietf-mmusic-sctp-sdp], Section 4.3, and the application 2432 protocol value stored. 2434 o An "a=sctp-port" attribute MUST be present, and it MUST be parsed 2435 as specified in [I-D.ietf-mmusic-sctp-sdp], Section 5.2, and the 2436 value stored. 2438 o If present, a single "a=max-message-size" attribute MUST be parsed 2439 as specified in [I-D.ietf-mmusic-sctp-sdp], Section 6, and the 2440 value stored. Otherwise, use the specified default. 2442 5.6.3. Semantics Verification 2444 Assuming parsing completes successfully, the parsed description is 2445 then evaluated to ensure internal consistency as well as proper 2446 support for mandatory features. Specifically, the following checks 2447 are performed: 2449 o For each m= section, valid values for each of the mandatory-to-use 2450 features enumerated in Section 5.1.2 MUST be present. These 2451 values MAY either be present at the media level, or inherited from 2452 the session level. 2454 * ICE ufrag and password values, which MUST comply with the size 2455 limits specified in [RFC5245], Section 15.4. 2457 * DTLS setup value, which MUST be set according to the rules 2458 specified in [RFC5763], Section 5, and MUST be consistent with 2459 the selected role of the current DTLS connection, if one 2460 exists.[TODO: may need revision, i.e., use of actpass 2462 * DTLS fingerprint values, where at least one fingerprint MUST be 2463 present. 2465 o Each m= section is also checked to ensure prohibited features are 2466 not used. If this is a local description, the "ice-lite" 2467 attribute MUST NOT be specified. 2469 If this session description is of type "pranswer" or "answer", the 2470 following additional checks are applied: 2472 o The session description must follow the rules defined in 2473 [RFC3264], Section 6, including the requirement that the number of 2474 m= sections MUST exactly match the number of m= sections in the 2475 associated offer. 2477 o For each m= section, the media type and protocol values MUST 2478 exactly match the media type and protocol values in the 2479 corresponding m= section in the associated offer. 2481 5.7. Applying a Local Description 2483 The following steps are performed at the media engine level to apply 2484 a local description. 2486 First, the parsed parameters are checked to ensure that any 2487 modifications performed fall within those explicitly permitted by 2488 Section 6; otherwise, processing MUST stop and an error MUST be 2489 returned. 2491 Next, media sections are processed. For each media section, the 2492 following steps MUST be performed; if any parameters are out of 2493 bounds, or cannot be applied, processing MUST stop and an error MUST 2494 be returned. 2496 o If this media section is new, begin gathering candidates for it, 2497 as defined in [RFC5245], Section 4.1.1, unless it has been marked 2498 as bundle-only. 2500 o Or, if the ICE ufrag and password values have changed, trigger the 2501 ICE Agent to start an ICE restart and begin gathering new 2502 candidates for the media section, as defined in [RFC5245], 2503 Section 9.1.1.1, unless it has been marked as bundle-only. 2505 o If the media section proto value indicates use of RTP: 2507 * If RTCP mux is indicated, prepare to demux RTP and RTCP from 2508 the RTP ICE component, as specified in [RFC5761], 2509 Section 5.1.1. If RTCP mux is not indicated, but was indicated 2510 in a previous description, this MUST result in an error. 2512 * For each specified RTP header extension, establish a mapping 2513 between the extension ID and URI, as described in section 6 of 2514 [RFC5285]. If any indicated RTP header extension is unknown, 2515 this MUST result in an error. 2517 * If the MID header extension is supported, prepare to demux RTP 2518 data intended for this media section based on the MID header 2519 extension, as described in [I-D.ietf-mmusic-msid], Section 3.2. 2521 * For each specified payload type, establish a mapping between 2522 the payload type ID and the actual media format, as descibed in 2523 [RFC3264]. If any indicated payload type is unknown, this MUST 2524 result in an error. 2526 * For each specified "rtx" media format, establish a mapping 2527 between the RTX payload type and its associated primary payload 2528 type, as described in [RFC4588], Sections 8.6 and 8.7. If any 2529 referenced primary payload types are not present, this MUST 2530 result in an error. 2532 * If the directional attribute is of type "sendrecv" or 2533 "recvonly", enable receipt and decoding of media. 2535 Finally, if this description is of type "pranswer" or "answer", 2536 follow the processing defined in the Section 5.9 section below. 2538 5.8. Applying a Remote Description 2540 If the answer contains any "a=ice-options" attributes where "trickle" 2541 is listed as an attribute, update the PeerConnection canTrickle 2542 property to be true. Otherwise, set this property to false. 2544 The following steps are performed at the media engine level to apply 2545 a remote description. 2547 The following steps MUST be performed for attributes at the session 2548 level; if any parameters are out of bounds, or cannot be applied, 2549 processing MUST stop and an error MUST be returned. 2551 o For any specified "CT" bandwidth value, set this as the limit for 2552 the maximum total bitrate for all m= sections, as specified in 2553 Section 5.8 of [RFC4566]. The implementation can decide how to 2554 allocate the available bandwidth between m= sections to 2555 simultaneously meet any limits on individual m= sections, as well 2556 as this overall session limit. 2558 o For any specified "RR" or "RS" bandwidth values, handle as 2559 specified in [RFC3556], Section 2. 2561 o Any "AS" bandwidth value MUST be ignored, as the meaning of this 2562 construct at the session level is not well defined. 2564 For each media section, the following steps MUST be performed; if any 2565 parameters are out of bounds, or cannot be applied, processing MUST 2566 stop and an error MUST be returned. 2568 o If the description is of type "offer", and the ICE ufrag or 2569 password changed from the previous remote description, as 2570 described in Section 9.1.1.1 of [RFC5245], mark that an ICE 2571 restart is needed. 2573 o Configure the ICE components associated with this media section to 2574 use the supplied ICE remote ufrag and password for their 2575 connectivity checks. 2577 o Pair any supplied ICE candidates with any gathered local 2578 candidates, as described in Section 5.7 of [RFC5245] and start 2579 connectivity checks with the appropriate credentials. 2581 o If the media section proto value indicates use of RTP: 2583 * [TODO: header extensions] 2585 * For each specified payload type that is also supported by the 2586 local implementation, establish a mapping between the payload 2587 type ID and the actual media format. [TODO - Justin to add 2588 more to explain mapping.] If any indicated payload type is 2589 unknown, it MUST be ignored. [TODO: should fail on answers] 2591 * For each specified "rtx" media format, establish a mapping 2592 between the RTX payload type and its associated primary payload 2593 type, as described in [RFC4588]. If any referenced primary 2594 payload types are not present, this MUST result in an error. 2596 * For each specified fmtp parameter that is supported by the 2597 local implementation, enable them on the associated payload 2598 types. 2600 * For each specified RTCP feedback mechanism that is supported by 2601 the local implementation, enable them on the associated payload 2602 types. 2604 * For any specified "TIAS" bandwidth value, set this value as a 2605 constraint on the maximum RTP bitrate to be used when sending 2606 media, as specified in [RFC3890]. If a "TIAS" value is not 2607 present, but an "AS" value is specified, generate a "TIAS" 2608 value using this formula: 2610 TIAS = AS * 0.95 - 50 * 40 * 8 2612 The 50 is based on 50 packets per second, the 40 is based on an 2613 estimate of total header size, and the 0.95 is to allocate 5% 2614 to RTCP. If more accurate control of bandwidth is needed, 2615 "TIAS" should be used instead of "AS". 2617 * For any "RR" or "RS" bandwidth values, handle as specified in 2618 [RFC3556], Section 2. 2620 * Any specified "CT" bandwidth value MUST be ignored, as the 2621 meaning of this construct at the media level is not well 2622 defined. 2624 * [TODO: handling of CN, telephone-event, "red"] 2626 * If the media section if of type audio: 2628 + For any specified "ptime" value, configure the available 2629 payload types to use the specified packet size. If the 2630 specified size is not supported for a payload type, use the 2631 next closest value instead. 2633 Finally, if this description is of type "pranswer" or "answer", 2634 follow the processing defined in the Section 5.9 section below. 2636 5.9. Applying an Answer 2638 In addition to the steps mentioned above for processing a local or 2639 remote description, the following steps are performed when processing 2640 a description of type "pranswer" or "answer". 2642 For each media section, the following steps MUST be performed: 2644 o If the media section has been rejected (i.e. port is set to zero 2645 in the answer), stop any reception or transmission of media for 2646 this section, and discard any associated ICE components, as 2647 described in Section 9.2.1.3 of [RFC5245]. 2649 o If the remote DTLS fingerprint has been changed, tear down the 2650 existing DTLS connection. 2652 o If no valid DTLS connection exists, prepare to start a DTLS 2653 connection, using the specified roles and fingerprints, on any 2654 underlying ICE components, once they are active. 2656 o If the media section proto value indicates use of RTP: 2658 * If the media section has RTCP mux enabled, discard any RTCP 2659 component, and begin or continue muxing RTCP over the RTP 2660 component, as specified in [RFC5761], Section 5.1.3. 2661 Otherwise, transmit RTCP over the RTCP component; if no RTCP 2662 component exists, because RTCP mux was previously enabled, this 2663 MUST result in an error. 2665 * If the media section has reduced-size RTCP enabled, configure 2666 the RTCP transmission for this media section to use reduced- 2667 size RTCP, as specified in [RFC5506]. 2669 * If the directional attribute in the answer is of type 2670 "sendrecv" or "sendonly", prepare to start transmitting media 2671 using the specified primary SSRC and one of the selected 2672 payload types, once the underlying transport layers have been 2673 established. Otherwise, stop transmitting RTP media, although 2674 RTCP should still be sent, as described in [RFC3264], 2675 Section 5.1. 2677 o If the media section proto value indicates use of SCTP: 2679 * If no SCTP association yet exists, prepare to initiate a SCTP 2680 association over the associated ICE component and DTLS 2681 connection, using the local SCTP port value from the local 2682 description, and the remote SCTP port value from the remote 2683 description, as described in [I-D.ietf-mmusic-sctp-sdp], 2684 Section 10.2. 2686 If the answer contains valid bundle groups, discard any ICE 2687 components for the m= sections that will be bundled onto the primary 2688 ICE components in each bundle, and begin muxing these m= sections 2689 accordingly, as described in 2690 [I-D.ietf-mmusic-sdp-bundle-negotiation], Section 8.2. 2692 6. Configurable SDP Parameters 2694 It is possible to change elements in the SDP returned from 2695 createOffer before passing it to setLocalDescription. When an 2696 implementation receives modified SDP it MUST either: 2698 o Accept the changes and adjust its behavior to match the SDP. 2700 o Reject the changes and return an error via the error callback. 2702 Changes MUST NOT be silently ignored. 2704 The following elements of the session description MUST NOT be changed 2705 between the createOffer and the setLocalDescription (or between the 2706 createAnswer and the setLocalDescription), since they reflect 2707 transport attributes that are solely under browser control, and the 2708 browser MUST NOT honor an attempt to change them: 2710 o The number, type and port number of m= lines. 2712 o The generated ICE credentials (a=ice-ufrag and a=ice-pwd). 2714 o The set of ICE candidates and their parameters (a=candidate). 2716 o The DTLS fingerprint(s) (a=fingerprint). 2718 o The contents of bundle groups, bundle-only parameters, or "a=rtcp- 2719 mux" parameters. 2721 The following modifications, if done by the browser to a description 2722 between createOffer/createAnswer and the setLocalDescription, MUST be 2723 honored by the browser: 2725 o Remove or reorder codecs (m=) 2727 The following parameters may be controlled by options passed into 2728 createOffer/createAnswer. As an open issue, these changes may also 2729 be be performed by manipulating the SDP returned from createOffer/ 2730 createAnswer, as indicated above, as long as the capabilities of the 2731 endpoint are not exceeded (e.g. asking for a resolution greater than 2732 what the endpoint can encode): 2734 o [[OPEN ISSUE: This is a placeholder for other modifications, which 2735 we may continue adding as use cases appear.]] 2737 Implementations MAY choose to either honor or reject any elements not 2738 listed in the above two categories, but must do so explicitly as 2739 described at the beginning of this section. Note that future 2740 standards may add new SDP elements to the list of elements which must 2741 be accepted or rejected, but due to version skew, applications must 2742 be prepared for implementations to accept changes which must be 2743 rejected and vice versa. 2745 The application can also modify the SDP to reduce the capabilities in 2746 the offer it sends to the far side or the offer that it installs from 2747 the far side in any way the application sees fit, as long as it is a 2748 valid SDP offer and specifies a subset of what was in the original 2749 offer. This is safe because the answer is not permitted to expand 2750 capabilities and therefore will just respond to what is actually in 2751 the offer. 2753 As always, the application is solely responsible for what it sends to 2754 the other party, and all incoming SDP will be processed by the 2755 browser to the extent of its capabilities. It is an error to assume 2756 that all SDP is well-formed; however, one should be able to assume 2757 that any implementation of this specification will be able to 2758 process, as a remote offer or answer, unmodified SDP coming from any 2759 other implementation of this specification. 2761 7. Examples 2763 Note that this example section shows several SDP fragments. To 2764 format in 72 columns, some of the lines in SDP have been split into 2765 multiple lines, where leading whitespace indicates that a line is a 2766 continuation of the previous line. In addition, some blank lines 2767 have been added to improve readability but are not valid in SDP. 2769 More examples of SDP for WebRTC call flows can be found in 2770 [I-D.nandakumar-rtcweb-sdp]. 2772 7.1. Simple Example 2774 This section shows a very simple example that sets up a minimal audio 2775 / video call between two browsers and does not use trickle ICE. The 2776 example in the following section provides a more realistic example of 2777 what would happen in a normal browser to browser connection. 2779 The flow shows Alice's browser initiating the session to Bob's 2780 browser. The messages from Alice's JS to Bob's JS are assumed to 2781 flow over some signaling protocol via a web server. The JS on both 2782 Alice's side and Bob's side waits for all candidates before sending 2783 the offer or answer, so the offers and answers are complete. Trickle 2784 ICE is not used. Both Alice and Bob are using the default policy of 2785 balanced. 2787 // set up local media state 2788 AliceJS->AliceUA: create new PeerConnection 2789 AliceJS->AliceUA: addStream with stream containing audio and video 2790 AliceJS->AliceUA: createOffer to get offer 2791 AliceJS->AliceUA: setLocalDescription with offer 2792 AliceUA->AliceJS: multiple onicecandidate events with candidates 2794 // wait for ICE gathering to complete 2795 AliceUA->AliceJS: onicecandidate event with null candidate 2796 AliceJS->AliceUA: get |offer-A1| from pendingLocalDescription 2798 // |offer-A1| is sent over signaling protocol to Bob 2799 AliceJS->WebServer: signaling with |offer-A1| 2800 WebServer->BobJS: signaling with |offer-A1| 2802 // |offer-A1| arrives at Bob 2803 BobJS->BobUA: create a PeerConnection 2804 BobJS->BobUA: setRemoteDescription with |offer-A1| 2805 BobUA->BobJS: onaddstream event with remoteStream 2807 // Bob accepts call 2808 BobJS->BobUA: addStream with local media 2809 BobJS->BobUA: createAnswer 2810 BobJS->BobUA: setLocalDescription with answer 2811 BobUA->BobJS: multiple onicecandidate events with candidates 2813 // wait for ICE gathering to complete 2814 BobUA->BobJS: onicecandidate event with null candidate 2815 BobJS->BobUA: get |answer-A1| from currentLocalDescription 2817 // |answer-A1| is sent over signaling protocol to Alice 2818 BobJS->WebServer: signaling with |answer-A1| 2819 WebServer->AliceJS: signaling with |answer-A1| 2821 // |answer-A1| arrives at Alice 2822 AliceJS->AliceUA: setRemoteDescription with |answer-A1| 2823 AliceUA->AliceJS: onaddstream event with remoteStream 2825 // media flows 2826 BobUA->AliceUA: media sent from Bob to Alice 2827 AliceUA->BobUA: media sent from Alice to Bob 2829 The SDP for |offer-A1| looks like: 2831 v=0 2832 o=- 4962303333179871722 1 IN IP4 0.0.0.0 2833 s=- 2834 t=0 0 2835 a=group:BUNDLE a1 v1 2836 a=ice-options:trickle 2837 m=audio 56500 UDP/TLS/RTP/SAVPF 96 0 8 97 98 2838 c=IN IP4 192.0.2.1 2839 a=mid:a1 2840 a=rtcp:56501 IN IP4 192.0.2.1 2841 a=msid:47017fee-b6c1-4162-929c-a25110252400 2842 f83006c5-a0ff-4e0a-9ed9-d3e6747be7d9 2843 a=sendrecv 2844 a=rtpmap:96 opus/48000/2 2845 a=rtpmap:0 PCMU/8000 2846 a=rtpmap:8 PCMA/8000 2847 a=rtpmap:97 telephone-event/8000 2848 a=rtpmap:98 telephone-event/48000 2849 a=maxptime:120 2850 a=ice-ufrag:ETEn1v9DoTMB9J4r 2851 a=ice-pwd:OtSK0WpNtpUjkY4+86js7ZQl 2852 a=fingerprint:sha-256 2853 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 2854 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 2855 a=setup:actpass 2856 a=rtcp-mux 2857 a=rtcp-rsize 2858 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level 2859 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 2860 a=ssrc:1732846380 cname:EocUG1f0fcg/yvY7 2861 a=candidate:3348148302 1 udp 2113937151 192.0.2.1 56500 2862 typ host 2863 a=candidate:3348148302 2 udp 2113937151 192.0.2.1 56501 2864 typ host 2865 a=end-of-candidates 2867 m=video 56502 UDP/TLS/RTP/SAVPF 100 101 2868 c=IN IP4 192.0.2.1 2869 a=rtcp:56503 IN IP4 192.0.2.1 2870 a=mid:v1 2871 a=msid:61317484-2ed4-49d7-9eb7-1414322a7aae 2872 f30bdb4a-5db8-49b5-bcdc-e0c9a23172e0 2873 a=sendrecv 2874 a=rtpmap:100 VP8/90000 2875 a=rtpmap:101 rtx/90000 2876 a=fmtp:101 apt=100 2877 a=ice-ufrag:BGKkWnG5GmiUpdIV 2878 a=ice-pwd:mqyWsAjvtKwTGnvhPztQ9mIf 2879 a=fingerprint:sha-256 2880 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 2881 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 2882 a=setup:actpass 2883 a=rtcp-mux 2884 a=rtcp-rsize 2885 a=extmap:3 urn:ietf:params:rtp-hdrext:sdes:mid 2886 a=rtcp-fb:100 ccm fir 2887 a=rtcp-fb:100 nack 2888 a=rtcp-fb:100 nack pli 2889 a=ssrc:1366781083 cname:EocUG1f0fcg/yvY7 2890 a=ssrc:1366781084 cname:EocUG1f0fcg/yvY7 2891 a=ssrc-group:FID 1366781083 1366781084 2892 a=candidate:3348148302 1 udp 2113937151 192.0.2.1 56502 2893 typ host 2894 a=candidate:3348148302 2 udp 2113937151 192.0.2.1 56503 2895 typ host 2896 a=end-of-candidates 2898 The SDP for |answer-A1| looks like: 2900 v=0 2901 o=- 6729291447651054566 1 IN IP4 0.0.0.0 2902 s=- 2903 t=0 0 2904 a=group:BUNDLE a1 v1 2905 m=audio 20000 UDP/TLS/RTP/SAVPF 96 0 8 97 98 2906 c=IN IP4 192.0.2.2 2907 a=mid:a1 2908 a=rtcp:20000 IN IP4 192.0.2.2 2909 a=msid:PI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1 2910 PI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1a0 2911 a=sendrecv 2912 a=rtpmap:96 opus/48000/2 2913 a=rtpmap:0 PCMU/8000 2914 a=rtpmap:8 PCMA/8000 2915 a=rtpmap:97 telephone-event/8000 2916 a=rtpmap:98 telephone-event/48000 2917 a=maxptime:120 2918 a=ice-ufrag:6sFvz2gdLkEwjZEr 2919 a=ice-pwd:cOTZKZNVlO9RSGsEGM63JXT2 2920 a=fingerprint:sha-256 6B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35 2921 :DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 2922 a=setup:active 2923 a=rtcp-mux 2924 a=rtcp-rsize 2925 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level 2926 a=ssrc:3429951804 cname:Q/NWs1ao1HmN4Xa5 2927 a=candidate:2299743422 1 udp 2113937151 192.0.2.2 20000 2928 typ host 2929 a=end-of-candidates 2930 m=video 20000 UDP/TLS/RTP/SAVPF 100 101 2931 c=IN IP4 192.0.2.2 2932 a=rtcp 20001 IN IP4 192.0.2.2 2933 a=mid:v1 2934 a=msid:PI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1 2935 PI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1v0 2936 a=sendrecv 2937 a=rtpmap:100 VP8/90000 2938 a=rtpmap:101 rtx/90000 2939 a=fmtp:101 apt=100 2940 a=fingerprint:sha-256 6B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35 2941 :DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 2942 a=setup:active 2943 a=rtcp-mux 2944 a=rtcp-rsize 2945 a=rtcp-fb:100 ccm fir 2946 a=rtcp-fb:100 nack 2947 a=rtcp-fb:100 nack pli 2948 a=ssrc:3229706345 cname:Q/NWs1ao1HmN4Xa5 2949 a=ssrc:3229706346 cname:Q/NWs1ao1HmN4Xa5 2950 a=ssrc-group:FID 3229706345 3229706346 2952 7.2. Normal Examples 2954 This section shows a typical example of a session between two 2955 browsers setting up an audio channel and a data channel. Trickle ICE 2956 is used in full trickle mode with a bundle policy of max-bundle, an 2957 RTCP mux policy of require, and a single TURN server. Later, two 2958 video flows, one for the presenter and one for screen sharing, are 2959 added to the session. This example shows Alice's browser initiating 2960 the session to Bob's browser. The messages from Alice's JS to Bob's 2961 JS are assumed to flow over some signaling protocol via a web server. 2963 // set up local media state 2964 AliceJS->AliceUA: create new PeerConnection 2965 AliceJS->AliceUA: addStream that contains audio track 2966 AliceJS->AliceUA: createDataChannel to get data channel 2967 AliceJS->AliceUA: createOffer to get |offer-B1| 2968 AliceJS->AliceUA: setLocalDescription with |offer-B1| 2970 // |offer-B1| is sent over signaling protocol to Bob 2971 AliceJS->WebServer: signaling with |offer-B1| 2972 WebServer->BobJS: signaling with |offer-B1| 2974 // |offer-B1| arrives at Bob 2975 BobJS->BobUA: create a PeerConnection 2976 BobJS->BobUA: setRemoteDescription with |offer-B1| 2977 BobUA->BobJS: onaddstream with audio track from Alice 2978 // candidates are sent to Bob 2979 AliceUA->AliceJS: onicecandidate event with |candidate-B1| (host) 2980 AliceJS->WebServer: signaling with |candidate-B1| 2981 AliceUA->AliceJS: onicecandidate event with |candidate-B2| (srflx) 2982 AliceJS->WebServer: signaling with |candidate-B2| 2984 WebServer->BobJS: signaling with |candidate-B1| 2985 BobJS->BobUA: addIceCandidate with |candidate-B1| 2986 WebServer->BobJS: signaling with |candidate-B2| 2987 BobJS->BobUA: addIceCandidate with |candidate-B2| 2989 // Bob accepts call 2990 BobJS->BobUA: addStream with local audio stream 2991 BobJS->BobUA: createDataChannel to get data channel 2992 BobJS->BobUA: createAnswer to get |answer-B1| 2993 BobJS->BobUA: setLocalDescription with |answer-B1| 2995 // |answer-B1| is sent to Alice 2996 BobJS->WebServer: signaling with |answer-B1| 2997 WebServer->AliceJS: signaling with |answer-B1| 2998 AliceJS->AliceUA: setRemoteDescription with |answer-B1| 2999 AliceUA->AliceJS: onaddstream event with audio track from Bob 3001 // candidates are sent to Alice 3002 BobUA->BobJS: onicecandidate event with |candidate-B3| (host) 3003 BobJS->WebServer: signaling with |candidate-B3| 3004 BobUA->BobJS: onicecandidate event with |candidate-B4| (srflx) 3005 BobJS->WebServer: signaling with |candidate-B4| 3007 WebServer->AliceJS: signaling with |candidate-B3| 3008 AliceJS->AliceUA: addIceCandidate with |candidate-B3| 3009 WebServer->AliceJS: signaling with |candidate-B4| 3010 AliceJS->AliceUA: addIceCandidate with |candidate-B4| 3012 // data channel opens 3013 BobUA->BobJS: ondatachannel event 3014 AliceUA->AliceJS: ondatachannel event 3015 BobUA->BobJS: onopen 3016 AliceUA->AliceJS: onopen 3018 // media is flowing between browsers 3019 BobUA->AliceUA: audio+data sent from Bob to Alice 3020 AliceUA->BobUA: audio+data sent from Alice to Bob 3022 // some time later Bob adds two video streams 3023 // note, no candidates exchanged, because of bundle 3024 BobJS->BobUA: addStream with first video stream 3025 BobJS->BobUA: addStream with second video stream 3026 BobJS->BobUA: createOffer to get |offer-B2| 3027 BobJS->BobUA: setLocalDescription with |offer-B2| 3029 // |offer-B2| is sent to Alice 3030 BobJS->WebServer: signaling with |offer-B2| 3031 WebServer->AliceJS: signaling with |offer-B2| 3032 AliceJS->AliceUA: setRemoteDescription with |offer-B2| 3033 AliceUA->AliceJS: onaddstream event with first video stream 3034 AliceUA->AliceJS: onaddstream event with second video stream 3035 AliceJS->AliceUA: createAnswer to get |answer-B2| 3036 AliceJS->AliceUA: setLocalDescription with |answer-B2| 3038 // |answer-B2| is sent over signaling protocol to Bob 3039 AliceJS->WebServer: signaling with |answer-B2| 3040 WebServer->BobJS: signaling with |answer-B2| 3041 BobJS->BobUA: setRemoteDescription with |answer-B2| 3043 // media is flowing between browsers 3044 BobUA->AliceUA: audio+video+data sent from Bob to Alice 3045 AliceUA->BobUA: audio+video+data sent from Alice to Bob 3047 The SDP for |offer-B1| looks like: 3049 v=0 3050 o=- 4962303333179871723 1 IN IP4 0.0.0.0 3051 s=- 3052 t=0 0 3053 a=group:BUNDLE a1 d1 3054 a=ice-options:trickle 3055 m=audio 9 UDP/TLS/RTP/SAVPF 96 0 8 97 98 3056 c=IN IP4 0.0.0.0 3057 a=rtcp:9 IN IP4 0.0.0.0 3058 a=mid:a1 3059 a=msid:57017fee-b6c1-4162-929c-a25110252400 3060 e83006c5-a0ff-4e0a-9ed9-d3e6747be7d9 3061 a=sendrecv 3062 a=rtpmap:96 opus/48000/2 3063 a=rtpmap:0 PCMU/8000 3064 a=rtpmap:8 PCMA/8000 3065 a=rtpmap:97 telephone-event/8000 3066 a=rtpmap:98 telephone-event/48000 3067 a=maxptime:120 3068 a=ice-ufrag:ATEn1v9DoTMB9J4r 3069 a=ice-pwd:AtSK0WpNtpUjkY4+86js7ZQl 3070 a=fingerprint:sha-256 3071 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 3072 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3073 a=setup:actpass 3074 a=rtcp-mux 3075 a=rtcp-rsize 3076 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level 3077 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 3078 a=ssrc:1732846380 cname:FocUG1f0fcg/yvY7 3080 m=application 0 UDP/DTLS/SCTP webrtc-datachannel 3081 c=IN IP4 0.0.0.0 3082 a=bundle-only 3083 a=mid:d1 3084 a=fmtp:webrtc-datachannel max-message-size=65536 3085 a=sctp-port 5000 3086 a=fingerprint:sha-256 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 3087 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3088 a=setup:actpass 3090 The SDP for |candidate-B1| looks like: 3092 candidate:109270923 1 udp 2122194687 192.168.1.2 51556 typ host 3094 The SDP for |candidate-B2| looks like: 3096 candidate:4036177503 1 udp 1685987071 11.22.33.44 52546 typ srflx 3097 raddr 192.168.1.2 rport 51556 3099 The SDP for |answer-B1| looks like: 3101 v=0 3102 o=- 7729291447651054566 1 IN IP4 0.0.0.0 3103 s=- 3104 t=0 0 3105 a=group:BUNDLE a1 d1 3106 a=ice-options:trickle 3107 m=audio 9 UDP/TLS/RTP/SAVPF 96 0 8 97 98 3108 c=IN IP4 0.0.0.0 3109 a=rtcp:9 IN IP4 0.0.0.0 3110 a=mid:a1 3111 a=msid:QI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1 3112 QI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1a0 3113 a=sendrecv 3114 a=rtpmap:96 opus/48000/2 3115 a=rtpmap:0 PCMU/8000 3116 a=rtpmap:8 PCMA/8000 3117 a=rtpmap:97 telephone-event/8000 3118 a=rtpmap:98 telephone-event/48000 3119 a=maxptime:120 3120 a=ice-ufrag:7sFvz2gdLkEwjZEr 3121 a=ice-pwd:dOTZKZNVlO9RSGsEGM63JXT2 3122 a=fingerprint:sha-256 6B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35 3123 :DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 3124 a=setup:active 3125 a=rtcp-mux 3126 a=rtcp-rsize 3127 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level 3128 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 3129 a=ssrc:4429951804 cname:Q/NWs1ao1HmN4Xa5 3131 m=application 9 UDP/DTLS/SCTP webrtc-datachannel 3132 c=IN IP4 0.0.0.0 3133 a=mid:d1 3134 a=fmtp:webrtc-datachannel max-message-size=65536 3135 a=sctp-port 5000 3136 a=fingerprint:sha-256 6B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35 3137 :DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 3138 a=setup:active 3140 The SDP for |candidate-B3| looks like: 3142 candidate:109270924 1 udp 2122194687 192.168.2.3 61665 typ host 3143 The SDP for |candidate-B4| looks like: 3145 candidate:4036177504 1 udp 1685987071 55.66.77.88 64532 typ srflx 3146 raddr 192.168.2.3 rport 61665 3148 The SDP for |offer-B2| looks like: (note the increment of the version 3149 number in the o= line, and the c= and a=rtcp lines, which indicate 3150 the local candidate that was selected) 3152 v=0 3153 o=- 7729291447651054566 2 IN IP4 0.0.0.0 3154 s=- 3155 t=0 0 3156 a=group:BUNDLE a1 d1 v1 v2 3157 a=ice-options:trickle 3158 m=audio 64532 UDP/TLS/RTP/SAVPF 96 0 8 97 98 3159 c=IN IP4 55.66.77.88 3160 a=rtcp:64532 IN IP4 55.66.77.88 3161 a=mid:a1 3162 a=msid:QI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1 3163 QI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1a0 3164 a=sendrecv 3165 a=rtpmap:96 opus/48000/2 3166 a=rtpmap:0 PCMU/8000 3167 a=rtpmap:8 PCMA/8000 3168 a=rtpmap:97 telephone-event/8000 3169 a=rtpmap:98 telephone-event/48000 3170 a=maxptime:120 3171 a=ice-ufrag:7sFvz2gdLkEwjZEr 3172 a=ice-pwd:dOTZKZNVlO9RSGsEGM63JXT2 3173 a=fingerprint:sha-256 6B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35 3174 :DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 3175 a=setup:actpass 3176 a=rtcp-mux 3177 a=rtcp-rsize 3178 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level 3179 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 3180 a=ssrc:4429951804 cname:Q/NWs1ao1HmN4Xa5 3181 a=candidate:109270924 1 udp 2122194687 192.168.2.3 61665 typ host 3182 a=candidate:4036177504 1 udp 1685987071 55.66.77.88 64532 typ srflx 3183 raddr 192.168.2.3 rport 61665 3184 a=candidate:3671762467 1 udp 41819903 66.77.88.99 50416 typ relay 3185 raddr 55.66.77.88 rport 64532 3186 a=end-of-candidates 3188 m=application 64532 UDP/DTLS/SCTP webrtc-datachannel 3189 c=IN IP4 55.66.77.88 3190 a=mid:d1 3191 a=fmtp:webrtc-datachannel max-message-size=65536 3192 a=sctp-port 5000 3193 a=ice-ufrag:7sFvz2gdLkEwjZEr 3194 a=ice-pwd:dOTZKZNVlO9RSGsEGM63JXT2 3195 a=fingerprint:sha-256 6B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35 3196 :DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 3197 a=setup:actpass 3198 a=candidate:109270924 1 udp 2122194687 192.168.2.3 61665 typ host 3199 a=candidate:4036177504 1 udp 1685987071 55.66.77.88 64532 typ srflx 3200 raddr 192.168.2.3 rport 61665 3201 a=candidate:3671762467 1 udp 41819903 66.77.88.99 50416 typ relay 3202 raddr 55.66.77.88 rport 64532 3203 a=end-of-candidates 3205 m=video 0 UDP/TLS/RTP/SAVPF 100 101 3206 c=IN IP4 55.66.77.88 3207 a=bundle-only 3208 a=rtcp:64532 IN IP4 55.66.77.88 3209 a=mid:v1 3210 a=msid:61317484-2ed4-49d7-9eb7-1414322a7aae 3211 f30bdb4a-5db8-49b5-bcdc-e0c9a23172e0 3212 a=sendrecv 3213 a=rtpmap:100 VP8/90000 3214 a=rtpmap:101 rtx/90000 3215 a=fmtp:101 apt=100 3216 a=fingerprint:sha-256 3217 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 3218 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3219 a=setup:actpass 3220 a=rtcp-mux 3221 a=rtcp-rsize 3222 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 3223 a=rtcp-fb:100 ccm fir 3224 a=rtcp-fb:100 nack 3225 a=rtcp-fb:100 nack pli 3226 a=ssrc:1366781083 cname:Q/NWs1ao1HmN4Xa5 3227 a=ssrc:1366781084 cname:Q/NWs1ao1HmN4Xa5 3228 a=ssrc-group:FID 1366781083 1366781084 3230 m=video 0 UDP/TLS/RTP/SAVPF 100 101 3231 c=IN IP4 55.66.77.88 3232 a=bundle-only 3233 a=rtcp:64532 IN IP4 55.66.77.88 3234 a=mid:v1 3235 a=msid:71317484-2ed4-49d7-9eb7-1414322a7aae 3236 f30bdb4a-5db8-49b5-bcdc-e0c9a23172e0 3237 a=sendrecv 3238 a=rtpmap:100 VP8/90000 3239 a=rtpmap:101 rtx/90000 3240 a=fmtp:101 apt=100 3241 a=fingerprint:sha-256 3242 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 3243 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3244 a=setup:actpass 3245 a=rtcp-mux 3246 a=rtcp-rsize 3247 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 3248 a=rtcp-fb:100 ccm fir 3249 a=rtcp-fb:100 nack 3250 a=rtcp-fb:100 nack pli 3251 a=ssrc:2366781083 cname:Q/NWs1ao1HmN4Xa5 3252 a=ssrc:2366781084 cname:Q/NWs1ao1HmN4Xa5 3253 a=ssrc-group:FID 2366781083 2366781084 3255 The SDP for |answer-B2| looks like: (note the use of setup:passive to 3256 maintain the existing DTLS roles, and the use of a=recvonly to 3257 indicate that the video streams are one-way) 3259 v=0 3260 o=- 4962303333179871723 2 IN IP4 0.0.0.0 3261 s=- 3262 t=0 0 3263 a=group:BUNDLE a1 d1 v1 v2 3264 a=ice-options:trickle 3265 m=audio 52546 UDP/TLS/RTP/SAVPF 96 0 8 97 98 3266 c=IN IP4 11.22.33.44 3267 a=rtcp:52546 IN IP4 11.22.33.44 3268 a=mid:a1 3269 a=msid:57017fee-b6c1-4162-929c-a25110252400 3270 e83006c5-a0ff-4e0a-9ed9-d3e6747be7d9 3271 a=sendrecv 3272 a=rtpmap:96 opus/48000/2 3273 a=rtpmap:0 PCMU/8000 3274 a=rtpmap:8 PCMA/8000 3275 a=rtpmap:97 telephone-event/8000 3276 a=rtpmap:98 telephone-event/48000 3277 a=maxptime:120 3278 a=ice-ufrag:ATEn1v9DoTMB9J4r 3279 a=ice-pwd:AtSK0WpNtpUjkY4+86js7ZQl 3280 a=fingerprint:sha-256 3281 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 3282 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3283 a=setup:passive 3284 a=rtcp-mux 3285 a=rtcp-rsize 3286 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level 3287 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 3288 a=ssrc:1732846380 cname:FocUG1f0fcg/yvY7 3289 a=candidate:109270923 1 udp 2122194687 192.168.1.2 51556 typ host 3290 a=candidate:4036177503 1 udp 1685987071 11.22.33.44 52546 typ srflx 3291 raddr 192.168.1.2 rport 51556 3292 a=candidate:3671762466 1 udp 41819903 22.33.44.55 61405 typ relay 3293 raddr 11.22.33.44 rport 52546 3294 a=end-of-candidates 3296 m=application 52546 UDP/DTLS/SCTP webrtc-datachannel 3297 c=IN IP4 11.22.33.44 3298 a=mid:d1 3299 a=fmtp:webrtc-datachannel max-message-size=65536 3300 a=sctp-port 5000 3301 a=fingerprint:sha-256 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 3302 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3303 a=setup:passive 3305 m=video 52546 UDP/TLS/RTP/SAVPF 100 101 3306 c=IN IP4 11.22.33.44 3307 a=rtcp:52546 IN IP4 11.22.33.44 3308 a=mid:v1 3309 a=recvonly 3310 a=rtpmap:100 VP8/90000 3311 a=rtpmap:101 rtx/90000 3312 a=fmtp:101 apt=100 3313 a=fingerprint:sha-256 3314 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 3315 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3316 a=setup:passive 3317 a=rtcp-mux 3318 a=rtcp-rsize 3319 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 3320 a=rtcp-fb:100 ccm fir 3321 a=rtcp-fb:100 nack 3322 a=rtcp-fb:100 nack pli 3324 m=video 52546 UDP/TLS/RTP/SAVPF 100 101 3325 c=IN IP4 11.22.33.44 3326 a=rtcp:52546 IN IP4 11.22.33.44 3327 a=mid:v2 3328 a=recvonly 3329 a=rtpmap:100 VP8/90000 3330 a=rtpmap:101 rtx/90000 3331 a=fmtp:101 apt=100 3332 a=fingerprint:sha-256 3333 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 3334 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3336 a=setup:passive 3337 a=rtcp-mux 3338 a=rtcp-rsize 3339 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 3340 a=rtcp-fb:100 ccm fir 3341 a=rtcp-fb:100 nack 3342 a=rtcp-fb:100 nack pli 3344 8. Security Considerations 3346 The IETF has published separate documents 3347 [I-D.ietf-rtcweb-security-arch] [I-D.ietf-rtcweb-security] describing 3348 the security architecture for WebRTC as a whole. The remainder of 3349 this section describes security considerations for this document. 3351 While formally the JSEP interface is an API, it is better to think of 3352 it is an Internet protocol, with the JS being untrustworthy from the 3353 perspective of the browser. Thus, the threat model of [RFC3552] 3354 applies. In particular, JS can call the API in any order and with 3355 any inputs, including malicious ones. This is particularly relevant 3356 when we consider the SDP which is passed to setLocalDescription(). 3357 While correct API usage requires that the application pass in SDP 3358 which was derived from createOffer() or createAnswer() (perhaps 3359 suitably modified as described in Section 6, there is no guarantee 3360 that applications do so. The browser MUST be prepared for the JS to 3361 pass in bogus data instead. 3363 Conversely, the application programmer MUST recognize that the JS 3364 does not have complete control of browser behavior. One case that 3365 bears particular mention is that editing ICE candidates out of the 3366 SDP or suppressing trickled candidates does not have the expected 3367 behavior: implementations will still perform checks from those 3368 candidates even if they are not sent to the other side. Thus, for 3369 instance, it is not possible to prevent the remote peer from learning 3370 your public IP address by removing server reflexive candidates. 3371 Applications which wish to conceal their public IP address should 3372 instead configure the ICE agent to use only relay candidates. 3374 9. IANA Considerations 3376 This document requires no actions from IANA. 3378 10. Acknowledgements 3380 Significant text incorporated in the draft as well and review was 3381 provided by Peter Thatcher, Taylor Brandstetter, Harald Alvestrand 3382 and Suhas Nandakumar. Dan Burnett, Neil Stratford, Anant Narayanan, 3383 Andrew Hutton, Richard Ejzak, Adam Bergkvist and Matthew Kaufman all 3384 provided valuable feedback on this proposal. 3386 11. References 3388 11.1. Normative References 3390 [I-D.ietf-ice-trickle] 3391 Ivov, E., Rescorla, E., Uberti, J., and P. Saint-Andre, 3392 "Trickle ICE: Incremental Provisioning of Candidates for 3393 the Interactive Connectivity Establishment (ICE) 3394 Protocol". 3396 [I-D.ietf-mmusic-msid] 3397 Alvestrand, H., "Cross Session Stream Identification in 3398 the Session Description Protocol", draft-ietf-mmusic- 3399 msid-01 (work in progress), August 2013. 3401 [I-D.ietf-mmusic-sctp-sdp] 3402 Loreto, S. and G. Camarillo, "Stream Control Transmission 3403 Protocol (SCTP)-Based Media Transport in the Session 3404 Description Protocol (SDP)", draft-ietf-mmusic-sctp-sdp-04 3405 (work in progress), June 2013. 3407 [I-D.ietf-mmusic-sdp-bundle-negotiation] 3408 Holmberg, C., Alvestrand, H., and C. Jennings, 3409 "Multiplexing Negotiation Using Session Description 3410 Protocol (SDP) Port Numbers", draft-ietf-mmusic-sdp- 3411 bundle-negotiation-04 (work in progress), June 2013. 3413 [I-D.ietf-mmusic-sdp-mux-attributes] 3414 Nandakumar, S., "A Framework for SDP Attributes when 3415 Multiplexing", draft-ietf-mmusic-sdp-mux-attributes-01 3416 (work in progress), February 2014. 3418 [I-D.ietf-rtcweb-audio] 3419 Valin, J. and C. Bran, "WebRTC Audio Codec and Processing 3420 Requirements", draft-ietf-rtcweb-audio-02 (work in 3421 progress), August 2013. 3423 [I-D.ietf-rtcweb-fec] 3424 Uberti, J., "WebRTC Forward Error Correction 3425 Requirements", draft-ietf-rtcweb-fec-00 (work in 3426 progress), February 2015. 3428 [I-D.ietf-rtcweb-rtp-usage] 3429 Perkins, C., Westerlund, M., and J. Ott, "Web Real-Time 3430 Communication (WebRTC): Media Transport and Use of RTP", 3431 draft-ietf-rtcweb-rtp-usage-09 (work in progress), 3432 September 2013. 3434 [I-D.ietf-rtcweb-security] 3435 Rescorla, E., "Security Considerations for WebRTC", draft- 3436 ietf-rtcweb-security-06 (work in progress), January 2014. 3438 [I-D.ietf-rtcweb-security-arch] 3439 Rescorla, E., "WebRTC Security Architecture", draft-ietf- 3440 rtcweb-security-arch-09 (work in progress), February 2014. 3442 [I-D.ietf-rtcweb-video] 3443 Roach, A., "WebRTC Video Processing and Codec 3444 Requirements", draft-ietf-rtcweb-video-00 (work in 3445 progress), July 2014. 3447 [I-D.nandakumar-mmusic-proto-iana-registration] 3448 Nandakumar, S., "IANA registration of SDP 'proto' 3449 attribute for transporting RTP Media over TCP under 3450 various RTP profiles.", September 2014. 3452 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 3453 Requirement Levels", BCP 14, RFC 2119, March 1997. 3455 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 3456 A., Peterson, J., Sparks, R., Handley, M., and E. 3457 Schooler, "SIP: Session Initiation Protocol", RFC 3261, 3458 June 2002. 3460 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 3461 with Session Description Protocol (SDP)", RFC 3264, June 3462 2002. 3464 [RFC3552] Rescorla, E. and B. Korver, "Guidelines for Writing RFC 3465 Text on Security Considerations", BCP 72, RFC 3552, July 3466 2003. 3468 [RFC3605] Huitema, C., "Real Time Control Protocol (RTCP) attribute 3469 in Session Description Protocol (SDP)", RFC 3605, October 3470 2003. 3472 [RFC3890] Westerlund, M., "A Transport Independent Bandwidth 3473 Modifier for the Session Description Protocol (SDP)", 3474 RFC 3890, DOI 10.17487/RFC3890, September 2004, 3475 . 3477 [RFC4145] Yon, D. and G. Camarillo, "TCP-Based Media Transport in 3478 the Session Description Protocol (SDP)", RFC 4145, 3479 September 2005. 3481 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 3482 Description Protocol", RFC 4566, July 2006. 3484 [RFC4572] Lennox, J., "Connection-Oriented Media Transport over the 3485 Transport Layer Security (TLS) Protocol in the Session 3486 Description Protocol (SDP)", RFC 4572, July 2006. 3488 [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, 3489 "Extended RTP Profile for Real-time Transport Control 3490 Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, July 3491 2006. 3493 [RFC5245] Rosenberg, J., "Interactive Connectivity Establishment 3494 (ICE): A Protocol for Network Address Translator (NAT) 3495 Traversal for Offer/Answer Protocols", RFC 5245, April 3496 2010. 3498 [RFC5285] Singer, D. and H. Desineni, "A General Mechanism for RTP 3499 Header Extensions", RFC 5285, July 2008. 3501 [RFC5761] Perkins, C. and M. Westerlund, "Multiplexing RTP Data and 3502 Control Packets on a Single Port", RFC 5761, April 2010. 3504 [RFC5888] Camarillo, G. and H. Schulzrinne, "The Session Description 3505 Protocol (SDP) Grouping Framework", RFC 5888, June 2010. 3507 [RFC6236] Johansson, I. and K. Jung, "Negotiation of Generic Image 3508 Attributes in the Session Description Protocol (SDP)", 3509 RFC 6236, May 2011. 3511 [RFC6347] Rescorla, E. and N. Modadugu, "Datagram Transport Layer 3512 Security Version 1.2", RFC 6347, January 2012. 3514 [RFC6904] Lennox, J., "Encryption of Header Extensions in the Secure 3515 Real-time Transport Protocol (SRTP)", RFC 6904, April 3516 2013. 3518 11.2. Informative References 3520 [I-D.nandakumar-rtcweb-sdp] 3521 Nandakumar, S. and C. Jennings, "SDP for the WebRTC", 3522 draft-nandakumar-rtcweb-sdp-02 (work in progress), July 3523 2013. 3525 [I-D.shieh-rtcweb-ip-handling] 3526 Shieh, G. and J. Uberti, "WebRTC IP Address Handling 3527 Recommendations", draft-shieh-rtcweb-ip-handling-00 (work 3528 in progress), October 2015. 3530 [RFC3389] Zopf, R., "Real-time Transport Protocol (RTP) Payload for 3531 Comfort Noise (CN)", RFC 3389, September 2002. 3533 [RFC3556] Casner, S., "Session Description Protocol (SDP) Bandwidth 3534 Modifiers for RTP Control Protocol (RTCP) Bandwidth", 3535 RFC 3556, July 2003. 3537 [RFC3960] Camarillo, G. and H. Schulzrinne, "Early Media and Ringing 3538 Tone Generation in the Session Initiation Protocol (SIP)", 3539 RFC 3960, December 2004. 3541 [RFC4568] Andreasen, F., Baugher, M., and D. Wing, "Session 3542 Description Protocol (SDP) Security Descriptions for Media 3543 Streams", RFC 4568, July 2006. 3545 [RFC4588] Rey, J., Leon, D., Miyazaki, A., Varsa, V., and R. 3546 Hakenberg, "RTP Retransmission Payload Format", RFC 4588, 3547 July 2006. 3549 [RFC5506] Johansson, I. and M. Westerlund, "Support for Reduced-Size 3550 Real-Time Transport Control Protocol (RTCP): Opportunities 3551 and Consequences", RFC 5506, April 2009. 3553 [RFC5576] Lennox, J., Ott, J., and T. Schierl, "Source-Specific 3554 Media Attributes in the Session Description Protocol 3555 (SDP)", RFC 5576, June 2009. 3557 [RFC5763] Fischl, J., Tschofenig, H., and E. Rescorla, "Framework 3558 for Establishing a Secure Real-time Transport Protocol 3559 (SRTP) Security Context Using Datagram Transport Layer 3560 Security (DTLS)", RFC 5763, May 2010. 3562 [RFC5764] McGrew, D. and E. Rescorla, "Datagram Transport Layer 3563 Security (DTLS) Extension to Establish Keys for the Secure 3564 Real-time Transport Protocol (SRTP)", RFC 5764, May 2010. 3566 [RFC5956] Begen, A., "Forward Error Correction Grouping Semantics in 3567 the Session Description Protocol", RFC 5956, September 3568 2010. 3570 [RFC6464] Lennox, J., Ed., Ivov, E., and E. Marocco, "A Real-time 3571 Transport Protocol (RTP) Header Extension for Client-to- 3572 Mixer Audio Level Indication", RFC 6464, 3573 DOI 10.17487/RFC6464, December 2011, 3574 . 3576 [W3C.WD-webrtc-20140617] 3577 Bergkvist, A., Burnett, D., Narayanan, A., and C. 3578 Jennings, "WebRTC 1.0: Real-time Communication Between 3579 Browsers", World Wide Web Consortium WD WD-webrtc- 3580 20140617, June 2014, 3581 . 3583 Appendix A. Change log 3585 Note: This section will be removed by RFC Editor before publication. 3587 Changes in draft-13: 3589 o Clarified which SDP lines can be ignored. 3591 o Clarified how to handle various received attributes. 3593 o Revised how atttributes should be generated for bundled m= lines. 3595 o Remove unused references. 3597 o Remove text advocating use of unilateral PTs. 3599 o Trigger an ICE restart even if the ICE candidate policy is being 3600 made more strict. 3602 o Remove the 'public' ICE candidate policy. 3604 o Move open issues/TODOs into GitHub issues. 3606 o Split local/remote description accessors into current/pending. 3608 o Clarify a=imageattr handling. 3610 o Add more detail on VoiceActivityDetection handling. 3612 o Reference draft-shieh-rtcweb-ip-handling. 3614 o Make it clear when an ICE restart should occur. 3616 o Resolve reference TODOs. 3618 o Remove MSID semantics. 3620 o ice-options are now at session level. 3622 o Default RTCP mux policy is now 'require'. 3624 Changes in draft-12: 3626 o Filled in sections on applying local and remote descriptions. 3628 o Discussed downscaling and upscaling to fulfill imageattr 3629 requirements. 3631 o Updated what SDP can be modified by the application. 3633 o Updated to latest datachannel SDP. 3635 o Allowed multiple fingerprint lines. 3637 o Switched back to IPv4 for dummy candidates. 3639 o Added additional clarity on ICE default candidates. 3641 Changes in draft-11: 3643 o Clarified handling of RTP CNAMEs. 3645 o Updated what SDP lines should be processed or ignored. 3647 o Specified how a=imageattr should be used. 3649 Changes in draft-10: 3651 o TODO 3653 Changes in draft-09: 3655 o Don't return null for {local,remote}Description after close(). 3657 o Changed TCP/TLS to UDP/DTLS in RTP profile names. 3659 o Separate out bundle and mux policy. 3661 o Added specific references to FEC mechanisms. 3663 o Added canTrickle mechanism. 3665 o Added section on subsequent answers and, answer options. 3667 o Added text defining set{Local,Remote}Description behavior. 3669 Changes in draft-08: 3671 o Added new example section and removed old examples in appendix. 3673 o Fixed field handling. 3675 o Added text describing a=rtcp attribute. 3677 o Reworked handling of OfferToReceiveAudio and OfferToReceiveVideo 3678 per discussion at IETF 90. 3680 o Reworked trickle ICE handling and its impact on m= and c= lines 3681 per discussion at interim. 3683 o Added max-bundle-and-rtcp-mux policy. 3685 o Added description of maxptime handling. 3687 o Updated ICE candidate pool default to 0. 3689 o Resolved open issues around AppID/receiver-ID. 3691 o Reworked and expanded how changes to the ICE configuration are 3692 handled. 3694 o Some reference updates. 3696 o Editorial clarification. 3698 Changes in draft-07: 3700 o Expanded discussion of VAD and Opus DTX. 3702 o Added a security considerations section. 3704 o Rewrote the section on modifying SDP to require implementations to 3705 clearly indicate whether any given modification is allowed. 3707 o Clarified impact of IceRestart on CreateOffer in local-offer 3708 state. 3710 o Guidance on whether attributes should be defined at the media 3711 level or the session level. 3713 o Renamed "default" bundle policy to "balanced". 3715 o Removed default ICE candidate pool size and clarify how it works. 3717 o Defined a canonical order for assignment of MSTs to m= lines. 3719 o Removed discussion of rehydration. 3721 o Added Eric Rescorla as a draft editor. 3723 o Cleaned up references. 3725 o Editorial cleanup 3727 Changes in draft-06: 3729 o Reworked handling of m= line recycling. 3731 o Added handling of BUNDLE and bundle-only. 3733 o Clarified handling of rollback. 3735 o Added text describing the ICE Candidate Pool and its behavior. 3737 o Allowed OfferToReceiveX to create multiple recvonly m= sections. 3739 Changes in draft-05: 3741 o Fixed several issues identified in the createOffer/Answer sections 3742 during document review. 3744 o Updated references. 3746 Changes in draft-04: 3748 o Filled in sections on createOffer and createAnswer. 3750 o Added SDP examples. 3752 o Fixed references. 3754 Changes in draft-03: 3756 o Added text describing relationship to W3C specification 3758 Changes in draft-02: 3760 o Converted from nroff 3761 o Removed comparisons to old approaches abandoned by the working 3762 group 3764 o Removed stuff that has moved to W3C specification 3766 o Align SDP handling with W3C draft 3768 o Clarified section on forking. 3770 Changes in draft-01: 3772 o Added diagrams for architecture and state machine. 3774 o Added sections on forking and rehydration. 3776 o Clarified meaning of "pranswer" and "answer". 3778 o Reworked how ICE restarts and media directions are controlled. 3780 o Added list of parameters that can be changed in a description. 3782 o Updated suggested API and examples to match latest thinking. 3784 o Suggested API and examples have been moved to an appendix. 3786 Changes in draft -00: 3788 o Migrated from draft-uberti-rtcweb-jsep-02. 3790 Authors' Addresses 3792 Justin Uberti 3793 Google 3794 747 6th Ave S 3795 Kirkland, WA 98033 3796 USA 3798 Email: justin@uberti.name 3800 Cullen Jennings 3801 Cisco 3802 170 West Tasman Drive 3803 San Jose, CA 95134 3804 USA 3806 Email: fluffy@iii.ca 3807 Eric Rescorla (editor) 3808 Mozilla 3809 331 Evelyn Ave 3810 Mountain View, CA 94041 3811 USA 3813 Email: ekr@rtfm.com