idnits 2.17.1 draft-ietf-rtcweb-jsep-11.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 44 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. == There are 20 instances of lines with private range IPv4 addresses in the document. If these are generic example addresses, they should be changed to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x, 198.51.100.x or 203.0.113.x. -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 5, 2015) is 3211 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '0' on line 660 == Missing Reference: 'RFC1918' is mentioned on line 771, but not defined == Missing Reference: 'RFC4787' is mentioned on line 774, but not defined == Unused Reference: 'RFC5124' is defined on line 3226, but no explicit reference was found in the text == Unused Reference: 'RFC7022' is defined on line 3255, but no explicit reference was found in the text == Outdated reference: A later version (-17) exists of draft-ietf-mmusic-msid-01 == Outdated reference: A later version (-26) exists of draft-ietf-mmusic-sctp-sdp-04 == Outdated reference: A later version (-54) exists of draft-ietf-mmusic-sdp-bundle-negotiation-04 == Outdated reference: A later version (-19) exists of draft-ietf-mmusic-sdp-mux-attributes-01 == Outdated reference: A later version (-02) exists of draft-ietf-mmusic-trickle-ice-00 == Outdated reference: A later version (-11) exists of draft-ietf-rtcweb-audio-02 == Outdated reference: A later version (-09) exists of draft-ietf-rtcweb-data-protocol-04 == Outdated reference: A later version (-10) exists of draft-ietf-rtcweb-fec-00 == Outdated reference: A later version (-26) exists of draft-ietf-rtcweb-rtp-usage-09 == Outdated reference: A later version (-12) exists of draft-ietf-rtcweb-security-06 == Outdated reference: A later version (-20) exists of draft-ietf-rtcweb-security-arch-09 == Outdated reference: A later version (-06) exists of draft-ietf-rtcweb-video-00 -- No information found for draft-nandakumar-mmusic-proto-iana-registration - is the name correct? -- Possible downref: Normative reference to a draft: ref. 'I-D.nandakumar-mmusic-proto-iana-registration' ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) ** Obsolete normative reference: RFC 4572 (Obsoleted by RFC 8122) ** Obsolete normative reference: RFC 5245 (Obsoleted by RFC 8445, RFC 8839) ** Obsolete normative reference: RFC 5285 (Obsoleted by RFC 8285) ** Obsolete normative reference: RFC 6347 (Obsoleted by RFC 9147) == Outdated reference: A later version (-08) exists of draft-nandakumar-rtcweb-sdp-02 Summary: 5 errors (**), 0 flaws (~~), 20 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group J. Uberti 3 Internet-Draft Google 4 Intended status: Standards Track C. Jennings 5 Expires: January 6, 2016 Cisco 6 E. Rescorla, Ed. 7 Mozilla 8 July 5, 2015 10 Javascript Session Establishment Protocol 11 draft-ietf-rtcweb-jsep-11 13 Abstract 15 This document describes the mechanisms for allowing a Javascript 16 application to control the signaling plane of a multimedia session 17 via the interface specified in the W3C RTCPeerConnection API, and 18 discusses how this relates to existing signaling protocols. 20 Status of This Memo 22 This Internet-Draft is submitted in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at http://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on January 6, 2016. 37 Copyright Notice 39 Copyright (c) 2015 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents 44 (http://trustee.ietf.org/license-info) in effect on the date of 45 publication of this document. Please review these documents 46 carefully, as they describe your rights and restrictions with respect 47 to this document. Code Components extracted from this document must 48 include Simplified BSD License text as described in Section 4.e of 49 the Trust Legal Provisions and are provided without warranty as 50 described in the Simplified BSD License. 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 55 1.1. General Design of JSEP . . . . . . . . . . . . . . . . . 3 56 1.2. Other Approaches Considered . . . . . . . . . . . . . . . 5 57 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 58 3. Semantics and Syntax . . . . . . . . . . . . . . . . . . . . 6 59 3.1. Signaling Model . . . . . . . . . . . . . . . . . . . . . 6 60 3.2. Session Descriptions and State Machine . . . . . . . . . 7 61 3.3. Session Description Format . . . . . . . . . . . . . . . 10 62 3.4. ICE . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 63 3.4.1. ICE Gathering Overview . . . . . . . . . . . . . . . 10 64 3.4.2. ICE Candidate Trickling . . . . . . . . . . . . . . . 11 65 3.4.2.1. ICE Candidate Format . . . . . . . . . . . . . . 11 66 3.4.3. ICE Candidate Policy . . . . . . . . . . . . . . . . 12 67 3.4.4. ICE Candidate Pool . . . . . . . . . . . . . . . . . 13 68 3.5. Video Size Negotiation . . . . . . . . . . . . . . . . . 13 69 3.5.1. Creating an imageattr Attribute . . . . . . . . . . . 13 70 3.5.2. Interpreting an imageattr Attribute . . . . . . . . . 14 71 3.6. Interactions With Forking . . . . . . . . . . . . . . . . 15 72 3.6.1. Sequential Forking . . . . . . . . . . . . . . . . . 15 73 3.6.2. Parallel Forking . . . . . . . . . . . . . . . . . . 16 74 4. Interface . . . . . . . . . . . . . . . . . . . . . . . . . . 17 75 4.1. Methods . . . . . . . . . . . . . . . . . . . . . . . . . 17 76 4.1.1. Constructor . . . . . . . . . . . . . . . . . . . . . 17 77 4.1.2. createOffer . . . . . . . . . . . . . . . . . . . . . 19 78 4.1.3. createAnswer . . . . . . . . . . . . . . . . . . . . 20 79 4.1.4. SessionDescriptionType . . . . . . . . . . . . . . . 21 80 4.1.4.1. Use of Provisional Answers . . . . . . . . . . . 21 81 4.1.4.2. Rollback . . . . . . . . . . . . . . . . . . . . 22 82 4.1.5. setLocalDescription . . . . . . . . . . . . . . . . . 23 83 4.1.6. setRemoteDescription . . . . . . . . . . . . . . . . 23 84 4.1.7. localDescription . . . . . . . . . . . . . . . . . . 24 85 4.1.8. remoteDescription . . . . . . . . . . . . . . . . . . 24 86 4.1.9. canTrickleIceCandidates . . . . . . . . . . . . . . . 24 87 4.1.10. setConfiguration . . . . . . . . . . . . . . . . . . 25 88 4.1.11. addIceCandidate . . . . . . . . . . . . . . . . . . . 26 89 5. SDP Interaction Procedures . . . . . . . . . . . . . . . . . 26 90 5.1. Requirements Overview . . . . . . . . . . . . . . . . . . 26 91 5.1.1. Implementation Requirements . . . . . . . . . . . . . 26 92 5.1.2. Usage Requirements . . . . . . . . . . . . . . . . . 28 93 5.1.3. Profile Names and Interoperability . . . . . . . . . 28 94 5.2. Constructing an Offer . . . . . . . . . . . . . . . . . . 29 95 5.2.1. Initial Offers . . . . . . . . . . . . . . . . . . . 29 96 5.2.2. Subsequent Offers . . . . . . . . . . . . . . . . . . 34 97 5.2.3. Options Handling . . . . . . . . . . . . . . . . . . 37 98 5.2.3.1. OfferToReceiveAudio . . . . . . . . . . . . . . . 37 99 5.2.3.2. OfferToReceiveVideo . . . . . . . . . . . . . . . 37 100 5.2.3.3. IceRestart . . . . . . . . . . . . . . . . . . . 38 101 5.2.3.4. VoiceActivityDetection . . . . . . . . . . . . . 38 102 5.3. Generating an Answer . . . . . . . . . . . . . . . . . . 38 103 5.3.1. Initial Answers . . . . . . . . . . . . . . . . . . . 39 104 5.3.2. Subsequent Answers . . . . . . . . . . . . . . . . . 43 105 5.3.3. Options Handling . . . . . . . . . . . . . . . . . . 44 106 5.3.3.1. VoiceActivityDetection . . . . . . . . . . . . . 44 107 5.4. Processing a Local Description . . . . . . . . . . . . . 44 108 5.5. Processing a Remote Description . . . . . . . . . . . . . 45 109 5.6. Parsing a Session Description . . . . . . . . . . . . . . 45 110 5.6.1. Session-Level Parsing . . . . . . . . . . . . . . . . 46 111 5.6.2. Media Section Parsing . . . . . . . . . . . . . . . . 47 112 5.6.3. Semantics Verification . . . . . . . . . . . . . . . 49 113 5.7. Applying a Local Description . . . . . . . . . . . . . . 50 114 5.8. Applying a Remote Description . . . . . . . . . . . . . . 50 115 5.9. Applying an Answer . . . . . . . . . . . . . . . . . . . 50 116 6. Configurable SDP Parameters . . . . . . . . . . . . . . . . . 50 117 7. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 52 118 7.1. Simple Example . . . . . . . . . . . . . . . . . . . . . 52 119 7.2. Normal Examples . . . . . . . . . . . . . . . . . . . . . 56 120 8. Security Considerations . . . . . . . . . . . . . . . . . . . 67 121 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 67 122 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 67 123 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 68 124 11.1. Normative References . . . . . . . . . . . . . . . . . . 68 125 11.2. Informative References . . . . . . . . . . . . . . . . . 71 126 Appendix A. Change log . . . . . . . . . . . . . . . . . . . . . 72 127 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 75 129 1. Introduction 131 This document describes how the W3C WEBRTC RTCPeerConnection 132 interface[W3C.WD-webrtc-20140617] is used to control the setup, 133 management and teardown of a multimedia session. 135 1.1. General Design of JSEP 137 The thinking behind WebRTC call setup has been to fully specify and 138 control the media plane, but to leave the signaling plane up to the 139 application as much as possible. The rationale is that different 140 applications may prefer to use different protocols, such as the 141 existing SIP or Jingle call signaling protocols, or something custom 142 to the particular application, perhaps for a novel use case. In this 143 approach, the key information that needs to be exchanged is the 144 multimedia session description, which specifies the necessary 145 transport and media configuration information necessary to establish 146 the media plane. 148 With these considerations in mind, this document describes the 149 Javascript Session Establishment Protocol (JSEP) that allows for full 150 control of the signaling state machine from Javascript. JSEP removes 151 the browser almost entirely from the core signaling flow, which is 152 instead handled by the Javascript making use of two interfaces: (1) 153 passing in local and remote session descriptions and (2) interacting 154 with the ICE state machine. 156 In this document, the use of JSEP is described as if it always occurs 157 between two browsers. Note though in many cases it will actually be 158 between a browser and some kind of server, such as a gateway or MCU. 159 This distinction is invisible to the browser; it just follows the 160 instructions it is given via the API. 162 JSEP's handling of session descriptions is simple and 163 straightforward. Whenever an offer/answer exchange is needed, the 164 initiating side creates an offer by calling a createOffer() API. The 165 application optionally modifies that offer, and then uses it to set 166 up its local config via the setLocalDescription() API. The offer is 167 then sent off to the remote side over its preferred signaling 168 mechanism (e.g., WebSockets); upon receipt of that offer, the remote 169 party installs it using the setRemoteDescription() API. 171 To complete the offer/answer exchange, the remote party uses the 172 createAnswer() API to generate an appropriate answer, applies it 173 using the setLocalDescription() API, and sends the answer back to the 174 initiator over the signaling channel. When the initiator gets that 175 answer, it installs it using the setRemoteDescription() API, and 176 initial setup is complete. This process can be repeated for 177 additional offer/answer exchanges. 179 Regarding ICE [RFC5245], JSEP decouples the ICE state machine from 180 the overall signaling state machine, as the ICE state machine must 181 remain in the browser, because only the browser has the necessary 182 knowledge of candidates and other transport info. Performing this 183 separation also provides additional flexibility; in protocols that 184 decouple session descriptions from transport, such as Jingle, the 185 session description can be sent immediately and the transport 186 information can be sent when available. In protocols that don't, 187 such as SIP, the information can be used in the aggregated form. 188 Sending transport information separately can allow for faster ICE and 189 DTLS startup, since ICE checks can start as soon as any transport 190 information is available rather than waiting for all of it. 192 Through its abstraction of signaling, the JSEP approach does require 193 the application to be aware of the signaling process. While the 194 application does not need to understand the contents of session 195 descriptions to set up a call, the application must call the right 196 APIs at the right times, convert the session descriptions and ICE 197 information into the defined messages of its chosen signaling 198 protocol, and perform the reverse conversion on the messages it 199 receives from the other side. 201 One way to mitigate this is to provide a Javascript library that 202 hides this complexity from the developer; said library would 203 implement a given signaling protocol along with its state machine and 204 serialization code, presenting a higher level call-oriented interface 205 to the application developer. For example, libraries exist to adapt 206 the JSEP API into an API suitable for a SIP or XMPP. Thus, JSEP 207 provides greater control for the experienced developer without 208 forcing any additional complexity on the novice developer. 210 1.2. Other Approaches Considered 212 One approach that was considered instead of JSEP was to include a 213 lightweight signaling protocol. Instead of providing session 214 descriptions to the API, the API would produce and consume messages 215 from this protocol. While providing a more high-level API, this put 216 more control of signaling within the browser, forcing the browser to 217 have to understand and handle concepts like signaling glare. In 218 addition, it prevented the application from driving the state machine 219 to a desired state, as is needed in the page reload case. 221 A second approach that was considered but not chosen was to decouple 222 the management of the media control objects from session 223 descriptions, instead offering APIs that would control each component 224 directly. This was rejected based on a feeling that requiring 225 exposure of this level of complexity to the application programmer 226 would not be beneficial; it would result in an API where even a 227 simple example would require a significant amount of code to 228 orchestrate all the needed interactions, as well as creating a large 229 API surface that needed to be agreed upon and documented. In 230 addition, these API points could be called in any order, resulting in 231 a more complex set of interactions with the media subsystem than the 232 JSEP approach, which specifies how session descriptions are to be 233 evaluated and applied. 235 One variation on JSEP that was considered was to keep the basic 236 session description-oriented API, but to move the mechanism for 237 generating offers and answers out of the browser. Instead of 238 providing createOffer/createAnswer methods within the browser, this 239 approach would instead expose a getCapabilities API which would 240 provide the application with the information it needed in order to 241 generate its own session descriptions. This increases the amount of 242 work that the application needs to do; it needs to know how to 243 generate session descriptions from capabilities, and especially how 244 to generate the correct answer from an arbitrary offer and the 245 supported capabilities. While this could certainly be addressed by 246 using a library like the one mentioned above, it basically forces the 247 use of said library even for a simple example. Providing 248 createOffer/createAnswer avoids this problem, but still allows 249 applications to generate their own offers/answers (to a large extent) 250 if they choose, using the description generated by createOffer as an 251 indication of the browser's capabilities. 253 2. Terminology 255 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 256 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 257 document are to be interpreted as described in [RFC2119]. 259 3. Semantics and Syntax 261 3.1. Signaling Model 263 JSEP does not specify a particular signaling model or state machine, 264 other than the generic need to exchange session descriptions in the 265 fashion described by [RFC3264] (offer/answer) in order for both sides 266 of the session to know how to conduct the session. JSEP provides 267 mechanisms to create offers and answers, as well as to apply them to 268 a session. However, the browser is totally decoupled from the actual 269 mechanism by which these offers and answers are communicated to the 270 remote side, including addressing, retransmission, forking, and glare 271 handling. These issues are left entirely up to the application; the 272 application has complete control over which offers and answers get 273 handed to the browser, and when. 275 +-----------+ +-----------+ 276 | Web App |<--- App-Specific Signaling -->| Web App | 277 +-----------+ +-----------+ 278 ^ ^ 279 | SDP | SDP 280 V V 281 +-----------+ +-----------+ 282 | Browser |<----------- Media ------------>| Browser | 283 +-----------+ +-----------+ 285 Figure 1: JSEP Signaling Model 287 3.2. Session Descriptions and State Machine 289 In order to establish the media plane, the user agent needs specific 290 parameters to indicate what to transmit to the remote side, as well 291 as how to handle the media that is received. These parameters are 292 determined by the exchange of session descriptions in offers and 293 answers, and there are certain details to this process that must be 294 handled in the JSEP APIs. 296 Whether a session description applies to the local side or the remote 297 side affects the meaning of that description. For example, the list 298 of codecs sent to a remote party indicates what the local side is 299 willing to receive, which, when intersected with the set of codecs 300 the remote side supports, specifies what the remote side should send. 301 However, not all parameters follow this rule; for example, the DTLS- 302 SRTP parameters [RFC5763] sent to a remote party indicate what 303 certificate the local side will use in DTLS setup, and thereby what 304 the remote party should expect to receive; the remote party will have 305 to accept these parameters, with no option to choose different 306 values. 308 In addition, various RFCs put different conditions on the format of 309 offers versus answers. For example, an offer may propose an 310 arbitrary number of media streams (i.e. m= sections), but an answer 311 must contain the exact same number as the offer. 313 Lastly, while the exact media parameters are only known only after an 314 offer and an answer have been exchanged, it is possible for the 315 offerer to receive media after they have sent an offer and before 316 they have received an answer. To properly process incoming media in 317 this case, the offerer's media handler must be aware of the details 318 of the offer before the answer arrives. 320 Therefore, in order to handle session descriptions properly, the user 321 agent needs: 323 1. To know if a session description pertains to the local or remote 324 side. 326 2. To know if a session description is an offer or an answer. 328 3. To allow the offer to be specified independently of the answer. 330 JSEP addresses this by adding both setLocalDescription and 331 setRemoteDescription methods and having session description objects 332 contain a type field indicating the type of session description being 333 supplied. This satisfies the requirements listed above for both the 334 offerer, who first calls setLocalDescription(sdp [offer]) and then 335 later setRemoteDescription(sdp [answer]), as well as for the 336 answerer, who first calls setRemoteDescription(sdp [offer]) and then 337 later setLocalDescription(sdp [answer]). 339 JSEP also allows for an answer to be treated as provisional by the 340 application. Provisional answers provide a way for an answerer to 341 communicate initial session parameters back to the offerer, in order 342 to allow the session to begin, while allowing a final answer to be 343 specified later. This concept of a final answer is important to the 344 offer/answer model; when such an answer is received, any extra 345 resources allocated by the caller can be released, now that the exact 346 session configuration is known. These "resources" can include things 347 like extra ICE components, TURN candidates, or video decoders. 348 Provisional answers, on the other hand, do no such deallocation 349 results; as a result, multiple dissimilar provisional answers can be 350 received and applied during call setup. 352 In [RFC3264], the constraint at the signaling level is that only one 353 offer can be outstanding for a given session, but at the media stack 354 level, a new offer can be generated at any point. For example, when 355 using SIP for signaling, if one offer is sent, then cancelled using a 356 SIP CANCEL, another offer can be generated even though no answer was 357 received for the first offer. To support this, the JSEP media layer 358 can provide an offer via the createOffer() method whenever the 359 Javascript application needs one for the signaling. The answerer can 360 send back zero or more provisional answers, and finally end the 361 offer-answer exchange by sending a final answer. The state machine 362 for this is as follows: 364 setRemote(OFFER) setLocal(PRANSWER) 365 /-----\ /-----\ 366 | | | | 367 v | v | 368 +---------------+ | +---------------+ | 369 | |----/ | |----/ 370 | | setLocal(PRANSWER) | | 371 | Remote-Offer |------------------- >| Local-Pranswer| 372 | | | | 373 | | | | 374 +---------------+ +---------------+ 375 ^ | | 376 | | setLocal(ANSWER) | 377 setRemote(OFFER) | | 378 | V setLocal(ANSWER) | 379 +---------------+ | 380 | | | 381 | |<---------------------------+ 382 | Stable | 383 | |<---------------------------+ 384 | | | 385 +---------------+ setRemote(ANSWER) | 386 ^ | | 387 | | setLocal(OFFER) | 388 setRemote(ANSWER) | | 389 | V | 390 +---------------+ +---------------+ 391 | | | | 392 | | setRemote(PRANSWER) | | 393 | Local-Offer |------------------- >|Remote-Pranswer| 394 | | | | 395 | |----\ | |----\ 396 +---------------+ | +---------------+ | 397 ^ | ^ | 398 | | | | 399 \-----/ \-----/ 400 setLocal(OFFER) setRemote(PRANSWER) 402 Figure 2: JSEP State Machine 404 Aside from these state transitions there is no other difference 405 between the handling of provisional ("pranswer") and final ("answer") 406 answers. 408 3.3. Session Description Format 410 In the WebRTC specification, session descriptions are formatted as 411 SDP messages. While this format is not optimal for manipulation from 412 Javascript, it is widely accepted, and frequently updated with new 413 features. Any alternate encoding of session descriptions would have 414 to keep pace with the changes to SDP, at least until the time that 415 this new encoding eclipsed SDP in popularity. As a result, JSEP 416 currently uses SDP as the internal representation for its session 417 descriptions. 419 However, to simplify Javascript processing, and provide for future 420 flexibility, the SDP syntax is encapsulated within a 421 SessionDescription object, which can be constructed from SDP, and be 422 serialized out to SDP. If future specifications agree on a JSON 423 format for session descriptions, we could easily enable this object 424 to generate and consume that JSON. 426 Other methods may be added to SessionDescription in the future to 427 simplify handling of SessionDescriptions from Javascript. In the 428 meantime, Javascript libraries can be used to perform these 429 manipulations. 431 Note that most applications should be able to treat the 432 SessionDescriptions produced and consumed by these various API calls 433 as opaque blobs; that is, the application will not need to read or 434 change them. The W3C WebRTC API specification will provide 435 appropriate APIs to allow the application to control various session 436 parameters, which will provide the necessary information to the 437 browser about what sort of SessionDescription to produce. 439 3.4. ICE 441 3.4.1. ICE Gathering Overview 443 JSEP gathers ICE candidates as needed by the application. Collection 444 of ICE candidates is referred to as a gathering phase, and this is 445 triggered either by the addition of a new or recycled m= line to the 446 local session description, or new ICE credentials in the description, 447 indicating an ICE restart. Use of new ICE credentials can be 448 triggered explicitly by the application, or implicitly by the browser 449 in response to changes in the ICE configuration. 451 When a new gathering phase starts, the ICE Agent will notify the 452 application that gathering is occurring through an event. Then, when 453 each new ICE candidate becomes available, the ICE Agent will supply 454 it to the application via an additional event; these candidates will 455 also automatically be added to the local session description. 457 Finally, when all candidates have been gathered, an event will be 458 dispatched to signal that the gathering process is complete. 460 Note that gathering phases only gather the candidates needed by 461 new/recycled/restarting m= lines; other m= lines continue to use 462 their existing candidates. 464 3.4.2. ICE Candidate Trickling 466 Candidate trickling is a technique through which a caller may 467 incrementally provide candidates to the callee after the initial 468 offer has been dispatched; the semantics of "Trickle ICE" are defined 469 in [I-D.ietf-mmusic-trickle-ice]. This process allows the callee to 470 begin acting upon the call and setting up the ICE (and perhaps DTLS) 471 connections immediately, without having to wait for the caller to 472 gather all possible candidates. This results in faster media setup 473 in cases where gathering is not performed prior to initiating the 474 call. 476 JSEP supports optional candidate trickling by providing APIs, as 477 described above, that provide control and feedback on the ICE 478 candidate gathering process. Applications that support candidate 479 trickling can send the initial offer immediately and send individual 480 candidates when they get the notified of a new candidate; 481 applications that do not support this feature can simply wait for the 482 indication that gathering is complete, and then create and send their 483 offer, with all the candidates, at this time. 485 Upon receipt of trickled candidates, the receiving application will 486 supply them to its ICE Agent. This triggers the ICE Agent to start 487 using the new remote candidates for connectivity checks. 489 3.4.2.1. ICE Candidate Format 491 As with session descriptions, the syntax of the IceCandidate object 492 provides some abstraction, but can be easily converted to and from 493 the SDP candidate lines. 495 The candidate lines are the only SDP information that is contained 496 within IceCandidate, as they represent the only information needed 497 that is not present in the initial offer (i.e., for trickle 498 candidates). This information is carried with the same syntax as the 499 "candidate-attribute" field defined for ICE. For example: 501 candidate:1 1 UDP 1694498815 192.0.2.33 10000 typ host 503 The IceCandidate object also contains fields to indicate which m= 504 line it should be associated with. The m= line can be identified in 505 one of two ways; either by a m= line index, or a MID. The m= line 506 index is a zero-based index, with index N referring to the N+1th m= 507 line in the SDP sent by the entity which sent the IceCandidate. The 508 MID uses the "media stream identification" attribute, as defined in 509 [RFC5888], Section 4, to identify the m= line. JSEP implementations 510 creating an ICE Candidate object MUST populate both of these fields. 511 Implementations receiving an ICE Candidate object MUST use the MID if 512 present, or the m= line index, if not (as it could have come from a 513 non-JSEP endpoint). 515 3.4.3. ICE Candidate Policy 517 Typically, when gathering ICE candidates, the browser will gather all 518 possible forms of initial candidates - host, server reflexive, and 519 relay. However, in certain cases, applications may want to have more 520 specific control over the gathering process, due to privacy or 521 related concerns. For example, one may want to suppress the use of 522 host candidates, to avoid exposing information about the local 523 network, or go as far as only using relay candidates, to leak as 524 little location information as possible (note that these choices come 525 with corresponding operational costs). To accomplish this, the 526 browser MUST allow the application to restrict which ICE candidates 527 are used in a session. In addition, administrators may also wish to 528 control the set of ICE candidates, and so the browser SHOULD also 529 allow control via local policy, with the most restrictive policy 530 prevailing. 532 There may also be cases where the application wants to change which 533 types of candidates are used while the session is active. A prime 534 example is where a callee may initially want to use only relay 535 candidates, to avoid leaking location information to an arbitrary 536 caller, but then change to use all candidates (for lower operational 537 cost) once the user has indicated they want to take the call. For 538 this scenario, the browser MUST allow the candidate policy to be 539 changed in mid-session, subject to the aforementioned interactions 540 with local policy. 542 To administer the ICE candidate policy, the browser will determine 543 the current setting at the start of each gathering phase. Then, 544 during the gathering phase, the browser MUST NOT expose candidates 545 disallowed by the current policy to the application, use them as the 546 source of connectivity checks, or indirectly expose them via other 547 fields, such as the raddr/rport attributes for other ICE candidates. 548 Later, if a different policy is specified by the application, the 549 application can apply it by kicking off a new gathering phase via an 550 ICE restart. 552 3.4.4. ICE Candidate Pool 554 JSEP applications typically inform the browser to begin ICE gathering 555 via the information supplied to setLocalDescription, as this is where 556 the app specifies the number of media streams, and thereby ICE 557 components, for which to gather candidates. However, to accelerate 558 cases where the application knows the number of ICE components to use 559 ahead of time, it may ask the browser to gather a pool of potential 560 ICE candidates to help ensure rapid media setup. 562 When setLocalDescription is eventually called, and the browser goes 563 to gather the needed ICE candidates, it SHOULD start by checking if 564 any candidates are available in the pool. If there are candidates in 565 the pool, they SHOULD be handed to the application immediately via 566 the ICE candidate event. If the pool becomes depleted, either 567 because a larger-than-expected number of ICE components is used, or 568 because the pool has not had enough time to gather candidates, the 569 remaining candidates are gathered as usual. 571 One example of where this concept is useful is an application that 572 expects an incoming call at some point in the future, and wants to 573 minimize the time it takes to establish connectivity, to avoid 574 clipping of initial media. By pre-gathering candidates into the 575 pool, it can exchange and start sending connectivity checks from 576 these candidates almost immediately upon receipt of a call. Note 577 though that by holding on to these pre-gathered candidates, which 578 will be kept alive as long as they may be needed, the application 579 will consume resources on the STUN/TURN servers it is using. 581 3.5. Video Size Negotiation 583 Video size negotiation is the process through which a receiver can 584 use the "a=imageattr" SDP attribute [RFC6236] to indicate what video 585 frame sizes it is capable of receiving. A receiver may have hard 586 limits on what its video decoder can process, or it may wish to 587 constrain what it receives due to application preferences, e.g. a 588 specific size for the window in which the video will be displayed. 590 3.5.1. Creating an imageattr Attribute 592 In order to determine the limits on what video resolution a receiver 593 wants to receive, it will intersect its decoder hard limits with any 594 mandatory constraints that have been applied to the associated 595 MediaStreamTrack. If the decoder limits are unknown, e.g. when using 596 a software decoder, the mandatory constraints are used directly. For 597 the answerer, these mandatory constraints can be applied to the 598 remote MediaStreamTracks that are created by a setRemoteDescription 599 call, and will affect the output of the ensuing createAnswer call. 601 Any constraints set after setLocalDescription is used to set the 602 answer will result in a new offer-answer exchange. For the offerer, 603 because it does not know about any remote MediaStreamTracks until it 604 receives the answer, the offer can only reflect decoder hard limits. 605 If the offerer wishes to set mandatory constraints on video 606 resolution, it must do so after receiving the answer, and the result 607 will be a new offer-answer to communicate them. 609 If there are no known decoder limits or mandatory constraints, the 610 "a=imageattr" attribute SHOULD be omitted. 612 Otherwise, an "a=imageattr" attribute is created with "recv" 613 direction, and the resulting resolution space formed by intersecting 614 the decoder limits and constraints is used to specify its minimum and 615 maximum x= and y= values. If the intersection is the null set, i.e., 616 there are no resolutions that are permitted by both the decoder and 617 the mandatory constraints, this SHOULD be represented by x=0 and y=0 618 values. 620 The rules here express a single set of preferences, and therefore, 621 the "a=imageattr" q= value is not important. It SHOULD be set to 622 1.0. 624 The "a=imageattr" field is payload type specific. When all video 625 codecs supported have the same capabilities, use of a single 626 attribute, with the wildcard payload type (*), is RECOMMENDED. 627 However, when the supported video codecs have differing capabilities, 628 specific "a=imageattr" attributes MUST be inserted for each payload 629 type. 631 As an example, consider a system with a HD-capable, multiformat video 632 decoder, where the application has constrained the received track to 633 at most 360p. In this case, the implemention would generate this 634 attribute: 636 a=imageattr:* recv [x=[16:640],y=[16:360],q=1.0] 638 3.5.2. Interpreting an imageattr Attribute 640 [RFC6236] defines "a=imageattr" to be an advisory field. This means 641 that it does not absolutely constrain the video formats that the 642 sender can use, but gives an indication of the preferred values. 644 This specification prescribes more specific behavior. When a sender 645 of a given MediaStreamTrack, which is producing video of a certain 646 resolution, receives an "a=imageattr recv" attribute, it MUST first 647 check to see if the original resolution meets the criteria specified 648 in the attribute, and transmit it untouched if so. If the original 649 resolution is too large for the attribute criteria, the sender SHOULD 650 apply downscaling to the output of the MediaStreamTrack in order to 651 satisfy the criteria. In rare cases, where a receiver requires a 652 minimum resolution which is greater than the native resolution of the 653 video, the sender SHOULD apply upscaling in order to provide that 654 resolution. The sender SHOULD NOT apply upscaling in any other 655 cases. 657 If there is no appropriate scaling mechanism that allows the received 658 criteria to be satisfied, the sender MUST NOT transmit the track. 660 In the special case of receiving a maximum resolution of [0, 0], as 661 described above, the sender MUST NOT transmit the track. 663 3.6. Interactions With Forking 665 Some call signaling systems allow various types of forking where an 666 SDP Offer may be provided to more than one device. For example, SIP 667 [RFC3261] defines both a "Parallel Search" and "Sequential Search". 668 Although these are primarily signaling level issues that are outside 669 the scope of JSEP, they do have some impact on the configuration of 670 the media plane that is relevant. When forking happens at the 671 signaling layer, the Javascript application responsible for the 672 signaling needs to make the decisions about what media should be sent 673 or received at any point of time, as well as which remote endpoint it 674 should communicate with; JSEP is used to make sure the media engine 675 can make the RTP and media perform as required by the application. 676 The basic operations that the applications can have the media engine 677 do are: 679 o Start exchanging media with a given remote peer, but keep all the 680 resources reserved in the offer. 682 o Start exchanging media with a given remote peer, and free any 683 resources in the offer that are not being used. 685 3.6.1. Sequential Forking 687 Sequential forking involves a call being dispatched to multiple 688 remote callees, where each callee can accept the call, but only one 689 active session ever exists at a time; no mixing of received media is 690 performed. 692 JSEP handles sequential forking well, allowing the application to 693 easily control the policy for selecting the desired remote endpoint. 694 When an answer arrives from one of the callees, the application can 695 choose to apply it either as a provisional answer, leaving open the 696 possibility of using a different answer in the future, or apply it as 697 a final answer, ending the setup flow. 699 In a "first-one-wins" situation, the first answer will be applied as 700 a final answer, and the application will reject any subsequent 701 answers. In SIP parlance, this would be ACK + BYE. 703 In a "last-one-wins" situation, all answers would be applied as 704 provisional answers, and any previous call leg will be terminated. 705 At some point, the application will end the setup process, perhaps 706 with a timer; at this point, the application could reapply the 707 existing remote description as a final answer. 709 3.6.2. Parallel Forking 711 Parallel forking involves a call being dispatched to multiple remote 712 callees, where each callee can accept the call, and multiple 713 simultaneous active signaling sessions can be established as a 714 result. If multiple callees send media at the same time, the 715 possibilities for handling this are described in Section 3.1 of 716 [RFC3960]. Most SIP devices today only support exchanging media with 717 a single device at a time, and do not try to mix multiple early media 718 audio sources, as that could result in a confusing situation. For 719 example, consider having a European ringback tone mixed together with 720 the North American ringback tone - the resulting sound would not be 721 like either tone, and would confuse the user. If the signaling 722 application wishes to only exchange media with one of the remote 723 endpoints at a time, then from a media engine point of view, this is 724 exactly like the sequential forking case. 726 In the parallel forking case where the Javascript application wishes 727 to simultaneously exchange media with multiple peers, the flow is 728 slightly more complex, but the Javascript application can follow the 729 strategy that [RFC3960] describes using UPDATE. The UPDATE approach 730 allows the signaling to set up a separate media flow for each peer 731 that it wishes to exchange media with. In JSEP, this offer used in 732 the UPDATE would be formed by simply creating a new PeerConnection 733 and making sure that the same local media streams have been added 734 into this new PeerConnection. Then the new PeerConnection object 735 would produce a SDP offer that could be used by the signaling to 736 perform the UPDATE strategy discussed in [RFC3960]. 738 As a result of sharing the media streams, the application will end up 739 with N parallel PeerConnection sessions, each with a local and remote 740 description and their own local and remote addresses. The media flow 741 from these sessions can be managed by specifying SDP direction 742 attributes in the descriptions, or the application can choose to play 743 out the media from all sessions mixed together. Of course, if the 744 application wants to only keep a single session, it can simply 745 terminate the sessions that it no longer needs. 747 4. Interface 749 This section details the basic operations that must be present to 750 implement JSEP functionality. The actual API exposed in the W3C API 751 may have somewhat different syntax, but should map easily to these 752 concepts. 754 4.1. Methods 756 4.1.1. Constructor 758 The PeerConnection constructor allows the application to specify 759 global parameters for the media session, such as the STUN/TURN 760 servers and credentials to use when gathering candidates, as well as 761 the initial ICE candidate policy and pool size, and also the BUNDLE 762 policy to use. 764 If an ICE candidate policy is specified, it functions as described in 765 Section 3.4.3, causing the browser to only surface the permitted 766 candidates to the application, and only use those candidates for 767 connectivity checks. The set of available policies is as follows: 769 all: All candidates will be gathered and used. 771 public: Candidates with private IP addresses [RFC1918] will be 772 filtered out. This prevents exposure of internal network details, 773 at the cost of requiring relay usage even for intranet calls, if 774 the NAT does not allow hairpinning as described in [RFC4787], 775 section 6. 777 relay: All candidates except relay candidates will be filtered out. 778 This obfuscates the location information that might be ascertained 779 by the remote peer from the received candidates. Depending on how 780 the application deploys its relay servers, this could obfuscate 781 location to a metro or possibly even global level. 783 Although it can be overridden by local policy, the default ICE 784 candidate policy MUST be set to allow all candidates, as this 785 minimizes use of application STUN/TURN server resources. 787 If a size is specified for the ICE candidate pool, this indicates the 788 number of ICE components to pre-gather candidates for. Because pre- 789 gathering results in utilizing STUN/TURN server resources for 790 potentially long periods of time, this must only occur upon 791 application request, and therefore the default candidate pool size 792 MUST be zero. 794 The application can specify its preferred policy regarding use of 795 BUNDLE, the multiplexing mechanism defined in 796 [I-D.ietf-mmusic-sdp-bundle-negotiation]. Regardless of policy, the 797 application will always try to negotiate BUNDLE onto a single 798 transport, and will offer a single BUNDLE group across all media 799 section; use of this single transport is contingent upon the answerer 800 accepting BUNDLE. However, by specifying a policy from the list 801 below, the application can control exactly how aggressively it will 802 try to BUNDLE media streams together, which affects how it will 803 interoperate with a non-BUNDLE-aware endpoint. When negotiating with 804 a non-BUNDLE-aware endpoint, only the streams not marked as bundle- 805 only streams will be established. The set of available policies is 806 as follows: 808 balanced: The first media section of each type (audio, video, or 809 application) will contain transport parameters, which will allow 810 an answerer to unbundle that section. The second and any 811 subsequent media section of each type will be marked bundle-only. 812 The result is that if there are N distinct media types, then 813 candidates will be gathered for for N media streams. This policy 814 balances desire to multiplex with the need to ensure basic audio 815 and video can still be negotiated in legacy cases. 817 max-compat: All media sections will contain transport parameters; 818 none will be marked as bundle-only. This policy will allow all 819 streams to be received by non-BUNDLE-aware endpoints, but require 820 separate candidates to be gathered for each media stream. 822 max-bundle: Only the first media section will contain transport 823 parameters; all streams other than the first will be marked as 824 bundle-only. This policy aims to minimize candidate gathering and 825 maximize multiplexing, at the cost of less compatibility with 826 legacy endpoints. 828 As it provides the best tradeoff between performance and 829 compatibility with legacy endpoints, the default BUNDLE policy MUST 830 be set to "balanced". 832 The application can specify its preferred policy regarding use of 833 RTP/RTCP multiplexing [RFC5761] using one of the following policies: 835 negotiate: The browser will gather both RTP and RTCP candidates but 836 also will offer "a=rtcp-mux", thus allowing for compatibility with 837 either multiplexing or non-multiplexing endpoints. 839 require: The browser will only gather RTP candidates. This halves 840 the number of candidates that the offerer needs to gather. When 841 acting as answerer, the browser will reject any m= section that 842 does not provide an "a=rtcp-mux" attribute. 844 4.1.2. createOffer 846 The createOffer method generates a blob of SDP that contains a 847 [RFC3264] offer with the supported configurations for the session, 848 including descriptions of the local MediaStreams attached to this 849 PeerConnection, the codec/RTP/RTCP options supported by this 850 implementation, and any candidates that have been gathered by the ICE 851 Agent. An options parameter may be supplied to provide additional 852 control over the generated offer. This options parameter should 853 allow for the following manipulations to be performed: 855 o To indicate support for a media type even if no MediaStreamTracks 856 of that type have been added to the session (e.g., an audio call 857 that wants to receive video.) 859 o To trigger an ICE restart, for the purpose of reestablishing 860 connectivity. 862 In the initial offer, the generated SDP will contain all desired 863 functionality for the session (functionality that is supported but 864 not desired by default may be omitted); for each SDP line, the 865 generation of the SDP will follow the process defined for generating 866 an initial offer from the document that specifies the given SDP line. 867 The exact handling of initial offer generation is detailed in 868 Section 5.2.1 below. 870 In the event createOffer is called after the session is established, 871 createOffer will generate an offer to modify the current session 872 based on any changes that have been made to the session, e.g. adding 873 or removing MediaStreams, or requesting an ICE restart. For each 874 existing stream, the generation of each SDP line must follow the 875 process defined for generating an updated offer from the RFC that 876 specifies the given SDP line. For each new stream, the generation of 877 the SDP must follow the process of generating an initial offer, as 878 mentioned above. If no changes have been made, or for SDP lines that 879 are unaffected by the requested changes, the offer will only contain 880 the parameters negotiated by the last offer-answer exchange. The 881 exact handling of subsequent offer generation is detailed in 882 Section 5.2.2. below. 884 Session descriptions generated by createOffer must be immediately 885 usable by setLocalDescription; if a system has limited resources 886 (e.g. a finite number of decoders), createOffer should return an 887 offer that reflects the current state of the system, so that 888 setLocalDescription will succeed when it attempts to acquire those 889 resources. Because this method may need to inspect the system state 890 to determine the currently available resources, it may be implemented 891 as an async operation. 893 Calling this method may do things such as generate new ICE 894 credentials, but does not result in candidate gathering, or cause 895 media to start or stop flowing. 897 4.1.3. createAnswer 899 The createAnswer method generates a blob of SDP that contains a 900 [RFC3264] SDP answer with the supported configuration for the session 901 that is compatible with the parameters supplied in the most recent 902 call to setRemoteDescription, which MUST have been called prior to 903 calling createAnswer. Like createOffer, the returned blob contains 904 descriptions of the local MediaStreams attached to this 905 PeerConnection, the codec/RTP/RTCP options negotiated for this 906 session, and any candidates that have been gathered by the ICE Agent. 907 An options parameter may be supplied to provide additional control 908 over the generated answer. 910 As an answer, the generated SDP will contain a specific configuration 911 that specifies how the media plane should be established; for each 912 SDP line, the generation of the SDP must follow the process defined 913 for generating an answer from the document that specifies the given 914 SDP line. The exact handling of answer generation is detailed in 915 Section 5.3. below. 917 Session descriptions generated by createAnswer must be immediately 918 usable by setLocalDescription; like createOffer, the returned 919 description should reflect the current state of the system. Because 920 this method may need to inspect the system state to determine the 921 currently available resources, it may need to be implemented as an 922 async operation. 924 Calling this method may do things such as generate new ICE 925 credentials, but does not trigger candidate gathering or change media 926 state. 928 4.1.4. SessionDescriptionType 930 Session description objects (RTCSessionDescription) may be of type 931 "offer", "pranswer", "answer" or "rollback". These types provide 932 information as to how the description parameter should be parsed, and 933 how the media state should be changed. 935 "offer" indicates that a description should be parsed as an offer; 936 said description may include many possible media configurations. A 937 description used as an "offer" may be applied anytime the 938 PeerConnection is in a stable state, or as an update to a previously 939 supplied but unanswered "offer". 941 "pranswer" indicates that a description should be parsed as an 942 answer, but not a final answer, and so should not result in the 943 freeing of allocated resources. It may result in the start of media 944 transmission, if the answer does not specify an inactive media 945 direction. A description used as a "pranswer" may be applied as a 946 response to an "offer", or an update to a previously sent "pranswer". 948 "answer" indicates that a description should be parsed as an answer, 949 the offer-answer exchange should be considered complete, and any 950 resources (decoders, candidates) that are no longer needed can be 951 released. A description used as an "answer" may be applied as a 952 response to an "offer", or an update to a previously sent "pranswer". 954 The only difference between a provisional and final answer is that 955 the final answer results in the freeing of any unused resources that 956 were allocated as a result of the offer. As such, the application 957 can use some discretion on whether an answer should be applied as 958 provisional or final, and can change the type of the session 959 description as needed. For example, in a serial forking scenario, an 960 application may receive multiple "final" answers, one from each 961 remote endpoint. The application could choose to accept the initial 962 answers as provisional answers, and only apply an answer as final 963 when it receives one that meets its criteria (e.g. a live user 964 instead of voicemail). 966 "rollback" is a special session description type implying that the 967 state machine should be rolled back to the previous state, as 968 described in Section 4.1.4.2. The contents MUST be empty. 970 4.1.4.1. Use of Provisional Answers 972 Most web applications will not need to create answers using the 973 "pranswer" type. While it is good practice to send an immediate 974 response to an "offer", in order to warm up the session transport and 975 prevent media clipping, the preferred handling for a web application 976 would be to create and send an "inactive" final answer immediately 977 after receiving the offer. Later, when the called user actually 978 accepts the call, the application can create a new "sendrecv" offer 979 to update the previous offer/answer pair and start the media flow. 980 While this could also be done with an inactive "pranswer", followed 981 by a sendrecv "answer", the initial "pranswer" leaves the offer- 982 answer exchange open, which means that neither side can send an 983 updated offer during this time. 985 As an example, consider a typical web application that will set up a 986 data channel, an audio channel, and a video channel. When an 987 endpoint receives an offer with these channels, it could send an 988 answer accepting the data channel for two-way data, and accepting the 989 audio and video tracks as inactive or receive-only. It could then 990 ask the user to accept the call, acquire the local media streams, and 991 send a new offer to the remote side moving the audio and video to be 992 two-way media. By the time the human has accepted the call and 993 triggered the new offer, it is likely that the ICE and DTLS 994 handshaking for all the channels will already have finished. 996 Of course, some applications may not be able to perform this double 997 offer-answer exchange, particularly ones that are attempting to 998 gateway to legacy signaling protocols. In these cases, "pranswer" 999 can still provide the application with a mechanism to warm up the 1000 transport. 1002 4.1.4.2. Rollback 1004 In certain situations it may be desirable to "undo" a change made to 1005 setLocalDescription or setRemoteDescription. Consider a case where a 1006 call is ongoing, and one side wants to change some of the session 1007 parameters; that side generates an updated offer and then calls 1008 setLocalDescription. However, the remote side, either before or 1009 after setRemoteDescription, decides it does not want to accept the 1010 new parameters, and sends a reject message back to the offerer. Now, 1011 the offerer, and possibly the answerer as well, need to return to a 1012 stable state and the previous local/remote description. To support 1013 this, we introduce the concept of "rollback". 1015 A rollback discards any proposed changes to the session, returning 1016 the state machine to the stable state, and setting the modified local 1017 and/or remote description back to their previous values. Any 1018 resources or candidates that were allocated by the abandoned local 1019 description are discarded; any media that is received will be 1020 processed according to the previous local and remote descriptions. 1021 Rollback can only be used to cancel proposed changes; there is no 1022 support for rolling back from a stable state to a previous stable 1023 state. Note that this implies that once the answerer has performed 1024 setLocalDescription with his answer, this cannot be rolled back. 1026 A rollback is performed by supplying a session description of type 1027 "rollback" with empty contents to either setLocalDescription or 1028 setRemoteDescription, depending on which was most recently used (i.e. 1029 if the new offer was supplied to setLocalDescription, the rollback 1030 should be done using setLocalDescription as well). 1032 4.1.5. setLocalDescription 1034 The setLocalDescription method instructs the PeerConnection to apply 1035 the supplied session description as its local configuration. The 1036 type field indicates whether the description should be processed as 1037 an offer, provisional answer, or final answer; offers and answers are 1038 checked differently, using the various rules that exist for each SDP 1039 line. 1041 This API changes the local media state; among other things, it sets 1042 up local resources for receiving and decoding media. In order to 1043 successfully handle scenarios where the application wants to offer to 1044 change from one media format to a different, incompatible format, the 1045 PeerConnection must be able to simultaneously support use of both the 1046 old and new local descriptions (e.g. support codecs that exist in 1047 both descriptions) until a final answer is received, at which point 1048 the PeerConnection can fully adopt the new local description, or roll 1049 back to the old description if the remote side denied the change. 1051 This API indirectly controls the candidate gathering process. When a 1052 local description is supplied, and the number of transports currently 1053 in use does not match the number of transports needed by the local 1054 description, the PeerConnection will create transports as needed and 1055 begin gathering candidates for them. 1057 If setRemoteDescription was previous called with an offer, and 1058 setLocalDescription is called with an answer (provisional or final), 1059 and the media directions are compatible, and media are available to 1060 send, this will result in the starting of media transmission. 1062 4.1.6. setRemoteDescription 1064 The setRemoteDescription method instructs the PeerConnection to apply 1065 the supplied session description as the desired remote configuration. 1066 As in setLocalDescription, the type field of the description 1067 indicates how it should be processed. 1069 This API changes the local media state; among other things, it sets 1070 up local resources for sending and encoding media. 1072 If setLocalDescription was previously called with an offer, and 1073 setRemoteDescription is called with an answer (provisional or final), 1074 and the media directions are compatible, and media are available to 1075 send, this will result in the starting of media transmission. 1077 4.1.7. localDescription 1079 The localDescription method returns a copy of the current local 1080 configuration, i.e. what was most recently passed to 1081 setLocalDescription, plus any local candidates that have been 1082 generated by the ICE Agent. 1084 [[OPEN ISSUE: Do we need to expose accessors for both the current and 1085 proposed local description? https://github.com/rtcweb-wg/jsep/ 1086 issues/16]] 1088 A null object will be returned if the local description has not yet 1089 been established. 1091 4.1.8. remoteDescription 1093 The remoteDescription method returns a copy of the current remote 1094 configuration, i.e. what was most recently passed to 1095 setRemoteDescription, plus any remote candidates that have been 1096 supplied via processIceMessage. 1098 [[OPEN ISSUE: Do we need to expose accessors for both the current and 1099 proposed remote description? https://github.com/rtcweb-wg/jsep/ 1100 issues/16]] 1102 A null object will be returned if the remote description has not yet 1103 been established. 1105 4.1.9. canTrickleIceCandidates 1107 The canTrickleIceCandidates property indicates whether the remote 1108 side supports receiving trickled candidates. There are three 1109 potential values: 1111 null: No SDP has been received from the other side, so it is not 1112 known if it can handle trickle. This is the initial value before 1113 setRemoteDescription() is called. 1115 true: SDP has been received from the other side indicating that it 1116 can support trickle. 1118 false: SDP has been received from the other side indicating that it 1119 cannot support trickle. 1121 As described in Section 3.4.2, JSEP implementations always provide 1122 candidates to the application individually, consistent with what is 1123 needed for Trickle ICE. However, applications can use the 1124 canTrickleIceCandidates property to determine whether their peer can 1125 actually do Trickle ICE, i.e., whether it is safe to send an initial 1126 offer or answer followed later by candidates as they are gathered. 1127 As "true" is the only value that definitively indicates remote 1128 Trickle ICE support, an application which compares 1129 canTrickleIceCandidates against "true" will by default attempt Half 1130 Trickle on initial offers and Full Trickle on subsequent interactions 1131 with a Trickle ICE-compatible agent. 1133 4.1.10. setConfiguration 1135 The setConfiguration method allows the global configuration of the 1136 PeerConnection, which was initially set by constructor parameters, to 1137 be changed during the session. The effects of this method call 1138 depend on when it is invoked, and differ depending on which specific 1139 parameters are changed: 1141 o Any changes to the STUN/TURN servers to use affect the next 1142 gathering phase. If gathering has already occurred, this will 1143 cause the next call to createOffer to generate new ICE 1144 credentials, for the purpose of forcing an ICE restart and kicking 1145 off a new gathering phase, in which the new servers will be used. 1146 If the ICE candidate pool has a nonzero size, any existing 1147 candidates will be discarded, and new candidates will be gathered 1148 from the new servers. 1150 o Any changes to the ICE candidate policy also affect the next 1151 gathering phase, in similar fashion to the server changes 1152 described above. Note though that changes to the policy have no 1153 effect on the candidate pool, because pooled candidates are not 1154 surfaced to the application until a gathering phase occurs, and so 1155 any necessary filtering can still be done on any pooled 1156 candidates. 1158 o Any changes to the ICE candidate pool size take effect 1159 immediately; if increased, additional candidates are pre-gathered; 1160 if decreased, the now-superfluous candidates are discarded. 1162 o The BUNDLE and RTCP-multiplexing policies MUST NOT be changed 1163 after the construction of the PeerConnection. 1165 This call may result in a change to the state of the ICE Agent, and 1166 may result in a change to media state if it results in connectivity 1167 being established. 1169 4.1.11. addIceCandidate 1171 The addIceCandidate method provides a remote candidate to the ICE 1172 Agent, which, if parsed successfully, will be added to the remote 1173 description according to the rules defined for Trickle ICE. 1174 Connectivity checks will be sent to the new candidate. 1176 This call will result in a change to the state of the ICE Agent, and 1177 may result in a change to media state if it results in connectivity 1178 being established. 1180 5. SDP Interaction Procedures 1182 This section describes the specific procedures to be followed when 1183 creating and parsing SDP objects. 1185 5.1. Requirements Overview 1187 JSEP implementations must comply with the specifications listed below 1188 that govern the creation and processing of offers and answers. 1190 The first set of specifications is the "mandatory-to-implement" set. 1191 All implementations must support these behaviors, but may not use all 1192 of them if the remote side, which may not be a JSEP endpoint, does 1193 not support them. 1195 The second set of specifications is the "mandatory-to-use" set. The 1196 local JSEP endpoint and any remote endpoint must indicate support for 1197 these specifications in their session descriptions. 1199 5.1.1. Implementation Requirements 1201 This list of mandatory-to-implement specifications is derived from 1202 the requirements outlined in [I-D.ietf-rtcweb-rtp-usage]. 1204 R-1 [RFC4566] is the base SDP specification and MUST be 1205 implemented. 1207 R-2 [RFC5764] MUST be supported for signaling the UDP/TLS/RTP/SAVPF 1208 [RFC5764] and TCP/DTLS/RTP/SAVPF 1209 [I-D.nandakumar-mmusic-proto-iana-registration] RTP profiles. 1211 R-3 [RFC5245] MUST be implemented for signaling the ICE credentials 1212 and candidate lines corresponding to each media stream. The 1213 ICE implementation MUST be a Full implementation, not a Lite 1214 implementation. 1216 R-4 [RFC5763] MUST be implemented to signal DTLS certificate 1217 fingerprints. 1219 R-5 [RFC4568] MUST NOT be implemented to signal SDES SRTP keying 1220 information. 1222 R-6 The [RFC5888] grouping framework MUST be implemented for 1223 signaling grouping information, and MUST be used to identify m= 1224 lines via the a=mid attribute. 1226 R-7 [I-D.ietf-mmusic-msid] MUST be supported, in order to signal 1227 associations between RTP objects and W3C MediaStreams and 1228 MediaStreamTracks in a standard way. 1230 R-8 The bundle mechanism in 1231 [I-D.ietf-mmusic-sdp-bundle-negotiation] MUST be supported to 1232 signal the ability to multiplex RTP streams on a single UDP 1233 port, in order to avoid excessive use of port number resources. 1235 R-9 The SDP attributes of "sendonly", "recvonly", "inactive", and 1236 "sendrecv" from [RFC4566] MUST be implemented to signal 1237 information about media direction. 1239 R-10 [RFC5576] MUST be implemented to signal RTP SSRC values and 1240 grouping semantics. 1242 R-11 [RFC4585] MUST be implemented to signal RTCP based feedback. 1244 R-12 [RFC5761] MUST be implemented to signal multiplexing of RTP and 1245 RTCP. 1247 R-13 [RFC5506] MUST be implemented to signal reduced-size RTCP 1248 messages. 1250 R-14 [RFC4588] MUST be implemented to signal RTX payload type 1251 associations. 1253 R-15 [RFC3556] with bandwidth modifiers MAY be supported for 1254 specifying RTCP bandwidth as a fraction of the media bandwidth, 1255 RTCP fraction allocated to the senders and setting maximum 1256 media bit-rate boundaries. 1258 R-16 TODO: any others? 1260 As required by [RFC4566], Section 5.13, JSEP implementations MUST 1261 ignore unknown attribute (a=) lines. 1263 5.1.2. Usage Requirements 1265 All session descriptions handled by JSEP endpoints, both local and 1266 remote, MUST indicate support for the following specifications. If 1267 any of these are absent, this omission MUST be treated as an error. 1269 R-1 ICE, as specified in [RFC5245], MUST be used. Note that the 1270 remote endpoint may use a Lite implementation; implementations 1271 MUST properly handle remote endpoints which do ICE-Lite. 1273 R-2 DTLS [RFC6347] or DTLS-SRTP [RFC5763], MUST be used, as 1274 appropriate for the media type, as specified in 1275 [I-D.ietf-rtcweb-security-arch] 1277 5.1.3. Profile Names and Interoperability 1279 For media m= sections, JSEP endpoints MUST support both the "UDP/TLS/ 1280 RTP/SAVPF" and "TCP/DTLS/RTP/SAVPF" profiles and MUST indicate one of 1281 these two profiles for each media m= line they produce in an offer. 1282 For data m= sections, JSEP endpoints must support both the "UDP/DTLS/ 1283 SCTP" and "TCP/DTLS/SCTP" profiles and MUST indicate one of these two 1284 profiles for each data m= line they produce in an offer. Because ICE 1285 can select either TCP or UDP transport depending on network 1286 conditions, both advertisements are consistent with ICE eventually 1287 selecting either either UDP or TCP. 1289 Unfortunately, in an attempt at compatibility, some endpoints 1290 generate other profile strings even when they mean to support one of 1291 these profiles. For instance, an endpoint might generate "RTP/AVP" 1292 but supply "a=fingerprint" and "a=rtcp-fb" attributes, indicating its 1293 willingness to support "(UDP,TCP)/TLS/RTP/SAVPF". In order to 1294 simplify compatibility with such endpoints, JSEP endpoints MUST 1295 follow the following rules when processing the media m= sections in 1296 an offer: 1298 o The profile in any "m=" line in any answer MUST exactly match the 1299 profile provided in the offer. 1301 o Any profile matching the following patterns MUST be accepted: 1302 "RTP/[S]AVP[F]" and "(UDP/TCP)/TLS/RTP/SAVP[F]" 1304 o Because DTLS-SRTP is REQUIRED, the choice of SAVP or AVP has no 1305 effect; support for DTLS-SRTP is determined by the presence of the 1306 "a=fingerprint" attribute. Note that lack of an "a=fingerprint" 1307 attribute will lead to negotiation failure. 1309 o The use of AVPF or AVP simply controls the timing rules used for 1310 RTCP feedback. If AVPF is provided, or an "a=rtcp-fb" attribute 1311 is present, assume AVPF timing, i.e. a default value of "trr- 1312 int=0". Otherwise, assume that AVPF is being used in an AVP 1313 compatible mode and use AVP timing, i.e., "trr-int=4". 1315 o For data m= sections, JSEP endpoints MUST support receiving the 1316 "UDP/ DTLS/SCTP", "TCP/DTLS/SCTP", or "DTLS/SCTP" (for backwards 1317 compatibility) profiles. 1319 Note that re-offers by JSEP endpoints MUST use the correct profile 1320 strings even if the initial offer/answer exchange used an (incorrect) 1321 older profile string. 1323 5.2. Constructing an Offer 1325 When createOffer is called, a new SDP description must be created 1326 that includes the functionality specified in 1327 [I-D.ietf-rtcweb-rtp-usage]. The exact details of this process are 1328 explained below. 1330 5.2.1. Initial Offers 1332 When createOffer is called for the first time, the result is known as 1333 the initial offer. 1335 The first step in generating an initial offer is to generate session- 1336 level attributes, as specified in [RFC4566], Section 5. 1337 Specifically: 1339 o The first SDP line MUST be "v=0", as specified in [RFC4566], 1340 Section 5.1 1342 o The second SDP line MUST be an "o=" line, as specified in 1343 [RFC4566], Section 5.2. The value of the field SHOULD 1344 be "-". The value of the field SHOULD be a 1345 cryptographically random number. To ensure uniqueness, this 1346 number SHOULD be at least 64 bits long. The value of the field SHOULD be zero. The value of the 1348 tuple SHOULD be set to a non- 1349 meaningful address, such as IN IP4 0.0.0.0, to prevent leaking the 1350 local address in this field. As mentioned in [RFC4566], the 1351 entire o= line needs to be unique, but selecting a random number 1352 for is sufficient to accomplish this. 1354 o The third SDP line MUST be a "s=" line, as specified in [RFC4566], 1355 Section 5.3; to match the "o=" line, a single dash SHOULD be used 1356 as the session name, e.g. "s=-". Note that this differs from the 1357 advice in [RFC4566] which proposes a single space, but as both 1358 "o=" and "s=" are meaningless, having the same meaningless value 1359 seems clearer. 1361 o Session Information ("i="), URI ("u="), Email Address ("e="), 1362 Phone Number ("p="), Bandwidth ("b="), Repeat Times ("r="), and 1363 Time Zones ("z=") lines are not useful in this context and SHOULD 1364 NOT be included. 1366 o Encryption Keys ("k=") lines do not provide sufficient security 1367 and MUST NOT be included. 1369 o A "t=" line MUST be added, as specified in [RFC4566], Section 5.9; 1370 both and SHOULD be set to zero, e.g. "t=0 1371 0". 1373 o An "a=msid-semantic:WMS" line MUST be added, as specified in 1374 [I-D.ietf-mmusic-msid], Section 4. 1376 The next step is to generate m= sections, as specified in [RFC4566] 1377 Section 5.14, for each MediaStreamTrack that has been added to the 1378 PeerConnection via the addStream method. (Note that this method 1379 takes a MediaStream, which can contain multiple MediaStreamTracks, 1380 and therefore multiple m= sections can be generated even if addStream 1381 is only called once.) m=sections MUST be sorted first by the order in 1382 which the MediaStreams were added to the PeerConnection, and then by 1383 the alphabetical ordering of the media type for the MediaStreamTrack. 1384 For example, if a MediaStream containing both an audio and a video 1385 MediaStreamTrack is added to a PeerConnection, the resultant m=audio 1386 section will precede the m=video section. If a second MediaStream 1387 containing an audio MediaStreamTrack was added, it would follow the 1388 m=video section. 1390 Each m= section, provided it is not being bundled into another m= 1391 section, MUST generate a unique set of ICE credentials and gather its 1392 own unique set of ICE candidates. Otherwise, it MUST use the same 1393 ICE credentials and candidates as the m= section into which it is 1394 being bundled. Note that this means that for offers, any m= sections 1395 which are not bundle-only MUST have unique ICE credentials and 1396 candidates, since it is possible that the answerer will accept them 1397 without bundling them. 1399 For DTLS, all m= sections MUST use the certificate for the identity 1400 that has been specified for the PeerConnection; as a result, they 1401 MUST all have the same [RFC4572] fingerprint value, or this value 1402 MUST be a session-level attribute. 1404 Each m= section should be generated as specified in [RFC4566], 1405 Section 5.14. For the m= line itself, the following rules MUST be 1406 followed: 1408 o The port value is set to the port of the default ICE candidate for 1409 this m= section, but given that no candidates have yet been 1410 gathered, the "dummy" port value of 9 (Discard) MUST be used, as 1411 indicated in [I-D.ietf-mmusic-trickle-ice], Section 5.1. 1413 o To properly indicate use of DTLS, the field MUST be set to 1414 "UDP/TLS/RTP/SAVPF", as specified in [RFC5764], Section 8, if the 1415 default candidate uses UDP transport, or "TCP/DTLS/RTP/SAVPF", as 1416 specified in[I-D.nandakumar-mmusic-proto-iana-registration] if the 1417 default candidate uses TCP transport. 1419 The m= line MUST be followed immediately by a "c=" line, as specified 1420 in [RFC4566], Section 5.7. Again, as no candidates have yet been 1421 gathered, the "c=" line must contain the "dummy" value "IN IP6 ::", 1422 as defined in [I-D.ietf-mmusic-trickle-ice], Section 5.1. 1424 Each m= section MUST include the following attribute lines: 1426 o An "a=mid" line, as specified in [RFC5888], Section 4. When 1427 generating mid values, it is RECOMMENDED that the values be 3 1428 bytes or less, to allow them to efficiently fit into the RTP 1429 header extension defined in 1430 [I-D.ietf-mmusic-sdp-bundle-negotiation], Section 11. 1432 o An "a=rtcp" line, as specified in [RFC3605], Section 2.1, 1433 containing the dummy value "9 IN IP6 ::", because no candidates 1434 have yet been gathered. 1436 o An "a=msid" line, as specified in [I-D.ietf-mmusic-msid], 1437 Section 2. 1439 o An "a=sendrecv" line, as specified in [RFC3264], Section 5.1. 1441 o For each supported codec, "a=rtpmap" and "a=fmtp" lines, as 1442 specified in [RFC4566], Section 6. The audio and video codecs 1443 that MUST be supported are specified in [I-D.ietf-rtcweb-audio] 1444 (see Section 3) and [I-D.ietf-rtcweb-video] (see Section 5). 1446 o If this m= section is for media with configurable frame sizes, 1447 e.g. audio, an "a=maxptime" line, indicating the smallest of the 1448 maximum supported frame sizes out of all codecs included above, as 1449 specified in [RFC4566], Section 6. 1451 o If this m= section is for video media, and there are known 1452 limitations on the size of images which can be decoded, an 1453 "a=imageattr" line, as specified in Section 3.5. 1455 o For each primary codec where RTP retransmission should be used, a 1456 corresponding "a=rtpmap" line indicating "rtx" with the clock rate 1457 of the primary codec and an "a=fmtp" line that references the 1458 payload type of the primary codec, as specified in [RFC4588], 1459 Section 8.1. 1461 o For each supported FEC mechanism, "a=rtpmap" and "a=fmtp" lines, 1462 as specified in [RFC4566], Section 6. The FEC mechanisms that 1463 MUST be supported are specified in [I-D.ietf-rtcweb-fec], 1464 Section 6, and specific usage for each media type is outlined in 1465 Sections 4 and 5. 1467 o "a=ice-ufrag" and "a=ice-pwd" lines, as specified in [RFC5245], 1468 Section 15.4. 1470 o An "a=ice-options" line, with the "trickle" option, as specified 1471 in [I-D.ietf-mmusic-trickle-ice], Section 4. 1473 o An "a=fingerprint" line, as specified in [RFC4572], Section 5; the 1474 algorithm used for the fingerprint MUST match that used in the 1475 certificate signature. 1477 o An "a=setup" line, as specified in [RFC4145], Section 4, and 1478 clarified for use in DTLS-SRTP scenarios in [RFC5763], Section 5. 1479 The role value in the offer MUST be "actpass". 1481 o An "a=rtcp-mux" line, as specified in [RFC5761], Section 5.1.1. 1483 o An "a=rtcp-rsize" line, as specified in [RFC5506], Section 5. 1485 o For each supported RTP header extension, an "a=extmap" line, as 1486 specified in [RFC5285], Section 5. The list of header extensions 1487 that SHOULD/MUST be supported is specified in 1488 [I-D.ietf-rtcweb-rtp-usage], Section 5.2. Any header extensions 1489 that require encryption MUST be specified as indicated in 1490 [RFC6904], Section 4. 1492 o For each supported RTCP feedback mechanism, an "a=rtcp-fb" 1493 mechanism, as specified in [RFC4585], Section 4.2. The list of 1494 RTCP feedback mechanisms that SHOULD/MUST be supported is 1495 specified in [I-D.ietf-rtcweb-rtp-usage], Section 5.1. 1497 o An "a=ssrc" line, as specified in [RFC5576], Section 4.1, 1498 indicating the SSRC to be used for sending media, along with the 1499 mandatory "cname" source attribute, as specified in Section 6.1, 1500 indicating the CNAME for the source. The CNAME MUST be generated 1501 in accordance with Section 4.9 of [I-D.ietf-rtcweb-rtp-usage]. 1503 o If RTX is supported for this media type, another "a=ssrc" line 1504 with the RTX SSRC, and an "a=ssrc-group" line, as specified in 1505 [RFC5576], section 4.2, with semantics set to "FID" and including 1506 the primary and RTX SSRCs. 1508 o If FEC is supported for this media type, another "a=ssrc" line 1509 with the FEC SSRC, and an "a=ssrc-group" line with semantics set 1510 to "FEC-FR" and including the primary and FEC SSRCs, as specified 1511 in [RFC5956], section 4.3. For simplicity, if both RTX and FEC 1512 are supported, the FEC SSRC MUST be the same as the RTX SSRC. 1514 o If the BUNDLE policy for this PeerConnection is set to "max- 1515 bundle", and this is not the first m= section, or the BUNDLE 1516 policy is set to "balanced", and this is not the first m= section 1517 for this media type, an "a=bundle-only" line. 1519 Lastly, if a data channel has been created, a m= section MUST be 1520 generated for data. The field MUST be set to "application" 1521 and the field MUST be set to "UDP/DTLS/SCTP" if the default 1522 candidate uses UDP transport, or "TCP/DTLS/SCTP" if the default 1523 candidate uses TCP transport [I-D.ietf-mmusic-sctp-sdp]. The "fmt" 1524 value MUST be set to the SCTP port number, as specified in 1525 Section 4.1. [TODO: update this to use a=sctp-port, as indicated in 1526 the latest data channel docs] 1528 Within the data m= section, the "a=mid", "a=ice-ufrag", "a=ice- 1529 passwd", "a=ice-options", "a=candidate", "a=fingerprint", and 1530 "a=setup" lines MUST be included as mentioned above, along with an 1531 "a=sctpmap" line referencing the SCTP port number and specifying the 1532 application protocol indicated in [I-D.ietf-rtcweb-data-protocol]. 1533 [OPEN ISSUE: the -01 of this document is missing this information.] 1535 Once all m= sections have been generated, a session-level "a=group" 1536 attribute MUST be added as specified in [RFC5888]. This attribute 1537 MUST have semantics "BUNDLE", and MUST include the mid identifiers of 1538 each m= section. The effect of this is that the browser offers all 1539 m= sections as one BUNDLE group. However, whether the m= sections 1540 are bundle-only or not depends on the BUNDLE policy. 1542 The next step is to generate session-level lip sync groups as defined 1543 in [RFC5888], Section 7. For each MediaStream with more than one 1544 MediaStreamTrack, a group of type "LS" MUST be added that contains 1545 the mid values for each MediaStreamTrack in that MediaStream. 1547 Attributes which SDP permits to either be at the session level or the 1548 media level SHOULD generally be at the media level even if they are 1549 identical. This promotes readability, especially if one of a set of 1550 initially identical attributes is subsequently changed. 1552 Attributes other than the ones specified above MAY be included, 1553 except for the following attributes which are specifically 1554 incompatible with the requirements of [I-D.ietf-rtcweb-rtp-usage], 1555 and MUST NOT be included: 1557 o "a=crypto" 1559 o "a=key-mgmt" 1561 o "a=ice-lite" 1563 Note that when BUNDLE is used, any additional attributes that are 1564 added MUST follow the advice in [I-D.ietf-mmusic-sdp-mux-attributes] 1565 on how those attributes interact with BUNDLE. 1567 Note that these requirements are in some cases stricter than those of 1568 SDP. Implementations MUST be prepared to accept compliant SDP even 1569 if it would not conform to the requirements for generating SDP in 1570 this specification. 1572 5.2.2. Subsequent Offers 1574 When createOffer is called a second (or later) time, or is called 1575 after a local description has already been installed, the processing 1576 is somewhat different than for an initial offer. 1578 If the initial offer was not applied using setLocalDescription, 1579 meaning the PeerConnection is still in the "stable" state, the steps 1580 for generating an initial offer should be followed, subject to the 1581 following restriction: 1583 o The fields of the "o=" line MUST stay the same except for the 1584 field, which MUST increment if the session 1585 description changes in any way, including the addition of ICE 1586 candidates. 1588 If the initial offer was applied using setLocalDescription, but an 1589 answer from the remote side has not yet been applied, meaning the 1590 PeerConnection is still in the "local-offer" state, an offer is 1591 generated by following the steps in the "stable" state above, along 1592 with these exceptions: 1594 o The "s=" and "t=" lines MUST stay the same. 1596 o Each "m=" and c=" line MUST be filled in with the port, protocol, 1597 and address of the default candidate for the m= section, as 1598 described in [RFC5245], Section 4.3. Each "a=rtcp" attribute line 1599 MUST also be filled in with the port and address of the 1600 appropriate default candidate, either the default RTP or RTCP 1601 candidate, depending on whether RTCP multiplexing is currently 1602 active or not. Note that if RTCP multiplexing is being offered, 1603 but not yet active, the default RTCP candidate MUST be used, as 1604 indicated in [RFC5761], section 5.1.3. In each case, if no 1605 candidates of the desired type have yet been gathered, dummy 1606 values MUST be used, as described above. 1608 o Each "a=mid" line MUST stay the same. 1610 o Each "a=ice-ufrag" and "a=ice-pwd" line MUST stay the same, unless 1611 the ICE configuration has changed (either changes to the supported 1612 STUN/TURN servers, or the ICE candidate policy), or the 1613 "IceRestart" option (Section 5.2.3.3 was specified. 1615 o Within each m= section, for each candidate that has been gathered 1616 during the most recent gathering phase (see Section 3.4.1), an 1617 "a=candidate" line MUST be added, as specified in [RFC5245], 1618 Section 4.3., paragraph 3. If candidate gathering for the section 1619 has completed, an "a=end-of-candidates" attribute MUST be added, 1620 as described in [I-D.ietf-mmusic-trickle-ice], Section 9.3. 1622 o For MediaStreamTracks that are still present, the "a=msid", 1623 "a=ssrc", and "a=ssrc-group" lines MUST stay the same. 1625 o If any MediaStreamTracks have been removed, either through the 1626 removeStream method or by removing them from an added MediaStream, 1627 their m= sections MUST be marked as recvonly by changing the value 1628 of the [RFC3264] directional attribute to "a=recvonly". The 1629 "a=msid", "a=ssrc", and "a=ssrc-group" lines MUST be removed from 1630 the associated m= sections. 1632 o If any MediaStreamTracks have been added, and there exist m= 1633 sections of the appropriate media type with no associated 1634 MediaStreamTracks (i.e. as described in the preceding paragraph), 1635 those m= sections MUST be recycled by adding the new 1636 MediaStreamTrack to the m= section. This is done by adding the 1637 necessary "a=msid", "a=ssrc", and "a=ssrc-group" lines to the 1638 recycled m= section, and removing the "a=recvonly" attribute. 1640 If the initial offer was applied using setLocalDescription, and an 1641 answer from the remote side has been applied using 1642 setRemoteDescription, meaning the PeerConnection is in the "remote- 1643 pranswer" or "stable" states, an offer is generated based on the 1644 negotiated session descriptions by following the steps mentioned for 1645 the "local-offer" state above, along with these exceptions: [OPEN 1646 ISSUE: should this be permitted in the remote-pranswer state? 1647 https://github.com/rtcweb-wg/jsep/issues/143] 1649 o If a m= section exists in the current local description, but does 1650 not have an associated local MediaStreamTrack (possibly because 1651 said MediaStreamTrack was removed since the last exchange), a m= 1652 section MUST still be generated in the new offer, as indicated in 1653 [RFC3264], Section 8. The disposition of this section will depend 1654 on the state of the remote MediaStreamTrack associated with this 1655 m= section. If one exists, and it is still in the "live" state, 1656 the new m= section MUST be marked as "a=recvonly", with no 1657 "a=msid" or related attributes present. If no remote 1658 MediaStreamTrack exists, or it is in the "ended" state, the m= 1659 section MUST be marked as rejected, by setting the port to zero, 1660 as indicated in [RFC3264], Section 8.2. 1662 o If any MediaStreamTracks have been added, and there exist recvonly 1663 m= sections of the appropriate media type with no associated 1664 MediaStreamTracks, or rejected m= sections of any media type, 1665 those m= sections MUST be recycled, and a local MediaStreamTrack 1666 associated with these recycled m= sections until all such existing 1667 m= sections have been used. This includes any recvonly or 1668 rejected m= sections created by the preceding paragraph. 1670 In addition, for each non-recycled, non-rejected m= section in the 1671 new offer, the following adjustments are made based on the contents 1672 of the corresponding m= section in the current remote description: 1674 o The m= line and corresponding "a=rtpmap" and "a=fmtp" lines MUST 1675 only include codecs present in the remote description. 1677 o The RTP header extensions MUST only include those that are present 1678 in the remote description. 1680 o The RTCP feedback extensions MUST only include those that are 1681 present in the remote description. 1683 o The "a=rtcp-mux" line MUST only be added if present in the remote 1684 description. 1686 o The "a=rtcp-rsize" line MUST only be added if present in the 1687 remote description. 1689 The "a=group:BUNDLE" attribute MUST include the mid identifiers 1690 specified in the BUNDLE group in the most recent answer, minus any m= 1691 sections that have been marked as rejected, plus any newly added or 1692 re-enabled m= sections. In other words, the BUNDLE attribute must 1693 contain all m= sections that were previously bundled, as long as they 1694 are still alive, as well as any new m= sections. 1696 The "LS" groups are generated in the same way as with initial offers. 1698 5.2.3. Options Handling 1700 The createOffer method takes as a parameter an RTCOfferOptions 1701 object. Special processing is performed when generating a SDP 1702 description if the following options are present. 1704 5.2.3.1. OfferToReceiveAudio 1706 If the "OfferToReceiveAudio" option is specified, with an integer 1707 value of N, and M audio MediaStreamTracks have been added to the 1708 PeerConnection, the offer MUST include N non-rejected m= sections 1709 with media type "audio", even if N is greater than M. This allows 1710 the offerer to receive audio, including multiple independent streams, 1711 even when not sending it; accordingly, the directional attribute on 1712 the N-M audio m= sections without associated MediaStreamTracks MUST 1713 be set to recvonly. 1715 If N is set to a value less than M, the offer MUST mark the m= 1716 sections associated with the M-N most recently added (since the last 1717 setLocalDescription) MediaStreamTracks as sendonly. This allows the 1718 offerer to indicate that it does not want to receive audio on some or 1719 all of its newly created streams. For m= sections that have 1720 previously been negotiated, this setting has no effect. [TODO: refer 1721 to RTCRtpSender in the future] 1723 For backwards compatibility with pre-standard versions of this 1724 specification, a value of "true" is interpreted as equivalent to N=1, 1725 and "false" as N=0. 1727 5.2.3.2. OfferToReceiveVideo 1729 If the "OfferToReceiveVideo" option is specified, with an integer 1730 value of N, and M video MediaStreamTracks have been added to the 1731 PeerConnection, the offer MUST include N non-rejected m= sections 1732 with media type "video", even if N is greater than M. This allows 1733 the offerer to receive video, including multiple independent streams, 1734 even when not sending it; accordingly, the directional attribute on 1735 the N-M video m= sections without associated MediaStreamTracks MUST 1736 be set to recvonly. 1738 If N is set to a value less than M, the offer MUST mark the m= 1739 sections associated with the M-N most recently added (since the last 1740 setLocalDescription) MediaStreamTracks as sendonly. This allows the 1741 offerer to indicate that it does not want to receive video on some or 1742 all of its newly created streams. For m= sections that have 1743 previously been negotiated, this setting has no effect. [TODO: refer 1744 to RTCRtpSender in the future] 1746 For backwards compatibility with pre-standard versions of this 1747 specification, a value of "true" is interpreted as equivalent to N=1, 1748 and "false" as N=0. 1750 5.2.3.3. IceRestart 1752 If the "IceRestart" option is specified, with a value of "true", the 1753 offer MUST indicate an ICE restart by generating new ICE ufrag and 1754 pwd attributes, as specified in [RFC5245], Section 9.1.1.1. If this 1755 option is specified on an initial offer, it has no effect (since a 1756 new ICE ufrag and pwd are already generated). Similarly, if the ICE 1757 configuration has changed, this option has no effect, since new ufrag 1758 and pwd attributes will be generated automatically. This option is 1759 primarily useful for reestablishing connectivity in cases where 1760 failures are detected by the application. 1762 5.2.3.4. VoiceActivityDetection 1764 If the "VoiceActivityDetection" option is specified, with a value of 1765 "true", the offer MUST indicate support for silence suppression in 1766 the audio it receives by including comfort noise ("CN") codecs for 1767 each offered audio codec, as specified in [RFC3389], Section 5.1, 1768 except for codecs that have their own internal silence suppression 1769 support. For codecs that have their own internal silence suppression 1770 support, the appropriate fmtp parameters for that codec MUST be 1771 specified to indicate that silence suppression for received audio is 1772 desired. For example, when using the Opus codec, the "usedtx=1" 1773 parameter would be specified in the offer. This option allows the 1774 endpoint to significantly reduce the amount of audio bandwidth it 1775 receives, at the cost of some fidelity, depending on the quality of 1776 the remote VAD algorithm. 1778 5.3. Generating an Answer 1780 When createAnswer is called, a new SDP description must be created 1781 that is compatible with the supplied remote description as well as 1782 the requirements specified in [I-D.ietf-rtcweb-rtp-usage]. The exact 1783 details of this process are explained below. 1785 5.3.1. Initial Answers 1787 When createAnswer is called for the first time after a remote 1788 description has been provided, the result is known as the initial 1789 answer. If no remote description has been installed, an answer 1790 cannot be generated, and an error MUST be returned. 1792 Note that the remote description SDP may not have been created by a 1793 JSEP endpoint and may not conform to all the requirements listed in 1794 Section 5.2. For many cases, this is not a problem. However, if any 1795 mandatory SDP attributes are missing, or functionality listed as 1796 mandatory-to-use above is not present, this MUST be treated as an 1797 error, and MUST cause the affected m= sections to be marked as 1798 rejected. 1800 The first step in generating an initial answer is to generate 1801 session-level attributes. The process here is identical to that 1802 indicated in the Initial Offers section above. 1804 The next step is to generate lip sync groups as defined in [RFC5888], 1805 Section 7. For each MediaStream with more than one MediaStreamTrack, 1806 a group of type "LS" MUST be added that contains the mid values for 1807 each MediaStreamTrack in that MediaStream. In some cases this may 1808 result in adding a mid to a given LS group that was not in that LS 1809 group in the associated offer. Although this is not allowed by 1810 [RFC5888], it is allowed when implementing this specification. 1811 [[OPEN ISSUE: This is still under discussion. See: 1812 https://github.com/rtcweb-wg/jsep/issues/162.]] 1814 The next step is to generate m= sections for each m= section that is 1815 present in the remote offer, as specified in [RFC3264], Section 6. 1816 For the purposes of this discussion, any session-level attributes in 1817 the offer that are also valid as media-level attributes SHALL be 1818 considered to be present in each m= section. 1820 The next step is to go through each offered m= section. If there is 1821 a local MediaStreamTrack of the same type which has been added to the 1822 PeerConnection via addStream and not yet associated with a m= 1823 section, and the specific m= section is either sendrecv or recvonly, 1824 the MediaStreamTrack will be associated with the m= section at this 1825 time. MediaStreamTracks are assigned to m= sections using the 1826 canonical order described in Section 5.2.1. If there are more m= 1827 sections of a certain type than MediaStreamTracks, some m= sections 1828 will not have an associated MediaStreamTrack. If there are more 1829 MediaStreamTracks of a certain type than compatible m= sections, only 1830 the first N MediaStreamTracks will be able to be associated in the 1831 constructed answer. The remainder will need to be associated in a 1832 subsequent offer. 1834 For each offered m= section, if the associated remote 1835 MediaStreamTrack has been stopped, and is therefore in state "ended", 1836 and no local MediaStreamTrack has been associated, the corresponding 1837 m= section in the answer MUST be marked as rejected by setting the 1838 port in the m= line to zero, as indicated in [RFC3264], Section 6., 1839 and further processing for this m= section can be skipped. 1841 Provided that is not the case, each m= section in the answer should 1842 then be generated as specified in [RFC3264], Section 6.1. For the m= 1843 line itself, the following rules must be followed: 1845 o The port value would normally be set to the port of the default 1846 ICE candidate for this m= section, but given that no candidates 1847 have yet been gathered, the "dummy" port value of 9 (Discard) MUST 1848 be used, as indicated in [I-D.ietf-mmusic-trickle-ice], 1849 Section 5.1. 1851 o The field MUST be set to exactly match the field 1852 for the corresponding m= line in the offer. 1854 The m= line MUST be followed immediately by a "c=" line, as specified 1855 in [RFC4566], Section 5.7. Again, as no candidates have yet been 1856 gathered, the "c=" line must contain the "dummy" value "IN IP6 ::", 1857 as defined in [I-D.ietf-mmusic-trickle-ice], Section 5.1. 1859 If the offer supports BUNDLE, all m= sections to be BUNDLEd must use 1860 the same ICE credentials and candidates; all m= sections not being 1861 BUNDLEd must use unique ICE credentials and candidates. Each m= 1862 section MUST include the following: 1864 o If present in the offer, an "a=mid" line, as specified in 1865 [RFC5888], Section 9.1. The "mid" value MUST match that specified 1866 in the offer. 1868 o An "a=rtcp" line, as specified in [RFC3605], Section 2.1, 1869 containing the dummy value "9 IN IP6 ::", because no candidates 1870 have yet been gathered. 1872 o If a local MediaStreamTrack has been associated, an "a=msid" line, 1873 as specified in [I-D.ietf-mmusic-msid], Section 2. 1875 o Depending on the directionality of the offer, the disposition of 1876 any associated remote MediaStreamTrack, and the presence of an 1877 associated local MediaStreamTrack, the appropriate directionality 1878 attribute, as specified in [RFC3264], Section 6.1. If the offer 1879 was sendrecv, and the remote MediaStreamTrack is still "live", and 1880 there is a local MediaStreamTrack that has been associated, the 1881 directionality MUST be set as sendrecv. If the offer was 1882 sendonly, and the remote MediaStreamTrack is still "live", the 1883 directionality MUST be set as recvonly. If the offer was 1884 recvonly, and a local MediaStreamTrack has been associated, the 1885 directionality MUST be set as sendonly. If the offer was 1886 inactive, the directionality MUST be set as inactive. 1888 o For each supported codec that is present in the offer, "a=rtpmap" 1889 and "a=fmtp" lines, as specified in [RFC4566], Section 6, and 1890 [RFC3264], Section 6.1. The audio and video codecs that MUST be 1891 supported are specified in [I-D.ietf-rtcweb-audio] (see Section 3) 1892 and [I-D.ietf-rtcweb-video] (see Section 5). Note that for 1893 simplicity, the answerer MAY use different payload types for 1894 codecs than the offerer, as it is not prohibited by Section 6.1. 1896 o If this m= section is for media with configurable frame sizes, 1897 e.g. audio, an "a=maxptime" line, indicating the smallest of the 1898 maximum supported frame sizes out of all codecs included above, as 1899 specified in [RFC4566], Section 6. 1901 o If this m= section is for video media, and there are known 1902 limitations on the size of images which can be decoded, an 1903 "a=imageattr" line, as specified in Section 3.5. 1905 o If "rtx" is present in the offer, for each primary codec where RTP 1906 retransmission should be used, a corresponding "a=rtpmap" line 1907 indicating "rtx" with the clock rate of the primary codec and an 1908 "a=fmtp" line that references the payload type of the primary 1909 codec, as specified in [RFC4588], Section 8.1. 1911 o For each supported FEC mechanism, "a=rtpmap" and "a=fmtp" lines, 1912 as specified in [RFC4566], Section 6. The FEC mechanisms that 1913 MUST be supported are specified in [I-D.ietf-rtcweb-fec], 1914 Section 6, and specific usage for each media type is outlined in 1915 Sections 4 and 5. 1917 o "a=ice-ufrag" and "a=ice-passwd" lines, as specified in [RFC5245], 1918 Section 15.4. 1920 o If the "trickle" ICE option is present in the offer, an "a=ice- 1921 options" line, with the "trickle" option, as specified in 1922 [I-D.ietf-mmusic-trickle-ice], Section 4. 1924 o An "a=fingerprint" line, as specified in [RFC4572], Section 5; the 1925 algorithm used for the fingerprint MUST match that used in the 1926 certificate signature. 1928 o An "a=setup" line, as specified in [RFC4145], Section 4, and 1929 clarified for use in DTLS-SRTP scenarios in [RFC5763], Section 5. 1931 The role value in the answer MUST be "active" or "passive"; the 1932 "active" role is RECOMMENDED. 1934 o If present in the offer, an "a=rtcp-mux" line, as specified in 1935 [RFC5761], Section 5.1.1. If the "require" RTCP multiplexing 1936 policy is set and no "a=rtcp-mux" line is present in the offer, 1937 then the m=line MUST be marked as rejected by setting the port in 1938 the m= line to zero, as indicated in [RFC3264], Section 6. 1940 o If present in the offer, an "a=rtcp-rsize" line, as specified in 1941 [RFC5506], Section 5. 1943 o For each supported RTP header extension that is present in the 1944 offer, an "a=extmap" line, as specified in [RFC5285], Section 5. 1945 The list of header extensions that SHOULD/MUST be supported is 1946 specified in [I-D.ietf-rtcweb-rtp-usage], Section 5.2. Any header 1947 extensions that require encryption MUST be specified as indicated 1948 in [RFC6904], Section 4. 1950 o For each supported RTCP feedback mechanism that is present in the 1951 offer, an "a=rtcp-fb" mechanism, as specified in [RFC4585], 1952 Section 4.2. The list of RTCP feedback mechanisms that SHOULD/ 1953 MUST be supported is specified in [I-D.ietf-rtcweb-rtp-usage], 1954 Section 5.1. 1956 o If a local MediaStreamTrack has been associated, an "a=ssrc" line, 1957 as specified in [RFC5576], Section 4.1, indicating the SSRC to be 1958 used for sending media, along with the mandatory "cname" source 1959 attribute, as specified in Section 6.1, indicating the CNAME for 1960 the source. The CNAME MUST be generated in accordance with 1961 Section 4.9 of [I-D.ietf-rtcweb-rtp-usage]. 1963 o If a local MediaStreamTrack has been associated, and RTX has been 1964 negotiated for this m= section, another "a=ssrc" line with the RTX 1965 SSRC, and an "a=ssrc-group" line, as specified in [RFC5576], 1966 section 4.2, with semantics set to "FID" and including the primary 1967 and RTX SSRCs. 1969 o If a local MediaStreamTrack has been associated, and FEC has been 1970 negotiated for this m= section, another "a=ssrc" line with the FEC 1971 SSRC, and an "a=ssrc-group" line with semantics set to "FEC-FR" 1972 and including the primary and FEC SSRCs, as specified in 1973 [RFC5956], section 4.3. For simplicity, if both RTX and FEC are 1974 supported, the FEC SSRC MUST be the same as the RTX SSRC. 1976 If a data channel m= section has been offered, a m= section MUST also 1977 be generated for data. The field MUST be set to 1978 "application" and the field MUST be set to exactly match the 1979 field in the offer; the "fmt" value MUST be set to the SCTP port 1980 number, as specified in Section 4.1. [TODO: update this to use 1981 a=sctp-port, as indicated in the latest data channel docs] 1983 Within the data m= section, the "a=mid", "a=ice-ufrag", "a=ice- 1984 passwd", "a=ice-options", "a=candidate", "a=fingerprint", and 1985 "a=setup" lines MUST be included as mentioned above, along with an 1986 "a=sctpmap" line referencing the SCTP port number and specifying the 1987 application protocol indicated in [I-D.ietf-rtcweb-data-protocol]. 1988 [OPEN ISSUE: the -01 of this document is missing this information.] 1990 If "a=group" attributes with semantics of "BUNDLE" are offered, 1991 corresponding session-level "a=group" attributes MUST be added as 1992 specified in [RFC5888]. These attributes MUST have semantics 1993 "BUNDLE", and MUST include the all mid identifiers from the offered 1994 BUNDLE groups that have not been rejected. Note that regardless of 1995 the presence of "a=bundle-only" in the offer, no m= sections in the 1996 answer should have an "a=bundle-only" line. 1998 Attributes that are common between all m= sections MAY be moved to 1999 session-level, if explicitly defined to be valid at session-level. 2001 The attributes prohibited in the creation of offers are also 2002 prohibited in the creation of answers. 2004 5.3.2. Subsequent Answers 2006 When createAnswer is called a second (or later) time, or is called 2007 after a local description has already been installed, the processing 2008 is somewhat different than for an initial answer. 2010 If the initial answer was not applied using setLocalDescription, 2011 meaning the PeerConnection is still in the "have-remote-offer" state, 2012 the steps for generating an initial answer should be followed, 2013 subject to the following restriction: 2015 o The fields of the "o=" line MUST stay the same except for the 2016 field, which MUST increment if the session 2017 description changes in any way from the previously generated 2018 answer. 2020 If any session description was previously supplied to 2021 setLocalDescription, an answer is generated by following the steps in 2022 the "have-remote-offer" state above, along with these exceptions: 2024 o The "s=" and "t=" lines MUST stay the same. 2026 o Each "m=" and c=" line MUST be filled in with the port and address 2027 of the default candidate for the m= section, as described in 2028 [RFC5245], Section 4.3. Note, however, that the m= line protocol 2029 need not match the default candidate, because this protocol value 2030 must instead match what was supplied in the offer, as described 2031 above. Each "a=rtcp" attribute line MUST also be filled in with 2032 the port and address of the appropriate default candidate, either 2033 the default RTP or RTCP candidate, depending on whether RTCP 2034 multiplexing is enabled in the answer. In each case, if no 2035 candidates of the desired type have yet been gathered, dummy 2036 values MUST be used, as described in the initial answer section 2037 above. 2039 o Each "a=ice-ufrag" and "a=ice-pwd" line MUST stay the same. 2041 o Within each m= section, for each candidate that has been gathered 2042 during the most recent gathering phase (see Section 3.4.1), an 2043 "a=candidate" line MUST be added, as specified in [RFC5245], 2044 Section 4.3., paragraph 3. If candidate gathering for the section 2045 has completed, an "a=end-of-candidates" attribute MUST be added, 2046 as described in [I-D.ietf-mmusic-trickle-ice], Section 9.3. 2048 o For MediaStreamTracks that are still present, the "a=msid", 2049 "a=ssrc", and "a=ssrc-group" lines MUST stay the same. 2051 5.3.3. Options Handling 2053 The createAnswer method takes as a parameter an RTCAnswerOptions 2054 object. The set of parameters for RTCAnswerOptions is different than 2055 those supported in RTCOfferOptions; the OfferToReceiveAudio, 2056 OfferToReceiveVideo, and IceRestart options mentioned in 2057 Section 5.2.3 are meaningless in the context of generating an answer, 2058 as there is no need to generate extra m= lines in an answer, and ICE 2059 credentials will automatically be changed for all m= lines where the 2060 offerer chose to perform ICE restart. 2062 The following options are supported in RTCAnswerOptions. 2064 5.3.3.1. VoiceActivityDetection 2066 Silence suppression in the answer is handled as described in 2067 Section 5.2.3.4. 2069 5.4. Processing a Local Description 2071 When a SessionDescription is supplied to setLocalDescription, the 2072 following steps MUST be performed: 2074 o First, the type of the SessionDescription is checked against the 2075 current state of the PeerConnection: 2077 * If the type is "offer", the PeerConnection state MUST be either 2078 "stable" or "have-local-offer". 2080 * If the type is "pranswer" or "answer", the PeerConnection state 2081 MUST be either "have-remote-offer" or "have-local-pranswer". 2083 o If the type is not correct for the current state, processing MUST 2084 stop and an error MUST be returned. 2086 o Next, the SessionDescription is parsed into a data structure, as 2087 described in the Section 5.6 section below. If parsing fails for 2088 any reason, processing MUST stop and an error MUST be returned. 2090 o Finally, the parsed SessionDescription is applied as described in 2091 the Section 5.7 section below. 2093 5.5. Processing a Remote Description 2095 When a SessionDescription is supplied to setRemoteDescription, the 2096 following steps MUST be performed: 2098 o First, the type of the SessionDescription is checked against the 2099 current state of the PeerConnection: 2101 * If the type is "offer", the PeerConnection state MUST be either 2102 "stable" or "have-remote-offer". 2104 * If the type is "pranswer" or "answer", the PeerConnection state 2105 MUST be either "have-local-offer" or "have-remote-pranswer". 2107 o If the type is not correct for the current state, processing MUST 2108 stop and an error MUST be returned. 2110 o Next, the SessionDescription is parsed into a data structure, as 2111 described in the Section 5.6 section below. If parsing fails for 2112 any reason, processing MUST stop and an error MUST be returned. 2114 o Finally, the parsed SessionDescription is applied as described in 2115 the Section 5.8 section below. 2117 5.6. Parsing a Session Description 2119 [The behavior described herein is a draft version, and needs more 2120 discussion to resolve various open issues.] 2121 When a SessionDescription of any type is supplied to setLocal/ 2122 RemoteDescription, the implementation must parse it and reject it if 2123 it is invalid. The exact details of this process are explained 2124 below. 2126 The SDP contained in the session description object consists of a 2127 sequence of text lines, each containing a key-value expression, as 2128 described in [RFC4566], Section 5. The SDP is read, line-by-line, 2129 and converted to a data structure that contains the deserialized 2130 information. However, SDP allows many types of lines, not all of 2131 which are relevant to JSEP applications. For each line, the 2132 implementation will first ensure it is syntactically correct 2133 according its defining ABNF [TODO: reference], check that it conforms 2134 to [RFC4566] and [RFC3264] semantics, and then either parse and store 2135 or discard the provided value, as described below. [TODO: ensure 2136 that every line is listed below.] If the line is not well-formed, or 2137 cannot be parsed as described, the parser MUST stop with an error and 2138 reject the session description. This ensures that implementations do 2139 not accidentally misinterpret ambiguous SDP. 2141 5.6.1. Session-Level Parsing 2143 First, the session-level lines are checked and parsed. These lines 2144 MUST occur in a specific order, and with a specific syntax, as 2145 defined in [RFC4566], Section 5. Note that while the specific line 2146 types (e.g. "v=", "c=") MUST occur in the defined order, lines of the 2147 same type (typically "a=") can occur in any order, and their ordering 2148 is not meaningful. 2150 For non-attribute (non-"a=") lines, their sequencing, syntax, and 2151 semantics, are checked, as mentioned above. The following lines are 2152 not meaningful in the JSEP context and MAY be discarded once they 2153 have been checked. 2155 The "c=" line MUST be checked for syntax but its value is not 2156 used. This supersedes the guidance in [RFC5245], Section 6.1, to 2157 use "ice-mismatch" to indicate mismatches between "c=" and the 2158 candidate lines; because JSEP always uses ICE, "ice-mismatch" is 2159 not useful in this context. 2161 TODO 2163 The remaining lines are processed as follows: 2165 The "b=" line, if present, MUST be parsed as specified in 2166 [RFC4566], Section 5.8, and the bwtype and bandwidth values 2167 stored. 2169 [OPEN ISSUE: is this WG consensus? Are there other non-a= lines 2170 that we need to do more than just syntactical validation, e.g. 2171 v=?] 2173 Specific processing MUST be applied for the following session-level 2174 attribute ("a=") lines: 2176 o Any "a=group" lines are parsed as specified in [RFC5888], 2177 Section 5, and the group's semantics and mids are stored. 2179 o If present, a single "a=ice-lite" line is parsed as specified in 2180 [RFC5245], Section 15.3, and a value indicating the presence of 2181 ice-lite is stored. 2183 o If present, a single "a=ice-ufrag" line is parsed as specified in 2184 [RFC5245], Section 15.4, and the ufrag value is stored. 2186 o If present, a single "a=ice-pwd" line is parsed as specified in 2187 [RFC5245], Section 15.4, and the password value is stored. 2189 o If present, a single "a=ice-options" line is parsed as specified 2190 in [RFC5245], Section 15.5, and the set of specified options is 2191 stored. 2193 o Any "a=fingerprint" lines are parsed as specified in [RFC4572], 2194 Section 5, and the set of fingerprint and algorithm values is 2195 stored. 2197 o If present, a single "a=setup" line is parsed as specified in 2198 [RFC4145], Section 4, and the setup value is stored. 2200 o Any "a=extmap" lines are parsed as specified in [RFC5285], 2201 Section 5, and their values are stored. 2203 o TODO: msid-semantic, identity, rtcp-rsize, rtcp-mux, and any other 2204 attribs valid at session level. 2206 Once all the session-level lines have been parsed, processing 2207 continues with the lines in media sections. 2209 5.6.2. Media Section Parsing 2211 Like the session-level lines, the media session lines MUST occur in 2212 the specific order and with the specific syntax defined in [RFC4566], 2213 Section 5. 2215 The "m=" line itself MUST be parsed as described in [RFC4566], 2216 Section 5.14, and the media, port, proto, and fmt values stored. 2218 Following the "m=" line, specific processing MUST be applied for the 2219 following non-attribute lines: 2221 o As with the "c=" line at the session level, the "c=" line MUST be 2222 parsed according to [RFC4566], Section 5.7, but its value is not 2223 used. 2225 o The "b=" line, if present, MUST be parsed as specified in 2226 [RFC4566], Section 5.8, and the bwtype and bandwidth values 2227 stored. 2229 Specific processing MUST also be applied for the following attribute 2230 lines: 2232 o If present, a single "a=ice-ufrag" line is parsed as specified in 2233 [RFC5245], Section 15.4, and the ufrag value is stored. 2235 o If present, a single "a=ice-pwd" line is parsed as specified in 2236 [RFC5245], Section 15.4, and the password value is stored. 2238 o If present, a single "a=ice-options" line is parsed as specified 2239 in [RFC5245], Section 15.5, and the set of specified options is 2240 stored. 2242 o Any "a=fingerprint" lines are parsed as specified in [RFC4572], 2243 Section 5, and the set of fingerprint and algorithm values is 2244 stored. 2246 o If present, a single "a=setup" line is parsed as specified in 2247 [RFC4145], Section 4, and the setup value is stored. 2249 If the "m=" proto value indicates use of RTP, as decribed in the 2250 Section 5.1.3 section above, the following attribute lines MUST be 2251 processed: 2253 o The "m=" fmt value MUST be parsed as specified in [RFC4566], 2254 Section 5.14, and the individual values stored. 2256 o Any "a=rtpmap" or "a=fmtp" lines MUST be parsed as specified in 2257 [RFC4566], Section 6, and their values stored. 2259 o If present, a single "a=ptime" line MUST be parsed as described in 2260 [RFC4566], Section 6, and its value stored. 2262 o If present, a single direction attribute line (e.g. "a=sendrecv") 2263 MUST be parsed as described in [RFC4566], Section 6, and its value 2264 stored. 2266 o Any "a=ssrc" or "a=ssrc-group" attributes MUST be parsed as 2267 specified in [RFC5576], Sections 4.1-4.2, and their values stored. 2269 o Any "a=extmap" attributes MUST be parsed as specified in 2270 [RFC5285], Section 5, and their values stored. 2272 o Any "a=rtcp-fb" attributes MUST be parsed as specified in 2273 [RFC4585], Section 4.2., and their values stored. 2275 o If present, a single "a=rtcp-mux" line MUST be parsed as specified 2276 in [RFC5761], Section 5.1.1, and its presence or absence flagged 2277 and stored. 2279 o TODO: a=rtcp-rsize, a=rtcp, a=msid, a=candidate, a=end-of- 2280 candidates 2282 Otherwise, if the "m=" proto value indicats use of SCTP, the 2283 following attribute lines MUST be processed: 2285 o The "m=" fmt value MUST be parsed as specified in 2286 [I-D.ietf-mmusic-sctp-sdp], Section 4.3, and the application 2287 protocol value stored. 2289 o An "a=sctp-port" attribute MUST be present, and it MUST be parsed 2290 as specified in [I-D.ietf-mmusic-sctp-sdp], Section 5.2, and the 2291 value stored. 2293 o TODO: max message size 2295 5.6.3. Semantics Verification 2297 Assuming parsing completes successfully, the parsed description is 2298 then evaluated to ensure internal consistency as well as proper 2299 support for mandatory features. Specifically, the following checks 2300 are performed: 2302 o For each m= section, valid values for each of the mandatory-to-use 2303 features enumerated in Section 5.1.2 MUST be present. These 2304 values MAY either be present at the media level, or inherited from 2305 the session level. 2307 * ICE ufrag and password values 2309 * DTLS fingerprint and setup values 2311 If this session description is of type "pranswer" or "answer", the 2312 following additional checks are applied: 2314 o The session description must follow the rules defined in 2315 [RFC3264], Section 6. 2317 o For each m= section, the protocol value MUST exactly match the 2318 protocol value in the corresponding m= section in the associated 2319 offer. 2321 5.7. Applying a Local Description 2323 The following steps are performed at the media engine level to apply 2324 a local description. 2326 First, the parsed parameters are checked to ensure that any 2327 modifications performed fall within those explicitly permitted by 2328 Section 6; otherwise, processing MUST stop and an error MUST be 2329 returned. 2331 Next, media sections are processed. For each media section, the 2332 following steps MUST be performed; if any parameters are out of 2333 bounds, or cannot be applied, processing MUST stop and an error MUST 2334 be returned. 2336 o TODO 2338 Finally, if this description is of type "pranswer" or "answer", 2339 follow the processing defined in the Section 5.9 section below. 2341 5.8. Applying a Remote Description 2343 TODO 2345 5.9. Applying an Answer 2347 TODO 2349 6. Configurable SDP Parameters 2351 It is possible to change elements in the SDP returned from 2352 createOffer before passing it to setLocalDescription. When an 2353 implementation receives modified SDP it MUST either: 2355 o Accept the changes and adjust its behavior to match the SDP. 2357 o Reject the changes and return an error via the error callback. 2359 Changes MUST NOT be silently ignored. 2361 The following elements of the session description MUST NOT be changed 2362 between the createOffer and the setLocalDescription (or between the 2363 createAnswer and the setLocalDescription), since they reflect 2364 transport attributes that are solely under browser control, and the 2365 browser MUST NOT honor an attempt to change them: 2367 o The number, type and port number of m= lines. 2369 o The generated ICE credentials (a=ice-ufrag and a=ice-pwd). 2371 o The set of ICE candidates and their parameters (a=candidate). 2373 o The DTLS fingerprint(s) (a=fingerprint). 2375 The following modifications, if done by the browser to a description 2376 between createOffer/createAnswer and the setLocalDescription, MUST be 2377 honored by the browser: 2379 o Remove or reorder codecs (m=) 2381 The following parameters may be controlled by options passed into 2382 createOffer/createAnswer. As an open issue, these changes may also 2383 be be performed by manipulating the SDP returned from createOffer/ 2384 createAnswer, as indicated above, as long as the capabilities of the 2385 endpoint are not exceeded (e.g. asking for a resolution greater than 2386 what the endpoint can encode): 2388 o [[OPEN ISSUE: This is a placeholder for other modifications, which 2389 we may continue adding as use cases appear.]] 2391 Implementations MAY choose to either honor or reject any elements not 2392 listed in the above two categories, but must do so explicitly as 2393 described at the beginning of this section. Note that future 2394 standards may add new SDP elements to the list of elements which must 2395 be accepted or rejected, but due to version skew, applications must 2396 be prepared for implementations to accept changes which must be 2397 rejected and vice versa. 2399 The application can also modify the SDP to reduce the capabilities in 2400 the offer it sends to the far side or the offer that it installs from 2401 the far side in any way the application sees fit, as long as it is a 2402 valid SDP offer and specifies a subset of what was in the original 2403 offer. This is safe because the answer is not permitted to expand 2404 capabilities and therefore will just respond to what is actually in 2405 the offer. 2407 As always, the application is solely responsible for what it sends to 2408 the other party, and all incoming SDP will be processed by the 2409 browser to the extent of its capabilities. It is an error to assume 2410 that all SDP is well-formed; however, one should be able to assume 2411 that any implementation of this specification will be able to 2412 process, as a remote offer or answer, unmodified SDP coming from any 2413 other implementation of this specification. 2415 7. Examples 2417 Note that this example section shows several SDP fragments. To 2418 format in 72 columns, some of the lines in SDP have been split into 2419 multiple lines, where leading whitespace indicates that a line is a 2420 continuation of the previous line. In addition, some blank lines 2421 have been added to improve readability but are not valid in SDP. 2423 More examples of SDP for WebRTC call flows can be found in 2424 [I-D.nandakumar-rtcweb-sdp]. 2426 7.1. Simple Example 2428 This section shows a very simple example that sets up a minimal audio 2429 / video call between two browsers and does not use trickle ICE. The 2430 example in the following section provides a more realistic example of 2431 what would happen in a normal browser to browser connection. 2433 The flow shows Alice's browser initiating the session to Bob's 2434 browser. The messages from Alice's JS to Bob's JS are assumed to 2435 flow over some signaling protocol via a web server. The JS on both 2436 Alice's side and Bob's side waits for all candidates before sending 2437 the offer or answer, so the offers and answers are complete. Trickle 2438 ICE is not used. Both Alice and Bob are using the default policy of 2439 balanced. 2441 // set up local media state 2442 AliceJS->AliceUA: create new PeerConnection 2443 AliceJS->AliceUA: addStream with stream containing audio and video 2444 AliceJS->AliceUA: createOffer to get offer 2445 AliceJS->AliceUA: setLocalDescription with offer 2446 AliceUA->AliceJS: multiple onicecandidate events with candidates 2448 // wait for ICE gathering to complete 2449 AliceUA->AliceJS: onicecandidate event with null candidate 2450 AliceJS->AliceUA: get |offer-A1| from value of localDescription 2452 // |offer-A1| is sent over signaling protocol to Bob 2453 AliceJS->WebServer: signaling with |offer-A1| 2454 WebServer->BobJS: signaling with |offer-A1| 2456 // |offer-A1| arrives at Bob 2457 BobJS->BobUA: create a PeerConnection 2458 BobJS->BobUA: setRemoteDescription with |offer-A1| 2459 BobUA->BobJS: onaddstream event with remoteStream 2461 // Bob accepts call 2462 BobJS->BobUA: addStream with local media 2463 BobJS->BobUA: createAnswer 2464 BobJS->BobUA: setLocalDescription with answer 2465 BobUA->BobJS: multiple onicecandidate events with candidates 2467 // wait for ICE gathering to complete 2468 BobUA->BobJS: onicecandidate event with null candidate 2469 BobJS->BobUA: get |answer-A1| from value of localDescription 2471 // |answer-A1| is sent over signaling protocol to Alice 2472 BobJS->WebServer: signaling with |answer-A1| 2473 WebServer->AliceJS: signaling with |answer-A1| 2475 // |answer-A1| arrives at Alice 2476 AliceJS->AliceUA: setRemoteDescription with |answer-A1| 2477 AliceUA->AliceJS: onaddstream event with remoteStream 2479 // media flows 2480 BobUA->AliceUA: media sent from Bob to Alice 2481 AliceUA->BobUA: media sent from Alice to Bob 2483 The SDP for |offer-A1| looks like: 2485 v=0 2486 o=- 4962303333179871722 1 IN IP4 0.0.0.0 2487 s=- 2488 t=0 0 2489 a=msid-semantic:WMS 2490 a=group:BUNDLE a1 v1 2491 m=audio 56500 UDP/TLS/RTP/SAVPF 96 0 8 97 98 2492 c=IN IP4 192.0.2.1 2493 a=mid:a1 2494 a=rtcp:56501 IN IP4 192.0.2.1 2495 a=msid:47017fee-b6c1-4162-929c-a25110252400 2496 f83006c5-a0ff-4e0a-9ed9-d3e6747be7d9 2497 a=sendrecv 2498 a=rtpmap:96 opus/48000/2 2499 a=rtpmap:0 PCMU/8000 2500 a=rtpmap:8 PCMA/8000 2501 a=rtpmap:97 telephone-event/8000 2502 a=rtpmap:98 telephone-event/48000 2503 a=maxptime:120 2504 a=ice-ufrag:ETEn1v9DoTMB9J4r 2505 a=ice-pwd:OtSK0WpNtpUjkY4+86js7ZQl 2506 a=ice-options:trickle 2507 a=fingerprint:sha-256 2508 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 2509 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 2510 a=setup:actpass 2511 a=rtcp-mux 2512 a=rtcp-rsize 2513 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level 2514 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 2515 a=ssrc:1732846380 cname:EocUG1f0fcg/yvY7 2516 a=candidate:3348148302 1 udp 2113937151 192.0.2.1 56500 2517 typ host 2518 a=candidate:3348148302 2 udp 2113937151 192.0.2.1 56501 2519 typ host 2520 a=end-of-candidates 2522 m=video 56502 UDP/TLS/RTP/SAVPF 100 101 2523 c=IN IP4 192.0.2.1 2524 a=rtcp:56503 IN IP4 192.0.2.1 2525 a=mid:v1 2526 a=msid:61317484-2ed4-49d7-9eb7-1414322a7aae 2527 f30bdb4a-5db8-49b5-bcdc-e0c9a23172e0 2528 a=sendrecv 2529 a=rtpmap:100 VP8/90000 2530 a=rtpmap:101 rtx/90000 2531 a=fmtp:101 apt=100 2532 a=ice-ufrag:BGKkWnG5GmiUpdIV 2533 a=ice-pwd:mqyWsAjvtKwTGnvhPztQ9mIf 2534 a=ice-options:trickle 2535 a=fingerprint:sha-256 2536 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 2538 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 2539 a=setup:actpass 2540 a=rtcp-mux 2541 a=rtcp-rsize 2542 a=extmap:3 urn:ietf:params:rtp-hdrext:sdes:mid 2543 a=rtcp-fb:100 ccm fir 2544 a=rtcp-fb:100 nack 2545 a=rtcp-fb:100 nack pli 2546 a=ssrc:1366781083 cname:EocUG1f0fcg/yvY7 2547 a=ssrc:1366781084 cname:EocUG1f0fcg/yvY7 2548 a=ssrc-group:FID 1366781083 1366781084 2549 a=candidate:3348148302 1 udp 2113937151 192.0.2.1 56502 2550 typ host 2551 a=candidate:3348148302 2 udp 2113937151 192.0.2.1 56503 2552 typ host 2553 a=end-of-candidates 2555 The SDP for |answer-A1| looks like: 2557 v=0 2558 o=- 6729291447651054566 1 IN IP4 0.0.0.0 2559 s=- 2560 t=0 0 2561 a=msid-semantic:WMS 2562 m=audio 20000 UDP/TLS/RTP/SAVPF 96 0 8 97 98 2563 c=IN IP4 192.0.2.2 2564 a=mid:a1 2565 a=rtcp:20000 IN IP4 192.0.2.2 2566 a=msid:PI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1 2567 PI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1a0 2568 a=sendrecv 2569 a=rtpmap:96 opus/48000/2 2570 a=rtpmap:0 PCMU/8000 2571 a=rtpmap:8 PCMA/8000 2572 a=rtpmap:97 telephone-event/8000 2573 a=rtpmap:98 telephone-event/48000 2574 a=maxptime:120 2575 a=ice-ufrag:6sFvz2gdLkEwjZEr 2576 a=ice-pwd:cOTZKZNVlO9RSGsEGM63JXT2 2577 a=fingerprint:sha-256 6B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35 2578 :DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 2579 a=setup:active 2580 a=rtcp-mux 2581 a=rtcp-rsize 2582 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level 2583 a=ssrc:3429951804 cname:Q/NWs1ao1HmN4Xa5 2584 a=candidate:2299743422 1 udp 2113937151 192.0.2.2 20000 2585 typ host 2587 a=end-of-candidates 2589 m=video 20001 UDP/TLS/RTP/SAVPF 100 101 2590 c=IN IP4 192.0.2.2 2591 a=rtcp 20001 IN IP4 192.0.2.2 2592 a=mid:v1 2593 a=msid:PI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1 2594 PI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1v0 2595 a=sendrecv 2596 a=rtpmap:100 VP8/90000 2597 a=rtpmap:101 rtx/90000 2598 a=fmtp:101 apt=100 2599 a=ice-ufrag:6sFvz2gdLkEwjZEr 2600 a=ice-pwd:cOTZKZNVlO9RSGsEGM63JXT2 2601 a=fingerprint:sha-256 6B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35 2602 :DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 2603 a=setup:active 2604 a=rtcp-mux 2605 a=rtcp-rsize 2606 a=rtcp-fb:100 ccm fir 2607 a=rtcp-fb:100 nack 2608 a=rtcp-fb:100 nack pli 2609 a=ssrc:3229706345 cname:Q/NWs1ao1HmN4Xa5 2610 a=ssrc:3229706346 cname:Q/NWs1ao1HmN4Xa5 2611 a=ssrc-group:FID 3229706345 3229706346 2612 a=candidate:2299743422 1 udp 2113937151 192.0.2.2 20001 2613 typ host 2614 a=end-of-candidates 2616 7.2. Normal Examples 2618 This section shows a typical example of a session between two 2619 browsers setting up an audio channel and a data channel. Trickle ICE 2620 is used in full trickle mode with a bundle policy of max-bundle, an 2621 RTCP mux policy of require, and a single TURN server. Later, two 2622 video flows, one for the presenter and one for screen sharing, are 2623 added to the session. This example shows Alice's browser initiating 2624 the session to Bob's browser. The messages from Alice's JS to Bob's 2625 JS are assumed to flow over some signaling protocol via a web server. 2627 // set up local media state 2628 AliceJS->AliceUA: create new PeerConnection 2629 AliceJS->AliceUA: addStream that contains audio track 2630 AliceJS->AliceUA: createDataChannel to get data channel 2631 AliceJS->AliceUA: createOffer to get |offer-B1| 2632 AliceJS->AliceUA: setLocalDescription with |offer-B1| 2634 // |offer-B1| is sent over signaling protocol to Bob 2635 AliceJS->WebServer: signaling with |offer-B1| 2636 WebServer->BobJS: signaling with |offer-B1| 2638 // |offer-B1| arrives at Bob 2639 BobJS->BobUA: create a PeerConnection 2640 BobJS->BobUA: setRemoteDescription with |offer-B1| 2641 BobUA->BobJS: onaddstream with audio track from Alice 2643 // candidates are sent to Bob 2644 AliceUA->AliceJS: onicecandidate event with |candidate-B1| (host) 2645 AliceJS->WebServer: signaling with |candidate-B1| 2646 AliceUA->AliceJS: onicecandidate event with |candidate-B2| (srflx) 2647 AliceJS->WebServer: signaling with |candidate-B2| 2648 AliceUA->AliceJS: onicecandidate event with |candidate-B3| (relay) 2649 AliceJS->WebServer: signaling with |candidate-B3| 2651 WebServer->BobJS: signaling with |candidate-B1| 2652 BobJS->BobUA: addIceCandidate with |candidate-B1| 2653 WebServer->BobJS: signaling with |candidate-B2| 2654 BobJS->BobUA: addIceCandidate with |candidate-B2| 2655 WebServer->BobJS: signaling with |candidate-B3| 2656 BobJS->BobUA: addIceCandidate with |candidate-B3| 2658 // Bob accepts call 2659 BobJS->BobUA: addStream with local audio stream 2660 BobJS->BobUA: createDataChannel to get data channel 2661 BobJS->BobUA: createAnswer to get |answer-B1| 2662 BobJS->BobUA: setLocalDescription with |answer-B1| 2664 // |answer-B1| is sent to Alice 2665 BobJS->WebServer: signaling with |answer-B1| 2666 WebServer->AliceJS: signaling with |answer-B1| 2667 AliceJS->AliceUA: setRemoteDescription with |answer-B1| 2668 AliceUA->AliceJS: onaddstream event with audio track from Bob 2670 // candidates are sent to Alice 2671 BobUA->BobJS: onicecandidate event with |candidate-B4| (host) 2672 BobJS->WebServer: signaling with |candidate-B4| 2673 BobUA->BobJS: onicecandidate event with |candidate-B5| (srflx) 2674 BobJS->WebServer: signaling with |candidate-B5| 2675 BobUA->BobJS: onicecandidate event with |candidate-B6| (relay) 2676 BobJS->WebServer: signaling with |candidate-B6| 2678 WebServer->AliceJS: signaling with |candidate-B4| 2679 AliceJS->AliceUA: addIceCandidate with |candidate-B4| 2680 WebServer->AliceJS: signaling with |candidate-B5| 2681 AliceJS->AliceUA: addIceCandidate with |candidate-B5| 2682 WebServer->AliceJS: signaling with |candidate-B6| 2683 AliceJS->AliceUA: addIceCandidate with |candidate-B6| 2685 // data channel opens 2686 BobUA->BobJS: ondatachannel event 2687 AliceUA->AliceJS: ondatachannel event 2688 BobUA->BobJS: onopen 2689 AliceUA->AliceJS: onopen 2691 // media is flowing between browsers 2692 BobUA->AliceUA: audio+data sent from Bob to Alice 2693 AliceUA->BobUA: audio+data sent from Alice to Bob 2695 // some time later Bob adds two video streams 2696 // note, no candidates exchanged, because of BUNDLE 2697 BobJS->BobUA: addStream with first video stream 2698 BobJS->BobUA: addStream with second video stream 2699 BobJS->BobUA: createOffer to get |offer-B2| 2700 BobJS->BobUA: setLocalDescription with |offer-B2| 2702 // |offer-B2| is sent to Alice 2703 BobJS->WebServer: signaling with |offer-B2| 2704 WebServer->AliceJS: signaling with |offer-B2| 2705 AliceJS->AliceUA: setRemoteDescription with |offer-B2| 2706 AliceUA->AliceJS: onaddstream event with first video stream 2707 AliceUA->AliceJS: onaddstream event with second video stream 2708 AliceJS->AliceUA: createAnswer to get |answer-B2| 2709 AliceJS->AliceUA: setLocalDescription with |answer-B2| 2711 // |answer-B2| is sent over signaling protocol to Bob 2712 AliceJS->WebServer: signaling with |answer-B2| 2713 WebServer->BobJS: signaling with |answer-B2| 2714 BobJS->BobUA: setRemoteDescription with |answer-B2| 2716 // media is flowing between browsers 2717 BobUA->AliceUA: audio+video+data sent from Bob to Alice 2718 AliceUA->BobUA: audio+video+data sent from Alice to Bob 2720 The SDP for |offer-B1| looks like: 2722 v=0 2723 o=- 4962303333179871723 1 IN IP4 0.0.0.0 2724 s=- 2725 t=0 0 2726 a=msid-semantic:WMS 2727 a=group:BUNDLE a1 d1 2728 m=audio 9 UDP/TLS/RTP/SAVPF 96 0 8 97 98 2729 c=IN IP6 :: 2730 a=rtcp:9 IN IP6 :: 2731 a=mid:a1 2732 a=msid:57017fee-b6c1-4162-929c-a25110252400 2733 e83006c5-a0ff-4e0a-9ed9-d3e6747be7d9 2734 a=sendrecv 2735 a=rtpmap:96 opus/48000/2 2736 a=rtpmap:0 PCMU/8000 2737 a=rtpmap:8 PCMA/8000 2738 a=rtpmap:97 telephone-event/8000 2739 a=rtpmap:98 telephone-event/48000 2740 a=maxptime:120 2741 a=ice-ufrag:ATEn1v9DoTMB9J4r 2742 a=ice-pwd:AtSK0WpNtpUjkY4+86js7ZQl 2743 a=ice-options:trickle 2744 a=fingerprint:sha-256 2745 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 2746 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 2747 a=setup:actpass 2748 a=rtcp-mux 2749 a=rtcp-rsize 2750 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level 2751 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 2752 a=ssrc:1732846380 cname:FocUG1f0fcg/yvY7 2754 m=application 9 UDP/DTLS/SCTP webrtc-datachannel 2755 c=IN IP6 :: 2756 a=mid:d1 2757 a=fmtp:webrtc-datachannel max-message-size=65536 2758 a=sctp-port 5000 2759 a=ice-ufrag:ATEn1v9DoTMB9J4r 2760 a=ice-pwd:AtSK0WpNtpUjkY4+86js7ZQl 2761 a=ice-options:trickle 2762 a=fingerprint:sha-256 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 2763 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 2764 a=setup:actpass 2766 The SDP for |candidate-B1| looks like: 2768 candidate:109270923 1 udp 2122194687 192.168.1.2 51556 typ host 2769 The SDP for |candidate-B2| looks like: 2771 candidate:4036177503 1 udp 1685987071 11.22.33.44 52546 typ srflx 2772 raddr 192.168.1.2 rport 51556 2774 The SDP for |candidate-B3| looks like: 2776 candidate:3671762466 1 udp 41819903 22.33.44.55 61405 typ relay 2777 raddr 11.22.33.44 rport 52546 2779 The SDP for |answer-B1| looks like: 2781 v=0 2782 o=- 7729291447651054566 1 IN IP4 0.0.0.0 2783 s=- 2784 t=0 0 2785 a=msid-semantic:WMS 2786 a=group:BUNDLE a1 d1 2787 m=audio 9 UDP/TLS/RTP/SAVPF 96 0 8 97 98 2788 c=IN IP6 :: 2789 a=rtcp:9 IN IP6 :: 2790 a=mid:a1 2791 a=msid:QI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1 2792 QI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1a0 2793 a=sendrecv 2794 a=rtpmap:96 opus/48000/2 2795 a=rtpmap:0 PCMU/8000 2796 a=rtpmap:8 PCMA/8000 2797 a=rtpmap:97 telephone-event/8000 2798 a=rtpmap:98 telephone-event/48000 2799 a=maxptime:120 2800 a=ice-ufrag:7sFvz2gdLkEwjZEr 2801 a=ice-pwd:dOTZKZNVlO9RSGsEGM63JXT2 2802 a=ice-options:trickle 2803 a=fingerprint:sha-256 6B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35 2804 :DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 2805 a=setup:active 2806 a=rtcp-mux 2807 a=rtcp-rsize 2808 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level 2809 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 2810 a=ssrc:4429951804 cname:Q/NWs1ao1HmN4Xa5 2812 m=application 9 UDP/DTLS/SCTP webrtc-datachannel 2813 c=IN IP6 :: 2814 a=mid:d1 2815 a=fmtp:webrtc-datachannel max-message-size=65536 2816 a=sctp-port 5000 2817 a=ice-ufrag:7sFvz2gdLkEwjZEr 2818 a=ice-pwd:dOTZKZNVlO9RSGsEGM63JXT2 2819 a=ice-options:trickle 2820 a=fingerprint:sha-256 6B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35 2821 :DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 2822 a=setup:active 2824 The SDP for |candidate-B4| looks like: 2826 candidate:109270924 1 udp 2122194687 192.168.2.3 61665 typ host 2828 The SDP for |candidate-B5| looks like: 2830 candidate:4036177504 1 udp 1685987071 55.66.77.88 64532 typ srflx 2831 raddr 192.168.2.3 rport 61665 2833 The SDP for |candidate-B6| looks like: 2835 candidate:3671762467 1 udp 41819903 66.77.88.99 50416 typ relay 2836 raddr 55.66.77.88 rport 64532 2838 The SDP for |offer-B2| looks like: (note the increment of the version 2839 number in the o= line, and the c= and a=rtcp lines, which indicate 2840 the local candidate that was selected) 2842 v=0 2843 o=- 7729291447651054566 2 IN IP4 0.0.0.0 2844 s=- 2845 t=0 0 2846 a=msid-semantic:WMS 2847 a=group:BUNDLE a1 d1 v1 v2 2848 m=audio 64532 UDP/TLS/RTP/SAVPF 96 0 8 97 98 2849 c=IN IP4 55.66.77.88 2850 a=rtcp:64532 IN IP4 55.66.77.88 2851 a=mid:a1 2852 a=msid:QI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1 2853 QI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1a0 2854 a=sendrecv 2855 a=rtpmap:96 opus/48000/2 2856 a=rtpmap:0 PCMU/8000 2857 a=rtpmap:8 PCMA/8000 2858 a=rtpmap:97 telephone-event/8000 2859 a=rtpmap:98 telephone-event/48000 2860 a=maxptime:120 2861 a=ice-ufrag:7sFvz2gdLkEwjZEr 2862 a=ice-pwd:dOTZKZNVlO9RSGsEGM63JXT2 2863 a=ice-options:trickle 2864 a=fingerprint:sha-256 6B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35 2865 :DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 2866 a=setup:actpass 2867 a=rtcp-mux 2868 a=rtcp-rsize 2869 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level 2870 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 2871 a=ssrc:4429951804 cname:Q/NWs1ao1HmN4Xa5 2872 a=candidate:109270924 1 udp 2122194687 192.168.2.3 61665 typ host 2873 a=candidate:4036177504 1 udp 1685987071 55.66.77.88 64532 typ srflx 2874 raddr 192.168.2.3 rport 61665 2875 a=candidate:3671762467 1 udp 41819903 66.77.88.99 50416 typ relay 2876 raddr 55.66.77.88 rport 64532 2877 a=end-of-candidates 2878 m=application 64532 UDP/DTLS/SCTP webrtc-datachannel 2879 c=IN IP4 55.66.77.88 2880 a=mid:d1 2881 a=fmtp:webrtc-datachannel max-message-size=65536 2882 a=sctp-port 5000 2883 a=ice-ufrag:7sFvz2gdLkEwjZEr 2884 a=ice-pwd:dOTZKZNVlO9RSGsEGM63JXT2 2885 a=ice-options:trickle 2886 a=fingerprint:sha-256 6B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35 2887 :DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 2888 a=setup:actpass 2889 a=candidate:109270924 1 udp 2122194687 192.168.2.3 61665 typ host 2890 a=candidate:4036177504 1 udp 1685987071 55.66.77.88 64532 typ srflx 2891 raddr 192.168.2.3 rport 61665 2892 a=candidate:3671762467 1 udp 41819903 66.77.88.99 50416 typ relay 2893 raddr 55.66.77.88 rport 64532 2894 a=end-of-candidates 2896 m=video 64532 UDP/TLS/RTP/SAVPF 100 101 2897 c=IN IP4 55.66.77.88 2898 a=rtcp:64532 IN IP4 55.66.77.88 2899 a=mid:v1 2900 a=msid:61317484-2ed4-49d7-9eb7-1414322a7aae 2901 f30bdb4a-5db8-49b5-bcdc-e0c9a23172e0 2902 a=sendrecv 2903 a=rtpmap:100 VP8/90000 2904 a=rtpmap:101 rtx/90000 2905 a=fmtp:101 apt=100 2906 a=ice-ufrag:7sFvz2gdLkEwjZEr 2907 a=ice-pwd:dOTZKZNVlO9RSGsEGM63JXT2 2908 a=ice-options:trickle 2909 a=fingerprint:sha-256 2910 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 2911 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 2912 a=setup:actpass 2913 a=rtcp-mux 2914 a=rtcp-rsize 2915 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 2916 a=rtcp-fb:100 ccm fir 2917 a=rtcp-fb:100 nack 2918 a=rtcp-fb:100 nack pli 2919 a=ssrc:1366781083 cname:Q/NWs1ao1HmN4Xa5 2920 a=ssrc:1366781084 cname:Q/NWs1ao1HmN4Xa5 2921 a=ssrc-group:FID 1366781083 1366781084 2922 a=candidate:109270924 1 udp 2122194687 192.168.2.3 61665 typ host 2923 a=candidate:4036177504 1 udp 1685987071 55.66.77.88 64532 typ srflx 2924 raddr 192.168.2.3 rport 61665 2925 a=candidate:3671762467 1 udp 41819903 66.77.88.99 50416 typ relay 2926 raddr 55.66.77.88 rport 64532 2927 a=end-of-candidates 2929 m=video 64532 UDP/TLS/RTP/SAVPF 100 101 2930 c=IN IP4 55.66.77.88 2931 a=rtcp:64532 IN IP4 55.66.77.88 2932 a=mid:v1 2933 a=msid:71317484-2ed4-49d7-9eb7-1414322a7aae 2934 f30bdb4a-5db8-49b5-bcdc-e0c9a23172e0 2935 a=sendrecv 2936 a=rtpmap:100 VP8/90000 2937 a=rtpmap:101 rtx/90000 2938 a=fmtp:101 apt=100 2939 a=ice-ufrag:7sFvz2gdLkEwjZEr 2940 a=ice-pwd:dOTZKZNVlO9RSGsEGM63JXT2 2941 a=ice-options:trickle 2942 a=fingerprint:sha-256 2943 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 2944 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 2945 a=setup:actpass 2946 a=rtcp-mux 2947 a=rtcp-rsize 2948 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 2949 a=rtcp-fb:100 ccm fir 2950 a=rtcp-fb:100 nack 2951 a=rtcp-fb:100 nack pli 2952 a=ssrc:2366781083 cname:Q/NWs1ao1HmN4Xa5 2953 a=ssrc:2366781084 cname:Q/NWs1ao1HmN4Xa5 2954 a=ssrc-group:FID 2366781083 2366781084 2955 a=candidate:109270924 1 udp 2122194687 192.168.2.3 61665 typ host 2956 a=candidate:4036177504 1 udp 1685987071 55.66.77.88 64532 typ srflx 2957 raddr 192.168.2.3 rport 61665 2958 a=candidate:3671762467 1 udp 41819903 66.77.88.99 50416 typ relay 2959 raddr 55.66.77.88 rport 64532 2960 a=end-of-candidates 2962 The SDP for |answer-B2| looks like: (note the use of setup:passive to 2963 maintain the existing DTLS roles, and the use of a=recvonly to 2964 indicate that the video streams are one-way) 2966 v=0 2967 o=- 4962303333179871723 2 IN IP4 0.0.0.0 2968 s=- 2969 t=0 0 2970 a=msid-semantic:WMS 2971 a=group:BUNDLE a1 d1 v1 v2 2972 m=audio 52546 UDP/TLS/RTP/SAVPF 96 0 8 97 98 2973 c=IN IP4 11.22.33.44 2974 a=rtcp:52546 IN IP4 11.22.33.44 2975 a=mid:a1 2976 a=msid:57017fee-b6c1-4162-929c-a25110252400 2977 e83006c5-a0ff-4e0a-9ed9-d3e6747be7d9 2978 a=sendrecv 2979 a=rtpmap:96 opus/48000/2 2980 a=rtpmap:0 PCMU/8000 2981 a=rtpmap:8 PCMA/8000 2982 a=rtpmap:97 telephone-event/8000 2983 a=rtpmap:98 telephone-event/48000 2984 a=maxptime:120 2985 a=ice-ufrag:ATEn1v9DoTMB9J4r 2986 a=ice-pwd:AtSK0WpNtpUjkY4+86js7ZQl 2987 a=ice-options:trickle 2988 a=fingerprint:sha-256 2989 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 2990 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 2991 a=setup:passive 2992 a=rtcp-mux 2993 a=rtcp-rsize 2994 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level 2995 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 2996 a=ssrc:1732846380 cname:FocUG1f0fcg/yvY7 2997 a=candidate:109270923 1 udp 2122194687 192.168.1.2 51556 typ host 2998 a=candidate:4036177503 1 udp 1685987071 11.22.33.44 52546 typ srflx 2999 raddr 192.168.1.2 rport 51556 3000 a=candidate:3671762466 1 udp 41819903 22.33.44.55 61405 typ relay 3001 raddr 11.22.33.44 rport 52546 3002 a=end-of-candidates 3004 m=application 52546 UDP/DTLS/SCTP webrtc-datachannel 3005 c=IN IP4 11.22.33.44 3006 a=mid:d1 3007 a=fmtp:webrtc-datachannel max-message-size=65536 3008 a=sctp-port 5000 3009 a=ice-ufrag:ATEn1v9DoTMB9J4r 3010 a=ice-pwd:AtSK0WpNtpUjkY4+86js7ZQl 3011 a=ice-options:trickle 3012 a=fingerprint:sha-256 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 3013 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3014 a=setup:passive 3015 a=candidate:109270923 1 udp 2122194687 192.168.1.2 51556 typ host 3016 a=candidate:4036177503 1 udp 1685987071 11.22.33.44 52546 typ srflx 3017 raddr 192.168.1.2 rport 51556 3018 a=candidate:3671762466 1 udp 41819903 22.33.44.55 61405 typ relay 3019 raddr 11.22.33.44 rport 52546 3020 a=end-of-candidates 3021 m=video 52546 UDP/TLS/RTP/SAVPF 100 101 3022 c=IN IP4 11.22.33.44 3023 a=rtcp:52546 IN IP4 11.22.33.44 3024 a=mid:v1 3025 a=recvonly 3026 a=rtpmap:100 VP8/90000 3027 a=rtpmap:101 rtx/90000 3028 a=fmtp:101 apt=100 3029 a=ice-ufrag:ATEn1v9DoTMB9J4r 3030 a=ice-pwd:AtSK0WpNtpUjkY4+86js7ZQl 3031 a=ice-options:trickle 3032 a=fingerprint:sha-256 3033 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 3034 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3035 a=setup:passive 3036 a=rtcp-mux 3037 a=rtcp-rsize 3038 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 3039 a=rtcp-fb:100 ccm fir 3040 a=rtcp-fb:100 nack 3041 a=rtcp-fb:100 nack pli 3042 a=candidate:109270923 1 udp 2122194687 192.168.1.2 51556 typ host 3043 a=candidate:4036177503 1 udp 1685987071 11.22.33.44 52546 typ srflx 3044 raddr 192.168.1.2 rport 51556 3045 a=candidate:3671762466 1 udp 41819903 22.33.44.55 61405 typ relay 3046 raddr 11.22.33.44 rport 52546 3047 a=end-of-candidates 3049 m=video 52546 UDP/TLS/RTP/SAVPF 100 101 3050 c=IN IP4 11.22.33.44 3051 a=rtcp:52546 IN IP4 11.22.33.44 3052 a=mid:v2 3053 a=recvonly 3054 a=rtpmap:100 VP8/90000 3055 a=rtpmap:101 rtx/90000 3056 a=fmtp:101 apt=100 3057 a=ice-ufrag:ATEn1v9DoTMB9J4r 3058 a=ice-pwd:AtSK0WpNtpUjkY4+86js7ZQl 3059 a=ice-options:trickle 3060 a=fingerprint:sha-256 3061 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 3062 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3063 a=setup:passive 3064 a=rtcp-mux 3065 a=rtcp-rsize 3066 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 3067 a=rtcp-fb:100 ccm fir 3068 a=rtcp-fb:100 nack 3069 a=rtcp-fb:100 nack pli 3070 a=candidate:109270923 1 udp 2122194687 192.168.1.2 51556 typ host 3071 a=candidate:4036177503 1 udp 1685987071 11.22.33.44 52546 typ srflx 3072 raddr 192.168.1.2 rport 51556 3073 a=candidate:3671762466 1 udp 41819903 22.33.44.55 61405 typ relay 3074 raddr 11.22.33.44 rport 52546 3075 a=end-of-candidates 3077 8. Security Considerations 3079 The IETF has published separate documents 3080 [I-D.ietf-rtcweb-security-arch] [I-D.ietf-rtcweb-security] describing 3081 the security architecture for WebRTC as a whole. The remainder of 3082 this section describes security considerations for this document. 3084 While formally the JSEP interface is an API, it is better to think of 3085 it is an Internet protocol, with the JS being untrustworthy from the 3086 perspective of the browser. Thus, the threat model of [RFC3552] 3087 applies. In particular, JS can call the API in any order and with 3088 any inputs, including malicious ones. This is particularly relevant 3089 when we consider the SDP which is passed to setLocalDescription(). 3090 While correct API usage requires that the application pass in SDP 3091 which was derived from createOffer() or createAnswer() (perhaps 3092 suitably modified as described in Section 6, there is no guarantee 3093 that applications do so. The browser MUST be prepared for the JS to 3094 pass in bogus data instead. 3096 Conversely, the application programmer MUST recognize that the JS 3097 does not have complete control of browser behavior. One case that 3098 bears particular mention is that editing ICE candidates out of the 3099 SDP or suppressing trickled candidates does not have the expected 3100 behavior: implementations will still perform checks from those 3101 candidates even if they are not sent to the other side. Thus, for 3102 instance, it is not possible to prevent the remote peer from learning 3103 your public IP address by removing server reflexive candidates. 3104 Applications which wish to conceal their public IP address should 3105 instead configure the ICE agent to use only relay candidates. 3107 9. IANA Considerations 3109 This document requires no actions from IANA. 3111 10. Acknowledgements 3113 Significant text incorporated in the draft as well and review was 3114 provided by Harald Alvestrand and Suhas Nandakumar. Dan Burnett, 3115 Neil Stratford, Eric Rescorla, Anant Narayanan, Andrew Hutton, 3116 Richard Ejzak, Adam Bergkvist and Matthew Kaufman all provided 3117 valuable feedback on this proposal. 3119 11. References 3121 11.1. Normative References 3123 [I-D.ietf-mmusic-msid] 3124 Alvestrand, H., "Cross Session Stream Identification in 3125 the Session Description Protocol", draft-ietf-mmusic- 3126 msid-01 (work in progress), August 2013. 3128 [I-D.ietf-mmusic-sctp-sdp] 3129 Loreto, S. and G. Camarillo, "Stream Control Transmission 3130 Protocol (SCTP)-Based Media Transport in the Session 3131 Description Protocol (SDP)", draft-ietf-mmusic-sctp-sdp-04 3132 (work in progress), June 2013. 3134 [I-D.ietf-mmusic-sdp-bundle-negotiation] 3135 Holmberg, C., Alvestrand, H., and C. Jennings, 3136 "Multiplexing Negotiation Using Session Description 3137 Protocol (SDP) Port Numbers", draft-ietf-mmusic-sdp- 3138 bundle-negotiation-04 (work in progress), June 2013. 3140 [I-D.ietf-mmusic-sdp-mux-attributes] 3141 Nandakumar, S., "A Framework for SDP Attributes when 3142 Multiplexing", draft-ietf-mmusic-sdp-mux-attributes-01 3143 (work in progress), February 2014. 3145 [I-D.ietf-mmusic-trickle-ice] 3146 Ivov, E., Rescorla, E., and J. Uberti, "Trickle ICE: 3147 Incremental Provisioning of Candidates for the Interactive 3148 Connectivity Establishment (ICE) Protocol", draft-ietf- 3149 mmusic-trickle-ice-00 (work in progress), March 2013. 3151 [I-D.ietf-rtcweb-audio] 3152 Valin, J. and C. Bran, "WebRTC Audio Codec and Processing 3153 Requirements", draft-ietf-rtcweb-audio-02 (work in 3154 progress), August 2013. 3156 [I-D.ietf-rtcweb-data-protocol] 3157 Jesup, R., Loreto, S., and M. Tuexen, "WebRTC Data Channel 3158 Protocol", draft-ietf-rtcweb-data-protocol-04 (work in 3159 progress), February 2013. 3161 [I-D.ietf-rtcweb-fec] 3162 Uberti, J., "WebRTC Forward Error Correction 3163 Requirements", draft-ietf-rtcweb-fec-00 (work in 3164 progress), February 2015. 3166 [I-D.ietf-rtcweb-rtp-usage] 3167 Perkins, C., Westerlund, M., and J. Ott, "Web Real-Time 3168 Communication (WebRTC): Media Transport and Use of RTP", 3169 draft-ietf-rtcweb-rtp-usage-09 (work in progress), 3170 September 2013. 3172 [I-D.ietf-rtcweb-security] 3173 Rescorla, E., "Security Considerations for WebRTC", draft- 3174 ietf-rtcweb-security-06 (work in progress), January 2014. 3176 [I-D.ietf-rtcweb-security-arch] 3177 Rescorla, E., "WebRTC Security Architecture", draft-ietf- 3178 rtcweb-security-arch-09 (work in progress), February 2014. 3180 [I-D.ietf-rtcweb-video] 3181 Roach, A., "WebRTC Video Processing and Codec 3182 Requirements", draft-ietf-rtcweb-video-00 (work in 3183 progress), July 2014. 3185 [I-D.nandakumar-mmusic-proto-iana-registration] 3186 Nandakumar, S., "IANA registration of SDP 'proto' 3187 attribute for transporting RTP Media over TCP under 3188 various RTP profiles.", September 2014. 3190 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 3191 Requirement Levels", BCP 14, RFC 2119, March 1997. 3193 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 3194 A., Peterson, J., Sparks, R., Handley, M., and E. 3195 Schooler, "SIP: Session Initiation Protocol", RFC 3261, 3196 June 2002. 3198 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 3199 with Session Description Protocol (SDP)", RFC 3264, June 3200 2002. 3202 [RFC3552] Rescorla, E. and B. Korver, "Guidelines for Writing RFC 3203 Text on Security Considerations", BCP 72, RFC 3552, July 3204 2003. 3206 [RFC3605] Huitema, C., "Real Time Control Protocol (RTCP) attribute 3207 in Session Description Protocol (SDP)", RFC 3605, October 3208 2003. 3210 [RFC4145] Yon, D. and G. Camarillo, "TCP-Based Media Transport in 3211 the Session Description Protocol (SDP)", RFC 4145, 3212 September 2005. 3214 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 3215 Description Protocol", RFC 4566, July 2006. 3217 [RFC4572] Lennox, J., "Connection-Oriented Media Transport over the 3218 Transport Layer Security (TLS) Protocol in the Session 3219 Description Protocol (SDP)", RFC 4572, July 2006. 3221 [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, 3222 "Extended RTP Profile for Real-time Transport Control 3223 Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, July 3224 2006. 3226 [RFC5124] Ott, J. and E. Carrara, "Extended Secure RTP Profile for 3227 Real-time Transport Control Protocol (RTCP)-Based Feedback 3228 (RTP/SAVPF)", RFC 5124, February 2008. 3230 [RFC5245] Rosenberg, J., "Interactive Connectivity Establishment 3231 (ICE): A Protocol for Network Address Translator (NAT) 3232 Traversal for Offer/Answer Protocols", RFC 5245, April 3233 2010. 3235 [RFC5285] Singer, D. and H. Desineni, "A General Mechanism for RTP 3236 Header Extensions", RFC 5285, July 2008. 3238 [RFC5761] Perkins, C. and M. Westerlund, "Multiplexing RTP Data and 3239 Control Packets on a Single Port", RFC 5761, April 2010. 3241 [RFC5888] Camarillo, G. and H. Schulzrinne, "The Session Description 3242 Protocol (SDP) Grouping Framework", RFC 5888, June 2010. 3244 [RFC6236] Johansson, I. and K. Jung, "Negotiation of Generic Image 3245 Attributes in the Session Description Protocol (SDP)", RFC 3246 6236, May 2011. 3248 [RFC6347] Rescorla, E. and N. Modadugu, "Datagram Transport Layer 3249 Security Version 1.2", RFC 6347, January 2012. 3251 [RFC6904] Lennox, J., "Encryption of Header Extensions in the Secure 3252 Real-time Transport Protocol (SRTP)", RFC 6904, April 3253 2013. 3255 [RFC7022] Begen, A., Perkins, C., Wing, D., and E. Rescorla, 3256 "Guidelines for Choosing RTP Control Protocol (RTCP) 3257 Canonical Names (CNAMEs)", RFC 7022, September 2013. 3259 11.2. Informative References 3261 [I-D.nandakumar-rtcweb-sdp] 3262 Nandakumar, S. and C. Jennings, "SDP for the WebRTC", 3263 draft-nandakumar-rtcweb-sdp-02 (work in progress), July 3264 2013. 3266 [RFC3389] Zopf, R., "Real-time Transport Protocol (RTP) Payload for 3267 Comfort Noise (CN)", RFC 3389, September 2002. 3269 [RFC3556] Casner, S., "Session Description Protocol (SDP) Bandwidth 3270 Modifiers for RTP Control Protocol (RTCP) Bandwidth", RFC 3271 3556, July 2003. 3273 [RFC3960] Camarillo, G. and H. Schulzrinne, "Early Media and Ringing 3274 Tone Generation in the Session Initiation Protocol (SIP)", 3275 RFC 3960, December 2004. 3277 [RFC4568] Andreasen, F., Baugher, M., and D. Wing, "Session 3278 Description Protocol (SDP) Security Descriptions for Media 3279 Streams", RFC 4568, July 2006. 3281 [RFC4588] Rey, J., Leon, D., Miyazaki, A., Varsa, V., and R. 3282 Hakenberg, "RTP Retransmission Payload Format", RFC 4588, 3283 July 2006. 3285 [RFC5506] Johansson, I. and M. Westerlund, "Support for Reduced-Size 3286 Real-Time Transport Control Protocol (RTCP): Opportunities 3287 and Consequences", RFC 5506, April 2009. 3289 [RFC5576] Lennox, J., Ott, J., and T. Schierl, "Source-Specific 3290 Media Attributes in the Session Description Protocol 3291 (SDP)", RFC 5576, June 2009. 3293 [RFC5763] Fischl, J., Tschofenig, H., and E. Rescorla, "Framework 3294 for Establishing a Secure Real-time Transport Protocol 3295 (SRTP) Security Context Using Datagram Transport Layer 3296 Security (DTLS)", RFC 5763, May 2010. 3298 [RFC5764] McGrew, D. and E. Rescorla, "Datagram Transport Layer 3299 Security (DTLS) Extension to Establish Keys for the Secure 3300 Real-time Transport Protocol (SRTP)", RFC 5764, May 2010. 3302 [RFC5956] Begen, A., "Forward Error Correction Grouping Semantics in 3303 the Session Description Protocol", RFC 5956, September 3304 2010. 3306 [W3C.WD-webrtc-20140617] 3307 Bergkvist, A., Burnett, D., Narayanan, A., and C. 3308 Jennings, "WebRTC 1.0: Real-time Communication Between 3309 Browsers", World Wide Web Consortium WD WD-webrtc- 3310 20140617, June 2014, 3311 . 3313 Appendix A. Change log 3315 Note: This section will be removed by RFC Editor before publication. 3317 Changes in draft-09: 3319 o Don't return null for {local,remote}Description after close(). 3321 o Changed TCP/TLS to UDP/DTLS in RTP profile names. 3323 o Separate out bundle and mux policy. 3325 o Added specific references to FEC mechanisms. 3327 o Added canTrickle mechanism. 3329 o Added section on subsequent answers and, answer options. 3331 o Added text defining set{Local,Remote}Description behavior. 3333 Changes in draft-08: 3335 o Added new example section and removed old examples in appendix. 3337 o Fixed field handling. 3339 o Added text describing a=rtcp attribute. 3341 o Reworked handling of OfferToReceiveAudio and OfferToReceiveVideo 3342 per discussion at IETF 90. 3344 o Reworked trickle ICE handling and its impact on m= and c= lines 3345 per discussion at interim. 3347 o Added max-bundle-and-rtcp-mux policy. 3349 o Added description of maxptime handling. 3351 o Updated ICE candidate pool default to 0. 3353 o Resolved open issues around AppID/receiver-ID. 3355 o Reworked and expanded how changes to the ICE configuration are 3356 handled. 3358 o Some reference updates. 3360 o Editorial clarification. 3362 Changes in draft-07: 3364 o Expanded discussion of VAD and Opus DTX. 3366 o Added a security considerations section. 3368 o Rewrote the section on modifying SDP to require implementations to 3369 clearly indicate whether any given modification is allowed. 3371 o Clarified impact of IceRestart on CreateOffer in local-offer 3372 state. 3374 o Guidance on whether attributes should be defined at the media 3375 level or the session level. 3377 o Renamed "default" bundle policy to "balanced". 3379 o Removed default ICE candidate pool size and clarify how it works. 3381 o Defined a canonical order for assignment of MSTs to m= lines. 3383 o Removed discussion of rehydration. 3385 o Added Eric Rescorla as a draft editor. 3387 o Cleaned up references. 3389 o Editorial cleanup 3391 Changes in draft-06: 3393 o Reworked handling of m= line recycling. 3395 o Added handling of BUNDLE and bundle-only. 3397 o Clarified handling of rollback. 3399 o Added text describing the ICE Candidate Pool and its behavior. 3401 o Allowed OfferToReceiveX to create multiple recvonly m= sections. 3403 Changes in draft-05: 3405 o Fixed several issues identified in the createOffer/Answer sections 3406 during document review. 3408 o Updated references. 3410 Changes in draft-04: 3412 o Filled in sections on createOffer and createAnswer. 3414 o Added SDP examples. 3416 o Fixed references. 3418 Changes in draft-03: 3420 o Added text describing relationship to W3C specification 3422 Changes in draft-02: 3424 o Converted from nroff 3426 o Removed comparisons to old approaches abandoned by the working 3427 group 3429 o Removed stuff that has moved to W3C specification 3431 o Align SDP handling with W3C draft 3433 o Clarified section on forking. 3435 Changes in draft-01: 3437 o Added diagrams for architecture and state machine. 3439 o Added sections on forking and rehydration. 3441 o Clarified meaning of "pranswer" and "answer". 3443 o Reworked how ICE restarts and media directions are controlled. 3445 o Added list of parameters that can be changed in a description. 3447 o Updated suggested API and examples to match latest thinking. 3449 o Suggested API and examples have been moved to an appendix. 3451 Changes in draft -00: 3453 o Migrated from draft-uberti-rtcweb-jsep-02. 3455 Authors' Addresses 3457 Justin Uberti 3458 Google 3459 747 6th Ave S 3460 Kirkland, WA 98033 3461 USA 3463 Email: justin@uberti.name 3465 Cullen Jennings 3466 Cisco 3467 170 West Tasman Drive 3468 San Jose, CA 95134 3469 USA 3471 Email: fluffy@iii.ca 3473 Eric Rescorla (editor) 3474 Mozilla 3475 331 Evelyn Ave 3476 Mountain View, CA 94041 3477 USA 3479 Email: ekr@rtfm.com