idnits 2.17.1 draft-ietf-rtcweb-jsep-10.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 44 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. == There are 20 instances of lines with private range IPv4 addresses in the document. If these are generic example addresses, they should be changed to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x, 198.51.100.x or 203.0.113.x. -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 15, 2015) is 3231 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '0' on line 675 == Missing Reference: 'RFC1918' is mentioned on line 786, but not defined == Missing Reference: 'RFC4787' is mentioned on line 789, but not defined == Unused Reference: 'RFC5124' is defined on line 3237, but no explicit reference was found in the text == Outdated reference: A later version (-17) exists of draft-ietf-mmusic-msid-01 == Outdated reference: A later version (-26) exists of draft-ietf-mmusic-sctp-sdp-04 == Outdated reference: A later version (-54) exists of draft-ietf-mmusic-sdp-bundle-negotiation-04 == Outdated reference: A later version (-19) exists of draft-ietf-mmusic-sdp-mux-attributes-01 == Outdated reference: A later version (-02) exists of draft-ietf-mmusic-trickle-ice-00 == Outdated reference: A later version (-11) exists of draft-ietf-rtcweb-audio-02 == Outdated reference: A later version (-09) exists of draft-ietf-rtcweb-data-protocol-04 == Outdated reference: A later version (-10) exists of draft-ietf-rtcweb-fec-00 == Outdated reference: A later version (-26) exists of draft-ietf-rtcweb-rtp-usage-09 == Outdated reference: A later version (-12) exists of draft-ietf-rtcweb-security-06 == Outdated reference: A later version (-20) exists of draft-ietf-rtcweb-security-arch-09 == Outdated reference: A later version (-06) exists of draft-ietf-rtcweb-video-00 -- No information found for draft-nandakumar-mmusic-proto-iana-registration - is the name correct? -- Possible downref: Normative reference to a draft: ref. 'I-D.nandakumar-mmusic-proto-iana-registration' ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) ** Obsolete normative reference: RFC 4572 (Obsoleted by RFC 8122) ** Obsolete normative reference: RFC 5245 (Obsoleted by RFC 8445, RFC 8839) ** Obsolete normative reference: RFC 5285 (Obsoleted by RFC 8285) ** Obsolete normative reference: RFC 6347 (Obsoleted by RFC 9147) == Outdated reference: A later version (-08) exists of draft-nandakumar-rtcweb-sdp-02 Summary: 5 errors (**), 0 flaws (~~), 19 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group J. Uberti 3 Internet-Draft Google 4 Intended status: Standards Track C. Jennings 5 Expires: December 17, 2015 Cisco 6 E. Rescorla, Ed. 7 Mozilla 8 June 15, 2015 10 Javascript Session Establishment Protocol 11 draft-ietf-rtcweb-jsep-10 13 Abstract 15 This document describes the mechanisms for allowing a Javascript 16 application to control the signaling plane of a multimedia session 17 via the interface specified in the W3C RTCPeerConnection API, and 18 discusses how this relates to existing signaling protocols. 20 Status of This Memo 22 This Internet-Draft is submitted in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at http://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on December 17, 2015. 37 Copyright Notice 39 Copyright (c) 2015 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents 44 (http://trustee.ietf.org/license-info) in effect on the date of 45 publication of this document. Please review these documents 46 carefully, as they describe your rights and restrictions with respect 47 to this document. Code Components extracted from this document must 48 include Simplified BSD License text as described in Section 4.e of 49 the Trust Legal Provisions and are provided without warranty as 50 described in the Simplified BSD License. 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 55 1.1. General Design of JSEP . . . . . . . . . . . . . . . . . 3 56 1.2. Other Approaches Considered . . . . . . . . . . . . . . . 5 57 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 58 3. Semantics and Syntax . . . . . . . . . . . . . . . . . . . . 6 59 3.1. Signaling Model . . . . . . . . . . . . . . . . . . . . . 6 60 3.2. Session Descriptions and State Machine . . . . . . . . . 7 61 3.3. Session Description Format . . . . . . . . . . . . . . . 10 62 3.4. ICE . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 63 3.4.1. ICE Gathering Overview . . . . . . . . . . . . . . . 10 64 3.4.2. ICE Candidate Trickling . . . . . . . . . . . . . . . 11 65 3.4.2.1. ICE Candidate Format . . . . . . . . . . . . . . 11 66 3.4.3. ICE Candidate Policy . . . . . . . . . . . . . . . . 12 67 3.4.4. ICE Candidate Pool . . . . . . . . . . . . . . . . . 13 68 3.5. RTP CNAME Semantics . . . . . . . . . . . . . . . . . . . 13 69 3.6. Video Size Negotiation . . . . . . . . . . . . . . . . . 14 70 3.6.1. Creating an imageattr Attribute . . . . . . . . . . . 14 71 3.6.2. Interpreting an imageattr Attribute . . . . . . . . . 15 72 3.7. Interactions With Forking . . . . . . . . . . . . . . . . 15 73 3.7.1. Sequential Forking . . . . . . . . . . . . . . . . . 16 74 3.7.2. Parallel Forking . . . . . . . . . . . . . . . . . . 16 75 4. Interface . . . . . . . . . . . . . . . . . . . . . . . . . . 17 76 4.1. Methods . . . . . . . . . . . . . . . . . . . . . . . . . 17 77 4.1.1. Constructor . . . . . . . . . . . . . . . . . . . . . 17 78 4.1.2. createOffer . . . . . . . . . . . . . . . . . . . . . 19 79 4.1.3. createAnswer . . . . . . . . . . . . . . . . . . . . 20 80 4.1.4. SessionDescriptionType . . . . . . . . . . . . . . . 21 81 4.1.4.1. Use of Provisional Answers . . . . . . . . . . . 22 82 4.1.4.2. Rollback . . . . . . . . . . . . . . . . . . . . 23 83 4.1.5. setLocalDescription . . . . . . . . . . . . . . . . . 23 84 4.1.6. setRemoteDescription . . . . . . . . . . . . . . . . 24 85 4.1.7. localDescription . . . . . . . . . . . . . . . . . . 24 86 4.1.8. remoteDescription . . . . . . . . . . . . . . . . . . 24 87 4.1.9. canTrickleIceCandidates . . . . . . . . . . . . . . . 25 88 4.1.10. setConfiguration . . . . . . . . . . . . . . . . . . 25 89 4.1.11. addIceCandidate . . . . . . . . . . . . . . . . . . . 26 90 5. SDP Interaction Procedures . . . . . . . . . . . . . . . . . 26 91 5.1. Requirements Overview . . . . . . . . . . . . . . . . . . 26 92 5.1.1. Implementation Requirements . . . . . . . . . . . . . 27 93 5.1.2. Usage Requirements . . . . . . . . . . . . . . . . . 28 94 5.1.3. Profile Names and Interoperability . . . . . . . . . 28 95 5.2. Constructing an Offer . . . . . . . . . . . . . . . . . . 29 96 5.2.1. Initial Offers . . . . . . . . . . . . . . . . . . . 29 97 5.2.2. Subsequent Offers . . . . . . . . . . . . . . . . . . 34 98 5.2.3. Options Handling . . . . . . . . . . . . . . . . . . 37 99 5.2.3.1. OfferToReceiveAudio . . . . . . . . . . . . . . . 37 100 5.2.3.2. OfferToReceiveVideo . . . . . . . . . . . . . . . 38 101 5.2.3.3. IceRestart . . . . . . . . . . . . . . . . . . . 38 102 5.2.3.4. VoiceActivityDetection . . . . . . . . . . . . . 38 103 5.3. Generating an Answer . . . . . . . . . . . . . . . . . . 39 104 5.3.1. Initial Answers . . . . . . . . . . . . . . . . . . . 39 105 5.3.2. Subsequent Answers . . . . . . . . . . . . . . . . . 43 106 5.3.3. Options Handling . . . . . . . . . . . . . . . . . . 44 107 5.3.3.1. VoiceActivityDetection . . . . . . . . . . . . . 45 108 5.4. Processing a Local Description . . . . . . . . . . . . . 45 109 5.5. Processing a Remote Description . . . . . . . . . . . . . 45 110 5.6. Parsing a Session Description . . . . . . . . . . . . . . 46 111 5.6.1. Session-Level Parsing . . . . . . . . . . . . . . . . 46 112 5.6.2. Media Section Parsing . . . . . . . . . . . . . . . . 48 113 5.6.3. Semantics Verification . . . . . . . . . . . . . . . 50 114 5.7. Applying a Local Description . . . . . . . . . . . . . . 50 115 5.8. Applying a Remote Description . . . . . . . . . . . . . . 51 116 5.9. Applying an Answer . . . . . . . . . . . . . . . . . . . 51 117 6. Configurable SDP Parameters . . . . . . . . . . . . . . . . . 51 118 7. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 52 119 7.1. Simple Example . . . . . . . . . . . . . . . . . . . . . 52 120 7.2. Normal Examples . . . . . . . . . . . . . . . . . . . . . 56 121 8. Security Considerations . . . . . . . . . . . . . . . . . . . 67 122 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 67 123 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 67 124 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 68 125 11.1. Normative References . . . . . . . . . . . . . . . . . . 68 126 11.2. Informative References . . . . . . . . . . . . . . . . . 71 127 Appendix A. Change log . . . . . . . . . . . . . . . . . . . . . 72 128 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 75 130 1. Introduction 132 This document describes how the W3C WEBRTC RTCPeerConnection 133 interface[W3C.WD-webrtc-20140617] is used to control the setup, 134 management and teardown of a multimedia session. 136 1.1. General Design of JSEP 138 The thinking behind WebRTC call setup has been to fully specify and 139 control the media plane, but to leave the signaling plane up to the 140 application as much as possible. The rationale is that different 141 applications may prefer to use different protocols, such as the 142 existing SIP or Jingle call signaling protocols, or something custom 143 to the particular application, perhaps for a novel use case. In this 144 approach, the key information that needs to be exchanged is the 145 multimedia session description, which specifies the necessary 146 transport and media configuration information necessary to establish 147 the media plane. 149 With these considerations in mind, this document describes the 150 Javascript Session Establishment Protocol (JSEP) that allows for full 151 control of the signaling state machine from Javascript. JSEP removes 152 the browser almost entirely from the core signaling flow, which is 153 instead handled by the Javascript making use of two interfaces: (1) 154 passing in local and remote session descriptions and (2) interacting 155 with the ICE state machine. 157 In this document, the use of JSEP is described as if it always occurs 158 between two browsers. Note though in many cases it will actually be 159 between a browser and some kind of server, such as a gateway or MCU. 160 This distinction is invisible to the browser; it just follows the 161 instructions it is given via the API. 163 JSEP's handling of session descriptions is simple and 164 straightforward. Whenever an offer/answer exchange is needed, the 165 initiating side creates an offer by calling a createOffer() API. The 166 application optionally modifies that offer, and then uses it to set 167 up its local config via the setLocalDescription() API. The offer is 168 then sent off to the remote side over its preferred signaling 169 mechanism (e.g., WebSockets); upon receipt of that offer, the remote 170 party installs it using the setRemoteDescription() API. 172 To complete the offer/answer exchange, the remote party uses the 173 createAnswer() API to generate an appropriate answer, applies it 174 using the setLocalDescription() API, and sends the answer back to the 175 initiator over the signaling channel. When the initiator gets that 176 answer, it installs it using the setRemoteDescription() API, and 177 initial setup is complete. This process can be repeated for 178 additional offer/answer exchanges. 180 Regarding ICE [RFC5245], JSEP decouples the ICE state machine from 181 the overall signaling state machine, as the ICE state machine must 182 remain in the browser, because only the browser has the necessary 183 knowledge of candidates and other transport info. Performing this 184 separation also provides additional flexibility; in protocols that 185 decouple session descriptions from transport, such as Jingle, the 186 session description can be sent immediately and the transport 187 information can be sent when available. In protocols that don't, 188 such as SIP, the information can be used in the aggregated form. 189 Sending transport information separately can allow for faster ICE and 190 DTLS startup, since ICE checks can start as soon as any transport 191 information is available rather than waiting for all of it. 193 Through its abstraction of signaling, the JSEP approach does require 194 the application to be aware of the signaling process. While the 195 application does not need to understand the contents of session 196 descriptions to set up a call, the application must call the right 197 APIs at the right times, convert the session descriptions and ICE 198 information into the defined messages of its chosen signaling 199 protocol, and perform the reverse conversion on the messages it 200 receives from the other side. 202 One way to mitigate this is to provide a Javascript library that 203 hides this complexity from the developer; said library would 204 implement a given signaling protocol along with its state machine and 205 serialization code, presenting a higher level call-oriented interface 206 to the application developer. For example, libraries exist to adapt 207 the JSEP API into an API suitable for a SIP or XMPP. Thus, JSEP 208 provides greater control for the experienced developer without 209 forcing any additional complexity on the novice developer. 211 1.2. Other Approaches Considered 213 One approach that was considered instead of JSEP was to include a 214 lightweight signaling protocol. Instead of providing session 215 descriptions to the API, the API would produce and consume messages 216 from this protocol. While providing a more high-level API, this put 217 more control of signaling within the browser, forcing the browser to 218 have to understand and handle concepts like signaling glare. In 219 addition, it prevented the application from driving the state machine 220 to a desired state, as is needed in the page reload case. 222 A second approach that was considered but not chosen was to decouple 223 the management of the media control objects from session 224 descriptions, instead offering APIs that would control each component 225 directly. This was rejected based on a feeling that requiring 226 exposure of this level of complexity to the application programmer 227 would not be beneficial; it would result in an API where even a 228 simple example would require a significant amount of code to 229 orchestrate all the needed interactions, as well as creating a large 230 API surface that needed to be agreed upon and documented. In 231 addition, these API points could be called in any order, resulting in 232 a more complex set of interactions with the media subsystem than the 233 JSEP approach, which specifies how session descriptions are to be 234 evaluated and applied. 236 One variation on JSEP that was considered was to keep the basic 237 session description-oriented API, but to move the mechanism for 238 generating offers and answers out of the browser. Instead of 239 providing createOffer/createAnswer methods within the browser, this 240 approach would instead expose a getCapabilities API which would 241 provide the application with the information it needed in order to 242 generate its own session descriptions. This increases the amount of 243 work that the application needs to do; it needs to know how to 244 generate session descriptions from capabilities, and especially how 245 to generate the correct answer from an arbitrary offer and the 246 supported capabilities. While this could certainly be addressed by 247 using a library like the one mentioned above, it basically forces the 248 use of said library even for a simple example. Providing 249 createOffer/createAnswer avoids this problem, but still allows 250 applications to generate their own offers/answers (to a large extent) 251 if they choose, using the description generated by createOffer as an 252 indication of the browser's capabilities. 254 2. Terminology 256 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 257 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 258 document are to be interpreted as described in [RFC2119]. 260 3. Semantics and Syntax 262 3.1. Signaling Model 264 JSEP does not specify a particular signaling model or state machine, 265 other than the generic need to exchange session descriptions in the 266 fashion described by [RFC3264] (offer/answer) in order for both sides 267 of the session to know how to conduct the session. JSEP provides 268 mechanisms to create offers and answers, as well as to apply them to 269 a session. However, the browser is totally decoupled from the actual 270 mechanism by which these offers and answers are communicated to the 271 remote side, including addressing, retransmission, forking, and glare 272 handling. These issues are left entirely up to the application; the 273 application has complete control over which offers and answers get 274 handed to the browser, and when. 276 +-----------+ +-----------+ 277 | Web App |<--- App-Specific Signaling -->| Web App | 278 +-----------+ +-----------+ 279 ^ ^ 280 | SDP | SDP 281 V V 282 +-----------+ +-----------+ 283 | Browser |<----------- Media ------------>| Browser | 284 +-----------+ +-----------+ 286 Figure 1: JSEP Signaling Model 288 3.2. Session Descriptions and State Machine 290 In order to establish the media plane, the user agent needs specific 291 parameters to indicate what to transmit to the remote side, as well 292 as how to handle the media that is received. These parameters are 293 determined by the exchange of session descriptions in offers and 294 answers, and there are certain details to this process that must be 295 handled in the JSEP APIs. 297 Whether a session description applies to the local side or the remote 298 side affects the meaning of that description. For example, the list 299 of codecs sent to a remote party indicates what the local side is 300 willing to receive, which, when intersected with the set of codecs 301 the remote side supports, specifies what the remote side should send. 302 However, not all parameters follow this rule; for example, the DTLS- 303 SRTP parameters [RFC5763] sent to a remote party indicate what 304 certificate the local side will use in DTLS setup, and thereby what 305 the remote party should expect to receive; the remote party will have 306 to accept these parameters, with no option to choose different 307 values. 309 In addition, various RFCs put different conditions on the format of 310 offers versus answers. For example, an offer may propose an 311 arbitrary number of media streams (i.e. m= sections), but an answer 312 must contain the exact same number as the offer. 314 Lastly, while the exact media parameters are only known only after an 315 offer and an answer have been exchanged, it is possible for the 316 offerer to receive media after they have sent an offer and before 317 they have received an answer. To properly process incoming media in 318 this case, the offerer's media handler must be aware of the details 319 of the offer before the answer arrives. 321 Therefore, in order to handle session descriptions properly, the user 322 agent needs: 324 1. To know if a session description pertains to the local or remote 325 side. 327 2. To know if a session description is an offer or an answer. 329 3. To allow the offer to be specified independently of the answer. 331 JSEP addresses this by adding both setLocalDescription and 332 setRemoteDescription methods and having session description objects 333 contain a type field indicating the type of session description being 334 supplied. This satisfies the requirements listed above for both the 335 offerer, who first calls setLocalDescription(sdp [offer]) and then 336 later setRemoteDescription(sdp [answer]), as well as for the 337 answerer, who first calls setRemoteDescription(sdp [offer]) and then 338 later setLocalDescription(sdp [answer]). 340 JSEP also allows for an answer to be treated as provisional by the 341 application. Provisional answers provide a way for an answerer to 342 communicate initial session parameters back to the offerer, in order 343 to allow the session to begin, while allowing a final answer to be 344 specified later. This concept of a final answer is important to the 345 offer/answer model; when such an answer is received, any extra 346 resources allocated by the caller can be released, now that the exact 347 session configuration is known. These "resources" can include things 348 like extra ICE components, TURN candidates, or video decoders. 349 Provisional answers, on the other hand, do no such deallocation 350 results; as a result, multiple dissimilar provisional answers can be 351 received and applied during call setup. 353 In [RFC3264], the constraint at the signaling level is that only one 354 offer can be outstanding for a given session, but at the media stack 355 level, a new offer can be generated at any point. For example, when 356 using SIP for signaling, if one offer is sent, then cancelled using a 357 SIP CANCEL, another offer can be generated even though no answer was 358 received for the first offer. To support this, the JSEP media layer 359 can provide an offer via the createOffer() method whenever the 360 Javascript application needs one for the signaling. The answerer can 361 send back zero or more provisional answers, and finally end the 362 offer-answer exchange by sending a final answer. The state machine 363 for this is as follows: 365 setRemote(OFFER) setLocal(PRANSWER) 366 /-----\ /-----\ 367 | | | | 368 v | v | 369 +---------------+ | +---------------+ | 370 | |----/ | |----/ 371 | | setLocal(PRANSWER) | | 372 | Remote-Offer |------------------- >| Local-Pranswer| 373 | | | | 374 | | | | 375 +---------------+ +---------------+ 376 ^ | | 377 | | setLocal(ANSWER) | 378 setRemote(OFFER) | | 379 | V setLocal(ANSWER) | 380 +---------------+ | 381 | | | 382 | |<---------------------------+ 383 | Stable | 384 | |<---------------------------+ 385 | | | 386 +---------------+ setRemote(ANSWER) | 387 ^ | | 388 | | setLocal(OFFER) | 389 setRemote(ANSWER) | | 390 | V | 391 +---------------+ +---------------+ 392 | | | | 393 | | setRemote(PRANSWER) | | 394 | Local-Offer |------------------- >|Remote-Pranswer| 395 | | | | 396 | |----\ | |----\ 397 +---------------+ | +---------------+ | 398 ^ | ^ | 399 | | | | 400 \-----/ \-----/ 401 setLocal(OFFER) setRemote(PRANSWER) 403 Figure 2: JSEP State Machine 405 Aside from these state transitions there is no other difference 406 between the handling of provisional ("pranswer") and final ("answer") 407 answers. 409 3.3. Session Description Format 411 In the WebRTC specification, session descriptions are formatted as 412 SDP messages. While this format is not optimal for manipulation from 413 Javascript, it is widely accepted, and frequently updated with new 414 features. Any alternate encoding of session descriptions would have 415 to keep pace with the changes to SDP, at least until the time that 416 this new encoding eclipsed SDP in popularity. As a result, JSEP 417 currently uses SDP as the internal representation for its session 418 descriptions. 420 However, to simplify Javascript processing, and provide for future 421 flexibility, the SDP syntax is encapsulated within a 422 SessionDescription object, which can be constructed from SDP, and be 423 serialized out to SDP. If future specifications agree on a JSON 424 format for session descriptions, we could easily enable this object 425 to generate and consume that JSON. 427 Other methods may be added to SessionDescription in the future to 428 simplify handling of SessionDescriptions from Javascript. In the 429 meantime, Javascript libraries can be used to perform these 430 manipulations. 432 Note that most applications should be able to treat the 433 SessionDescriptions produced and consumed by these various API calls 434 as opaque blobs; that is, the application will not need to read or 435 change them. The W3C WebRTC API specification will provide 436 appropriate APIs to allow the application to control various session 437 parameters, which will provide the necessary information to the 438 browser about what sort of SessionDescription to produce. 440 3.4. ICE 442 3.4.1. ICE Gathering Overview 444 JSEP gathers ICE candidates as needed by the application. Collection 445 of ICE candidates is referred to as a gathering phase, and this is 446 triggered either by the addition of a new or recycled m= line to the 447 local session description, or new ICE credentials in the description, 448 indicating an ICE restart. Use of new ICE credentials can be 449 triggered explicitly by the application, or implicitly by the browser 450 in response to changes in the ICE configuration. 452 When a new gathering phase starts, the ICE Agent will notify the 453 application that gathering is occurring through an event. Then, when 454 each new ICE candidate becomes available, the ICE Agent will supply 455 it to the application via an additional event; these candidates will 456 also automatically be added to the local session description. 458 Finally, when all candidates have been gathered, an event will be 459 dispatched to signal that the gathering process is complete. 461 Note that gathering phases only gather the candidates needed by 462 new/recycled/restarting m= lines; other m= lines continue to use 463 their existing candidates. 465 3.4.2. ICE Candidate Trickling 467 Candidate trickling is a technique through which a caller may 468 incrementally provide candidates to the callee after the initial 469 offer has been dispatched; the semantics of "Trickle ICE" are defined 470 in [I-D.ietf-mmusic-trickle-ice]. This process allows the callee to 471 begin acting upon the call and setting up the ICE (and perhaps DTLS) 472 connections immediately, without having to wait for the caller to 473 gather all possible candidates. This results in faster media setup 474 in cases where gathering is not performed prior to initiating the 475 call. 477 JSEP supports optional candidate trickling by providing APIs, as 478 described above, that provide control and feedback on the ICE 479 candidate gathering process. Applications that support candidate 480 trickling can send the initial offer immediately and send individual 481 candidates when they get the notified of a new candidate; 482 applications that do not support this feature can simply wait for the 483 indication that gathering is complete, and then create and send their 484 offer, with all the candidates, at this time. 486 Upon receipt of trickled candidates, the receiving application will 487 supply them to its ICE Agent. This triggers the ICE Agent to start 488 using the new remote candidates for connectivity checks. 490 3.4.2.1. ICE Candidate Format 492 As with session descriptions, the syntax of the IceCandidate object 493 provides some abstraction, but can be easily converted to and from 494 the SDP candidate lines. 496 The candidate lines are the only SDP information that is contained 497 within IceCandidate, as they represent the only information needed 498 that is not present in the initial offer (i.e., for trickle 499 candidates). This information is carried with the same syntax as the 500 "candidate-attribute" field defined for ICE. For example: 502 candidate:1 1 UDP 1694498815 192.0.2.33 10000 typ host 504 The IceCandidate object also contains fields to indicate which m= 505 line it should be associated with. The m= line can be identified in 506 one of two ways; either by a m= line index, or a MID. The m= line 507 index is a zero-based index, with index N referring to the N+1th m= 508 line in the SDP sent by the entity which sent the IceCandidate. The 509 MID uses the "media stream identification" attribute, as defined in 510 [RFC5888], Section 4, to identify the m= line. JSEP implementations 511 creating an ICE Candidate object MUST populate both of these fields. 512 Implementations receiving an ICE Candidate object MUST use the MID if 513 present, or the m= line index, if not (as it could have come from a 514 non-JSEP endpoint). 516 3.4.3. ICE Candidate Policy 518 Typically, when gathering ICE candidates, the browser will gather all 519 possible forms of initial candidates - host, server reflexive, and 520 relay. However, in certain cases, applications may want to have more 521 specific control over the gathering process, due to privacy or 522 related concerns. For example, one may want to suppress the use of 523 host candidates, to avoid exposing information about the local 524 network, or go as far as only using relay candidates, to leak as 525 little location information as possible (note that these choices come 526 with corresponding operational costs). To accomplish this, the 527 browser MUST allow the application to restrict which ICE candidates 528 are used in a session. In addition, administrators may also wish to 529 control the set of ICE candidates, and so the browser SHOULD also 530 allow control via local policy, with the most restrictive policy 531 prevailing. 533 There may also be cases where the application wants to change which 534 types of candidates are used while the session is active. A prime 535 example is where a callee may initially want to use only relay 536 candidates, to avoid leaking location information to an arbitrary 537 caller, but then change to use all candidates (for lower operational 538 cost) once the user has indicated they want to take the call. For 539 this scenario, the browser MUST allow the candidate policy to be 540 changed in mid-session, subject to the aforementioned interactions 541 with local policy. 543 To administer the ICE candidate policy, the browser will determine 544 the current setting at the start of each gathering phase. Then, 545 during the gathering phase, the browser MUST NOT expose candidates 546 disallowed by the current policy to the application, use them as the 547 source of connectivity checks, or indirectly expose them via other 548 fields, such as the raddr/rport attributes for other ICE candidates. 549 Later, if a different policy is specified by the application, the 550 application can apply it by kicking off a new gathering phase via an 551 ICE restart. 553 3.4.4. ICE Candidate Pool 555 JSEP applications typically inform the browser to begin ICE gathering 556 via the information supplied to setLocalDescription, as this is where 557 the app specifies the number of media streams, and thereby ICE 558 components, for which to gather candidates. However, to accelerate 559 cases where the application knows the number of ICE components to use 560 ahead of time, it may ask the browser to gather a pool of potential 561 ICE candidates to help ensure rapid media setup. 563 When setLocalDescription is eventually called, and the browser goes 564 to gather the needed ICE candidates, it SHOULD start by checking if 565 any candidates are available in the pool. If there are candidates in 566 the pool, they SHOULD be handed to the application immediately via 567 the ICE candidate event. If the pool becomes depleted, either 568 because a larger-than-expected number of ICE components is used, or 569 because the pool has not had enough time to gather candidates, the 570 remaining candidates are gathered as usual. 572 One example of where this concept is useful is an application that 573 expects an incoming call at some point in the future, and wants to 574 minimize the time it takes to establish connectivity, to avoid 575 clipping of initial media. By pre-gathering candidates into the 576 pool, it can exchange and start sending connectivity checks from 577 these candidates almost immediately upon receipt of a call. Note 578 though that by holding on to these pre-gathered candidates, which 579 will be kept alive as long as they may be needed, the application 580 will consume resources on the STUN/TURN servers it is using. 582 3.5. RTP CNAME Semantics 584 RTP CNAME values provide a canonical name for the RTP endpoint, 585 allowing other RTP endpoints to determine which RTP streams are using 586 the same clock and thus which clock sources can be used that to 587 synchronize media playout. 589 Any MediaStreamTracks which have different clock sources MUST have 590 different CNAMEs [TODO: need a reference for this.] Any 591 MediaStreamTracks which are in different PeerConnection objects MUST 592 have different CNAMEs; this prevents peers from linking calls from 593 multiple remote PeerConnections based on the CNAME. For simplicity, 594 MediaStreamTracks in the same PeerConnection which have the same 595 clock source SHOULD have the same CNAME. 597 3.6. Video Size Negotiation 599 Video size negotiation is the process through which a receiver can 600 use the "a=imageattr" SDP attribute [RFC6236] to indicate what video 601 frame sizes it is capable of receiving. A receiver may have hard 602 limits on what its video decoder can process, or it may wish to 603 constrain what it receives due to application preferences, e.g. a 604 specific size for the window in which the video will be displayed. 606 3.6.1. Creating an imageattr Attribute 608 In order to determine the limits on what video resolution a receiver 609 wants to receive, it will intersect its decoder hard limits with any 610 mandatory constraints that have been applied to the associated 611 MediaStreamTrack. If the decoder limits are unknown, e.g. when using 612 a software decoder, the mandatory constraints are used directly. For 613 the answerer, these mandatory constraints can be applied to the 614 remote MediaStreamTracks that are created by a setRemoteDescription 615 call, and will affect the output of the ensuing createAnswer call. 616 Any constraints set after setLocalDescription is used to set the 617 answer will result in a new offer-answer exchange. For the offerer, 618 because it does not know about any remote MediaStreamTracks until it 619 receives the answer, the offer can only reflect decoder hard limits. 620 If the offerer wishes to set mandatory constraints on video 621 resolution, it must do so after receiving the answer, and the result 622 will be a new offer-answer to communicate them. 624 If there are no known decoder limits or mandatory constraints, the 625 "a=imageattr" attribute SHOULD be omitted. 627 Otherwise, an "a=imageattr" attribute is created with "recv" 628 direction, and the resulting resolution space formed by intersecting 629 the decoder limits and constraints is used to specify its minimum and 630 maximum x= and y= values. If the intersection is the null set, i.e., 631 there are no resolutions that are permitted by both the decoder and 632 the mandatory constraints, this SHOULD be represented by x=0 and y=0 633 values. 635 The rules here express a single set of preferences, and therefore, 636 the "a=imageattr" q= value is not important. It SHOULD be set to 637 1.0. 639 The "a=imageattr" field is payload type specific. When all video 640 codecs supported have the same capabilities, use of a single 641 attribute, with the wildcard payload type (*), is RECOMMENDED. 642 However, when the supported video codecs have differing capabilities, 643 specific "a=imageattr" attributes MUST be inserted for each payload 644 type. 646 As an example, consider a system with a HD-capable, multiformat video 647 decoder, where the application has constrained the received track to 648 at most 360p. In this case, the implemention would generate this 649 attribute: 651 a=imageattr:* recv [x=[16:640],y=[16:360],q=1.0] 653 3.6.2. Interpreting an imageattr Attribute 655 [RFC6236] defines "a=imageattr" to be an advisory field. This means 656 that it does not absolutely constrain the video formats that the 657 sender can use, but gives an indication of the preferred values. 659 This specification prescribes more specific behavior. When a sender 660 of a given MediaStreamTrack, which is producing video of a certain 661 resolution, receives an "a=imageattr recv" attribute, it MUST first 662 check to see if the original resolution meets the criteria specified 663 in the attribute, and transmit it untouched if so. If the original 664 resolution is too large for the attribute criteria, the sender SHOULD 665 apply downscaling to the output of the MediaStreamTrack in order to 666 satisfy the criteria. In rare cases, where a receiver requires a 667 minimum resolution which is greater than the native resolution of the 668 video, the sender SHOULD apply upscaling in order to provide that 669 resolution. The sender SHOULD NOT apply upscaling in any other 670 cases. 672 If there is no appropriate scaling mechanism that allows the received 673 criteria to be satisfied, the sender MUST NOT transmit the track. 675 In the special case of receiving a maximum resolution of [0, 0], as 676 described above, the sender MUST NOT transmit the track. 678 3.7. Interactions With Forking 680 Some call signaling systems allow various types of forking where an 681 SDP Offer may be provided to more than one device. For example, SIP 682 [RFC3261] defines both a "Parallel Search" and "Sequential Search". 683 Although these are primarily signaling level issues that are outside 684 the scope of JSEP, they do have some impact on the configuration of 685 the media plane that is relevant. When forking happens at the 686 signaling layer, the Javascript application responsible for the 687 signaling needs to make the decisions about what media should be sent 688 or received at any point of time, as well as which remote endpoint it 689 should communicate with; JSEP is used to make sure the media engine 690 can make the RTP and media perform as required by the application. 691 The basic operations that the applications can have the media engine 692 do are: 694 o Start exchanging media with a given remote peer, but keep all the 695 resources reserved in the offer. 697 o Start exchanging media with a given remote peer, and free any 698 resources in the offer that are not being used. 700 3.7.1. Sequential Forking 702 Sequential forking involves a call being dispatched to multiple 703 remote callees, where each callee can accept the call, but only one 704 active session ever exists at a time; no mixing of received media is 705 performed. 707 JSEP handles sequential forking well, allowing the application to 708 easily control the policy for selecting the desired remote endpoint. 709 When an answer arrives from one of the callees, the application can 710 choose to apply it either as a provisional answer, leaving open the 711 possibility of using a different answer in the future, or apply it as 712 a final answer, ending the setup flow. 714 In a "first-one-wins" situation, the first answer will be applied as 715 a final answer, and the application will reject any subsequent 716 answers. In SIP parlance, this would be ACK + BYE. 718 In a "last-one-wins" situation, all answers would be applied as 719 provisional answers, and any previous call leg will be terminated. 720 At some point, the application will end the setup process, perhaps 721 with a timer; at this point, the application could reapply the 722 existing remote description as a final answer. 724 3.7.2. Parallel Forking 726 Parallel forking involves a call being dispatched to multiple remote 727 callees, where each callee can accept the call, and multiple 728 simultaneous active signaling sessions can be established as a 729 result. If multiple callees send media at the same time, the 730 possibilities for handling this are described in Section 3.1 of 731 [RFC3960]. Most SIP devices today only support exchanging media with 732 a single device at a time, and do not try to mix multiple early media 733 audio sources, as that could result in a confusing situation. For 734 example, consider having a European ringback tone mixed together with 735 the North American ringback tone - the resulting sound would not be 736 like either tone, and would confuse the user. If the signaling 737 application wishes to only exchange media with one of the remote 738 endpoints at a time, then from a media engine point of view, this is 739 exactly like the sequential forking case. 741 In the parallel forking case where the Javascript application wishes 742 to simultaneously exchange media with multiple peers, the flow is 743 slightly more complex, but the Javascript application can follow the 744 strategy that [RFC3960] describes using UPDATE. The UPDATE approach 745 allows the signaling to set up a separate media flow for each peer 746 that it wishes to exchange media with. In JSEP, this offer used in 747 the UPDATE would be formed by simply creating a new PeerConnection 748 and making sure that the same local media streams have been added 749 into this new PeerConnection. Then the new PeerConnection object 750 would produce a SDP offer that could be used by the signaling to 751 perform the UPDATE strategy discussed in [RFC3960]. 753 As a result of sharing the media streams, the application will end up 754 with N parallel PeerConnection sessions, each with a local and remote 755 description and their own local and remote addresses. The media flow 756 from these sessions can be managed by specifying SDP direction 757 attributes in the descriptions, or the application can choose to play 758 out the media from all sessions mixed together. Of course, if the 759 application wants to only keep a single session, it can simply 760 terminate the sessions that it no longer needs. 762 4. Interface 764 This section details the basic operations that must be present to 765 implement JSEP functionality. The actual API exposed in the W3C API 766 may have somewhat different syntax, but should map easily to these 767 concepts. 769 4.1. Methods 771 4.1.1. Constructor 773 The PeerConnection constructor allows the application to specify 774 global parameters for the media session, such as the STUN/TURN 775 servers and credentials to use when gathering candidates, as well as 776 the initial ICE candidate policy and pool size, and also the BUNDLE 777 policy to use. 779 If an ICE candidate policy is specified, it functions as described in 780 Section 3.4.3, causing the browser to only surface the permitted 781 candidates to the application, and only use those candidates for 782 connectivity checks. The set of available policies is as follows: 784 all: All candidates will be gathered and used. 786 public: Candidates with private IP addresses [RFC1918] will be 787 filtered out. This prevents exposure of internal network details, 788 at the cost of requiring relay usage even for intranet calls, if 789 the NAT does not allow hairpinning as described in [RFC4787], 790 section 6. 792 relay: All candidates except relay candidates will be filtered out. 793 This obfuscates the location information that might be ascertained 794 by the remote peer from the received candidates. Depending on how 795 the application deploys its relay servers, this could obfuscate 796 location to a metro or possibly even global level. 798 Although it can be overridden by local policy, the default ICE 799 candidate policy MUST be set to allow all candidates, as this 800 minimizes use of application STUN/TURN server resources. 802 If a size is specified for the ICE candidate pool, this indicates the 803 number of ICE components to pre-gather candidates for. Because pre- 804 gathering results in utilizing STUN/TURN server resources for 805 potentially long periods of time, this must only occur upon 806 application request, and therefore the default candidate pool size 807 MUST be zero. 809 The application can specify its preferred policy regarding use of 810 BUNDLE, the multiplexing mechanism defined in 811 [I-D.ietf-mmusic-sdp-bundle-negotiation]. Regardless of policy, the 812 application will always try to negotiate BUNDLE onto a single 813 transport, and will offer a single BUNDLE group across all media 814 sections. However, by specifying a policy from the list below, the 815 application can control how aggressively it will try to BUNDLE media 816 streams together, which affects how it will interoperate with a non- 817 BUNDLE-aware endpoint. When negotiating with a non-BUNDLE-aware 818 endpoint, only the streams not marked as bundle-only streams will be 819 established. The set of available policies is as follows: 821 balanced: The first media section of each type (audio, video, or 822 application) will contain transport parameters, which will allow 823 an answerer to unbundle that section. The second and any 824 subsequent media section of each type will be marked bundle-only. 825 The result is that if there are N distinct media types, then 826 candidates will be gathered for for N media streams. This policy 827 balances desire to multiplex with the need to ensure basic audio 828 and video can still be negotiated in legacy cases. 830 max-compat: All media sections will contain transport parameters; 831 none will be marked as bundle-only. This policy will allow all 832 streams to be received by non-BUNDLE-aware endpoints, but require 833 separate candidates to be gathered for each media stream. 835 max-bundle: Only the first media section will contain transport 836 parameters; all streams other than the first will be marked as 837 bundle-only. This policy aims to minimize candidate gathering and 838 maximize multiplexing, at the cost of less compatibility with 839 legacy endpoints. 841 As it provides the best tradeoff between performance and 842 compatibility with legacy endpoints, the default BUNDLE policy MUST 843 be set to "balanced". 845 The application can specify its preferred policy regarding use of 846 RTP/RTCP multiplexing [RFC5761] using one of the following policies: 848 negotiate: The browser will gather both RTP and RTCP candidates but 849 also will offer "a=rtcp-mux", thus allowing for compatibility with 850 either multiplexing or non-multiplexing endpoints. 852 require: The browser will only gather RTP candidates. This halves 853 the number of candidates that the offerer needs to gather. When 854 acting as answerer, the browser will reject any m= section that 855 does not provide an "a=rtcp-mux" attribute. 857 4.1.2. createOffer 859 The createOffer method generates a blob of SDP that contains a 860 [RFC3264] offer with the supported configurations for the session, 861 including descriptions of the local MediaStreams attached to this 862 PeerConnection, the codec/RTP/RTCP options supported by this 863 implementation, and any candidates that have been gathered by the ICE 864 Agent. An options parameter may be supplied to provide additional 865 control over the generated offer. This options parameter should 866 allow for the following manipulations to be performed: 868 o To indicate support for a media type even if no MediaStreamTracks 869 of that type have been added to the session (e.g., an audio call 870 that wants to receive video.) 872 o To trigger an ICE restart, for the purpose of reestablishing 873 connectivity. 875 In the initial offer, the generated SDP will contain all desired 876 functionality for the session (functionality that is supported but 877 not desired by default may be omitted); for each SDP line, the 878 generation of the SDP will follow the process defined for generating 879 an initial offer from the document that specifies the given SDP line. 880 The exact handling of initial offer generation is detailed in 881 Section 5.2.1 below. 883 In the event createOffer is called after the session is established, 884 createOffer will generate an offer to modify the current session 885 based on any changes that have been made to the session, e.g. adding 886 or removing MediaStreams, or requesting an ICE restart. For each 887 existing stream, the generation of each SDP line must follow the 888 process defined for generating an updated offer from the RFC that 889 specifies the given SDP line. For each new stream, the generation of 890 the SDP must follow the process of generating an initial offer, as 891 mentioned above. If no changes have been made, or for SDP lines that 892 are unaffected by the requested changes, the offer will only contain 893 the parameters negotiated by the last offer-answer exchange. The 894 exact handling of subsequent offer generation is detailed in 895 Section 5.2.2. below. 897 Session descriptions generated by createOffer must be immediately 898 usable by setLocalDescription; if a system has limited resources 899 (e.g. a finite number of decoders), createOffer should return an 900 offer that reflects the current state of the system, so that 901 setLocalDescription will succeed when it attempts to acquire those 902 resources. Because this method may need to inspect the system state 903 to determine the currently available resources, it may be implemented 904 as an async operation. 906 Calling this method may do things such as generate new ICE 907 credentials, but does not result in candidate gathering, or cause 908 media to start or stop flowing. 910 4.1.3. createAnswer 912 The createAnswer method generates a blob of SDP that contains a 913 [RFC3264] SDP answer with the supported configuration for the session 914 that is compatible with the parameters supplied in the most recent 915 call to setRemoteDescription, which MUST have been called prior to 916 calling createAnswer. Like createOffer, the returned blob contains 917 descriptions of the local MediaStreams attached to this 918 PeerConnection, the codec/RTP/RTCP options negotiated for this 919 session, and any candidates that have been gathered by the ICE Agent. 920 An options parameter may be supplied to provide additional control 921 over the generated answer. 923 As an answer, the generated SDP will contain a specific configuration 924 that specifies how the media plane should be established; for each 925 SDP line, the generation of the SDP must follow the process defined 926 for generating an answer from the document that specifies the given 927 SDP line. The exact handling of answer generation is detailed in 928 Section 5.3. below. 930 Session descriptions generated by createAnswer must be immediately 931 usable by setLocalDescription; like createOffer, the returned 932 description should reflect the current state of the system. Because 933 this method may need to inspect the system state to determine the 934 currently available resources, it may need to be implemented as an 935 async operation. 937 Calling this method may do things such as generate new ICE 938 credentials, but does not trigger candidate gathering or change media 939 state. 941 4.1.4. SessionDescriptionType 943 Session description objects (RTCSessionDescription) may be of type 944 "offer", "pranswer", "answer" or "rollback". These types provide 945 information as to how the description parameter should be parsed, and 946 how the media state should be changed. 948 "offer" indicates that a description should be parsed as an offer; 949 said description may include many possible media configurations. A 950 description used as an "offer" may be applied anytime the 951 PeerConnection is in a stable state, or as an update to a previously 952 supplied but unanswered "offer". 954 "pranswer" indicates that a description should be parsed as an 955 answer, but not a final answer, and so should not result in the 956 freeing of allocated resources. It may result in the start of media 957 transmission, if the answer does not specify an inactive media 958 direction. A description used as a "pranswer" may be applied as a 959 response to an "offer", or an update to a previously sent "pranswer". 961 "answer" indicates that a description should be parsed as an answer, 962 the offer-answer exchange should be considered complete, and any 963 resources (decoders, candidates) that are no longer needed can be 964 released. A description used as an "answer" may be applied as a 965 response to an "offer", or an update to a previously sent "pranswer". 967 The only difference between a provisional and final answer is that 968 the final answer results in the freeing of any unused resources that 969 were allocated as a result of the offer. As such, the application 970 can use some discretion on whether an answer should be applied as 971 provisional or final, and can change the type of the session 972 description as needed. For example, in a serial forking scenario, an 973 application may receive multiple "final" answers, one from each 974 remote endpoint. The application could choose to accept the initial 975 answers as provisional answers, and only apply an answer as final 976 when it receives one that meets its criteria (e.g. a live user 977 instead of voicemail). 979 "rollback" is a special session description type implying that the 980 state machine should be rolled back to the previous state, as 981 described in Section 4.1.4.2. The contents MUST be empty. 983 4.1.4.1. Use of Provisional Answers 985 Most web applications will not need to create answers using the 986 "pranswer" type. While it is good practice to send an immediate 987 response to an "offer", in order to warm up the session transport and 988 prevent media clipping, the preferred handling for a web application 989 would be to create and send an "inactive" final answer immediately 990 after receiving the offer. Later, when the called user actually 991 accepts the call, the application can create a new "sendrecv" offer 992 to update the previous offer/answer pair and start the media flow. 993 While this could also be done with an inactive "pranswer", followed 994 by a sendrecv "answer", the initial "pranswer" leaves the offer- 995 answer exchange open, which means that neither side can send an 996 updated offer during this time. 998 As an example, consider a typical web application that will set up a 999 data channel, an audio channel, and a video channel. When an 1000 endpoint receives an offer with these channels, it could send an 1001 answer accepting the data channel for two-way data, and accepting the 1002 audio and video tracks as inactive or receive-only. It could then 1003 ask the user to accept the call, acquire the local media streams, and 1004 send a new offer to the remote side moving the audio and video to be 1005 two-way media. By the time the human has accepted the call and 1006 triggered the new offer, it is likely that the ICE and DTLS 1007 handshaking for all the channels will already have finished. 1009 Of course, some applications may not be able to perform this double 1010 offer-answer exchange, particularly ones that are attempting to 1011 gateway to legacy signaling protocols. In these cases, "pranswer" 1012 can still provide the application with a mechanism to warm up the 1013 transport. 1015 4.1.4.2. Rollback 1017 In certain situations it may be desirable to "undo" a change made to 1018 setLocalDescription or setRemoteDescription. Consider a case where a 1019 call is ongoing, and one side wants to change some of the session 1020 parameters; that side generates an updated offer and then calls 1021 setLocalDescription. However, the remote side, either before or 1022 after setRemoteDescription, decides it does not want to accept the 1023 new parameters, and sends a reject message back to the offerer. Now, 1024 the offerer, and possibly the answerer as well, need to return to a 1025 stable state and the previous local/remote description. To support 1026 this, we introduce the concept of "rollback". 1028 A rollback discards any proposed changes to the session, returning 1029 the state machine to the stable state, and setting the modified local 1030 and/or remote description back to their previous values. Any 1031 resources or candidates that were allocated by the abandoned local 1032 description are discarded; any media that is received will be 1033 processed according to the previous local and remote descriptions. 1034 Rollback can only be used to cancel proposed changes; there is no 1035 support for rolling back from a stable state to a previous stable 1036 state. Note that this implies that once the answerer has performed 1037 setLocalDescription with his answer, this cannot be rolled back. 1039 A rollback is performed by supplying a session description of type 1040 "rollback" with empty contents to either setLocalDescription or 1041 setRemoteDescription, depending on which was most recently used (i.e. 1042 if the new offer was supplied to setLocalDescription, the rollback 1043 should be done using setLocalDescription as well). 1045 4.1.5. setLocalDescription 1047 The setLocalDescription method instructs the PeerConnection to apply 1048 the supplied session description as its local configuration. The 1049 type field indicates whether the description should be processed as 1050 an offer, provisional answer, or final answer; offers and answers are 1051 checked differently, using the various rules that exist for each SDP 1052 line. 1054 This API changes the local media state; among other things, it sets 1055 up local resources for receiving and decoding media. In order to 1056 successfully handle scenarios where the application wants to offer to 1057 change from one media format to a different, incompatible format, the 1058 PeerConnection must be able to simultaneously support use of both the 1059 old and new local descriptions (e.g. support codecs that exist in 1060 both descriptions) until a final answer is received, at which point 1061 the PeerConnection can fully adopt the new local description, or roll 1062 back to the old description if the remote side denied the change. 1064 This API indirectly controls the candidate gathering process. When a 1065 local description is supplied, and the number of transports currently 1066 in use does not match the number of transports needed by the local 1067 description, the PeerConnection will create transports as needed and 1068 begin gathering candidates for them. 1070 If setRemoteDescription was previous called with an offer, and 1071 setLocalDescription is called with an answer (provisional or final), 1072 and the media directions are compatible, and media are available to 1073 send, this will result in the starting of media transmission. 1075 4.1.6. setRemoteDescription 1077 The setRemoteDescription method instructs the PeerConnection to apply 1078 the supplied session description as the desired remote configuration. 1079 As in setLocalDescription, the type field of the description 1080 indicates how it should be processed. 1082 This API changes the local media state; among other things, it sets 1083 up local resources for sending and encoding media. 1085 If setLocalDescription was previously called with an offer, and 1086 setRemoteDescription is called with an answer (provisional or final), 1087 and the media directions are compatible, and media are available to 1088 send, this will result in the starting of media transmission. 1090 4.1.7. localDescription 1092 The localDescription method returns a copy of the current local 1093 configuration, i.e. what was most recently passed to 1094 setLocalDescription, plus any local candidates that have been 1095 generated by the ICE Agent. 1097 [[OPEN ISSUE: Do we need to expose accessors for both the current and 1098 proposed local description? https://github.com/rtcweb-wg/jsep/ 1099 issues/16]] 1101 A null object will be returned if the local description has not yet 1102 been established. 1104 4.1.8. remoteDescription 1106 The remoteDescription method returns a copy of the current remote 1107 configuration, i.e. what was most recently passed to 1108 setRemoteDescription, plus any remote candidates that have been 1109 supplied via processIceMessage. 1111 [[OPEN ISSUE: Do we need to expose accessors for both the current and 1112 proposed remote description? https://github.com/rtcweb-wg/jsep/ 1113 issues/16]] 1115 A null object will be returned if the remote description has not yet 1116 been established. 1118 4.1.9. canTrickleIceCandidates 1120 The canTrickleIceCandidates property indicates whether the remote 1121 side supports receiving trickled candidates. There are three 1122 potential values: 1124 null: No SDP has been received from the other side, so it is not 1125 known if it can handle trickle. This is the initial value before 1126 setRemoteDescription() is called. 1128 true: SDP has been received from the other side indicating that it 1129 can support trickle. 1131 false: SDP has been received from the other side indicating that it 1132 cannot support trickle. 1134 As described in Section 3.4.2, JSEP implementations always provide 1135 candidates to the application individually, consistent with what is 1136 needed for Trickle ICE. However, applications can use the 1137 canTrickleIceCandidates property to determine whether their peer can 1138 actually do Trickle ICE, i.e., whether it is safe to send an initial 1139 offer or answer followed later by candidates as they are gathered. 1140 As "true" is the only value that definitively indicates remote 1141 Trickle ICE support, an application which compares 1142 canTrickleIceCandidates against "true" will by default attempt Half 1143 Trickle on initial offers and Full Trickle on subsequent interactions 1144 with a Trickle ICE-compatible agent. 1146 4.1.10. setConfiguration 1148 The setConfiguration method allows the global configuration of the 1149 PeerConnection, which was initially set by constructor parameters, to 1150 be changed during the session. The effects of this method call 1151 depend on when it is invoked, and differ depending on which specific 1152 parameters are changed: 1154 o Any changes to the STUN/TURN servers to use affect the next 1155 gathering phase. If gathering has already occurred, this will 1156 cause the next call to createOffer to generate new ICE 1157 credentials, for the purpose of forcing an ICE restart and kicking 1158 off a new gathering phase, in which the new servers will be used. 1160 If the ICE candidate pool has a nonzero size, any existing 1161 candidates will be discarded, and new candidates will be gathered 1162 from the new servers. 1164 o Any changes to the ICE candidate policy also affect the next 1165 gathering phase, in similar fashion to the server changes 1166 described above. Note though that changes to the policy have no 1167 effect on the candidate pool, because pooled candidates are not 1168 surfaced to the application until a gathering phase occurs, and so 1169 any necessary filtering can still be done on any pooled 1170 candidates. 1172 o Any changes to the ICE candidate pool size take effect 1173 immediately; if increased, additional candidates are pre-gathered; 1174 if decreased, the now-superfluous candidates are discarded. 1176 o The BUNDLE and RTCP-multiplexing policies MUST NOT be changed 1177 after the construction of the PeerConnection. 1179 This call may result in a change to the state of the ICE Agent, and 1180 may result in a change to media state if it results in connectivity 1181 being established. 1183 4.1.11. addIceCandidate 1185 The addIceCandidate method provides a remote candidate to the ICE 1186 Agent, which, if parsed successfully, will be added to the remote 1187 description according to the rules defined for Trickle ICE. 1188 Connectivity checks will be sent to the new candidate. 1190 This call will result in a change to the state of the ICE Agent, and 1191 may result in a change to media state if it results in connectivity 1192 being established. 1194 5. SDP Interaction Procedures 1196 This section describes the specific procedures to be followed when 1197 creating and parsing SDP objects. 1199 5.1. Requirements Overview 1201 JSEP implementations must comply with the specifications listed below 1202 that govern the creation and processing of offers and answers. 1204 The first set of specifications is the "mandatory-to-implement" set. 1205 All implementations must support these behaviors, but may not use all 1206 of them if the remote side, which may not be a JSEP endpoint, does 1207 not support them. 1209 The second set of specifications is the "mandatory-to-use" set. The 1210 local JSEP endpoint and any remote endpoint must indicate support for 1211 these specifications in their session descriptions. 1213 5.1.1. Implementation Requirements 1215 This list of mandatory-to-implement specifications is derived from 1216 the requirements outlined in [I-D.ietf-rtcweb-rtp-usage]. 1218 R-1 [RFC4566] is the base SDP specification and MUST be 1219 implemented. 1221 R-2 [RFC5764] MUST be supported for signaling the UDP/TLS/RTP/SAVPF 1222 [RFC5764] and TCP/DTLS/RTP/SAVPF 1223 [I-D.nandakumar-mmusic-proto-iana-registration] RTP profiles. 1225 R-3 [RFC5245] MUST be implemented for signaling the ICE credentials 1226 and candidate lines corresponding to each media stream. The 1227 ICE implementation MUST be a Full implementation, not a Lite 1228 implementation. 1230 R-4 [RFC5763] MUST be implemented to signal DTLS certificate 1231 fingerprints. 1233 R-5 [RFC4568] MUST NOT be implemented to signal SDES SRTP keying 1234 information. 1236 R-6 The [RFC5888] grouping framework MUST be implemented for 1237 signaling grouping information, and MUST be used to identify m= 1238 lines via the a=mid attribute. 1240 R-7 [I-D.ietf-mmusic-msid] MUST be supported, in order to signal 1241 associations between RTP objects and W3C MediaStreams and 1242 MediaStreamTracks in a standard way. 1244 R-8 The bundle mechanism in 1245 [I-D.ietf-mmusic-sdp-bundle-negotiation] MUST be supported to 1246 signal the ability to multiplex RTP streams on a single UDP 1247 port, in order to avoid excessive use of port number resources. 1249 R-9 The SDP attributes of "sendonly", "recvonly", "inactive", and 1250 "sendrecv" from [RFC4566] MUST be implemented to signal 1251 information about media direction. 1253 R-10 [RFC5576] MUST be implemented to signal RTP SSRC values and 1254 grouping semantics. 1256 R-11 [RFC4585] MUST be implemented to signal RTCP based feedback. 1258 R-12 [RFC5761] MUST be implemented to signal multiplexing of RTP and 1259 RTCP. 1261 R-13 [RFC5506] MUST be implemented to signal reduced-size RTCP 1262 messages. 1264 R-14 [RFC4588] MUST be implemented to signal RTX payload type 1265 associations. 1267 R-15 [RFC3556] with bandwidth modifiers MAY be supported for 1268 specifying RTCP bandwidth as a fraction of the media bandwidth, 1269 RTCP fraction allocated to the senders and setting maximum 1270 media bit-rate boundaries. 1272 R-16 TODO: any others? 1274 As required by [RFC4566], Section 5.13, JSEP implementations MUST 1275 ignore unknown attribute (a=) lines. 1277 5.1.2. Usage Requirements 1279 All session descriptions handled by JSEP endpoints, both local and 1280 remote, MUST indicate support for the following specifications. If 1281 any of these are absent, this omission MUST be treated as an error. 1283 R-1 ICE, as specified in [RFC5245], MUST be used. Note that the 1284 remote endpoint may use a Lite implementation; implementations 1285 MUST properly handle remote endpoints which do ICE-Lite. 1287 R-2 DTLS [RFC6347] or DTLS-SRTP [RFC5763], MUST be used, as 1288 appropriate for the media type, as specified in 1289 [I-D.ietf-rtcweb-security-arch] 1291 5.1.3. Profile Names and Interoperability 1293 For media m= sections, JSEP endpoints MUST support both the "UDP/TLS/ 1294 RTP/SAVPF" and "TCP/DTLS/RTP/SAVPF" profiles and MUST indicate one of 1295 these two profiles for each media m= line they produce in an offer. 1296 For data m= sections, JSEP endpoints must support both the "UDP/DTLS/ 1297 SCTP" and "TCP/DTLS/SCTP" profiles and MUST indicate one of these two 1298 profiles for each data m= line they produce in an offer. Because ICE 1299 can select either TCP or UDP transport depending on network 1300 conditions, both advertisements are consistent with ICE eventually 1301 selecting either either UDP or TCP. 1303 Unfortunately, in an attempt at compatibility, some endpoints 1304 generate other profile strings even when they mean to support one of 1305 these profiles. For instance, an endpoint might generate "RTP/AVP" 1306 but supply "a=fingerprint" and "a=rtcp-fb" attributes, indicating its 1307 willingness to support "(UDP,TCP)/TLS/RTP/SAVPF". In order to 1308 simplify compatibility with such endpoints, JSEP endpoints MUST 1309 follow the following rules when processing the media m= sections in 1310 an offer: 1312 o The profile in any "m=" line in any answer MUST exactly match the 1313 profile provided in the offer. 1315 o Any profile matching the following patterns MUST be accepted: 1316 "RTP/[S]AVP[F]" and "(UDP/TCP)/TLS/RTP/SAVP[F]" 1318 o Because DTLS-SRTP is REQUIRED, the choice of SAVP or AVP has no 1319 effect; support for DTLS-SRTP is determined by the presence of the 1320 "a=fingerprint" attribute. Note that lack of an "a=fingerprint" 1321 attribute will lead to negotiation failure. 1323 o The use of AVPF or AVP simply controls the timing rules used for 1324 RTCP feedback. If AVPF is provided, or an "a=rtcp-fb" attribute 1325 is present, assume AVPF timing, i.e. a default value of "trr- 1326 int=0". Otherwise, assume that AVPF is being used in an AVP 1327 compatible mode and use AVP timing, i.e., "trr-int=4". 1329 o For data m= sections, JSEP endpoints MUST support receiving the 1330 "UDP/ DTLS/SCTP", "TCP/DTLS/SCTP", or "DTLS/SCTP" (for backwards 1331 compatibility) profiles. 1333 Note that re-offers by JSEP endpoints MUST use the correct profile 1334 strings even if the initial offer/answer exchange used an (incorrect) 1335 older profile string. 1337 5.2. Constructing an Offer 1339 When createOffer is called, a new SDP description must be created 1340 that includes the functionality specified in 1341 [I-D.ietf-rtcweb-rtp-usage]. The exact details of this process are 1342 explained below. 1344 5.2.1. Initial Offers 1346 When createOffer is called for the first time, the result is known as 1347 the initial offer. 1349 The first step in generating an initial offer is to generate session- 1350 level attributes, as specified in [RFC4566], Section 5. 1351 Specifically: 1353 o The first SDP line MUST be "v=0", as specified in [RFC4566], 1354 Section 5.1 1356 o The second SDP line MUST be an "o=" line, as specified in 1357 [RFC4566], Section 5.2. The value of the field SHOULD 1358 be "-". The value of the field SHOULD be a 1359 cryptographically random number. To ensure uniqueness, this 1360 number SHOULD be at least 64 bits long. The value of the field SHOULD be zero. The value of the 1362 tuple SHOULD be set to a non- 1363 meaningful address, such as IN IP4 0.0.0.0, to prevent leaking the 1364 local address in this field. As mentioned in [RFC4566], the 1365 entire o= line needs to be unique, but selecting a random number 1366 for is sufficient to accomplish this. 1368 o The third SDP line MUST be a "s=" line, as specified in [RFC4566], 1369 Section 5.3; to match the "o=" line, a single dash SHOULD be used 1370 as the session name, e.g. "s=-". Note that this differs from the 1371 advice in [RFC4566] which proposes a single space, but as both 1372 "o=" and "s=" are meaningless, having the same meaningless value 1373 seems clearer. 1375 o Session Information ("i="), URI ("u="), Email Address ("e="), 1376 Phone Number ("p="), Bandwidth ("b="), Repeat Times ("r="), and 1377 Time Zones ("z=") lines are not useful in this context and SHOULD 1378 NOT be included. 1380 o Encryption Keys ("k=") lines do not provide sufficient security 1381 and MUST NOT be included. 1383 o A "t=" line MUST be added, as specified in [RFC4566], Section 5.9; 1384 both and SHOULD be set to zero, e.g. "t=0 1385 0". 1387 o An "a=msid-semantic:WMS" line MUST be added, as specified in 1388 [I-D.ietf-mmusic-msid], Section 4. 1390 The next step is to generate m= sections, as specified in [RFC4566] 1391 Section 5.14, for each MediaStreamTrack that has been added to the 1392 PeerConnection via the addStream method. (Note that this method 1393 takes a MediaStream, which can contain multiple MediaStreamTracks, 1394 and therefore multiple m= sections can be generated even if addStream 1395 is only called once.) m=sections MUST be sorted first by the order in 1396 which the MediaStreams were added to the PeerConnection, and then by 1397 the alphabetical ordering of the media type for the MediaStreamTrack. 1398 For example, if a MediaStream containing both an audio and a video 1399 MediaStreamTrack is added to a PeerConnection, the resultant m=audio 1400 section will precede the m=video section. If a second MediaStream 1401 containing an audio MediaStreamTrack was added, it would follow the 1402 m=video section. 1404 Each m= section, provided it is not being bundled into another m= 1405 section, MUST generate a unique set of ICE credentials and gather its 1406 own unique set of ICE candidates. Otherwise, it MUST use the same 1407 ICE credentials and candidates as the m= section into which it is 1408 being bundled. Note that this means that for offers, any m= sections 1409 which are not bundle-only MUST have unique ICE credentials and 1410 candidates, since it is possible that the answerer will accept them 1411 without bundling them. 1413 For DTLS, all m= sections MUST use the certificate for the identity 1414 that has been specified for the PeerConnection; as a result, they 1415 MUST all have the same [RFC4572] fingerprint value, or this value 1416 MUST be a session-level attribute. 1418 Each m= section should be generated as specified in [RFC4566], 1419 Section 5.14. For the m= line itself, the following rules MUST be 1420 followed: 1422 o The port value is set to the port of the default ICE candidate for 1423 this m= section, but given that no candidates have yet been 1424 gathered, the "dummy" port value of 9 (Discard) MUST be used, as 1425 indicated in [I-D.ietf-mmusic-trickle-ice], Section 5.1. 1427 o To properly indicate use of DTLS, the field MUST be set to 1428 "UDP/TLS/RTP/SAVPF", as specified in [RFC5764], Section 8, if the 1429 default candidate uses UDP transport, or "TCP/DTLS/RTP/SAVPF", as 1430 specified in[I-D.nandakumar-mmusic-proto-iana-registration] if the 1431 default candidate uses TCP transport. 1433 The m= line MUST be followed immediately by a "c=" line, as specified 1434 in [RFC4566], Section 5.7. Again, as no candidates have yet been 1435 gathered, the "c=" line must contain the "dummy" value "IN IP6 ::", 1436 as defined in [I-D.ietf-mmusic-trickle-ice], Section 5.1. 1438 Each m= section MUST include the following attribute lines: 1440 o An "a=mid" line, as specified in [RFC5888], Section 4. When 1441 generating mid values, it is RECOMMENDED that the values be 3 1442 bytes or less, to allow them to efficiently fit into the RTP 1443 header extension defined in 1444 [I-D.ietf-mmusic-sdp-bundle-negotiation], Section 11. 1446 o An "a=rtcp" line, as specified in [RFC3605], Section 2.1, 1447 containing the dummy value "9 IN IP6 ::", because no candidates 1448 have yet been gathered. 1450 o An "a=msid" line, as specified in [I-D.ietf-mmusic-msid], 1451 Section 2. 1453 o An "a=sendrecv" line, as specified in [RFC3264], Section 5.1. 1455 o For each supported codec, "a=rtpmap" and "a=fmtp" lines, as 1456 specified in [RFC4566], Section 6. The audio and video codecs 1457 that MUST be supported are specified in [I-D.ietf-rtcweb-audio] 1458 (see Section 3) and [I-D.ietf-rtcweb-video] (see Section 5). 1460 o If this m= section is for media with configurable frame sizes, 1461 e.g. audio, an "a=maxptime" line, indicating the smallest of the 1462 maximum supported frame sizes out of all codecs included above, as 1463 specified in [RFC4566], Section 6. 1465 o If this m= section is for video media, an "a=imageattr" line, as 1466 specified in Section 3.6. 1468 o For each primary codec where RTP retransmission should be used, a 1469 corresponding "a=rtpmap" line indicating "rtx" with the clock rate 1470 of the primary codec and an "a=fmtp" line that references the 1471 payload type of the primary codec, as specified in [RFC4588], 1472 Section 8.1. 1474 o For each supported FEC mechanism, "a=rtpmap" and "a=fmtp" lines, 1475 as specified in [RFC4566], Section 6. The FEC mechanisms that 1476 MUST be supported are specified in [I-D.ietf-rtcweb-fec], 1477 Section 6, and specific usage for each media type is outlined in 1478 Sections 4 and 5. 1480 o "a=ice-ufrag" and "a=ice-pwd" lines, as specified in [RFC5245], 1481 Section 15.4. 1483 o An "a=ice-options" line, with the "trickle" option, as specified 1484 in [I-D.ietf-mmusic-trickle-ice], Section 4. 1486 o An "a=fingerprint" line, as specified in [RFC4572], Section 5; the 1487 algorithm used for the fingerprint MUST match that used in the 1488 certificate signature. 1490 o An "a=setup" line, as specified in [RFC4145], Section 4, and 1491 clarified for use in DTLS-SRTP scenarios in [RFC5763], Section 5. 1492 The role value in the offer MUST be "actpass". 1494 o An "a=rtcp-mux" line, as specified in [RFC5761], Section 5.1.1. 1496 o An "a=rtcp-rsize" line, as specified in [RFC5506], Section 5. 1498 o For each supported RTP header extension, an "a=extmap" line, as 1499 specified in [RFC5285], Section 5. The list of header extensions 1500 that SHOULD/MUST be supported is specified in 1501 [I-D.ietf-rtcweb-rtp-usage], Section 5.2. Any header extensions 1502 that require encryption MUST be specified as indicated in 1503 [RFC6904], Section 4. 1505 o For each supported RTCP feedback mechanism, an "a=rtcp-fb" 1506 mechanism, as specified in [RFC4585], Section 4.2. The list of 1507 RTCP feedback mechanisms that SHOULD/MUST be supported is 1508 specified in [I-D.ietf-rtcweb-rtp-usage], Section 5.1. 1510 o An "a=ssrc" line, as specified in [RFC5576], Section 4.1, 1511 indicating the SSRC to be used for sending media, along with the 1512 mandatory "cname" source attribute, as specified in Section 6.1, 1513 indicating the CNAME for the source. The CNAME must be generated 1514 in accordance with [RFC7022] and Section 3.5. 1516 o If RTX is supported for this media type, another "a=ssrc" line 1517 with the RTX SSRC, and an "a=ssrc-group" line, as specified in 1518 [RFC5576], section 4.2, with semantics set to "FID" and including 1519 the primary and RTX SSRCs. 1521 o If FEC is supported for this media type, another "a=ssrc" line 1522 with the FEC SSRC, and an "a=ssrc-group" line with semantics set 1523 to "FEC-FR" and including the primary and FEC SSRCs, as specified 1524 in [RFC5956], section 4.3. For simplicity, if both RTX and FEC 1525 are supported, the FEC SSRC MUST be the same as the RTX SSRC. 1527 o [OPEN ISSUE: Handling of a=imageattr] 1529 o If the BUNDLE policy for this PeerConnection is set to "max- 1530 bundle", and this is not the first m= section, or the BUNDLE 1531 policy is set to "balanced", and this is not the first m= section 1532 for this media type, an "a=bundle-only" line. 1534 Lastly, if a data channel has been created, a m= section MUST be 1535 generated for data. The field MUST be set to "application" 1536 and the field MUST be set to "UDP/DTLS/SCTP" if the default 1537 candidate uses UDP transport, or "TCP/DTLS/SCTP" if the default 1538 candidate uses TCP transport [I-D.ietf-mmusic-sctp-sdp]. The "fmt" 1539 value MUST be set to the SCTP port number, as specified in 1540 Section 4.1. [TODO: update this to use a=sctp-port, as indicated in 1541 the latest data channel docs] 1543 Within the data m= section, the "a=mid", "a=ice-ufrag", "a=ice- 1544 passwd", "a=ice-options", "a=candidate", "a=fingerprint", and 1545 "a=setup" lines MUST be included as mentioned above, along with an 1546 "a=sctpmap" line referencing the SCTP port number and specifying the 1547 application protocol indicated in [I-D.ietf-rtcweb-data-protocol]. 1548 [OPEN ISSUE: the -01 of this document is missing this information.] 1550 Once all m= sections have been generated, a session-level "a=group" 1551 attribute MUST be added as specified in [RFC5888]. This attribute 1552 MUST have semantics "BUNDLE", and MUST include the mid identifiers of 1553 each m= section. The effect of this is that the browser offers all 1554 m= sections as one BUNDLE group. However, whether the m= sections 1555 are bundle-only or not depends on the BUNDLE policy. 1557 The next step is to generate session-level lip sync groups as defined 1558 in [RFC5888], Section 7. For each MediaStream with more than one 1559 MediaStreamTrack, a group of type "LS" MUST be added that contains 1560 the mid values for each MediaStreamTrack in that MediaStream. 1562 Attributes which SDP permits to either be at the session level or the 1563 media level SHOULD generally be at the media level even if they are 1564 identical. This promotes readability, especially if one of a set of 1565 initially identical attributes is subsequently changed. 1567 Attributes other than the ones specified above MAY be included, 1568 except for the following attributes which are specifically 1569 incompatible with the requirements of [I-D.ietf-rtcweb-rtp-usage], 1570 and MUST NOT be included: 1572 o "a=crypto" 1574 o "a=key-mgmt" 1576 o "a=ice-lite" 1578 Note that when BUNDLE is used, any additional attributes that are 1579 added MUST follow the advice in [I-D.ietf-mmusic-sdp-mux-attributes] 1580 on how those attributes interact with BUNDLE. 1582 Note that these requirements are in some cases stricter than those of 1583 SDP. Implementations MUST be prepared to accept compliant SDP even 1584 if it would not conform to the requirements for generating SDP in 1585 this specification. 1587 5.2.2. Subsequent Offers 1589 When createOffer is called a second (or later) time, or is called 1590 after a local description has already been installed, the processing 1591 is somewhat different than for an initial offer. 1593 If the initial offer was not applied using setLocalDescription, 1594 meaning the PeerConnection is still in the "stable" state, the steps 1595 for generating an initial offer should be followed, subject to the 1596 following restriction: 1598 o The fields of the "o=" line MUST stay the same except for the 1599 field, which MUST increment if the session 1600 description changes in any way, including the addition of ICE 1601 candidates. 1603 If the initial offer was applied using setLocalDescription, but an 1604 answer from the remote side has not yet been applied, meaning the 1605 PeerConnection is still in the "local-offer" state, an offer is 1606 generated by following the steps in the "stable" state above, along 1607 with these exceptions: 1609 o The "s=" and "t=" lines MUST stay the same. 1611 o Each "m=" and c=" line MUST be filled in with the port, protocol, 1612 and address of the default candidate for the m= section, as 1613 described in [RFC5245], Section 4.3. Each "a=rtcp" attribute line 1614 MUST also be filled in with the port and address of the 1615 appropriate default candidate, either the default RTP or RTCP 1616 candidate, depending on whether RTCP multiplexing is currently 1617 active or not. Note that if RTCP multiplexing is being offered, 1618 but not yet active, the default RTCP candidate MUST be used, as 1619 indicated in [RFC5761], section 5.1.3. In each case, if no 1620 candidates of the desired type have yet been gathered, dummy 1621 values MUST be used, as described above. 1623 o Each "a=mid" line MUST stay the same. 1625 o Each "a=ice-ufrag" and "a=ice-pwd" line MUST stay the same, unless 1626 the ICE configuration has changed (either changes to the supported 1627 STUN/TURN servers, or the ICE candidate policy), or the 1628 "IceRestart" option (Section 5.2.3.3 was specified. 1630 o Within each m= section, for each candidate that has been gathered 1631 during the most recent gathering phase (see Section 3.4.1), an 1632 "a=candidate" line MUST be added, as specified in [RFC5245], 1633 Section 4.3., paragraph 3. If candidate gathering for the section 1634 has completed, an "a=end-of-candidates" attribute MUST be added, 1635 as described in [I-D.ietf-mmusic-trickle-ice], Section 9.3. 1637 o For MediaStreamTracks that are still present, the "a=msid", 1638 "a=ssrc", and "a=ssrc-group" lines MUST stay the same. 1640 o If any MediaStreamTracks have been removed, either through the 1641 removeStream method or by removing them from an added MediaStream, 1642 their m= sections MUST be marked as recvonly by changing the value 1643 of the [RFC3264] directional attribute to "a=recvonly". The 1644 "a=msid", "a=ssrc", and "a=ssrc-group" lines MUST be removed from 1645 the associated m= sections. 1647 o If any MediaStreamTracks have been added, and there exist m= 1648 sections of the appropriate media type with no associated 1649 MediaStreamTracks (i.e. as described in the preceding paragraph), 1650 those m= sections MUST be recycled by adding the new 1651 MediaStreamTrack to the m= section. This is done by adding the 1652 necessary "a=msid", "a=ssrc", and "a=ssrc-group" lines to the 1653 recycled m= section, and removing the "a=recvonly" attribute. 1655 If the initial offer was applied using setLocalDescription, and an 1656 answer from the remote side has been applied using 1657 setRemoteDescription, meaning the PeerConnection is in the "remote- 1658 pranswer" or "stable" states, an offer is generated based on the 1659 negotiated session descriptions by following the steps mentioned for 1660 the "local-offer" state above, along with these exceptions: [OPEN 1661 ISSUE: should this be permitted in the remote-pranswer state? 1662 https://github.com/rtcweb-wg/jsep/issues/143] 1664 o If a m= section exists in the current local description, but does 1665 not have an associated local MediaStreamTrack (possibly because 1666 said MediaStreamTrack was removed since the last exchange), a m= 1667 section MUST still be generated in the new offer, as indicated in 1668 [RFC3264], Section 8. The disposition of this section will depend 1669 on the state of the remote MediaStreamTrack associated with this 1670 m= section. If one exists, and it is still in the "live" state, 1671 the new m= section MUST be marked as "a=recvonly", with no 1672 "a=msid" or related attributes present. If no remote 1673 MediaStreamTrack exists, or it is in the "ended" state, the m= 1674 section MUST be marked as rejected, by setting the port to zero, 1675 as indicated in [RFC3264], Section 8.2. 1677 o If any MediaStreamTracks have been added, and there exist recvonly 1678 m= sections of the appropriate media type with no associated 1679 MediaStreamTracks, or rejected m= sections of any media type, 1680 those m= sections MUST be recycled, and a local MediaStreamTrack 1681 associated with these recycled m= sections until all such existing 1682 m= sections have been used. This includes any recvonly or 1683 rejected m= sections created by the preceding paragraph. 1685 In addition, for each non-recycled, non-rejected m= section in the 1686 new offer, the following adjustments are made based on the contents 1687 of the corresponding m= section in the current remote description: 1689 o The m= line and corresponding "a=rtpmap" and "a=fmtp" lines MUST 1690 only include codecs present in the remote description. 1692 o The RTP header extensions MUST only include those that are present 1693 in the remote description. 1695 o The RTCP feedback extensions MUST only include those that are 1696 present in the remote description. 1698 o The "a=rtcp-mux" line MUST only be added if present in the remote 1699 description. 1701 o The "a=rtcp-rsize" line MUST only be added if present in the 1702 remote description. 1704 The "a=group:BUNDLE" attribute MUST include the mid identifiers 1705 specified in the BUNDLE group in the most recent answer, minus any m= 1706 sections that have been marked as rejected, plus any newly added or 1707 re-enabled m= sections. In other words, the BUNDLE attribute must 1708 contain all m= sections that were previously bundled, as long as they 1709 are still alive, as well as any new m= sections. 1711 The "LS" groups are generated in the same way as with initial offers. 1713 5.2.3. Options Handling 1715 The createOffer method takes as a parameter an RTCOfferOptions 1716 object. Special processing is performed when generating a SDP 1717 description if the following options are present. 1719 5.2.3.1. OfferToReceiveAudio 1721 If the "OfferToReceiveAudio" option is specified, with an integer 1722 value of N, and M audio MediaStreamTracks have been added to the 1723 PeerConnection, the offer MUST include N non-rejected m= sections 1724 with media type "audio", even if N is greater than M. This allows 1725 the offerer to receive audio, including multiple independent streams, 1726 even when not sending it; accordingly, the directional attribute on 1727 the N-M audio m= sections without associated MediaStreamTracks MUST 1728 be set to recvonly. 1730 If N is set to a value less than M, the offer MUST mark the m= 1731 sections associated with the M-N most recently added (since the last 1732 setLocalDescription) MediaStreamTracks as sendonly. This allows the 1733 offerer to indicate that it does not want to receive audio on some or 1734 all of its newly created streams. For m= sections that have 1735 previously been negotiated, this setting has no effect. [TODO: refer 1736 to RTCRtpSender in the future] 1737 For backwards compatibility with pre-standard versions of this 1738 specification, a value of "true" is interpreted as equivalent to N=1, 1739 and "false" as N=0. 1741 5.2.3.2. OfferToReceiveVideo 1743 If the "OfferToReceiveVideo" option is specified, with an integer 1744 value of N, and M video MediaStreamTracks have been added to the 1745 PeerConnection, the offer MUST include N non-rejected m= sections 1746 with media type "video", even if N is greater than M. This allows 1747 the offerer to receive video, including multiple independent streams, 1748 even when not sending it; accordingly, the directional attribute on 1749 the N-M video m= sections without associated MediaStreamTracks MUST 1750 be set to recvonly. 1752 If N is set to a value less than M, the offer MUST mark the m= 1753 sections associated with the M-N most recently added (since the last 1754 setLocalDescription) MediaStreamTracks as sendonly. This allows the 1755 offerer to indicate that it does not want to receive video on some or 1756 all of its newly created streams. For m= sections that have 1757 previously been negotiated, this setting has no effect. [TODO: refer 1758 to RTCRtpSender in the future] 1760 For backwards compatibility with pre-standard versions of this 1761 specification, a value of "true" is interpreted as equivalent to N=1, 1762 and "false" as N=0. 1764 5.2.3.3. IceRestart 1766 If the "IceRestart" option is specified, with a value of "true", the 1767 offer MUST indicate an ICE restart by generating new ICE ufrag and 1768 pwd attributes, as specified in [RFC5245], Section 9.1.1.1. If this 1769 option is specified on an initial offer, it has no effect (since a 1770 new ICE ufrag and pwd are already generated). Similarly, if the ICE 1771 configuration has changed, this option has no effect, since new ufrag 1772 and pwd attributes will be generated automatically. This option is 1773 primarily useful for reestablishing connectivity in cases where 1774 failures are detected by the application. 1776 5.2.3.4. VoiceActivityDetection 1778 If the "VoiceActivityDetection" option is specified, with a value of 1779 "true", the offer MUST indicate support for silence suppression in 1780 the audio it receives by including comfort noise ("CN") codecs for 1781 each offered audio codec, as specified in [RFC3389], Section 5.1, 1782 except for codecs that have their own internal silence suppression 1783 support. For codecs that have their own internal silence suppression 1784 support, the appropriate fmtp parameters for that codec MUST be 1785 specified to indicate that silence suppression for received audio is 1786 desired. For example, when using the Opus codec, the "usedtx=1" 1787 parameter would be specified in the offer. This option allows the 1788 endpoint to significantly reduce the amount of audio bandwidth it 1789 receives, at the cost of some fidelity, depending on the quality of 1790 the remote VAD algorithm. 1792 5.3. Generating an Answer 1794 When createAnswer is called, a new SDP description must be created 1795 that is compatible with the supplied remote description as well as 1796 the requirements specified in [I-D.ietf-rtcweb-rtp-usage]. The exact 1797 details of this process are explained below. 1799 5.3.1. Initial Answers 1801 When createAnswer is called for the first time after a remote 1802 description has been provided, the result is known as the initial 1803 answer. If no remote description has been installed, an answer 1804 cannot be generated, and an error MUST be returned. 1806 Note that the remote description SDP may not have been created by a 1807 JSEP endpoint and may not conform to all the requirements listed in 1808 Section 5.2. For many cases, this is not a problem. However, if any 1809 mandatory SDP attributes are missing, or functionality listed as 1810 mandatory-to-use above is not present, this MUST be treated as an 1811 error, and MUST cause the affected m= sections to be marked as 1812 rejected. 1814 The first step in generating an initial answer is to generate 1815 session-level attributes. The process here is identical to that 1816 indicated in the Initial Offers section above. 1818 The next step is to generate lip sync groups as defined in [RFC5888], 1819 Section 7. For each MediaStream with more than one MediaStreamTrack, 1820 a group of type "LS" MUST be added that contains the mid values for 1821 each MediaStreamTrack in that MediaStream. In some cases this may 1822 result in adding a mid to a given LS group that was not in that LS 1823 group in the associated offer. Although this is not allowed by 1824 [RFC5888], it is allowed when implementing this specification. 1826 The next step is to generate m= sections for each m= section that is 1827 present in the remote offer, as specified in [RFC3264], Section 6. 1828 For the purposes of this discussion, any session-level attributes in 1829 the offer that are also valid as media-level attributes SHALL be 1830 considered to be present in each m= section. 1832 The next step is to go through each offered m= section. If there is 1833 a local MediaStreamTrack of the same type which has been added to the 1834 PeerConnection via addStream and not yet associated with a m= 1835 section, and the specific m= section is either sendrecv or recvonly, 1836 the MediaStreamTrack will be associated with the m= section at this 1837 time. MediaStreamTracks are assigned to m= sections using the 1838 canonical order described in Section 5.2.1. If there are more m= 1839 sections of a certain type than MediaStreamTracks, some m= sections 1840 will not have an associated MediaStreamTrack. If there are more 1841 MediaStreamTracks of a certain type than compatible m= sections, only 1842 the first N MediaStreamTracks will be able to be associated in the 1843 constructed answer. The remainder will need to be associated in a 1844 subsequent offer. 1846 For each offered m= section, if the associated remote 1847 MediaStreamTrack has been stopped, and is therefore in state "ended", 1848 and no local MediaStreamTrack has been associated, the corresponding 1849 m= section in the answer MUST be marked as rejected by setting the 1850 port in the m= line to zero, as indicated in [RFC3264], Section 6., 1851 and further processing for this m= section can be skipped. 1853 Provided that is not the case, each m= section in the answer should 1854 then be generated as specified in [RFC3264], Section 6.1. For the m= 1855 line itself, the following rules must be followed: 1857 o The port value would normally be set to the port of the default 1858 ICE candidate for this m= section, but given that no candidates 1859 have yet been gathered, the "dummy" port value of 9 (Discard) MUST 1860 be used, as indicated in [I-D.ietf-mmusic-trickle-ice], 1861 Section 5.1. 1863 o The field MUST be set to exactly match the field 1864 for the corresponding m= line in the offer. 1866 The m= line MUST be followed immediately by a "c=" line, as specified 1867 in [RFC4566], Section 5.7. Again, as no candidates have yet been 1868 gathered, the "c=" line must contain the "dummy" value "IN IP6 ::", 1869 as defined in [I-D.ietf-mmusic-trickle-ice], Section 5.1. 1871 If the offer supports BUNDLE, all m= sections to be BUNDLEd must use 1872 the same ICE credentials and candidates; all m= sections not being 1873 BUNDLEd must use unique ICE credentials and candidates. Each m= 1874 section MUST include the following: 1876 o If present in the offer, an "a=mid" line, as specified in 1877 [RFC5888], Section 9.1. The "mid" value MUST match that specified 1878 in the offer. 1880 o An "a=rtcp" line, as specified in [RFC3605], Section 2.1, 1881 containing the dummy value "9 IN IP6 ::", because no candidates 1882 have yet been gathered. 1884 o If a local MediaStreamTrack has been associated, an "a=msid" line, 1885 as specified in [I-D.ietf-mmusic-msid], Section 2. 1887 o Depending on the directionality of the offer, the disposition of 1888 any associated remote MediaStreamTrack, and the presence of an 1889 associated local MediaStreamTrack, the appropriate directionality 1890 attribute, as specified in [RFC3264], Section 6.1. If the offer 1891 was sendrecv, and the remote MediaStreamTrack is still "live", and 1892 there is a local MediaStreamTrack that has been associated, the 1893 directionality MUST be set as sendrecv. If the offer was 1894 sendonly, and the remote MediaStreamTrack is still "live", the 1895 directionality MUST be set as recvonly. If the offer was 1896 recvonly, and a local MediaStreamTrack has been associated, the 1897 directionality MUST be set as sendonly. If the offer was 1898 inactive, the directionality MUST be set as inactive. 1900 o For each supported codec that is present in the offer, "a=rtpmap" 1901 and "a=fmtp" lines, as specified in [RFC4566], Section 6, and 1902 [RFC3264], Section 6.1. The audio and video codecs that MUST be 1903 supported are specified in [I-D.ietf-rtcweb-audio] (see Section 3) 1904 and [I-D.ietf-rtcweb-video] (see Section 5). Note that for 1905 simplicity, the answerer MAY use different payload types for 1906 codecs than the offerer, as it is not prohibited by Section 6.1. 1908 o If this m= section is for media with configurable frame sizes, 1909 e.g. audio, an "a=maxptime" line, indicating the smallest of the 1910 maximum supported frame sizes out of all codecs included above, as 1911 specified in [RFC4566], Section 6. 1913 o If this m= section is for video media, an "a=imageattr" line, as 1914 specified in Section 3.6. 1916 o If "rtx" is present in the offer, for each primary codec where RTP 1917 retransmission should be used, a corresponding "a=rtpmap" line 1918 indicating "rtx" with the clock rate of the primary codec and an 1919 "a=fmtp" line that references the payload type of the primary 1920 codec, as specified in [RFC4588], Section 8.1. 1922 o For each supported FEC mechanism, "a=rtpmap" and "a=fmtp" lines, 1923 as specified in [RFC4566], Section 6. The FEC mechanisms that 1924 MUST be supported are specified in [I-D.ietf-rtcweb-fec], 1925 Section 6, and specific usage for each media type is outlined in 1926 Sections 4 and 5. 1928 o "a=ice-ufrag" and "a=ice-passwd" lines, as specified in [RFC5245], 1929 Section 15.4. 1931 o If the "trickle" ICE option is present in the offer, an "a=ice- 1932 options" line, with the "trickle" option, as specified in 1933 [I-D.ietf-mmusic-trickle-ice], Section 4. 1935 o An "a=fingerprint" line, as specified in [RFC4572], Section 5; the 1936 algorithm used for the fingerprint MUST match that used in the 1937 certificate signature. 1939 o An "a=setup" line, as specified in [RFC4145], Section 4, and 1940 clarified for use in DTLS-SRTP scenarios in [RFC5763], Section 5. 1941 The role value in the answer MUST be "active" or "passive"; the 1942 "active" role is RECOMMENDED. 1944 o If present in the offer, an "a=rtcp-mux" line, as specified in 1945 [RFC5761], Section 5.1.1. If the "require" RTCP multiplexing 1946 policy is set and no "a=rtcp-mux" line is present in the offer, 1947 then the m=line MUST be marked as rejected by setting the port in 1948 the m= line to zero, as indicated in [RFC3264], Section 6. 1950 o If present in the offer, an "a=rtcp-rsize" line, as specified in 1951 [RFC5506], Section 5. 1953 o For each supported RTP header extension that is present in the 1954 offer, an "a=extmap" line, as specified in [RFC5285], Section 5. 1955 The list of header extensions that SHOULD/MUST be supported is 1956 specified in [I-D.ietf-rtcweb-rtp-usage], Section 5.2. Any header 1957 extensions that require encryption MUST be specified as indicated 1958 in [RFC6904], Section 4. 1960 o For each supported RTCP feedback mechanism that is present in the 1961 offer, an "a=rtcp-fb" mechanism, as specified in [RFC4585], 1962 Section 4.2. The list of RTCP feedback mechanisms that SHOULD/ 1963 MUST be supported is specified in [I-D.ietf-rtcweb-rtp-usage], 1964 Section 5.1. 1966 o If a local MediaStreamTrack has been associated, an "a=ssrc" line, 1967 as specified in [RFC5576], Section 4.1, indicating the SSRC to be 1968 used for sending media, along with the mandatory "cname" source 1969 attribute, as specified in Section 6.1, indicating the CNAME for 1970 the source. The CNAME must be generated in accordance with 1971 [RFC7022] and Section 3.5. 1973 o If a local MediaStreamTrack has been associated, and RTX has been 1974 negotiated for this m= section, another "a=ssrc" line with the RTX 1975 SSRC, and an "a=ssrc-group" line, as specified in [RFC5576], 1976 section 4.2, with semantics set to "FID" and including the primary 1977 and RTX SSRCs. 1979 o If a local MediaStreamTrack has been associated, and FEC has been 1980 negotiated for this m= section, another "a=ssrc" line with the FEC 1981 SSRC, and an "a=ssrc-group" line with semantics set to "FEC-FR" 1982 and including the primary and FEC SSRCs, as specified in 1983 [RFC5956], section 4.3. For simplicity, if both RTX and FEC are 1984 supported, the FEC SSRC MUST be the same as the RTX SSRC. 1986 o [OPEN ISSUE: Handling of a=imageattr] 1988 If a data channel m= section has been offered, a m= section MUST also 1989 be generated for data. The field MUST be set to 1990 "application" and the field MUST be set to exactly match the 1991 field in the offer; the "fmt" value MUST be set to the SCTP port 1992 number, as specified in Section 4.1. [TODO: update this to use 1993 a=sctp-port, as indicated in the latest data channel docs] 1995 Within the data m= section, the "a=mid", "a=ice-ufrag", "a=ice- 1996 passwd", "a=ice-options", "a=candidate", "a=fingerprint", and 1997 "a=setup" lines MUST be included as mentioned above, along with an 1998 "a=sctpmap" line referencing the SCTP port number and specifying the 1999 application protocol indicated in [I-D.ietf-rtcweb-data-protocol]. 2000 [OPEN ISSUE: the -01 of this document is missing this information.] 2002 If "a=group" attributes with semantics of "BUNDLE" are offered, 2003 corresponding session-level "a=group" attributes MUST be added as 2004 specified in [RFC5888]. These attributes MUST have semantics 2005 "BUNDLE", and MUST include the all mid identifiers from the offered 2006 BUNDLE groups that have not been rejected. Note that regardless of 2007 the presence of "a=bundle-only" in the offer, no m= sections in the 2008 answer should have an "a=bundle-only" line. 2010 Attributes that are common between all m= sections MAY be moved to 2011 session-level, if explicitly defined to be valid at session-level. 2013 The attributes prohibited in the creation of offers are also 2014 prohibited in the creation of answers. 2016 5.3.2. Subsequent Answers 2018 When createAnswer is called a second (or later) time, or is called 2019 after a local description has already been installed, the processing 2020 is somewhat different than for an initial answer. 2022 If the initial answer was not applied using setLocalDescription, 2023 meaning the PeerConnection is still in the "have-remote-offer" state, 2024 the steps for generating an initial answer should be followed, 2025 subject to the following restriction: 2027 o The fields of the "o=" line MUST stay the same except for the 2028 field, which MUST increment if the session 2029 description changes in any way from the previously generated 2030 answer. 2032 If any session description was previously supplied to 2033 setLocalDescription, an answer is generated by following the steps in 2034 the "have-remote-offer" state above, along with these exceptions: 2036 o The "s=" and "t=" lines MUST stay the same. 2038 o Each "m=" and c=" line MUST be filled in with the port and address 2039 of the default candidate for the m= section, as described in 2040 [RFC5245], Section 4.3. Note, however, that the m= line protocol 2041 need not match the default candidate, because this protocol value 2042 must instead match what was supplied in the offer, as described 2043 above. Each "a=rtcp" attribute line MUST also be filled in with 2044 the port and address of the appropriate default candidate, either 2045 the default RTP or RTCP candidate, depending on whether RTCP 2046 multiplexing is enabled in the answer. In each case, if no 2047 candidates of the desired type have yet been gathered, dummy 2048 values MUST be used, as described in the initial answer section 2049 above. 2051 o Each "a=ice-ufrag" and "a=ice-pwd" line MUST stay the same. 2053 o Within each m= section, for each candidate that has been gathered 2054 during the most recent gathering phase (see Section 3.4.1), an 2055 "a=candidate" line MUST be added, as specified in [RFC5245], 2056 Section 4.3., paragraph 3. If candidate gathering for the section 2057 has completed, an "a=end-of-candidates" attribute MUST be added, 2058 as described in [I-D.ietf-mmusic-trickle-ice], Section 9.3. 2060 o For MediaStreamTracks that are still present, the "a=msid", 2061 "a=ssrc", and "a=ssrc-group" lines MUST stay the same. 2063 5.3.3. Options Handling 2065 The createAnswer method takes as a parameter an RTCAnswerOptions 2066 object. The set of parameters for RTCAnswerOptions is different than 2067 those supported in RTCOfferOptions; the OfferToReceiveAudio, 2068 OfferToReceiveVideo, and IceRestart options mentioned in 2069 Section 5.2.3 are meaningless in the context of generating an answer, 2070 as there is no need to generate extra m= lines in an answer, and ICE 2071 credentials will automatically be changed for all m= lines where the 2072 offerer chose to perform ICE restart. 2074 The following options are supported in RTCAnswerOptions. 2076 5.3.3.1. VoiceActivityDetection 2078 Silence suppression in the answer is handled as described in 2079 Section 5.2.3.4. 2081 5.4. Processing a Local Description 2083 When a SessionDescription is supplied to setLocalDescription, the 2084 following steps MUST be performed: 2086 o First, the type of the SessionDescription is checked against the 2087 current state of the PeerConnection: 2089 * If the type is "offer", the PeerConnection state MUST be either 2090 "stable" or "have-local-offer". 2092 * If the type is "pranswer" or "answer", the PeerConnection state 2093 MUST be either "have-remote-offer" or "have-local-pranswer". 2095 o If the type is not correct for the current state, processing MUST 2096 stop and an error MUST be returned. 2098 o Next, the SessionDescription is parsed into a data structure, as 2099 described in the Section 5.6 section below. If parsing fails for 2100 any reason, processing MUST stop and an error MUST be returned. 2102 o Finally, the parsed SessionDescription is applied as described in 2103 the Section 5.7 section below. 2105 5.5. Processing a Remote Description 2107 When a SessionDescription is supplied to setRemoteDescription, the 2108 following steps MUST be performed: 2110 o First, the type of the SessionDescription is checked against the 2111 current state of the PeerConnection: 2113 * If the type is "offer", the PeerConnection state MUST be either 2114 "stable" or "have-remote-offer". 2116 * If the type is "pranswer" or "answer", the PeerConnection state 2117 MUST be either "have-local-offer" or "have-remote-pranswer". 2119 o If the type is not correct for the current state, processing MUST 2120 stop and an error MUST be returned. 2122 o Next, the SessionDescription is parsed into a data structure, as 2123 described in the Section 5.6 section below. If parsing fails for 2124 any reason, processing MUST stop and an error MUST be returned. 2126 o Finally, the parsed SessionDescription is applied as described in 2127 the Section 5.8 section below. 2129 5.6. Parsing a Session Description 2131 [The behavior described herein is a draft version, and needs more 2132 discussion to resolve various open issues.] 2134 When a SessionDescription of any type is supplied to setLocal/ 2135 RemoteDescription, the implementation must parse it and reject it if 2136 it is invalid. The exact details of this process are explained 2137 below. 2139 The SDP contained in the session description object consists of a 2140 sequence of text lines, each containing a key-value expression, as 2141 described in [RFC4566], Section 5. The SDP is read, line-by-line, 2142 and converted to a data structure that contains the deserialized 2143 information. However, SDP allows many types of lines, not all of 2144 which are relevant to JSEP applications. For each line, the 2145 implementation will first ensure it is syntactically correct 2146 according its defining ABNF [TODO: reference], check that it conforms 2147 to [RFC4566] and [RFC3264] semantics, and then either parse and store 2148 or discard the provided value, as described below. [TODO: ensure 2149 that every line is listed below.] If the line is not well-formed, or 2150 cannot be parsed as described, the parser MUST stop with an error and 2151 reject the session description. This ensures that implementations do 2152 not accidentally misinterpret ambiguous SDP. 2154 5.6.1. Session-Level Parsing 2156 First, the session-level lines are checked and parsed. These lines 2157 MUST occur in a specific order, and with a specific syntax, as 2158 defined in [RFC4566], Section 5. Note that while the specific line 2159 types (e.g. "v=", "c=") MUST occur in the defined order, lines of the 2160 same type (typically "a=") can occur in any order, and their ordering 2161 is not meaningful. 2163 For non-attribute (non-"a=") lines, their sequencing, syntax, and 2164 semantics, are checked, as mentioned above. The following lines are 2165 not meaningful in the JSEP context and MAY be discarded once they 2166 have been checked. 2168 The "c=" line MUST be checked for syntax but its value is not 2169 used. This supersedes the guidance in [RFC5245], Section 6.1, to 2170 use "ice-mismatch" to indicate mismatches between "c=" and the 2171 candidate lines; because JSEP always uses ICE, "ice-mismatch" is 2172 not useful in this context. 2174 TODO 2176 The remaining lines are processed as follows: 2178 The "b=" line, if present, MUST be parsed as specified in 2179 [RFC4566], Section 5.8, and the bwtype and bandwidth values 2180 stored. 2182 [OPEN ISSUE: is this WG consensus? Are there other non-a= lines 2183 that we need to do more than just syntactical validation, e.g. 2184 v=?] 2186 Specific processing MUST be applied for the following session-level 2187 attribute ("a=") lines: 2189 o Any "a=group" lines are parsed as specified in [RFC5888], 2190 Section 5, and the group's semantics and mids are stored. 2192 o If present, a single "a=ice-lite" line is parsed as specified in 2193 [RFC5245], Section 15.3, and a value indicating the presence of 2194 ice-lite is stored. 2196 o If present, a single "a=ice-ufrag" line is parsed as specified in 2197 [RFC5245], Section 15.4, and the ufrag value is stored. 2199 o If present, a single "a=ice-pwd" line is parsed as specified in 2200 [RFC5245], Section 15.4, and the password value is stored. 2202 o If present, a single "a=ice-options" line is parsed as specified 2203 in [RFC5245], Section 15.5, and the set of specified options is 2204 stored. 2206 o Any "a=fingerprint" lines are parsed as specified in [RFC4572], 2207 Section 5, and the set of fingerprint and algorithm values is 2208 stored. 2210 o If present, a single "a=setup" line is parsed as specified in 2211 [RFC4145], Section 4, and the setup value is stored. 2213 o Any "a=extmap" lines are parsed as specified in [RFC5285], 2214 Section 5, and their values are stored. 2216 o TODO: msid-semantic, identity, rtcp-rsize, rtcp-mux, and any other 2217 attribs valid at session level. 2219 Once all the session-level lines have been parsed, processing 2220 continues with the lines in media sections. 2222 5.6.2. Media Section Parsing 2224 Like the session-level lines, the media session lines MUST occur in 2225 the specific order and with the specific syntax defined in [RFC4566], 2226 Section 5. 2228 The "m=" line itself MUST be parsed as described in [RFC4566], 2229 Section 5.14, and the media, port, proto, and fmt values stored. 2231 Following the "m=" line, specific processing MUST be applied for the 2232 following non-attribute lines: 2234 o As with the "c=" line at the session level, the "c=" line MUST be 2235 parsed according to [RFC4566], Section 5.7, but its value is not 2236 used. 2238 o The "b=" line, if present, MUST be parsed as specified in 2239 [RFC4566], Section 5.8, and the bwtype and bandwidth values 2240 stored. 2242 Specific processing MUST also be applied for the following attribute 2243 lines: 2245 o If present, a single "a=ice-ufrag" line is parsed as specified in 2246 [RFC5245], Section 15.4, and the ufrag value is stored. 2248 o If present, a single "a=ice-pwd" line is parsed as specified in 2249 [RFC5245], Section 15.4, and the password value is stored. 2251 o If present, a single "a=ice-options" line is parsed as specified 2252 in [RFC5245], Section 15.5, and the set of specified options is 2253 stored. 2255 o Any "a=fingerprint" lines are parsed as specified in [RFC4572], 2256 Section 5, and the set of fingerprint and algorithm values is 2257 stored. 2259 o If present, a single "a=setup" line is parsed as specified in 2260 [RFC4145], Section 4, and the setup value is stored. 2262 If the "m=" proto value indicates use of RTP, as decribed in the 2263 Section 5.1.3 section above, the following attribute lines MUST be 2264 processed: 2266 o The "m=" fmt value MUST be parsed as specified in [RFC4566], 2267 Section 5.14, and the individual values stored. 2269 o Any "a=rtpmap" or "a=fmtp" lines MUST be parsed as specified in 2270 [RFC4566], Section 6, and their values stored. 2272 o If present, a single "a=ptime" line MUST be parsed as described in 2273 [RFC4566], Section 6, and its value stored. 2275 o If present, a single direction attribute line (e.g. "a=sendrecv") 2276 MUST be parsed as described in [RFC4566], Section 6, and its value 2277 stored. 2279 o Any "a=ssrc" or "a=ssrc-group" attributes MUST be parsed as 2280 specified in [RFC5576], Sections 4.1-4.2, and their values stored. 2282 o Any "a=extmap" attributes MUST be parsed as specified in 2283 [RFC5285], Section 5, and their values stored. 2285 o Any "a=rtcp-fb" attributes MUST be parsed as specified in 2286 [RFC4585], Section 4.2., and their values stored. 2288 o If present, a single "a=rtcp-mux" line MUST be parsed as specified 2289 in [RFC5761], Section 5.1.1, and its presence or absence flagged 2290 and stored. 2292 o TODO: a=rtcp-rsize, a=rtcp, a=msid, a=candidate, a=end-of- 2293 candidates 2295 Otherwise, if the "m=" proto value indicats use of SCTP, the 2296 following attribute lines MUST be processed: 2298 o The "m=" fmt value MUST be parsed as specified in 2299 [I-D.ietf-mmusic-sctp-sdp], Section 4.3, and the application 2300 protocol value stored. 2302 o An "a=sctp-port" attribute MUST be present, and it MUST be parsed 2303 as specified in [I-D.ietf-mmusic-sctp-sdp], Section 5.2, and the 2304 value stored. 2306 o TODO: max message size 2308 5.6.3. Semantics Verification 2310 Assuming parsing completes successfully, the parsed description is 2311 then evaluated to ensure internal consistency as well as proper 2312 support for mandatory features. Specifically, the following checks 2313 are performed: 2315 o For each m= section, valid values for each of the mandatory-to-use 2316 features enumerated in Section 5.1.2 MUST be present. These 2317 values MAY either be present at the media level, or inherited from 2318 the session level. 2320 * ICE ufrag and password values 2322 * DTLS fingerprint and setup values 2324 If this session description is of type "pranswer" or "answer", the 2325 following additional checks are applied: 2327 o The session description must follow the rules defined in 2328 [RFC3264], Section 6. 2330 o For each m= section, the protocol value MUST exactly match the 2331 protocol value in the corresponding m= section in the associated 2332 offer. 2334 5.7. Applying a Local Description 2336 The following steps are performed at the media engine level to apply 2337 a local description. 2339 First, the parsed parameters are checked to ensure that any 2340 modifications performed fall within those explicitly permitted by 2341 Section 6; otherwise, processing MUST stop and an error MUST be 2342 returned. 2344 Next, media sections are processed. For each media section, the 2345 following steps MUST be performed; if any parameters are out of 2346 bounds, or cannot be applied, processing MUST stop and an error MUST 2347 be returned. 2349 o TODO 2351 Finally, if this description is of type "pranswer" or "answer", 2352 follow the processing defined in the Section 5.9 section below. 2354 5.8. Applying a Remote Description 2356 TODO 2358 5.9. Applying an Answer 2360 TODO 2362 6. Configurable SDP Parameters 2364 It is possible to change elements in the SDP returned from 2365 createOffer before passing it to setLocalDescription. When an 2366 implementation receives modified SDP it MUST either: 2368 o Accept the changes and adjust its behavior to match the SDP. 2370 o Reject the changes and return an error via the error callback. 2372 Changes MUST NOT be silently ignored. 2374 The following elements of the session description MUST NOT be changed 2375 between the createOffer and the setLocalDescription (or between the 2376 createAnswer and the setLocalDescription), since they reflect 2377 transport attributes that are solely under browser control, and the 2378 browser MUST NOT honor an attempt to change them: 2380 o The number, type and port number of m= lines. 2382 o The generated ICE credentials (a=ice-ufrag and a=ice-pwd). 2384 o The set of ICE candidates and their parameters (a=candidate). 2386 o The DTLS fingerprint(s) (a=fingerprint). 2388 The following modifications, if done by the browser to a description 2389 between createOffer/createAnswer and the setLocalDescription, MUST be 2390 honored by the browser: 2392 o Remove or reorder codecs (m=) 2394 The following parameters may be controlled by options passed into 2395 createOffer/createAnswer. As an open issue, these changes may also 2396 be be performed by manipulating the SDP returned from createOffer/ 2397 createAnswer, as indicated above, as long as the capabilities of the 2398 endpoint are not exceeded (e.g. asking for a resolution greater than 2399 what the endpoint can encode): 2401 o [[OPEN ISSUE: This is a placeholder for other modifications, which 2402 we may continue adding as use cases appear.]] 2404 Implementations MAY choose to either honor or reject any elements not 2405 listed in the above two categories, but must do so explicitly as 2406 described at the beginning of this section. Note that future 2407 standards may add new SDP elements to the list of elements which must 2408 be accepted or rejected, but due to version skew, applications must 2409 be prepared for implementations to accept changes which must be 2410 rejected and vice versa. 2412 The application can also modify the SDP to reduce the capabilities in 2413 the offer it sends to the far side or the offer that it installs from 2414 the far side in any way the application sees fit, as long as it is a 2415 valid SDP offer and specifies a subset of what was in the original 2416 offer. This is safe because the answer is not permitted to expand 2417 capabilities and therefore will just respond to what is actually in 2418 the offer. 2420 As always, the application is solely responsible for what it sends to 2421 the other party, and all incoming SDP will be processed by the 2422 browser to the extent of its capabilities. It is an error to assume 2423 that all SDP is well-formed; however, one should be able to assume 2424 that any implementation of this specification will be able to 2425 process, as a remote offer or answer, unmodified SDP coming from any 2426 other implementation of this specification. 2428 7. Examples 2430 Note that this example section shows several SDP fragments. To 2431 format in 72 columns, some of the lines in SDP have been split into 2432 multiple lines, where leading whitespace indicates that a line is a 2433 continuation of the previous line. In addition, some blank lines 2434 have been added to improve readability but are not valid in SDP. 2436 More examples of SDP for WebRTC call flows can be found in 2437 [I-D.nandakumar-rtcweb-sdp]. 2439 7.1. Simple Example 2441 This section shows a very simple example that sets up a minimal audio 2442 / video call between two browsers and does not use trickle ICE. The 2443 example in the following section provides a more realistic example of 2444 what would happen in a normal browser to browser connection. 2446 The flow shows Alice's browser initiating the session to Bob's 2447 browser. The messages from Alice's JS to Bob's JS are assumed to 2448 flow over some signaling protocol via a web server. The JS on both 2449 Alice's side and Bob's side waits for all candidates before sending 2450 the offer or answer, so the offers and answers are complete. Trickle 2451 ICE is not used. Both Alice and Bob are using the default policy of 2452 balanced. 2454 // set up local media state 2455 AliceJS->AliceUA: create new PeerConnection 2456 AliceJS->AliceUA: addStream with stream containing audio and video 2457 AliceJS->AliceUA: createOffer to get offer 2458 AliceJS->AliceUA: setLocalDescription with offer 2459 AliceUA->AliceJS: multiple onicecandidate events with candidates 2461 // wait for ICE gathering to complete 2462 AliceUA->AliceJS: onicecandidate event with null candidate 2463 AliceJS->AliceUA: get |offer-A1| from value of localDescription 2465 // |offer-A1| is sent over signaling protocol to Bob 2466 AliceJS->WebServer: signaling with |offer-A1| 2467 WebServer->BobJS: signaling with |offer-A1| 2469 // |offer-A1| arrives at Bob 2470 BobJS->BobUA: create a PeerConnection 2471 BobJS->BobUA: setRemoteDescription with |offer-A1| 2472 BobUA->BobJS: onaddstream event with remoteStream 2474 // Bob accepts call 2475 BobJS->BobUA: addStream with local media 2476 BobJS->BobUA: createAnswer 2477 BobJS->BobUA: setLocalDescription with answer 2478 BobUA->BobJS: multiple onicecandidate events with candidates 2480 // wait for ICE gathering to complete 2481 BobUA->BobJS: onicecandidate event with null candidate 2482 BobJS->BobUA: get |answer-A1| from value of localDescription 2484 // |answer-A1| is sent over signaling protocol to Alice 2485 BobJS->WebServer: signaling with |answer-A1| 2486 WebServer->AliceJS: signaling with |answer-A1| 2488 // |answer-A1| arrives at Alice 2489 AliceJS->AliceUA: setRemoteDescription with |answer-A1| 2490 AliceUA->AliceJS: onaddstream event with remoteStream 2492 // media flows 2493 BobUA->AliceUA: media sent from Bob to Alice 2494 AliceUA->BobUA: media sent from Alice to Bob 2496 The SDP for |offer-A1| looks like: 2498 v=0 2499 o=- 4962303333179871722 1 IN IP4 0.0.0.0 2500 s=- 2501 t=0 0 2502 a=msid-semantic:WMS 2503 a=group:BUNDLE a1 v1 2504 m=audio 56500 UDP/TLS/RTP/SAVPF 96 0 8 97 98 2505 c=IN IP4 192.0.2.1 2506 a=mid:a1 2507 a=rtcp:56501 IN IP4 192.0.2.1 2508 a=msid:47017fee-b6c1-4162-929c-a25110252400 2509 f83006c5-a0ff-4e0a-9ed9-d3e6747be7d9 2510 a=sendrecv 2511 a=rtpmap:96 opus/48000/2 2512 a=rtpmap:0 PCMU/8000 2513 a=rtpmap:8 PCMA/8000 2514 a=rtpmap:97 telephone-event/8000 2515 a=rtpmap:98 telephone-event/48000 2516 a=maxptime:120 2517 a=ice-ufrag:ETEn1v9DoTMB9J4r 2518 a=ice-pwd:OtSK0WpNtpUjkY4+86js7ZQl 2519 a=ice-options:trickle 2520 a=fingerprint:sha-256 2521 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 2522 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 2523 a=setup:actpass 2524 a=rtcp-mux 2525 a=rtcp-rsize 2526 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level 2527 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 2528 a=ssrc:1732846380 cname:EocUG1f0fcg/yvY7 2529 a=candidate:3348148302 1 udp 2113937151 192.0.2.1 56500 2530 typ host 2531 a=candidate:3348148302 2 udp 2113937151 192.0.2.1 56501 2532 typ host 2533 a=end-of-candidates 2535 m=video 56502 UDP/TLS/RTP/SAVPF 100 101 2536 c=IN IP4 192.0.2.1 2537 a=rtcp:56503 IN IP4 192.0.2.1 2538 a=mid:v1 2539 a=msid:61317484-2ed4-49d7-9eb7-1414322a7aae 2540 f30bdb4a-5db8-49b5-bcdc-e0c9a23172e0 2541 a=sendrecv 2542 a=rtpmap:100 VP8/90000 2543 a=rtpmap:101 rtx/90000 2544 a=fmtp:101 apt=100 2545 a=ice-ufrag:BGKkWnG5GmiUpdIV 2546 a=ice-pwd:mqyWsAjvtKwTGnvhPztQ9mIf 2547 a=ice-options:trickle 2548 a=fingerprint:sha-256 2549 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 2550 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 2551 a=setup:actpass 2552 a=rtcp-mux 2553 a=rtcp-rsize 2554 a=extmap:3 urn:ietf:params:rtp-hdrext:sdes:mid 2555 a=rtcp-fb:100 ccm fir 2556 a=rtcp-fb:100 nack 2557 a=rtcp-fb:100 nack pli 2558 a=ssrc:1366781083 cname:EocUG1f0fcg/yvY7 2559 a=ssrc:1366781084 cname:EocUG1f0fcg/yvY7 2560 a=ssrc-group:FID 1366781083 1366781084 2561 a=candidate:3348148302 1 udp 2113937151 192.0.2.1 56502 2562 typ host 2563 a=candidate:3348148302 2 udp 2113937151 192.0.2.1 56503 2564 typ host 2565 a=end-of-candidates 2567 The SDP for |answer-A1| looks like: 2569 v=0 2570 o=- 6729291447651054566 1 IN IP4 0.0.0.0 2571 s=- 2572 t=0 0 2573 a=msid-semantic:WMS 2574 m=audio 20000 UDP/TLS/RTP/SAVPF 96 0 8 97 98 2575 c=IN IP4 192.0.2.2 2576 a=mid:a1 2577 a=rtcp:20000 IN IP4 192.0.2.2 2578 a=msid:PI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1 2579 PI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1a0 2580 a=sendrecv 2581 a=rtpmap:96 opus/48000/2 2582 a=rtpmap:0 PCMU/8000 2583 a=rtpmap:8 PCMA/8000 2584 a=rtpmap:97 telephone-event/8000 2585 a=rtpmap:98 telephone-event/48000 2586 a=maxptime:120 2587 a=ice-ufrag:6sFvz2gdLkEwjZEr 2588 a=ice-pwd:cOTZKZNVlO9RSGsEGM63JXT2 2589 a=fingerprint:sha-256 6B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35 2590 :DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 2591 a=setup:active 2592 a=rtcp-mux 2593 a=rtcp-rsize 2594 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level 2595 a=ssrc:3429951804 cname:Q/NWs1ao1HmN4Xa5 2596 a=candidate:2299743422 1 udp 2113937151 192.0.2.2 20000 2597 typ host 2598 a=end-of-candidates 2600 m=video 20001 UDP/TLS/RTP/SAVPF 100 101 2601 c=IN IP4 192.0.2.2 2602 a=rtcp 20001 IN IP4 192.0.2.2 2603 a=mid:v1 2604 a=msid:PI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1 2605 PI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1v0 2606 a=sendrecv 2607 a=rtpmap:100 VP8/90000 2608 a=rtpmap:101 rtx/90000 2609 a=fmtp:101 apt=100 2610 a=ice-ufrag:6sFvz2gdLkEwjZEr 2611 a=ice-pwd:cOTZKZNVlO9RSGsEGM63JXT2 2612 a=fingerprint:sha-256 6B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35 2613 :DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 2614 a=setup:active 2615 a=rtcp-mux 2616 a=rtcp-rsize 2617 a=rtcp-fb:100 ccm fir 2618 a=rtcp-fb:100 nack 2619 a=rtcp-fb:100 nack pli 2620 a=ssrc:3229706345 cname:Q/NWs1ao1HmN4Xa5 2621 a=ssrc:3229706346 cname:Q/NWs1ao1HmN4Xa5 2622 a=ssrc-group:FID 3229706345 3229706346 2623 a=candidate:2299743422 1 udp 2113937151 192.0.2.2 20001 2624 typ host 2625 a=end-of-candidates 2627 7.2. Normal Examples 2629 This section shows a typical example of a session between two 2630 browsers setting up an audio channel and a data channel. Trickle ICE 2631 is used in full trickle mode with a bundle policy of max-bundle, an 2632 RTCP mux policy of require, and a single TURN server. Later, two 2633 video flows, one for the presenter and one for screen sharing, are 2634 added to the session. This example shows Alice's browser initiating 2635 the session to Bob's browser. The messages from Alice's JS to Bob's 2636 JS are assumed to flow over some signaling protocol via a web server. 2638 // set up local media state 2639 AliceJS->AliceUA: create new PeerConnection 2640 AliceJS->AliceUA: addStream that contains audio track 2641 AliceJS->AliceUA: createDataChannel to get data channel 2642 AliceJS->AliceUA: createOffer to get |offer-B1| 2643 AliceJS->AliceUA: setLocalDescription with |offer-B1| 2645 // |offer-B1| is sent over signaling protocol to Bob 2646 AliceJS->WebServer: signaling with |offer-B1| 2647 WebServer->BobJS: signaling with |offer-B1| 2649 // |offer-B1| arrives at Bob 2650 BobJS->BobUA: create a PeerConnection 2651 BobJS->BobUA: setRemoteDescription with |offer-B1| 2652 BobUA->BobJS: onaddstream with audio track from Alice 2654 // candidates are sent to Bob 2655 AliceUA->AliceJS: onicecandidate event with |candidate-B1| (host) 2656 AliceJS->WebServer: signaling with |candidate-B1| 2657 AliceUA->AliceJS: onicecandidate event with |candidate-B2| (srflx) 2658 AliceJS->WebServer: signaling with |candidate-B2| 2659 AliceUA->AliceJS: onicecandidate event with |candidate-B3| (relay) 2660 AliceJS->WebServer: signaling with |candidate-B3| 2662 WebServer->BobJS: signaling with |candidate-B1| 2663 BobJS->BobUA: addIceCandidate with |candidate-B1| 2664 WebServer->BobJS: signaling with |candidate-B2| 2665 BobJS->BobUA: addIceCandidate with |candidate-B2| 2666 WebServer->BobJS: signaling with |candidate-B3| 2667 BobJS->BobUA: addIceCandidate with |candidate-B3| 2669 // Bob accepts call 2670 BobJS->BobUA: addStream with local audio stream 2671 BobJS->BobUA: createDataChannel to get data channel 2672 BobJS->BobUA: createAnswer to get |answer-B1| 2673 BobJS->BobUA: setLocalDescription with |answer-B1| 2675 // |answer-B1| is sent to Alice 2676 BobJS->WebServer: signaling with |answer-B1| 2677 WebServer->AliceJS: signaling with |answer-B1| 2678 AliceJS->AliceUA: setRemoteDescription with |answer-B1| 2679 AliceUA->AliceJS: onaddstream event with audio track from Bob 2681 // candidates are sent to Alice 2682 BobUA->BobJS: onicecandidate event with |candidate-B4| (host) 2683 BobJS->WebServer: signaling with |candidate-B4| 2684 BobUA->BobJS: onicecandidate event with |candidate-B5| (srflx) 2685 BobJS->WebServer: signaling with |candidate-B5| 2686 BobUA->BobJS: onicecandidate event with |candidate-B6| (relay) 2687 BobJS->WebServer: signaling with |candidate-B6| 2689 WebServer->AliceJS: signaling with |candidate-B4| 2690 AliceJS->AliceUA: addIceCandidate with |candidate-B4| 2691 WebServer->AliceJS: signaling with |candidate-B5| 2692 AliceJS->AliceUA: addIceCandidate with |candidate-B5| 2693 WebServer->AliceJS: signaling with |candidate-B6| 2694 AliceJS->AliceUA: addIceCandidate with |candidate-B6| 2696 // data channel opens 2697 BobUA->BobJS: ondatachannel event 2698 AliceUA->AliceJS: ondatachannel event 2699 BobUA->BobJS: onopen 2700 AliceUA->AliceJS: onopen 2702 // media is flowing between browsers 2703 BobUA->AliceUA: audio+data sent from Bob to Alice 2704 AliceUA->BobUA: audio+data sent from Alice to Bob 2706 // some time later Bob adds two video streams 2707 // note, no candidates exchanged, because of BUNDLE 2708 BobJS->BobUA: addStream with first video stream 2709 BobJS->BobUA: addStream with second video stream 2710 BobJS->BobUA: createOffer to get |offer-B2| 2711 BobJS->BobUA: setLocalDescription with |offer-B2| 2713 // |offer-B2| is sent to Alice 2714 BobJS->WebServer: signaling with |offer-B2| 2715 WebServer->AliceJS: signaling with |offer-B2| 2716 AliceJS->AliceUA: setRemoteDescription with |offer-B2| 2717 AliceUA->AliceJS: onaddstream event with first video stream 2718 AliceUA->AliceJS: onaddstream event with second video stream 2719 AliceJS->AliceUA: createAnswer to get |answer-B2| 2720 AliceJS->AliceUA: setLocalDescription with |answer-B2| 2722 // |answer-B2| is sent over signaling protocol to Bob 2723 AliceJS->WebServer: signaling with |answer-B2| 2724 WebServer->BobJS: signaling with |answer-B2| 2725 BobJS->BobUA: setRemoteDescription with |answer-B2| 2727 // media is flowing between browsers 2728 BobUA->AliceUA: audio+video+data sent from Bob to Alice 2729 AliceUA->BobUA: audio+video+data sent from Alice to Bob 2731 The SDP for |offer-B1| looks like: 2733 v=0 2734 o=- 4962303333179871723 1 IN IP4 0.0.0.0 2735 s=- 2736 t=0 0 2737 a=msid-semantic:WMS 2738 a=group:BUNDLE a1 d1 2739 m=audio 9 UDP/TLS/RTP/SAVPF 96 0 8 97 98 2740 c=IN IP6 :: 2741 a=rtcp:9 IN IP6 :: 2742 a=mid:a1 2743 a=msid:57017fee-b6c1-4162-929c-a25110252400 2744 e83006c5-a0ff-4e0a-9ed9-d3e6747be7d9 2745 a=sendrecv 2746 a=rtpmap:96 opus/48000/2 2747 a=rtpmap:0 PCMU/8000 2748 a=rtpmap:8 PCMA/8000 2749 a=rtpmap:97 telephone-event/8000 2750 a=rtpmap:98 telephone-event/48000 2751 a=maxptime:120 2752 a=ice-ufrag:ATEn1v9DoTMB9J4r 2753 a=ice-pwd:AtSK0WpNtpUjkY4+86js7ZQl 2754 a=ice-options:trickle 2755 a=fingerprint:sha-256 2756 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 2757 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 2758 a=setup:actpass 2759 a=rtcp-mux 2760 a=rtcp-rsize 2761 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level 2762 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 2763 a=ssrc:1732846380 cname:FocUG1f0fcg/yvY7 2765 m=application 9 UDP/DTLS/SCTP webrtc-datachannel 2766 c=IN IP6 :: 2767 a=mid:d1 2768 a=fmtp:webrtc-datachannel max-message-size=65536 2769 a=sctp-port 5000 2770 a=ice-ufrag:ATEn1v9DoTMB9J4r 2771 a=ice-pwd:AtSK0WpNtpUjkY4+86js7ZQl 2772 a=ice-options:trickle 2773 a=fingerprint:sha-256 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 2774 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 2775 a=setup:actpass 2777 The SDP for |candidate-B1| looks like: 2779 candidate:109270923 1 udp 2122194687 192.168.1.2 51556 typ host 2780 The SDP for |candidate-B2| looks like: 2782 candidate:4036177503 1 udp 1685987071 11.22.33.44 52546 typ srflx 2783 raddr 192.168.1.2 rport 51556 2785 The SDP for |candidate-B3| looks like: 2787 candidate:3671762466 1 udp 41819903 22.33.44.55 61405 typ relay 2788 raddr 11.22.33.44 rport 52546 2790 The SDP for |answer-B1| looks like: 2792 v=0 2793 o=- 7729291447651054566 1 IN IP4 0.0.0.0 2794 s=- 2795 t=0 0 2796 a=msid-semantic:WMS 2797 a=group:BUNDLE a1 d1 2798 m=audio 9 UDP/TLS/RTP/SAVPF 96 0 8 97 98 2799 c=IN IP6 :: 2800 a=rtcp:9 IN IP6 :: 2801 a=mid:a1 2802 a=msid:QI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1 2803 QI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1a0 2804 a=sendrecv 2805 a=rtpmap:96 opus/48000/2 2806 a=rtpmap:0 PCMU/8000 2807 a=rtpmap:8 PCMA/8000 2808 a=rtpmap:97 telephone-event/8000 2809 a=rtpmap:98 telephone-event/48000 2810 a=maxptime:120 2811 a=ice-ufrag:7sFvz2gdLkEwjZEr 2812 a=ice-pwd:dOTZKZNVlO9RSGsEGM63JXT2 2813 a=ice-options:trickle 2814 a=fingerprint:sha-256 6B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35 2815 :DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 2816 a=setup:active 2817 a=rtcp-mux 2818 a=rtcp-rsize 2819 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level 2820 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 2821 a=ssrc:4429951804 cname:Q/NWs1ao1HmN4Xa5 2823 m=application 9 UDP/DTLS/SCTP webrtc-datachannel 2824 c=IN IP6 :: 2825 a=mid:d1 2826 a=fmtp:webrtc-datachannel max-message-size=65536 2827 a=sctp-port 5000 2828 a=ice-ufrag:7sFvz2gdLkEwjZEr 2829 a=ice-pwd:dOTZKZNVlO9RSGsEGM63JXT2 2830 a=ice-options:trickle 2831 a=fingerprint:sha-256 6B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35 2832 :DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 2833 a=setup:active 2835 The SDP for |candidate-B4| looks like: 2837 candidate:109270924 1 udp 2122194687 192.168.2.3 61665 typ host 2839 The SDP for |candidate-B5| looks like: 2841 candidate:4036177504 1 udp 1685987071 55.66.77.88 64532 typ srflx 2842 raddr 192.168.2.3 rport 61665 2844 The SDP for |candidate-B6| looks like: 2846 candidate:3671762467 1 udp 41819903 66.77.88.99 50416 typ relay 2847 raddr 55.66.77.88 rport 64532 2849 The SDP for |offer-B2| looks like: (note the increment of the version 2850 number in the o= line, and the c= and a=rtcp lines, which indicate 2851 the local candidate that was selected) 2853 v=0 2854 o=- 7729291447651054566 2 IN IP4 0.0.0.0 2855 s=- 2856 t=0 0 2857 a=msid-semantic:WMS 2858 a=group:BUNDLE a1 d1 v1 v2 2859 m=audio 64532 UDP/TLS/RTP/SAVPF 96 0 8 97 98 2860 c=IN IP4 55.66.77.88 2861 a=rtcp:64532 IN IP4 55.66.77.88 2862 a=mid:a1 2863 a=msid:QI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1 2864 QI39StLS8W7ZbQl1sJsWUXkr3Zf12fJUvzQ1a0 2865 a=sendrecv 2866 a=rtpmap:96 opus/48000/2 2867 a=rtpmap:0 PCMU/8000 2868 a=rtpmap:8 PCMA/8000 2869 a=rtpmap:97 telephone-event/8000 2870 a=rtpmap:98 telephone-event/48000 2871 a=maxptime:120 2872 a=ice-ufrag:7sFvz2gdLkEwjZEr 2873 a=ice-pwd:dOTZKZNVlO9RSGsEGM63JXT2 2874 a=ice-options:trickle 2875 a=fingerprint:sha-256 6B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35 2876 :DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 2877 a=setup:actpass 2878 a=rtcp-mux 2879 a=rtcp-rsize 2880 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level 2881 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 2882 a=ssrc:4429951804 cname:Q/NWs1ao1HmN4Xa5 2883 a=candidate:109270924 1 udp 2122194687 192.168.2.3 61665 typ host 2884 a=candidate:4036177504 1 udp 1685987071 55.66.77.88 64532 typ srflx 2885 raddr 192.168.2.3 rport 61665 2886 a=candidate:3671762467 1 udp 41819903 66.77.88.99 50416 typ relay 2887 raddr 55.66.77.88 rport 64532 2888 a=end-of-candidates 2889 m=application 64532 UDP/DTLS/SCTP webrtc-datachannel 2890 c=IN IP4 55.66.77.88 2891 a=mid:d1 2892 a=fmtp:webrtc-datachannel max-message-size=65536 2893 a=sctp-port 5000 2894 a=ice-ufrag:7sFvz2gdLkEwjZEr 2895 a=ice-pwd:dOTZKZNVlO9RSGsEGM63JXT2 2896 a=ice-options:trickle 2897 a=fingerprint:sha-256 6B:8B:F0:65:5F:78:E2:51:3B:AC:6F:F3:3F:46:1B:35 2898 :DC:B8:5F:64:1A:24:C2:43:F0:A1:58:D0:A1:2C:19:08 2899 a=setup:actpass 2900 a=candidate:109270924 1 udp 2122194687 192.168.2.3 61665 typ host 2901 a=candidate:4036177504 1 udp 1685987071 55.66.77.88 64532 typ srflx 2902 raddr 192.168.2.3 rport 61665 2903 a=candidate:3671762467 1 udp 41819903 66.77.88.99 50416 typ relay 2904 raddr 55.66.77.88 rport 64532 2905 a=end-of-candidates 2907 m=video 64532 UDP/TLS/RTP/SAVPF 100 101 2908 c=IN IP4 55.66.77.88 2909 a=rtcp:64532 IN IP4 55.66.77.88 2910 a=mid:v1 2911 a=msid:61317484-2ed4-49d7-9eb7-1414322a7aae 2912 f30bdb4a-5db8-49b5-bcdc-e0c9a23172e0 2913 a=sendrecv 2914 a=rtpmap:100 VP8/90000 2915 a=rtpmap:101 rtx/90000 2916 a=fmtp:101 apt=100 2917 a=ice-ufrag:7sFvz2gdLkEwjZEr 2918 a=ice-pwd:dOTZKZNVlO9RSGsEGM63JXT2 2919 a=ice-options:trickle 2920 a=fingerprint:sha-256 2921 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 2922 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 2923 a=setup:actpass 2924 a=rtcp-mux 2925 a=rtcp-rsize 2926 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 2927 a=rtcp-fb:100 ccm fir 2928 a=rtcp-fb:100 nack 2929 a=rtcp-fb:100 nack pli 2930 a=ssrc:1366781083 cname:Q/NWs1ao1HmN4Xa5 2931 a=ssrc:1366781084 cname:Q/NWs1ao1HmN4Xa5 2932 a=ssrc-group:FID 1366781083 1366781084 2933 a=candidate:109270924 1 udp 2122194687 192.168.2.3 61665 typ host 2934 a=candidate:4036177504 1 udp 1685987071 55.66.77.88 64532 typ srflx 2935 raddr 192.168.2.3 rport 61665 2936 a=candidate:3671762467 1 udp 41819903 66.77.88.99 50416 typ relay 2937 raddr 55.66.77.88 rport 64532 2938 a=end-of-candidates 2940 m=video 64532 UDP/TLS/RTP/SAVPF 100 101 2941 c=IN IP4 55.66.77.88 2942 a=rtcp:64532 IN IP4 55.66.77.88 2943 a=mid:v1 2944 a=msid:71317484-2ed4-49d7-9eb7-1414322a7aae 2945 f30bdb4a-5db8-49b5-bcdc-e0c9a23172e0 2946 a=sendrecv 2947 a=rtpmap:100 VP8/90000 2948 a=rtpmap:101 rtx/90000 2949 a=fmtp:101 apt=100 2950 a=ice-ufrag:7sFvz2gdLkEwjZEr 2951 a=ice-pwd:dOTZKZNVlO9RSGsEGM63JXT2 2952 a=ice-options:trickle 2953 a=fingerprint:sha-256 2954 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 2955 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 2956 a=setup:actpass 2957 a=rtcp-mux 2958 a=rtcp-rsize 2959 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 2960 a=rtcp-fb:100 ccm fir 2961 a=rtcp-fb:100 nack 2962 a=rtcp-fb:100 nack pli 2963 a=ssrc:2366781083 cname:Q/NWs1ao1HmN4Xa5 2964 a=ssrc:2366781084 cname:Q/NWs1ao1HmN4Xa5 2965 a=ssrc-group:FID 2366781083 2366781084 2966 a=candidate:109270924 1 udp 2122194687 192.168.2.3 61665 typ host 2967 a=candidate:4036177504 1 udp 1685987071 55.66.77.88 64532 typ srflx 2968 raddr 192.168.2.3 rport 61665 2969 a=candidate:3671762467 1 udp 41819903 66.77.88.99 50416 typ relay 2970 raddr 55.66.77.88 rport 64532 2971 a=end-of-candidates 2973 The SDP for |answer-B2| looks like: (note the use of setup:passive to 2974 maintain the existing DTLS roles, and the use of a=recvonly to 2975 indicate that the video streams are one-way) 2977 v=0 2978 o=- 4962303333179871723 2 IN IP4 0.0.0.0 2979 s=- 2980 t=0 0 2981 a=msid-semantic:WMS 2982 a=group:BUNDLE a1 d1 v1 v2 2983 m=audio 52546 UDP/TLS/RTP/SAVPF 96 0 8 97 98 2984 c=IN IP4 11.22.33.44 2985 a=rtcp:52546 IN IP4 11.22.33.44 2986 a=mid:a1 2987 a=msid:57017fee-b6c1-4162-929c-a25110252400 2988 e83006c5-a0ff-4e0a-9ed9-d3e6747be7d9 2989 a=sendrecv 2990 a=rtpmap:96 opus/48000/2 2991 a=rtpmap:0 PCMU/8000 2992 a=rtpmap:8 PCMA/8000 2993 a=rtpmap:97 telephone-event/8000 2994 a=rtpmap:98 telephone-event/48000 2995 a=maxptime:120 2996 a=ice-ufrag:ATEn1v9DoTMB9J4r 2997 a=ice-pwd:AtSK0WpNtpUjkY4+86js7ZQl 2998 a=ice-options:trickle 2999 a=fingerprint:sha-256 3000 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 3001 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3002 a=setup:passive 3003 a=rtcp-mux 3004 a=rtcp-rsize 3005 a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level 3006 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 3007 a=ssrc:1732846380 cname:FocUG1f0fcg/yvY7 3008 a=candidate:109270923 1 udp 2122194687 192.168.1.2 51556 typ host 3009 a=candidate:4036177503 1 udp 1685987071 11.22.33.44 52546 typ srflx 3010 raddr 192.168.1.2 rport 51556 3011 a=candidate:3671762466 1 udp 41819903 22.33.44.55 61405 typ relay 3012 raddr 11.22.33.44 rport 52546 3013 a=end-of-candidates 3015 m=application 52546 UDP/DTLS/SCTP webrtc-datachannel 3016 c=IN IP4 11.22.33.44 3017 a=mid:d1 3018 a=fmtp:webrtc-datachannel max-message-size=65536 3019 a=sctp-port 5000 3020 a=ice-ufrag:ATEn1v9DoTMB9J4r 3021 a=ice-pwd:AtSK0WpNtpUjkY4+86js7ZQl 3022 a=ice-options:trickle 3023 a=fingerprint:sha-256 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 3024 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3025 a=setup:passive 3026 a=candidate:109270923 1 udp 2122194687 192.168.1.2 51556 typ host 3027 a=candidate:4036177503 1 udp 1685987071 11.22.33.44 52546 typ srflx 3028 raddr 192.168.1.2 rport 51556 3029 a=candidate:3671762466 1 udp 41819903 22.33.44.55 61405 typ relay 3030 raddr 11.22.33.44 rport 52546 3031 a=end-of-candidates 3032 m=video 52546 UDP/TLS/RTP/SAVPF 100 101 3033 c=IN IP4 11.22.33.44 3034 a=rtcp:52546 IN IP4 11.22.33.44 3035 a=mid:v1 3036 a=recvonly 3037 a=rtpmap:100 VP8/90000 3038 a=rtpmap:101 rtx/90000 3039 a=fmtp:101 apt=100 3040 a=ice-ufrag:ATEn1v9DoTMB9J4r 3041 a=ice-pwd:AtSK0WpNtpUjkY4+86js7ZQl 3042 a=ice-options:trickle 3043 a=fingerprint:sha-256 3044 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 3045 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3046 a=setup:passive 3047 a=rtcp-mux 3048 a=rtcp-rsize 3049 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 3050 a=rtcp-fb:100 ccm fir 3051 a=rtcp-fb:100 nack 3052 a=rtcp-fb:100 nack pli 3053 a=candidate:109270923 1 udp 2122194687 192.168.1.2 51556 typ host 3054 a=candidate:4036177503 1 udp 1685987071 11.22.33.44 52546 typ srflx 3055 raddr 192.168.1.2 rport 51556 3056 a=candidate:3671762466 1 udp 41819903 22.33.44.55 61405 typ relay 3057 raddr 11.22.33.44 rport 52546 3058 a=end-of-candidates 3060 m=video 52546 UDP/TLS/RTP/SAVPF 100 101 3061 c=IN IP4 11.22.33.44 3062 a=rtcp:52546 IN IP4 11.22.33.44 3063 a=mid:v2 3064 a=recvonly 3065 a=rtpmap:100 VP8/90000 3066 a=rtpmap:101 rtx/90000 3067 a=fmtp:101 apt=100 3068 a=ice-ufrag:ATEn1v9DoTMB9J4r 3069 a=ice-pwd:AtSK0WpNtpUjkY4+86js7ZQl 3070 a=ice-options:trickle 3071 a=fingerprint:sha-256 3072 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04 3073 :BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 3074 a=setup:passive 3075 a=rtcp-mux 3076 a=rtcp-rsize 3077 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:mid 3078 a=rtcp-fb:100 ccm fir 3079 a=rtcp-fb:100 nack 3080 a=rtcp-fb:100 nack pli 3081 a=candidate:109270923 1 udp 2122194687 192.168.1.2 51556 typ host 3082 a=candidate:4036177503 1 udp 1685987071 11.22.33.44 52546 typ srflx 3083 raddr 192.168.1.2 rport 51556 3084 a=candidate:3671762466 1 udp 41819903 22.33.44.55 61405 typ relay 3085 raddr 11.22.33.44 rport 52546 3086 a=end-of-candidates 3088 8. Security Considerations 3090 The IETF has published separate documents 3091 [I-D.ietf-rtcweb-security-arch] [I-D.ietf-rtcweb-security] describing 3092 the security architecture for WebRTC as a whole. The remainder of 3093 this section describes security considerations for this document. 3095 While formally the JSEP interface is an API, it is better to think of 3096 it is an Internet protocol, with the JS being untrustworthy from the 3097 perspective of the browser. Thus, the threat model of [RFC3552] 3098 applies. In particular, JS can call the API in any order and with 3099 any inputs, including malicious ones. This is particularly relevant 3100 when we consider the SDP which is passed to setLocalDescription(). 3101 While correct API usage requires that the application pass in SDP 3102 which was derived from createOffer() or createAnswer() (perhaps 3103 suitably modified as described in Section 6, there is no guarantee 3104 that applications do so. The browser MUST be prepared for the JS to 3105 pass in bogus data instead. 3107 Conversely, the application programmer MUST recognize that the JS 3108 does not have complete control of browser behavior. One case that 3109 bears particular mention is that editing ICE candidates out of the 3110 SDP or suppressing trickled candidates does not have the expected 3111 behavior: implementations will still perform checks from those 3112 candidates even if they are not sent to the other side. Thus, for 3113 instance, it is not possible to prevent the remote peer from learning 3114 your public IP address by removing server reflexive candidates. 3115 Applications which wish to conceal their public IP address should 3116 instead configure the ICE agent to use only relay candidates. 3118 9. IANA Considerations 3120 This document requires no actions from IANA. 3122 10. Acknowledgements 3124 Significant text incorporated in the draft as well and review was 3125 provided by Harald Alvestrand and Suhas Nandakumar. Dan Burnett, 3126 Neil Stratford, Eric Rescorla, Anant Narayanan, Andrew Hutton, 3127 Richard Ejzak, Adam Bergkvist and Matthew Kaufman all provided 3128 valuable feedback on this proposal. 3130 11. References 3132 11.1. Normative References 3134 [I-D.ietf-mmusic-msid] 3135 Alvestrand, H., "Cross Session Stream Identification in 3136 the Session Description Protocol", draft-ietf-mmusic- 3137 msid-01 (work in progress), August 2013. 3139 [I-D.ietf-mmusic-sctp-sdp] 3140 Loreto, S. and G. Camarillo, "Stream Control Transmission 3141 Protocol (SCTP)-Based Media Transport in the Session 3142 Description Protocol (SDP)", draft-ietf-mmusic-sctp-sdp-04 3143 (work in progress), June 2013. 3145 [I-D.ietf-mmusic-sdp-bundle-negotiation] 3146 Holmberg, C., Alvestrand, H., and C. Jennings, 3147 "Multiplexing Negotiation Using Session Description 3148 Protocol (SDP) Port Numbers", draft-ietf-mmusic-sdp- 3149 bundle-negotiation-04 (work in progress), June 2013. 3151 [I-D.ietf-mmusic-sdp-mux-attributes] 3152 Nandakumar, S., "A Framework for SDP Attributes when 3153 Multiplexing", draft-ietf-mmusic-sdp-mux-attributes-01 3154 (work in progress), February 2014. 3156 [I-D.ietf-mmusic-trickle-ice] 3157 Ivov, E., Rescorla, E., and J. Uberti, "Trickle ICE: 3158 Incremental Provisioning of Candidates for the Interactive 3159 Connectivity Establishment (ICE) Protocol", draft-ietf- 3160 mmusic-trickle-ice-00 (work in progress), March 2013. 3162 [I-D.ietf-rtcweb-audio] 3163 Valin, J. and C. Bran, "WebRTC Audio Codec and Processing 3164 Requirements", draft-ietf-rtcweb-audio-02 (work in 3165 progress), August 2013. 3167 [I-D.ietf-rtcweb-data-protocol] 3168 Jesup, R., Loreto, S., and M. Tuexen, "WebRTC Data Channel 3169 Protocol", draft-ietf-rtcweb-data-protocol-04 (work in 3170 progress), February 2013. 3172 [I-D.ietf-rtcweb-fec] 3173 Uberti, J., "WebRTC Forward Error Correction 3174 Requirements", draft-ietf-rtcweb-fec-00 (work in 3175 progress), February 2015. 3177 [I-D.ietf-rtcweb-rtp-usage] 3178 Perkins, C., Westerlund, M., and J. Ott, "Web Real-Time 3179 Communication (WebRTC): Media Transport and Use of RTP", 3180 draft-ietf-rtcweb-rtp-usage-09 (work in progress), 3181 September 2013. 3183 [I-D.ietf-rtcweb-security] 3184 Rescorla, E., "Security Considerations for WebRTC", draft- 3185 ietf-rtcweb-security-06 (work in progress), January 2014. 3187 [I-D.ietf-rtcweb-security-arch] 3188 Rescorla, E., "WebRTC Security Architecture", draft-ietf- 3189 rtcweb-security-arch-09 (work in progress), February 2014. 3191 [I-D.ietf-rtcweb-video] 3192 Roach, A., "WebRTC Video Processing and Codec 3193 Requirements", draft-ietf-rtcweb-video-00 (work in 3194 progress), July 2014. 3196 [I-D.nandakumar-mmusic-proto-iana-registration] 3197 Nandakumar, S., "IANA registration of SDP 'proto' 3198 attribute for transporting RTP Media over TCP under 3199 various RTP profiles.", September 2014. 3201 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 3202 Requirement Levels", BCP 14, RFC 2119, March 1997. 3204 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 3205 A., Peterson, J., Sparks, R., Handley, M., and E. 3206 Schooler, "SIP: Session Initiation Protocol", RFC 3261, 3207 June 2002. 3209 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 3210 with Session Description Protocol (SDP)", RFC 3264, June 3211 2002. 3213 [RFC3552] Rescorla, E. and B. Korver, "Guidelines for Writing RFC 3214 Text on Security Considerations", BCP 72, RFC 3552, July 3215 2003. 3217 [RFC3605] Huitema, C., "Real Time Control Protocol (RTCP) attribute 3218 in Session Description Protocol (SDP)", RFC 3605, October 3219 2003. 3221 [RFC4145] Yon, D. and G. Camarillo, "TCP-Based Media Transport in 3222 the Session Description Protocol (SDP)", RFC 4145, 3223 September 2005. 3225 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 3226 Description Protocol", RFC 4566, July 2006. 3228 [RFC4572] Lennox, J., "Connection-Oriented Media Transport over the 3229 Transport Layer Security (TLS) Protocol in the Session 3230 Description Protocol (SDP)", RFC 4572, July 2006. 3232 [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, 3233 "Extended RTP Profile for Real-time Transport Control 3234 Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, July 3235 2006. 3237 [RFC5124] Ott, J. and E. Carrara, "Extended Secure RTP Profile for 3238 Real-time Transport Control Protocol (RTCP)-Based Feedback 3239 (RTP/SAVPF)", RFC 5124, February 2008. 3241 [RFC5245] Rosenberg, J., "Interactive Connectivity Establishment 3242 (ICE): A Protocol for Network Address Translator (NAT) 3243 Traversal for Offer/Answer Protocols", RFC 5245, April 3244 2010. 3246 [RFC5285] Singer, D. and H. Desineni, "A General Mechanism for RTP 3247 Header Extensions", RFC 5285, July 2008. 3249 [RFC5761] Perkins, C. and M. Westerlund, "Multiplexing RTP Data and 3250 Control Packets on a Single Port", RFC 5761, April 2010. 3252 [RFC5888] Camarillo, G. and H. Schulzrinne, "The Session Description 3253 Protocol (SDP) Grouping Framework", RFC 5888, June 2010. 3255 [RFC6236] Johansson, I. and K. Jung, "Negotiation of Generic Image 3256 Attributes in the Session Description Protocol (SDP)", RFC 3257 6236, May 2011. 3259 [RFC6347] Rescorla, E. and N. Modadugu, "Datagram Transport Layer 3260 Security Version 1.2", RFC 6347, January 2012. 3262 [RFC6904] Lennox, J., "Encryption of Header Extensions in the Secure 3263 Real-time Transport Protocol (SRTP)", RFC 6904, April 3264 2013. 3266 [RFC7022] Begen, A., Perkins, C., Wing, D., and E. Rescorla, 3267 "Guidelines for Choosing RTP Control Protocol (RTCP) 3268 Canonical Names (CNAMEs)", RFC 7022, September 2013. 3270 11.2. Informative References 3272 [I-D.nandakumar-rtcweb-sdp] 3273 Nandakumar, S. and C. Jennings, "SDP for the WebRTC", 3274 draft-nandakumar-rtcweb-sdp-02 (work in progress), July 3275 2013. 3277 [RFC3389] Zopf, R., "Real-time Transport Protocol (RTP) Payload for 3278 Comfort Noise (CN)", RFC 3389, September 2002. 3280 [RFC3556] Casner, S., "Session Description Protocol (SDP) Bandwidth 3281 Modifiers for RTP Control Protocol (RTCP) Bandwidth", RFC 3282 3556, July 2003. 3284 [RFC3960] Camarillo, G. and H. Schulzrinne, "Early Media and Ringing 3285 Tone Generation in the Session Initiation Protocol (SIP)", 3286 RFC 3960, December 2004. 3288 [RFC4568] Andreasen, F., Baugher, M., and D. Wing, "Session 3289 Description Protocol (SDP) Security Descriptions for Media 3290 Streams", RFC 4568, July 2006. 3292 [RFC4588] Rey, J., Leon, D., Miyazaki, A., Varsa, V., and R. 3293 Hakenberg, "RTP Retransmission Payload Format", RFC 4588, 3294 July 2006. 3296 [RFC5506] Johansson, I. and M. Westerlund, "Support for Reduced-Size 3297 Real-Time Transport Control Protocol (RTCP): Opportunities 3298 and Consequences", RFC 5506, April 2009. 3300 [RFC5576] Lennox, J., Ott, J., and T. Schierl, "Source-Specific 3301 Media Attributes in the Session Description Protocol 3302 (SDP)", RFC 5576, June 2009. 3304 [RFC5763] Fischl, J., Tschofenig, H., and E. Rescorla, "Framework 3305 for Establishing a Secure Real-time Transport Protocol 3306 (SRTP) Security Context Using Datagram Transport Layer 3307 Security (DTLS)", RFC 5763, May 2010. 3309 [RFC5764] McGrew, D. and E. Rescorla, "Datagram Transport Layer 3310 Security (DTLS) Extension to Establish Keys for the Secure 3311 Real-time Transport Protocol (SRTP)", RFC 5764, May 2010. 3313 [RFC5956] Begen, A., "Forward Error Correction Grouping Semantics in 3314 the Session Description Protocol", RFC 5956, September 3315 2010. 3317 [W3C.WD-webrtc-20140617] 3318 Bergkvist, A., Burnett, D., Narayanan, A., and C. 3319 Jennings, "WebRTC 1.0: Real-time Communication Between 3320 Browsers", World Wide Web Consortium WD WD-webrtc- 3321 20140617, June 2014, 3322 . 3324 Appendix A. Change log 3326 Note: This section will be removed by RFC Editor before publication. 3328 Changes in draft-09: 3330 o Don't return null for {local,remote}Description after close(). 3332 o Changed TCP/TLS to UDP/DTLS in RTP profile names. 3334 o Separate out bundle and mux policy. 3336 o Added specific references to FEC mechanisms. 3338 o Added canTrickle mechanism. 3340 o Added section on subsequent answers and, answer options. 3342 o Added text defining set{Local,Remote}Description behavior. 3344 Changes in draft-08: 3346 o Added new example section and removed old examples in appendix. 3348 o Fixed field handling. 3350 o Added text describing a=rtcp attribute. 3352 o Reworked handling of OfferToReceiveAudio and OfferToReceiveVideo 3353 per discussion at IETF 90. 3355 o Reworked trickle ICE handling and its impact on m= and c= lines 3356 per discussion at interim. 3358 o Added max-bundle-and-rtcp-mux policy. 3360 o Added description of maxptime handling. 3362 o Updated ICE candidate pool default to 0. 3364 o Resolved open issues around AppID/receiver-ID. 3366 o Reworked and expanded how changes to the ICE configuration are 3367 handled. 3369 o Some reference updates. 3371 o Editorial clarification. 3373 Changes in draft-07: 3375 o Expanded discussion of VAD and Opus DTX. 3377 o Added a security considerations section. 3379 o Rewrote the section on modifying SDP to require implementations to 3380 clearly indicate whether any given modification is allowed. 3382 o Clarified impact of IceRestart on CreateOffer in local-offer 3383 state. 3385 o Guidance on whether attributes should be defined at the media 3386 level or the session level. 3388 o Renamed "default" bundle policy to "balanced". 3390 o Removed default ICE candidate pool size and clarify how it works. 3392 o Defined a canonical order for assignment of MSTs to m= lines. 3394 o Removed discussion of rehydration. 3396 o Added Eric Rescorla as a draft editor. 3398 o Cleaned up references. 3400 o Editorial cleanup 3402 Changes in draft-06: 3404 o Reworked handling of m= line recycling. 3406 o Added handling of BUNDLE and bundle-only. 3408 o Clarified handling of rollback. 3410 o Added text describing the ICE Candidate Pool and its behavior. 3412 o Allowed OfferToReceiveX to create multiple recvonly m= sections. 3414 Changes in draft-05: 3416 o Fixed several issues identified in the createOffer/Answer sections 3417 during document review. 3419 o Updated references. 3421 Changes in draft-04: 3423 o Filled in sections on createOffer and createAnswer. 3425 o Added SDP examples. 3427 o Fixed references. 3429 Changes in draft-03: 3431 o Added text describing relationship to W3C specification 3433 Changes in draft-02: 3435 o Converted from nroff 3437 o Removed comparisons to old approaches abandoned by the working 3438 group 3440 o Removed stuff that has moved to W3C specification 3442 o Align SDP handling with W3C draft 3444 o Clarified section on forking. 3446 Changes in draft-01: 3448 o Added diagrams for architecture and state machine. 3450 o Added sections on forking and rehydration. 3452 o Clarified meaning of "pranswer" and "answer". 3454 o Reworked how ICE restarts and media directions are controlled. 3456 o Added list of parameters that can be changed in a description. 3458 o Updated suggested API and examples to match latest thinking. 3460 o Suggested API and examples have been moved to an appendix. 3462 Changes in draft -00: 3464 o Migrated from draft-uberti-rtcweb-jsep-02. 3466 Authors' Addresses 3468 Justin Uberti 3469 Google 3470 747 6th Ave S 3471 Kirkland, WA 98033 3472 USA 3474 Email: justin@uberti.name 3476 Cullen Jennings 3477 Cisco 3478 170 West Tasman Drive 3479 San Jose, CA 95134 3480 USA 3482 Email: fluffy@iii.ca 3484 Eric Rescorla (editor) 3485 Mozilla 3486 331 Evelyn Ave 3487 Mountain View, CA 94041 3488 USA 3490 Email: ekr@rtfm.com