idnits 2.17.1 draft-raymond-rtcweb-webrtc-js-obj-api-rationale-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The abstract seems to contain references ([RFC4566], [RFC3264]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (July 06, 2013) is 3940 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'MediaCapture' is defined on line 1441, but no explicit reference was found in the text == Unused Reference: 'WebRTC10' is defined on line 1473, but no explicit reference was found in the text ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) -- Obsolete informational reference (is this intentional?): RFC 5245 (Obsoleted by RFC 8445, RFC 8839) -- Obsolete informational reference (is this intentional?): RFC 5389 (Obsoleted by RFC 8489) -- Obsolete informational reference (is this intentional?): RFC 5766 (Obsoleted by RFC 8656) -- Obsolete informational reference (is this intentional?): RFC 6347 (Obsoleted by RFC 9147) Summary: 3 errors (**), 0 flaws (~~), 4 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group R. Raymond 3 Internet-Draft E. Lagerway 4 Intended status: Informational Hookflash 5 Expires: January 07, 2014 I. Baz Castillo 6 Versatica 7 R. Shpount 8 TurboBridge 9 July 06, 2013 11 WebRTC JavaScript Object API Rationale 12 draft-raymond-rtcweb-webrtc-js-obj-api-rationale-01 14 Abstract 16 This document describes the reasons why a JavaScript Object Model 17 approach is a far better solution than using SDP [RFC4566] as a 18 surface API for interfacing with WebRTC. The document outlines the 19 issues and pitfalls as well as use cases that are difficult (or 20 impossible) with SDP with offer / answer [RFC3264], and explains the 21 benefits and goals of an alternative JavaScript object model 22 approach. 24 Status of This Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at http://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on January 07, 2014. 41 Copyright Notice 43 Copyright (c) 2013 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents 48 (http://trustee.ietf.org/license-info) in effect on the date of 49 publication of this document. Please review these documents 50 carefully, as they describe your rights and restrictions with respect 51 to this document. Code Components extracted from this document must 52 include Simplified BSD License text as described in Section 4.e of 53 the Trust Legal Provisions and are provided without warranty as 54 described in the Simplified BSD License. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 59 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 60 2. Issues with a Universal Session Description Format (and Offer 61 / Answer) . . . . . . . . . . . . . . . . . . . . . . . . . . 4 62 2.1. Goal of Minimized Requirements . . . . . . . . . . . . . 6 63 2.2. Offer / Answer State Machine . . . . . . . . . . . . . . 7 64 2.2.1. Offer / Answer Violations . . . . . . . . . . . . . . 8 65 2.3. Browser to Browser Format Compatibility Issue . . . . . . 8 66 2.4. Browser to JavaScript Compatibility Issues . . . . . . . 9 67 2.5. SDP as a surface API for JavaScript developers . . . . . 9 68 2.6. Is SDP allowed to be mangled? . . . . . . . . . . . . . . 10 69 2.7. SDP errata and bugs compatibility issues . . . . . . . . 11 70 2.7.1. SDP Bugs Become Enshrined . . . . . . . . . . . . . . 11 71 2.8. SIP/SDP compatibility worsened . . . . . . . . . . . . . 12 72 2.9. Increased surface API . . . . . . . . . . . . . . . . . . 12 73 2.10. Impossible API to implement to achieve browser 74 compatibility . . . . . . . . . . . . . . . . . . . . . . 13 75 2.10.1. Example Oddities That Need Definition . . . . . . . 13 76 2.11. Plan A, Plan B vs NoPlan . . . . . . . . . . . . . . . . 14 77 2.12. SIP Forking Issue . . . . . . . . . . . . . . . . . . . . 15 78 3. Alternatives to Fixing these Issues Now . . . . . . . . . . . 15 79 3.1. Waiting for WebRTC 2.0 . . . . . . . . . . . . . . . . . 15 80 3.1.1. Cost now to fix versus fixing later . . . . . . . . . 16 81 3.1.2. If starting over, would even SIP people want SDP as a 82 surface API? . . . . . . . . . . . . . . . . . . . . 16 83 3.1.3. Incremental Approach may make Compatibility Worse . . 16 84 3.2. Session Description Format Construction API . . . . . . . 17 85 4. Example Difficult Usage Cases with Current Model . . . . . . 19 86 4.1. On / off hold example usage case . . . . . . . . . . . . 19 87 4.2. One-Sided Constraints Negotiation use Case Scenario . . . 20 88 4.3. Meet-me Negotiation Use Case Scenario . . . . . . . . . . 22 89 4.4. Browser to Browser Compatibility Extension Compatibility 90 Issue Scenario . . . . . . . . . . . . . . . . . . . . . 22 91 4.5. Building Interoperability between WebRTC and a SIP 92 Service Scenario . . . . . . . . . . . . . . . . . . . . 23 93 4.6. Bit-rate Change Scenario . . . . . . . . . . . . . . . . 24 94 4.7. Video Codec Option Change Scenario . . . . . . . . . . . 25 95 4.8. Video Upgrade Scenario . . . . . . . . . . . . . . . . . 25 96 5. Proposal: WebRTC JavaScript Object Model . . . . . . . . . . 26 97 5.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 26 98 5.2. Benefits . . . . . . . . . . . . . . . . . . . . . . . . 26 99 5.2.1. Greater compatibility . . . . . . . . . . . . . . . . 26 100 5.2.2. Easier to extend . . . . . . . . . . . . . . . . . . 26 101 5.2.3. Faster Reaction Time To Issues . . . . . . . . . . . 27 102 5.2.4. Decreased surface API . . . . . . . . . . . . . . . . 27 103 5.2.5. Greater compatibility for SIP . . . . . . . . . . . . 27 104 5.2.6. Alternative formats . . . . . . . . . . . . . . . . . 28 105 5.3. Design Goals and Considerations . . . . . . . . . . . . . 28 106 5.3.1. Objects Model Kept Simple . . . . . . . . . . . . . . 28 107 5.3.2. Simple to Gather Negotiation Information . . . . . . 28 108 5.3.3. Offer / Answer . . . . . . . . . . . . . . . . . . . 28 109 5.3.4. Extensions . . . . . . . . . . . . . . . . . . . . . 28 110 5.3.5. Well Defined Behaviors . . . . . . . . . . . . . . . 29 111 5.3.6. Data Channel . . . . . . . . . . . . . . . . . . . . 29 112 5.3.7. Satisfy the expectations of the RTCWEB charter . . . 29 113 5.3.8. SIP/SDP and current WebRTC API shim compatibility 114 statement . . . . . . . . . . . . . . . . . . . . . . 29 115 5.3.9. Greater Separation of RTCWEB Working Group and Other 116 Working Groups . . . . . . . . . . . . . . . . . . . 30 117 6. Security Considerations . . . . . . . . . . . . . . . . . . . 30 118 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 30 119 7.1. Normative References . . . . . . . . . . . . . . . . . . 30 120 7.2. Informative References . . . . . . . . . . . . . . . . . 31 121 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 32 123 1. Introduction 125 While the IETF RTCWEB WG is not specifically tasked with providing an 126 API by the W3C, the group has effectively defined a surface API with 127 the mandate to use SDP [RFC4566] with offer / answer [RFC3264]. 129 SDP is a condensed text based format that typically describes all of 130 the real-time media streams, networking properties, codecs, media 131 state and media attributes. SDP is completely extensible and can be 132 used to describe absolutely anything so long as it is formatted 133 correctly within its minimally defined limitations. 135 The points for mandating SDP with an offer / answer API typically 136 boils down to: 138 1. It's really easy to establish communication, especially with SIP 139 [RFC3261]. 141 2. The decision was already made. 143 3. SDP yields greater compatibility (especially with SIP networks). 145 4. We must have some kind of universal exchange format. 147 5. There is no alternative to this approach except destroying 148 everything created and starting from scratch. 150 This document will explain why these reasons are insufficient to 151 continue with an SDP with offer / answer mandate approach given 152 strong logical arguments and reasons with real world scenarios where 153 this approach fails and due in no small part to its lasting 154 consequences (including negative consequences for SIP). 156 The document highlights the benefits and goals for a different 157 "JavaScript Object Model" approach, which satisfies the RTCWEB WG 158 charter's requirements, yields greater compatibility and offers a 159 road-map where future potential extensions can be readily added 160 without breaking existing implementations. 162 A "JavaScript shim" is described including details on how it can 163 offer a wrapped API around a core WebRTC JavaScript Object Model. 164 This Shim will provide the same level of "ease of use" as experienced 165 with the current SDP WebRTC API. However, this JavaScript shim is 166 not mandatory to use for those who do not require an "SDP with offer 167 / answer" model. 169 1.1. Terminology 171 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 172 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 173 document are to be interpreted as described in RFC 2119 [RFC2119]. 175 2. Issues with a Universal Session Description Format (and Offer / 176 Answer) 178 The issue with SDP is not the expressiveness of the format but its 179 usage as an arbitrary universal format and an API surface instead of 180 providing JavaScript developers an object model they can readily 181 understand. JavaScript could be used to control the plumbing of 182 media objects using familiar JavaScript expressive concepts enshrined 183 with methods, properties and events. Today, in many real-world use 184 cases, controlling WebRTC requires modifying SDP directly. 186 Requiring JavaScript developers to serialize their API control 187 requests into a text format (via modifications of SDP existing blobs) 188 is only one aspect of the many issues the SDP approach creates for 189 developers. Needlessly, an offer / answer state machine is imposed 190 on JavaScript developers as well. 192 While the currently mandated SDP based API allows developers to 193 quickly implement basic calling demos and interoperability with some 194 SIP networks, it has many issues that will be explored and explained 195 in this document and include (but not limited to): 197 1. Defining a standard universal all-encompassing session 198 description format for use with WebRTC that describes all 199 connections, media, constraints, streams and tracks for all 200 scenarios is especially challenging. 202 2. Rather than focusing and defining the properties needed for 203 communication, the focus is put on the best way to express the 204 format where every nuance and behavior will need to be detailed 205 for any browser vendor to capably implement the SDP based WebRTC 206 specification. 208 3. The bar for browsers (or other applications with WebRTC engines) 209 to produce a WebRTC engine is raised substantially by forcing 210 the browser to implement an entire SDP offer / answer engine 211 too, with little to no added benefit. 213 4. A universal format built into the browser's API is entirely 214 unneeded and goes well beyond the RTCWEB chartered mandate for 215 the RTCWEB Working Group. 217 5. A flexible and expendable universal exchange format leads to 218 greater interpretations and mistakes in various implementations, 219 which in turn leads to increased incompatibilities. 221 6. Given the format is entirely flexible and open to 222 interpretation, resulting implementations will more likely be 223 prone to errors relative to the other truly needed aspects of 224 RTC (which have better defined boundaries, behaviours, and 225 scope). 227 7. Mistakes in the format won't be fixed until a new browser binary 228 update is released and deployed amongst users. 230 8. Mistakes in implementation of the session description format can 231 become enshrined and difficult to deprecate (for the sake of 232 compatibility). 234 9. Compatibility issues caused by the format will not be limited to 235 browsers-only as many hybrid browser-engine based applications 236 now exist too. 238 10. Using alternative signaling formats will require complete 239 understanding of the universal format to be able to translate it 240 into other alternative signaling formats. 242 11. JavaScript (or proxies) will need to parse and rewrite the 243 output session description format with 100% precision and 244 without loss. They will also require pre-knowledge of what each 245 browser produces and expects, despite the likelihood of a 246 multitude of outputted flavors, on various platforms, and from 247 version to version and despite the inability to easily predict 248 or detect the variants. 250 12. JavaScript developers trying to control WebRTC features will 251 need to manipulate any defined universal format rather than 252 interacting with JavaScript objects. 254 13. Offer / answer is mandated and the state machine is required but 255 the exact rules and violations of the rules ill defined when 256 used within WebRTC. 258 14. The rules of how a universal format can be modified before being 259 delivered to remote parties need to be meticulously defined or 260 compatibility issues will arise (including the allowed rules of 261 post browser format regeneration as to what can be modified and 262 fed back into the browser). 264 15. Due to the issues defined above, SIP compatibility will worsen, 265 not strengthen. 267 An alternative to all of the issues caused by a universal format and 268 state machine are described later in the document. This alternative 269 allows JavaScript to control the behavior of the media engine's 270 plumbing while providing extensible and modifiable shims written 271 entirely in JavaScript that produce consistent signaling and exchange 272 formats for the specific network where those formats operate. 274 2.1. Goal of Minimized Requirements 276 While the primary goal of WebRTC is to enable browser to browser 277 communication, the definition of a "browser" is ever expanding. 278 Beyond just traditional hand-held applications, hybrid applications 279 that are part HTML-5 and part native code exist. Servers will become 280 as much as part of the WebRTC infrastructure as browsers. Minimizing 281 the requirements to the basic wire compatibility necessary to achieve 282 RTC is essential for maximum compatibility, flexibility and varying 283 usage scenarios. 285 The mandate for the RTCWEB charter is to simply define requirements, 286 provide basic "on-the-wire" compatibility, and define security 287 requirements (such as enforcing ICE connection agreements). The 288 RTCWEB charter goals have been exceeded by going well beyond that 289 scope by mandating an API that works fine for simple SIP 290 interoperability demos but does not provide easy compatibility to the 291 basic constructs needed as outlined from the charter for use with 292 other on-the-wire signaling protocols (other than SIP). If SIP is 293 the only end goal of the WG, then that goal must be specifically 294 stated rather than effectively mandated by making alternative 295 signaling approaches unreasonably difficult to achieve. 297 2.2. Offer / Answer State Machine 299 The current SDP approach requires an offer / answer state machine. 300 Mandating an offer / answer state machine implies that: 302 1. SDP be generated by browser A and sent to browser B 304 2. Browser B must respond with the offer with an answer 306 3. If either party issues a new offer but the offer is rejected, the 307 state must revert to the previous agreed SDP (or fail to none) 309 4. If one side receives an offer while the other side has an 310 outstanding offer, a conflict occurs and both sides must reject 311 and revert and perform SDP conflict resolution to issue an offer 312 again 314 5. The only changes to the media that are allowed happens if both 315 sides agree 317 6. Any change required to the SDP requires a network round trip 318 where both sides mutually agree (at least as traditionally 319 defined in offer / answer but the rules are in flux) 321 This offer / answer model is defined as required with the current 322 implementation. Not only do the browser vendors have to enforce the 323 rules, all JavaScript authors must also adhere to these rules of 324 signaling. While WebRTC does not dictate the signaling mechanism 325 between browsers, effectively it is imposing this signaling state 326 machine on all implementations (which is not a mandate of the RTCWEB 327 Working Group). 329 There are other models for signaling other than offer / answer. For 330 example, one-sided constraints based negotiation is an alternative 331 model. This type of negotiation requires each side to determine what 332 it wants to receive independent of the other. This signaling is akin 333 to saying "if you plan to send anything, make sure it conforms to the 334 following". Changes to the media may occur without agreement from 335 the remote party where each side decides what is acceptable to 336 receive without agreement from the other. The remote side can decide 337 if it wants to send within those constraints or not. There is no 338 round trip offer / answer required in this model to affect change. 340 Offer / answer introduces the unnecessary asynchronism to the API and 341 JavaScript implementations. For example, changing the list of codecs 342 expecting to receive or the current sending codec can be done 343 immediately without the need for asynchronous calls. 345 Offer / answer is not required to achieve RTC wire compatibility but 346 it is currently mandated when alternatives could exist. 348 2.2.1. Offer / Answer Violations 350 The offer / answer SDP state machine is already violated in WebRTC. 351 Trickle ICE precludes offer / answer round trips and other proposed 352 standards like NoPlan [I-D.ivov-rtcweb-noplan] suggest relaxing the 353 offer / answer model even more. The rules of what offer / answer at 354 this point is undefined and in clear violation of the strict previous 355 rules without clear direction on what exactly constitutes offer / 356 answer anymore and where it should and should not be used. 358 A new state for offer / answer called PRANSWER is now defined, which 359 did not exist as part of the standard offer / answer state machine. 360 Offer rollback is not adequately defined either should an offer / 361 answer conflict occur. 363 Currently, switching codecs requires an SDP offer / answer should 364 perform a round trip even though it is not technically needed for an 365 RTC engine to change codecs. Should this be another exception to the 366 offer / answer state machine? 368 2.3. Browser to Browser Format Compatibility Issue 370 SDP is a flexible format, and it allows many alternative methods to 371 express the same intentions. The smallest change may alter the SDP's 372 meaning. 374 This creates a parsing and SDP generation compatibility issues. If 375 SDP is packaged by JavaScript and delivered to the remote browser 376 then each browser must support every single possible variant of SDP 377 for every browser version and platform in existence. They must do 378 this without failure. To maximize compatibility, a browser should 379 generate the SDP format in the variant expected by the remote party 380 (despite not having sufficient knowledge about the remote party to 381 provide the correct SDP). 383 2.4. Browser to JavaScript Compatibility Issues 385 Since WebRTC is not supposed to mandate the format on the wire for 386 signaling, one supported use case for WebRTC must be allowing the 387 browser generated SDP to be converted into alternative on-the-wire 388 formats. This SDP conversion may be performed by JavaScript in the 389 browser, or later by an intermediate gateway. In either case, the 390 converter must be entirely aware of all variants to the SDP possible 391 from every browser platform and version, despite browser version 392 detection being heavily frowned upon by industry best practices. 393 Likewise, the JavaScript or gateway must know how to generate the 394 correct SDP for all browsers and versions before passing the 395 serialize SDP blob into the browser. Generating compatible SDP may 396 be impossible unless the exact formats and restrictions are 397 unquestionably clear by all implementers of the specification (which 398 is anything but clearly described in the current WebRTC SDP based API 399 that developers are mandated to use). 401 2.5. SDP as a surface API for JavaScript developers 403 The current SDP based API is limited to placing a call and answering 404 a call and adding media. To perform common edge cases or to utilize 405 RTC features beyond the basic API typically requires SDP mangling. 407 Many of the operations from JavaScript to control or fetch properties 408 from RTC will be through serialization to / from the SDP instead of a 409 developer using familiar JavaScript language constructs (e.g. object 410 methods, structures, properties and events). The JavaScript 411 developer must learn an entirely new protocol called "SDP" and be 412 able to parse and generate not only basic SDP but any SDP extensions 413 without introducing a single compatibility issue. 415 Examples; A JavaScript developer wants to hold / un-hold media 416 streams. The developer must use a widely adopted but hidden feature 417 to parse the SDP from the browser, change it to add the appropriate 418 "hold" state, send that hold state to the remote side, wait for the 419 "answer" to accept the hold, parse the result on the return to see if 420 the hold was accepted and feed the result to the browser. 422 Worse, a flood of extensions to SDP for WebRTC are being written to 423 "enhance" and "extend" the functionality of the browser with new 424 features. Many basic things are ill defined in the current SDP based 425 API, for example, changing non-negotiated codec parameters, such as 426 codec bandwidth. 428 There is no facility for JavaScript to detect what SDP the browser is 429 currently using or capable of delivering. The developer has no idea 430 of the extensions available, or what SDP will be produced, or what 431 SDP is compatible. The developer's JavaScript code must be able to 432 handle everything generated by the browser for any use case beyond 433 basic call, answer and hang-up. This is a heavy burden to place on a 434 JavaScript developer who is not familiar with the details of RTC 435 concepts as expressed in SDP, and is a challenge even for those who 436 are familiar. 438 Effective APIs are meant to be contracts between a producer and 439 consumer, whereas this SDP methodology offers little in the form of 440 any such contract. 442 If SDP is to become standardized for use with WebRTC then JavaScript 443 developers must learn SDP to use RTC's available features and build 444 new features. Alternatively, accessors will need to be provided to 445 manipulate the SDP on behalf of the JavaScript (and if so, then why 446 not move to an object model straight away and do away with SDP?). 448 2.6. Is SDP allowed to be mangled? 450 The choice must be made if SDP may be modified or not. If 451 modifications are the only way to achieve RTC features available then 452 what is allowed to be modified must be clearly defined in exact 453 detail and the expected behavior of each feature (and modification of 454 each feature), as expressed in SDP, must be defined. Anything short 455 of exact specifications will cause incompatibility. Again, the 456 implication is that Web / JavaScript developers must learn SDP to 457 utilize the available RTC features and they must learn the rules of 458 modification equally well, which virtually do not exist at all today. 460 If the choice is to not allow complete SDP modification at all, then 461 the protocol becomes extremely tied to SDP based protocols like SIP. 462 Yet, there is no mandate for SIP to be the standardized protocol in 463 WebRTC. In fact, the mandate to require SIP was explicitly denied, 464 which presents the argument that SDP manipulation must be allowed. 466 The SDP mangling issue isn't just an issue when the format is sent 467 on-the-wire. If Browser A sends Browser B an SDP, the current 468 philosophy is that the SDP is allowed to be modified. However, there 469 is the possibility of modifying the SDP generated by Browser A and 470 giving that modified SDP back to Browser A to change it's behavior 471 (i.e. a serialized text based API call) before the offer is given to 472 Browser B (and likewise with Browser B when it responds with its SDP 473 answer). 475 How much of the SDP is allowed to be modified before giving the SDP 476 back to the local browser? SDP is a free-form format so anything can 477 theoretically get changed, but should it be allowed? If not, what 478 can and cannot be modified? CODECS? SSRC? SDES? Fingerprints? 479 Transports? M-lines? And so on... 481 This issue becomes further compounded when extensions are factored in 482 as well. 484 2.7. SDP errata and bugs compatibility issues 486 With the SDP baked into the browser binary, the only way SDP 487 compatibility issues can be fixed is by releasing a new browser 488 update, and the JavaScript developers must support or work around 489 flaws until the browser vendors deliver the fix and the user base 490 upgrades their browsers. 492 While it could be argued that any bug must be worked around, SDP is a 493 unique problem. SDP is a free-form format. Being compatible isn't 494 as easy as implementing a limited wire protocol for media transport 495 or a API contract with well defined features and attributes. The 496 likelihood of free-form SDP containing errors is far greater than a 497 typical well defined API due to SDPs many flavors, interpretations 498 and lack of strong definition. 500 2.7.1. SDP Bugs Become Enshrined 502 To illustrate a scenario: 504 1. Browser Vendor A has a bug 506 2. Browser Vendor B can't work with A because of the bug so it 507 implements a "work around" 509 3. Browser Vendor A fixes the bug but implements a work around to be 510 compatible with Browser Vendor B's "work around" 512 This situation demonstrates is how browser bugs can become enshrined 513 as there's no way to update the SDP produced by the browser binary 514 once it's released until the next update release cycle occurs. This 515 would not be true if JavaScript was used via a shim to produce SDP as 516 JavaScript can be dynamically updated as needed at any time and a 517 service provider can choose to update their JavaScript implementation 518 to exacting expectations for their network regardless of the browser 519 version. 521 The lower level RTC wire protocols that need to be mandated by the 522 RTCWEB Working Group have limited scopes and well defined behaviors. 523 Any mistakes are obvious, likely to present very rapidly, and easy to 524 spot which party is doing something wrong and much easier to fix 525 earlier as a result. This is not true with a free form highly 526 descriptive language for sessions. The combinations are limitless 527 and every scenario is difficult to test, especially in concert with 528 every other browser vendor with every version released. The session 529 description will be the likely place of failure across the browsers 530 when the session description is generated inside the browser's 531 binary. 533 2.8. SIP/SDP compatibility worsened 535 One of the main arguments for using SDP with offer / answer was 536 supposed to be ease of compatibility with existing signaling 537 networks, like SIP. Instead, variations in the browser's SDP will 538 likely worsen SIP compatibility instead of enhance it. 540 A SIP provider must now be compatible with every browser's SDP on 541 every platform and version and the browser's SDP must be compatible 542 with every SDP from a SIP network. Alternatively, JavaScript or SBCs 543 (Session Border Controller) must be used to re-write any incompatible 544 SDP to be compatible. However, this moves the problem from the 545 browser to JavaScript, or requires SBCs to "fix" the problem. 547 Had SDP been entirely generated by JavaScript rather than come from 548 the browser engine, the JavaScript could create only SDPs compatible 549 with a particular SIP provider under control of their own JavaScript 550 and the SIP provider could chose which JavaScript SDP parsing / 551 generation code to run, for maximum compatibility. 553 2.9. Increased surface API 555 By mandating SDP, the requirement for compatibility with WebRTC is 556 increased substantially with little benefit. Instead of just 557 supporting basic media RTP [RFC3550], STUN/ICE/TURN [RFC5389]/ 558 [RFC5245]/[RFC5766], DTLS [RFC6347] and CODECS an additional bar must 559 be passed, i.e. a browser or other WebRTC compliant API must support 560 SDP with a full offer / answer state machine (or a state machine with 561 additional rules to make it flexible for various scenarios). 563 With an alternative approach, the entire requirement for SDP could be 564 removed without any loss of compatibility or increase in complexity 565 while achieving greater compatibility via the JavaScript shim. 567 2.10. Impossible API to implement to achieve browser compatibility 569 The current mandated SDP based API cannot be implemented as a 570 standard by independent browser vendors in its current form. A list 571 of subsequent behaviors regarding the usage, parsing, handling, 572 extensions, behaviors, constraints and other such reference documents 573 must be meticulously defined for SDP with the modified offer / answer 574 state machine or no browser can ever claim to be "compliant". The 575 current definition process is far from complete. 577 The current WebRTC SDP based API is far from achieving that goal due 578 to the inclusion of free-form SDP with offer / answer and it is 579 grounds for removing it as it goes beyond the RTCWEB's charter and 580 limited scope. 582 Any incremental approach that does not remove the offer / answer 583 model requirement yields a road block to achieving alternative WebRTC 584 signaling protocols other than SIP. 586 An alternative WebRTC JavaScript object model approach that does not 587 require an all-encompassing session description and related state 588 machine is being proposed as an alternative solution so the RTCWEB 589 charter can complete its defined goals in a timely fashion. 591 2.10.1. Example Oddities That Need Definition 593 There are many oddities in the SDP RFC [RFC4566] and the various 594 related extensions. 596 For example; will RTP CODEC maps be required or not? They are not 597 required for basic CODECs according to the SDP RFC. However, with 598 all the flavors of CODECs being offered, defining a mapping between 599 payloads is critical to compatibility and not just a good idea. 601 Another example; should "t=0 0" be respected? Is that allowed to be 602 changed? Do the browser vendors need to enforce the attribute, or 603 should the JavaScript layer enforce it? Should the streams wait to 604 start until the NTP time stamp and close when the NTP time completes? 606 These are just small samples of questions that must all be completely 607 addressed in detail. This could also cause a cascade of updated 608 reference drafts and confusion as to which version is to be adhered 609 by browsers as well as what each browser specifically supports. 610 Nominally referencing the SDP RFC will not be sufficient, and deltas 611 from the established standards when violated will need to be defined 612 when the rules change. 614 2.11. Plan A, Plan B vs NoPlan 616 At the time of authoring this document, three plans on how to handle 617 large number of media streams in SDP have emerged currently under 618 consideration from the IETF, referred to as PlanA 619 [I-D.roach-rtcweb-plan-a], PlanB [I-D.uberti-rtcweb-plan] and NoPlan 620 [I-D.ivov-rtcweb-noplan]. 622 PlanA and PlanB acknowledge that using SDP as it is historically 623 defined in SIP is inefficient and problematic for large number of 624 media streams, especially factoring in that each media line must have 625 its own unique ports. 627 NoPlan allows for media to be described in a more JavaScript friendly 628 way and goes a long way towards improving the situation from SDP by 629 taking out the mapping of the streams from the SDP but does not 630 remove the reliance upon SDP. This creates a dual format system 631 where some information is initially carried over SDP and other 632 information is signaled through an alternative approach (including 633 the possibility of SDP offer/answer). NoPlan could have been the 634 sufficient approach if it took one step further and removed SDP 635 entirely. 637 PlanA, PlanB and NoPlan are a perfect example of why not to use SDP 638 as the basis for WebRTC. SDP has some arbitrary limitations as a 639 description protocol for multiple streams whereas no such limitations 640 exist at the lower layer transports themselves. RTP allows for 641 multiplexing multiple SSRCs. In other words, the problem is SDP, not 642 the real time transportation technologies. 644 These drafts illustrate the limitations of SDP and attempt to solve 645 it by introducing even more complex descriptions around SDP and / or 646 by "relaxation" of the offer answer model combined with altering the 647 description language of SDP. 649 None of these drafts address most of the concerns outlined in this 650 draft. If anything, they further illustrate how divergent the SDP 651 will become as more and more effort is put into working around 652 problems inherent to the nature of utilizing SDP (or any universal 653 format). 655 The issue that SDP implementers face should be isolated to those who 656 require SDP for their signaling protocols (namely SIP) where they can 657 choose the best practices for their networks for interoperability. 658 These complex approaches do not have to be forced on other signaling 659 protocols that do not have or require such limitations. 661 Certainly JavaScript programmers and the W3C should not be impacted 662 by such limitations by introducing SDP (or any universal format) into 663 the mix when it adds zero value and fails in its primary objectives, 664 namely: interoperability with existing SIP vendors & networks. 666 This further illustrates why SDP baked into the browser binary is not 667 beneficial for SIP vendors either. They will be forced to upgrade 668 their SIP infrastructure to support SDP packets from browsers with 669 these kinds of extensions or be forced to utilize a JavaScript SDP 670 re-write of SDP approach to "fix" these incompatibilities. 672 With an object approach, newer signaling protocols could describe 673 multiple media streams with ease and SIP providers could ensure they 674 only generate compatible SDP with their networks and agree on their 675 best practices and launch new features that incorporate approaches 676 like as PlanA, PlanB or NoPlan in a manner they deem fit rather then 677 when the browser vendors decide to upgrade the SDP arbitrarily. 679 2.12. SIP Forking Issue 681 The current SDP based API model does not allow for SIP parallel 682 forking even though the RTC engine can allow for demuxing a media 683 stream. The current model does not allow for one offer to be 684 transmitted but accepts multiple answers, which is legal in SIP. A 685 complex UPDATE process is described on how to work around the problem 686 instead of fixing the original problem, i.e. the state machine being 687 required. 689 A WebRTC JavaScript object model is designed to easily allow forking 690 but does not care if an upper shim supports SDP / SIP style forking 691 in the negotiation or not, so long as the basic rules of the RTC 692 media engine is respected. 694 3. Alternatives to Fixing these Issues Now 696 3.1. Waiting for WebRTC 2.0 698 If we don't get WebRTC 1.0 correct, fixing the API in WebRTC 2.0 may 699 become even more difficult. 701 At this stage, prototypes are underway but to our knowledge there are 702 no major commercial services deployed by more that one major vendor 703 using the current WebRTC API. Yet, the argument to even consider an 704 alternative is that 'it's too late'. Imagine trying to argue fixing 705 it after major networks are reliant upon specific browser 706 implementation. Having a good but simple architecture from the start 707 could alleviate a lot of pressure to fix a broken 1.0 in a 2.0 708 release before APIs become entrenched. 710 3.1.1. Cost now to fix versus fixing later 712 The cost of fixing the API issues today may pale in comparison to the 713 cost of compatibility problems spread across entire sets of 714 industries where constant fixes and work around may be required. 716 3.1.2. If starting over, would even SIP people want SDP as a surface 717 API? 719 Even SIP providers and vendors have started to realize that baking 720 SDP into the browser is not necessarily in their best interests, but 721 they do have an interest in a simple API to use since they aren't 722 specialized JavaScript developers but SIP integrators. 724 If an alternative approach provides SIP providers a simple JavaScript 725 API shim they desire and achieves greater interoperability because of 726 predictable, controllable and tailored SDP for their network, would 727 they not prefer such a model over the current "baked in the browser" 728 approach? 730 If the current WebRTC specification was ever rebooted, the current 731 mandated SDP based API would undoubtedly be scrapped in favor of a 732 better approach without its inherent design and use case flaws with 733 negative long term compatibility consequences. 735 3.1.3. Incremental Approach may make Compatibility Worse 737 One argument put forward, to keep the current SDP model, proposes the 738 current WebRTC SDP-based API must be completed soon and an 739 incremental improvement approach can be used to gradually move away 740 from these obvious problems. 742 The trouble with an incremental approach is that it may increase 743 incompatibility further. Not all browser vendors will match the 744 incremental improvements in unison nor will all customers upgrade 745 simultaneously. This puts the onus on JavaScript developers to 746 support multiple versions of the WebRTC API and increase the number 747 of APIs they must learn and maintain. The JavaScript developers must 748 still perform all the workarounds required for the current API even 749 if they support the increments. This limits their willingness to use 750 any additional APIs until all browsers universally support the 751 incremental improvements. This will likely slow innovation and 752 adoption of future improvements. 754 This will likely create a situation where browser vendors cannot 755 easily achieve compliance because they too must support the existing 756 API and incremental improvements along the way, or break those 757 reliant upon the current methods. 759 Having a good solid simple foundation is key to ensuring basic 760 compatibility while allowing for innovation to occur for those 761 developers who are willing to give new APIs a trial without needing 762 to support multiple sets of equivalent but incompatible APIs 763 simultaneously. 765 3.2. Session Description Format Construction API 767 An alternative JavaScript model has in the past been floated around, 768 other than the model advocated in this draft. That model creates a 769 JavaScript session description format construction API in the 770 browser. Such an API would use JavaScript objects to construct the 771 session description format rather than allowing direct control of how 772 media should be plumbed together from JavaScript. 774 While using SDP as the chosen format for WebRTC highlights the issues 775 described in this draft particularly well, using an alternative 776 format like JSON instead of SDP does not remove many of the issues 777 presented in this draft. The issues expressed are not solely caused 778 by the lack of expressiveness of the SDP format but the nature of 779 creating a universal all-encompassing format to describe all 780 transport, media, constraints, and negotiations with an attached 781 inflexible state machine is the nature of the issue. This format 782 must do everything and encompass all concepts and becomes the 783 effective mandate for signaling even if not explicitly required to 784 perform signaling. 786 A few years ago there was an attempt to create a new "SDP 2.0" format 787 with a draft named Session Description and Capability Negotiation 788 [I-D.ietf-mmusic-sdpng]. This effort to create the "ultimate" SDP 789 format in XML was ultimately abandoned, in no small part because of 790 the difficulties in coming up with a single solution that works for 791 all scenarios. 793 Given the difficulty in creating a universal all-encompassing format 794 that works for all scenarios, the idea that creating a JavaScript 795 based API that constructs a similar flexible, but well defined 796 universal session description format using JavaScript objects is 797 highly suspect to fail equally. The reality is that such an effort 798 is complex. 800 Even if successful, this format is not necessarily the format that 801 will be sent on-the-wire, especially for existing alternative 802 signaling protocols. As such, the format will still need to be 803 transformed into alternative formats by JavaScript (or by a gateway). 804 If the format must be parsed or interpreted by an intermediate then 805 the format becomes an interaction point to the browser no matter how 806 clever the JavaScript session description construction API 807 implementation. Whatever format is selected, each browser or 808 alternative protocol format will have to decide how to convert and 809 interpret the output and generate new compatible inputs and deal with 810 the variations that will undoubtedly arrive from browser to browser 811 and from version to version. 813 Even if JavaScript APIs are made available to simplify the 814 construction or interpretation of a defined format, this format would 815 still become a do-everything serialization access point for the 816 browser and the defined exchange point for the local and remote 817 browser. Therefore the format itself must be described in meticulous 818 detail. 820 The standardization requirements for such an approach would increase 821 substantially over the WebRTC JavaScript object model advocated by 822 this draft since not only would such a JavaScript format construction 823 API have to be standardized (as any JavaScript Model would) but the 824 formatting rules and state machine it relies upon needs to become 825 standardized in detail as well. 827 Every combination of this all-encompassing format would need to be 828 outlined, rather than minimal definition of fixed properties needed 829 on a scoped objects as used in the WebRTC JavaScript Object Model. 830 Any slight variations would likely cause JavaScript developers or 831 other browsers to break their implementations. Obtaining 100% 832 stability in such an output equally across all browsers, on all 833 platforms with all versions is highly doubtful. 835 While a JavaScript format construction API is merely hypothetical at 836 the time of writing this draft, any proposal will need to be vetted 837 to see if it addresses all the concerns and issues brought up in this 838 draft. 840 This hypothetical JavaScript session description construction API 841 still puts the emphasis in driving the developer towards building up 842 a media signaling exchange format rather than in the logic of how the 843 media should be controlled and pipelined. 845 The WebRTC JavaScript object model is being proposed as the 846 alternative. In a follow-up to this draft the model will describe 847 how the JavaScript developer gains control over the stream's 848 pipelining for the browser's media/RTC engine and thus free the 849 JavaScript developer to express signaling and state machines using 850 whatever mechanism desired. A simplified shim implemented entirely 851 in JavaScript will allow easier interpretation to any format desired 852 by the JavaScript developer in a way that can be updated 853 independently of a browser's binary release. Should any changes be 854 needed in signaling, a JavaScript shim generating this custom format 855 is strictly under the control of the service provider and not the 856 browser. 858 4. Example Difficult Usage Cases with Current Model 860 4.1. On / off hold example usage case 862 This is a typical scenario widely adopted SIP technique of an SDP 863 attribute to place a stream on / off hold. This is the accepted 864 methodology and performing alternative approaches would deviate from 865 the expected practices for use with SIP and its manipulation of SDP. 866 Although not officially documented as supported, it is effectively 867 supported in WebRTC implementations. This is a typical use case need 868 by media application: 870 1. Browser A establishes a connection with Browser B 872 2. Browser A and browser B are streaming media 874 3. JavaScript developer wants Browser A to put "on hold" 876 These are the steps that must be performed by a JavaScript developer: 878 1. createOffer to obtain the SDP from Browser A 880 2. Parse the SDP 882 3. Add "a=sendonly" or "a=inactive" to all media 884 4. Regenerate the SDP, feed back to browser 886 5. Send the SDP to Browser B 888 6. Receive the answer from Browser B (which should respond with 889 a=recvonly if it still wishes media) 891 7. Parse the received SDP and modify with "a=recvonly" if it did not 892 respond correctly (to ensure the local side hold back its media) 894 8. Pass the modified SDP answer back into Browser A 896 This also implies that: 898 1. All future SDP events received from Browser B must be mangled to 899 ensure the "sendonly/recvonly/inactive" attribute is maintained 900 while on hold 902 2. All future createOffer/createAnswer calls from Browser A must be 903 modified to ensure the "sendonly" property is maintained 905 3. We need to handle alternative formats to describe hold, e.g. 906 "c=0.0.0.0" from Browser B which may not utilize the latest SDP 907 specifications depending on the remote device / platform 909 Ironically, hold is a very SIP and telephony specific concept. The 910 better approach would be to allow the streams to be pause/unpaused at 911 will as that does not require interaction with the SDP, and allow the 912 higher layers to signal the desire to pause the session to the remote 913 peer in whatever manner desired. 915 This is a very basic use case that is extremely complex for a 916 JavaScript developer, but it is the only way to perform this 917 particular action which is effectively supported by the browsers, 918 except only via the "SDP surface API". Even if this particular use 919 case ends up being an exposed JavaScript method to manipulate the SDP 920 by the browser, there are countless other scenarios where tweaking a 921 field to modify the behavior in the format will only be only 922 available via SDP manipulation. 924 4.2. One-Sided Constraints Negotiation use Case Scenario 926 As WebRTC is a web API and not a SIP API, the API must be capable of 927 allowing for alternative signaling methods without enforcing it's own 928 signaling aspects (other than basic principles like ensure ICE 929 agreement has been achieved for security reasons). 931 Consider the following scenario: 933 1. Browser A and Browser B establish a connection 935 2. Browser A and Browser B use one-sided constraints negotiation 936 where each party independently decides what "it expects to 937 receive" 939 3. Browser A decides that it wishes to alter the properties of the 940 video it expects to receive 942 With this model, browser A must be capable of independently modifying 943 its expectations without waiting for an answer from the remote side 944 (as that's illegal by the nature of the offer / answer signaling), 945 unless the rules are relaxed and special exceptions are made. For 946 the model to work, browser A's receive constraints must be applied to 947 the send constraints of the remote peer. This model does not require 948 an SDP offer / answer exchange since the sending peer can monitor the 949 expectations of the receiving peer and set its send constraints as 950 appropriate. 952 To achieve this a for one-sided negotiation: 954 1. Browser A's JavaScript must respond to every SDP offer with an 955 answer locally generated from JavaScript without a round trip, 956 extracting out last known expectations from the remote SDP last 957 received as part of the answer 959 2. The JavaScript must update the constraint signaling for the 960 remote party 962 3. Browser B's JavaScript sees the constraints have changed from 963 Browser A thus it initiates a fake offer from the remote party 964 (generating the intentions of the constraint and generating an 965 SDP format) 967 4. Browser B's JavaScript must examine the answer if any constraints 968 have changed, and if so, it may trigger another reverse situation 969 where step 1 is repeated, except with Browser A and B's role 970 reversed. 972 Is this really doable? Maybe, with a great deal of difficulty and 973 SDP mangling but it is unquestionably a hack and a violation of offer 974 / answer (and relaxed rules create exceptions and exceptions require 975 additional logic to handle). The offer / answer rules are violated 976 because no round trip was performed at the time when the constraints 977 were changed. 979 This is also fragile because if Browser B failed to accept the fake 980 offer there is no way to enforce the constraint nor can the 981 JavaScript rollback the expected constraint. Likewise if the state 982 machine in Browser A expected an offer to be generated before a new 983 offer would be accepted, the conflict resolution process would be 984 extremely difficult and messy. 986 This offer / answer state machine is not even required to fulfill the 987 mandate of the RTCWEB Working Group charter but it is currently 988 mandated because it supposedly makes producing "SIP interoperability" 989 easier (which is highly suspect at best). 991 A JavaScript shim approach on a WebRTC JavaScript object model and 992 without offer / answer could achieve the same (or better) "SIP 993 interoperability" without breaking other stateless negotiation 994 models, such as one-sided negotiation. 996 4.3. Meet-me Negotiation Use Case Scenario 998 1. WebRTC client A generates an offer and sends to a server 1000 2. WebRTC client B generates an offer and sends to a server 1002 3. WebRTC client C generates an offer and sends to a server 1004 4. The server returns all the exchanges to each of these clients 1005 simultaneously 1007 5. WebRTC client A, B and C interconnect 1009 Technically, there is no need for independent SDP offer / answer 1010 negotiation amongst all these peers to achieve a mesh scenario for 1011 this use case. Each client has enough information about the other 1012 clients to establish a peer connection. The current WebRTC SDP API 1013 imposes independent round trip negotiations that are not technically 1014 necessary. If WebRTC client D was added later, the original 1015 connection can be forked and re-use the same DTLS fingerprints to 1016 negotiate new encryptions keys for media or data. Fingerprint or 1017 identity signature reuse should not introduce any additional security 1018 concerns since identities will be verified and keys negotiated for 1019 each peer-to-peer connection. 1021 A JavaScript object model approach would allow for this kind of 1022 scenario without independent round trip negotiations for each WebRTC 1023 client in the mesh. 1025 4.4. Browser to Browser Compatibility Extension Compatibility Issue 1026 Scenario 1028 Consider the following scenario: 1030 1. Browser A has implemented an extension to SDP (which is allowed) 1032 2. Browser B has no knowledge of such an extension 1034 3. The JavaScript engine running on Browser A has no knowledge of 1035 the extension 1037 4. The JavaScript engine packages up the SDP from Browser A and 1038 sends it to Browser B 1040 Under this scenario, what should browser B do? To reject the offer 1041 means communication cannot occur. To accept the offer has ambiguous 1042 meaning because the answer might have misunderstood the extension's 1043 intention and does not allow for the appropriate behavior. 1045 The exact rules of what is allowed in SDP and what is not and how 1046 extensions are treated must be defined clearly and non ambiguously. 1047 Even though current SDP offer / answer API can deal with some 1048 extensions, like new codecs being introduced, it is ambiguous on how 1049 to deal with more major extensions such as new SDP profiles, 1050 transports, or encryption methods. 1052 Assuming that a lack of response to an extension is non-agreement to 1053 use the extension is not acceptable. For example, if the extension 1054 was security related dictating some security precondition to opening 1055 a stream, the offer must be rejected as the precondition cannot be 1056 met. Ignoring the extension would mean the offer was accepted where 1057 it cannot be accepted. Another example would be introduction of new 1058 SDP profile, like AVPF2. Offer/answer negotiation simply fails when 1059 it encounters an unknown profile even if it is backwards compatible, 1060 like for instance, most of the calls to current SIP devices will fail 1061 if AVPF is used instead of AVP. A better approach is to define the 1062 rules for how extensions can be made, whereas SDP has no such rules. 1064 Currently, in SIP networks, such extensions are agreed upon in 1065 advanced and extensively tested before they are introduced. SBCs 1066 (Session Border Controllers) are often used to make devices with 1067 different feature sets work with each other. By allowing JavaScript 1068 control over the format generated on the wire, feature roll out is 1069 under strict control of the provider, and not whenever a browser 1070 vendor decides to produce an update. 1072 4.5. Building Interoperability between WebRTC and a SIP Service 1073 Scenario 1075 Consider the following scenario: 1077 1. Developer takes SDP produced by browser and send to SIP gateway 1078 (which is supposed to be SIP "compatible") 1080 2. Users happily use this service 1082 3. Browser Vendor A updates the browser SDP generator and a slight 1083 variation in SDP changes 1085 4. Users are now broken 1087 5. SIP gateway must be updated to handle new SDP (and old SDP) 1088 6. Browser Vendor B updates their browser SDP generator (with a 1089 different SDP variation) 1091 7. Users are now broken again 1093 8. SIP gateway must be updated to handle another variation of SDP 1094 (and maintain the old variations) 1096 9. Repeat to step 3, but add Browser Vendor C, D and multiple 1097 platforms 1099 This is not an unrealistic scenario by any stretch of the 1100 imagination. This currently happens in the SIP world, but at least 1101 in that world new devices are tested to ensure compatibility before 1102 roll outs occur on the network so issues can be addressed before the 1103 user's experience is broken. Since the SIP provider and gateway 1104 vendor do not have control over the update cycle of the browsers, 1105 their users are much more prone to breakage by taking the SDP from 1106 the browser and sending to their network. 1108 Whereas this is what happens with a JavaScript Object API model with 1109 SDP shim written in JavaScript-only: 1111 1. Developer uses shim to generate SDP by browser and sends to SIP 1112 gateway (with SDP that is compatible) 1114 2. Users happily use this service 1116 3. Browser Vendor A updates the browser with a new RTC feature. 1118 4. Repeat to step 2 1120 The reason why the browser update does not affect the gateway is 1121 because the SDP is generated entirely in JavaScript and thus updates 1122 to the browser do not change the SDP generation logic. The SDP is 1123 entirely in control of SIP network provider. Any bugs with SDP 1124 compatibility can be addressed by the SIP provider without changes in 1125 the browser's binary. Bugs, updates and improvements are completely 1126 within the boundary and control of the SIP network provider. 1128 4.6. Bit-rate Change Scenario 1130 Consider the follow scenario: 1132 1. User is connected to a conference server 1134 2. While user is listening, the user transmits a low bit-rate 1135 3. The users starts to communicate and the bit-rate is adjusted to 1136 maximum quality 1138 Using the current WebRTC API, this would require an offer / answer 1139 round trip to perform the change and thus the quality would be 1140 updated until the answer was acknowledged, although proposals have 1141 been made to alter the rules for offer / answer in this case and 1142 allow for an exception. This round trip is unnecessary technically 1143 since the bit-rate can be dynamically adjusted without remote 1144 acknowledgment. Yet, the current offer / answer model imposes a 1145 round trip (unless yet another exception to the SDP rules are 1146 adopted). 1148 4.7. Video Codec Option Change Scenario 1150 Consider the follow scenario: 1152 1. JavaScript wishes to change a video codec option 1154 Using the current WebRTC API, this would require parsing the entire 1155 SDP, isolating the video codecs for a particular video media line, 1156 figuring the mapping and then reconstructing the original SDP with 1157 the newly incorporated changes. Accessors have been suggested for 1158 these common use cases but do not exist yet. If such accessors are 1159 created then a more involved API cannot be avoided out of necessity. 1160 One of the main justifications given by SDP proponents for only 1161 having an API that creates and accepts SDP is due to its supposed 1162 simplicity, as opposed to providing a more involved API. 1164 4.8. Video Upgrade Scenario 1166 1. Alice and Bob are having an audio conversation 1168 2. Alice presses the video button on her application and offers Bob 1169 video 1171 3. Bob does not wish to see Alice's video, so the application 1172 rejects the media (e.g. using "a=inactive" or "m=video 0") 1174 4. Alice's web application successfully parses and interprets Bob's 1175 rejection 1177 5. As Alice's video window of herself is independent of the SDP 1178 negotiation, Alice's HTML5 application successfully renders 1179 Alice's video locally 1181 The current WebRTC implementation offers no event to indicate the 1182 rejection, thus Alice is given no feedback of the rejection. She 1183 incorrectly assumes she's in a video conversation. In order to solve 1184 this scenario, custom signaling must be added to indicate of Bob's 1185 rejection of Alice's video. Yet this is duplication of signaling as 1186 the video is already rejected in the SDP. This leaves the JavaScript 1187 developer with a choice: either parse the SDP, understand the SDP and 1188 derive meaning, or duplicate the SDP efforts by introducing custom 1189 signaling for a common scenario when upgrading from audio to video 1190 and providing appropriate user feedback. 1192 5. Proposal: WebRTC JavaScript Object Model 1194 5.1. Overview 1196 The browser can expose simple object methods, properties and events 1197 representing the various RTC components at an abstracted level and 1198 provide a solid API for controlling how the media should be 1199 pipelined. The properties needed to be exchanged is separated into 1200 the appropriate object rather than meshed into an all-encompassing 1201 format. 1203 A JavaScript-only shim can be layered on top of an object model to 1204 provide easy SDP offer / answer capability for those who want a 1205 similar "simple" API to the current WebRTC API for use with SIP. A 1206 developer can chose to use this shim or not if they do not need SDP. 1207 Likewise, the object model could be used to produce alternative 1208 formats to SDP if the same do-everything format is needed but in an 1209 alternative on-the-wire session description format. 1211 The object model described in the solution is presented in a related 1212 draft. This solution will allow for the RTCWEB Working Group to 1213 complete its chartered mandate without starting from scratch. If 1214 adopted, all of the drafts proposed to solve issues in expressing SDP 1215 for WebRTC can be moved to more appropriate working groups. For 1216 example, SDP for SIP issues can be moved to the appropriate SIP 1217 working groups and multi-party SDP to the MMUSIC (e.g. drafts like 1218 PlanA or PlanB). 1220 5.2. Benefits 1222 5.2.1. Greater compatibility 1224 By having a WebRTC JavaScript object model, the exact inputs, 1225 outputs, properties and events can be well defined on individual 1226 objects and each object will be designed to be a specific contract 1227 between browser vendors and JavaScript developers. 1229 5.2.2. Easier to extend 1230 New objects and methods can be added without breaking existing 1231 compatibility. Compliance can be verified with unit tests able to 1232 test each and every behavior across all browsers' versions on every 1233 platform. JavaScript developers can expect their version of the API 1234 object contract to remain fixed to expected behaviors and not break 1235 (unless through well planned deprecation). 1237 Any extensions added to a JavaScript object model does not change the 1238 behavior expectation from JavaScript developers when using the 1239 current version of the API regardless of any extensions, unless 1240 explicitly deprecated. This is unlike SDP where extensions could be 1241 silently added into the SDP produced by the browsers at will, even in 1242 minor browser version changes, where any component that consumes the 1243 SDP may be unaware what those additional feature behaviors imply or 1244 require as a result. 1246 5.2.3. Faster Reaction Time To Issues 1248 Signaling related bugs produced by the JavaScript shims can easily be 1249 fixed and updated at any time regardless of the browser's release 1250 cycle. If a SIP provider discovers their SIP is not compatible 1251 within their JavaScript shim, the SIP provider can update the shim 1252 code to their own needs dynamically without lobbying the browser 1253 vendor and waiting for the browser to be patched and updated. 1255 5.2.4. Decreased surface API 1257 With a JavaScript object model, the features are well defined so the 1258 surface API is fixed to the agreed contract. Once agreed, a browser 1259 vendor only has to ensure their compatibility with well defined 1260 limited scope unit tests, and need not worry about some free-form 1261 format that may introduce untold compatibility issues should another 1262 vendor issue an update. This is also true of any non-browsers that 1263 may wish to implement and be compliant to the WebRTC API for 1264 JavaScript and provide their own JavaScript and WebRTC engines. 1266 5.2.5. Greater compatibility for SIP 1268 While SIP is not the main RTCWEB Working Group charter responsibility 1269 for WebRTC, SIP compatibility is highly desirable. By exclusively 1270 generating SDP from a JavaScript shim, the SDP produced will be 1271 identical across all platforms and all devices with every browser 1272 version and entirely under the control of the SIP provider. This 1273 increases compatibility for SIP providers. The SDP produced from the 1274 shim can be custom tailored to a SIP network without affecting any 1275 other SIP vendor or harming compatibility with other utilizing 1276 WebRTC. 1278 5.2.6. Alternative formats 1280 With a JavaScript shim approach on top of an object model, the 1281 information going over the wire can be transformed from the 1282 JavaScript object properties to alternative formats, including JSON, 1283 XML or SIP (or anything custom). As the JavaScript shim to use is 1284 under control of the service provider and identical regardless of the 1285 platform, the output from the JavaScript format generation is 1286 consistent and controllable, thus ensuring maximum compatibility 1287 within a network. 1289 The party receiving this format can be sure the format is to an 1290 exacting specification of their choosing rather than relying on 1291 whatever format is produced by whatever browser vendor. 1293 5.3. Design Goals and Considerations 1295 5.3.1. Objects Model Kept Simple 1297 The JavaScript developer should not need to understand the mechanics 1298 of RTC other than understanding how to plumb the objects together. 1299 Those who need extended properties or events for finer control can 1300 obtain them with simple method access to an object, but those 1301 extended attributes should not be required for simple use cases. 1303 5.3.2. Simple to Gather Negotiation Information 1305 The objects model should allow a simple method for collecting 1306 information that will be needed for various alternative negotiation 1307 models, highly focused to the object. One of the targets for 1308 negotiation must be SDP and SIP. 1310 5.3.3. Offer / Answer 1312 The proposed JavaScript object model should not require the offer / 1313 answer state machine but must not preclude this state machine being 1314 built in a layer above. The offer / answer state machine must be 1315 possible to implement as a JavaScript shim without any additional 1316 built-in browser services needing to be implemented. 1318 5.3.4. Extensions 1320 Extending the object model for the expected common extension use 1321 cases without breaking the JavaScript API should be possible. Such 1322 possible extension use cases should include items like local mixing 1323 and data synchronization, or extended properties, events or features. 1325 As any design, there may be limitations but the design should hold up 1326 to various realistic scenarios that are likely to happen in the near 1327 future. 1329 5.3.5. Well Defined Behaviors 1331 An API must describe specific API behavior sets to the browser 1332 vendors so they have the appropriate guidelines for implementation, 1333 including the mapping to on-the-wire to RTC protocols. The API 1334 presented in the related draft may be the input to a W3C efforts to 1335 define specific and exact expected behavior sets for an object based 1336 JavaScript API for an official WebRTC 1.0 release. 1338 5.3.6. Data Channel 1340 The proposed WebRTC JavaScript Object model will provide a definition 1341 for basic JavaScript usage of the data channel. 1343 5.3.7. Satisfy the expectations of the RTCWEB charter 1345 The object model must adhere to the expectations of the RTCWEB 1346 charter either directly, via extensions that can be defined by the 1347 working group on top of the object model or possibly via a JavaScript 1348 shim written to utilize the functionality of the object model but it 1349 must not preclude the RTCWEB charter from fulfilling its previously 1350 stated goals. 1352 5.3.8. SIP/SDP and current WebRTC API shim compatibility statement 1354 The goal of the object model is to allow for a JavaScript shim that 1355 provides a simple mechanism for parsing and generating SDP for basic 1356 compatibility with SIP networks (capable of supporting the WebRTC 1357 wire protocols). 1359 The goal of this object based model is not to provide working 1360 JavaScript shim on top that is a 1-for-1 matching of the current 1361 WebRTC API as a shim, including all behaviors, features, bugs and 1362 expectations since the definition of the current approach is not 1363 defined enough to be able to produce that level of compatibility. 1364 This would be an impossible goal as a result, and would add little 1365 value. 1367 Extensions are beyond the scope of the JavaScript shim, but it is 1368 possible for others to fork and modify the shim to their own needs 1369 specific to their own SIP/SDP network infrastructure. 1371 Compatibility with the SDP used in all SIP networks is not a stated 1372 goal for any JavaScript shim since not even SIP providers can agree 1373 on a common agreed definitive standard set of RFCs and drafts. 1375 5.3.9. Greater Separation of RTCWEB Working Group and Other Working 1376 Groups 1378 A JavasScript object model would remove much of the need for cross 1379 IETF working group coordination, which has become common place with 1380 the current movement because of utilizing SDP and its close ties to 1381 SIP. By limiting the RTCWEB technologies used to only those required 1382 for Real-Time Communication from the browser (e.g. RTP, ICE/STUN/ 1383 TURN, DTLS), the RTCWEB Working Group is freed from tight couplings 1384 with other IETF working groups, each having their own charters, 1385 schedules, agendas and interests and thus ensures more rapid progress 1386 between RTCWEB Working Group the W3C and developers who are to use 1387 this technology. 1389 6. Security Considerations 1391 While RTCWEB has it's own security considerations for protocols, a 1392 JavaScript object model has no additional requirements other than 1393 those already established for use within RTCWEB, e.g. ICE 1394 connectivity permission check or DTLS fingerprint checks. 1396 JavaScript as a browser language itself has security consideration 1397 but nothing inherent to using a JavaScript object model versus a 1398 JavaScript SDP API model, as any proposed implementations must have a 1399 JavaScript API. The specifics of any API must list their own 1400 specific security considerations to their defined model and API, 1401 should any exist. 1403 Any specific issues for the proposed JavaScript object model will be 1404 outlined in the separated draft WebRTC JavaScript object model draft 1405 as needed and warranted. 1407 7. References 1409 7.1. Normative References 1411 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 1412 with Session Description Protocol (SDP)", RFC 3264, June 1413 2002. 1415 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 1416 Description Protocol", RFC 4566, July 2006. 1418 7.2. Informative References 1420 [I-D.ietf-mmusic-sdpng] 1421 Kutscher, D., Ott, J., and C. Bormann, "Session 1422 Description and Capability Negotiation", draft-ietf- 1423 mmusic-sdpng-08 (work in progress), February 2005. 1425 [I-D.ivov-rtcweb-noplan] 1426 Ivov, E., Marocco, E., and P. Thatcher, "No Plan: 1427 Economical Use of the Offer/Answer Model in WebRTC 1428 Sessions with Multiple Media Sources", draft-ivov-rtcweb- 1429 noplan-01 (work in progress), June 2013. 1431 [I-D.roach-rtcweb-plan-a] 1432 Roach, A. and M. Thomson, "Using SDP with Large Numbers of 1433 Media Flows", draft-roach-rtcweb-plan-a-00 (work in 1434 progress), May 2013. 1436 [I-D.uberti-rtcweb-plan] 1437 Uberti, J., "Plan B: a proposal for signaling multiple 1438 media sources in WebRTC.", draft-uberti-rtcweb-plan-00 1439 (work in progress), May 2013. 1441 [MediaCapture] 1442 Burnett, D., "Media Capture and Streams", May 2013, . 1445 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1446 Requirement Levels", BCP 14, RFC 2119, March 1997. 1448 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 1449 A., Peterson, J., Sparks, R., Handley, M., and E. 1450 Schooler, "SIP: Session Initiation Protocol", RFC 3261, 1451 June 2002. 1453 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 1454 Jacobson, "RTP: A Transport Protocol for Real-Time 1455 Applications", STD 64, RFC 3550, July 2003. 1457 [RFC5245] Rosenberg, J., "Interactive Connectivity Establishment 1458 (ICE): A Protocol for Network Address Translator (NAT) 1459 Traversal for Offer/Answer Protocols", RFC 5245, April 1460 2010. 1462 [RFC5389] Rosenberg, J., Mahy, R., Matthews, P., and D. Wing, 1463 "Session Traversal Utilities for NAT (STUN)", RFC 5389, 1464 October 2008. 1466 [RFC5766] Mahy, R., Matthews, P., and J. Rosenberg, "Traversal Using 1467 Relays around NAT (TURN): Relay Extensions to Session 1468 Traversal Utilities for NAT (STUN)", RFC 5766, April 2010. 1470 [RFC6347] Rescorla, E. and N. Modadugu, "Datagram Transport Layer 1471 Security Version 1.2", RFC 6347, January 2012. 1473 [WebRTC10] 1474 Bergkvist, A., "WebRTC 1.0 Real-time Communication Between 1475 Browsers", August 2012, 1476 . 1478 Authors' Addresses 1480 Robin Raymond 1481 Hookflash 1482 436, 3553 31 St. NW 1483 Calgary, Alberta T2L 2K7 1485 Email: robin@hookflash.com 1487 Erik Lagerway 1488 Hookflash 1489 436, 3553 31 St. NW 1490 Calgary, Alberta T2L 2K7 1491 Canada 1493 Email: erik@hookflash.com 1495 Inaki Baz Castillo 1496 Versatica 1497 Barakaldo 1498 Basque Country 1499 Spain 1501 Email: ibc@aliax.net 1503 Roman Shpount 1504 TurboBridge 1505 4905 Del Ray Ave Suite 300 1506 Bethesda, MD 20814 1507 USA 1509 Email: rshpount@turbobridge.com