idnits 2.17.1
draft-raymond-rtcweb-webrtc-js-obj-api-rationale-01.txt:
Checking boilerplate required by RFC 5378 and the IETF Trust (see
https://trustee.ietf.org/license-info):
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/checklist :
----------------------------------------------------------------------------
** The document seems to lack an IANA Considerations section. (See Section
2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
when there are no actions for IANA.)
** The abstract seems to contain references ([RFC4566], [RFC3264]), which
it shouldn't. Please replace those with straight textual mentions of the
documents in question.
Miscellaneous warnings:
----------------------------------------------------------------------------
== The copyright year in the IETF Trust and authors Copyright Line does not
match the current year
== The document doesn't use any RFC 2119 keywords, yet seems to have RFC
2119 boilerplate text.
-- The document date (July 06, 2013) is 3940 days in the past. Is this
intentional?
Checking references for intended status: Informational
----------------------------------------------------------------------------
== Unused Reference: 'MediaCapture' is defined on line 1441, but no
explicit reference was found in the text
== Unused Reference: 'WebRTC10' is defined on line 1473, but no explicit
reference was found in the text
** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866)
-- Obsolete informational reference (is this intentional?): RFC 5245
(Obsoleted by RFC 8445, RFC 8839)
-- Obsolete informational reference (is this intentional?): RFC 5389
(Obsoleted by RFC 8489)
-- Obsolete informational reference (is this intentional?): RFC 5766
(Obsoleted by RFC 8656)
-- Obsolete informational reference (is this intentional?): RFC 6347
(Obsoleted by RFC 9147)
Summary: 3 errors (**), 0 flaws (~~), 4 warnings (==), 5 comments (--).
Run idnits with the --verbose option for more detailed information about
the items above.
--------------------------------------------------------------------------------
2 Network Working Group R. Raymond
3 Internet-Draft E. Lagerway
4 Intended status: Informational Hookflash
5 Expires: January 07, 2014 I. Baz Castillo
6 Versatica
7 R. Shpount
8 TurboBridge
9 July 06, 2013
11 WebRTC JavaScript Object API Rationale
12 draft-raymond-rtcweb-webrtc-js-obj-api-rationale-01
14 Abstract
16 This document describes the reasons why a JavaScript Object Model
17 approach is a far better solution than using SDP [RFC4566] as a
18 surface API for interfacing with WebRTC. The document outlines the
19 issues and pitfalls as well as use cases that are difficult (or
20 impossible) with SDP with offer / answer [RFC3264], and explains the
21 benefits and goals of an alternative JavaScript object model
22 approach.
24 Status of This Memo
26 This Internet-Draft is submitted in full conformance with the
27 provisions of BCP 78 and BCP 79.
29 Internet-Drafts are working documents of the Internet Engineering
30 Task Force (IETF). Note that other groups may also distribute
31 working documents as Internet-Drafts. The list of current Internet-
32 Drafts is at http://datatracker.ietf.org/drafts/current/.
34 Internet-Drafts are draft documents valid for a maximum of six months
35 and may be updated, replaced, or obsoleted by other documents at any
36 time. It is inappropriate to use Internet-Drafts as reference
37 material or to cite them other than as "work in progress."
39 This Internet-Draft will expire on January 07, 2014.
41 Copyright Notice
43 Copyright (c) 2013 IETF Trust and the persons identified as the
44 document authors. All rights reserved.
46 This document is subject to BCP 78 and the IETF Trust's Legal
47 Provisions Relating to IETF Documents
48 (http://trustee.ietf.org/license-info) in effect on the date of
49 publication of this document. Please review these documents
50 carefully, as they describe your rights and restrictions with respect
51 to this document. Code Components extracted from this document must
52 include Simplified BSD License text as described in Section 4.e of
53 the Trust Legal Provisions and are provided without warranty as
54 described in the Simplified BSD License.
56 Table of Contents
58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
59 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4
60 2. Issues with a Universal Session Description Format (and Offer
61 / Answer) . . . . . . . . . . . . . . . . . . . . . . . . . . 4
62 2.1. Goal of Minimized Requirements . . . . . . . . . . . . . 6
63 2.2. Offer / Answer State Machine . . . . . . . . . . . . . . 7
64 2.2.1. Offer / Answer Violations . . . . . . . . . . . . . . 8
65 2.3. Browser to Browser Format Compatibility Issue . . . . . . 8
66 2.4. Browser to JavaScript Compatibility Issues . . . . . . . 9
67 2.5. SDP as a surface API for JavaScript developers . . . . . 9
68 2.6. Is SDP allowed to be mangled? . . . . . . . . . . . . . . 10
69 2.7. SDP errata and bugs compatibility issues . . . . . . . . 11
70 2.7.1. SDP Bugs Become Enshrined . . . . . . . . . . . . . . 11
71 2.8. SIP/SDP compatibility worsened . . . . . . . . . . . . . 12
72 2.9. Increased surface API . . . . . . . . . . . . . . . . . . 12
73 2.10. Impossible API to implement to achieve browser
74 compatibility . . . . . . . . . . . . . . . . . . . . . . 13
75 2.10.1. Example Oddities That Need Definition . . . . . . . 13
76 2.11. Plan A, Plan B vs NoPlan . . . . . . . . . . . . . . . . 14
77 2.12. SIP Forking Issue . . . . . . . . . . . . . . . . . . . . 15
78 3. Alternatives to Fixing these Issues Now . . . . . . . . . . . 15
79 3.1. Waiting for WebRTC 2.0 . . . . . . . . . . . . . . . . . 15
80 3.1.1. Cost now to fix versus fixing later . . . . . . . . . 16
81 3.1.2. If starting over, would even SIP people want SDP as a
82 surface API? . . . . . . . . . . . . . . . . . . . . 16
83 3.1.3. Incremental Approach may make Compatibility Worse . . 16
84 3.2. Session Description Format Construction API . . . . . . . 17
85 4. Example Difficult Usage Cases with Current Model . . . . . . 19
86 4.1. On / off hold example usage case . . . . . . . . . . . . 19
87 4.2. One-Sided Constraints Negotiation use Case Scenario . . . 20
88 4.3. Meet-me Negotiation Use Case Scenario . . . . . . . . . . 22
89 4.4. Browser to Browser Compatibility Extension Compatibility
90 Issue Scenario . . . . . . . . . . . . . . . . . . . . . 22
91 4.5. Building Interoperability between WebRTC and a SIP
92 Service Scenario . . . . . . . . . . . . . . . . . . . . 23
93 4.6. Bit-rate Change Scenario . . . . . . . . . . . . . . . . 24
94 4.7. Video Codec Option Change Scenario . . . . . . . . . . . 25
95 4.8. Video Upgrade Scenario . . . . . . . . . . . . . . . . . 25
96 5. Proposal: WebRTC JavaScript Object Model . . . . . . . . . . 26
97 5.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 26
98 5.2. Benefits . . . . . . . . . . . . . . . . . . . . . . . . 26
99 5.2.1. Greater compatibility . . . . . . . . . . . . . . . . 26
100 5.2.2. Easier to extend . . . . . . . . . . . . . . . . . . 26
101 5.2.3. Faster Reaction Time To Issues . . . . . . . . . . . 27
102 5.2.4. Decreased surface API . . . . . . . . . . . . . . . . 27
103 5.2.5. Greater compatibility for SIP . . . . . . . . . . . . 27
104 5.2.6. Alternative formats . . . . . . . . . . . . . . . . . 28
105 5.3. Design Goals and Considerations . . . . . . . . . . . . . 28
106 5.3.1. Objects Model Kept Simple . . . . . . . . . . . . . . 28
107 5.3.2. Simple to Gather Negotiation Information . . . . . . 28
108 5.3.3. Offer / Answer . . . . . . . . . . . . . . . . . . . 28
109 5.3.4. Extensions . . . . . . . . . . . . . . . . . . . . . 28
110 5.3.5. Well Defined Behaviors . . . . . . . . . . . . . . . 29
111 5.3.6. Data Channel . . . . . . . . . . . . . . . . . . . . 29
112 5.3.7. Satisfy the expectations of the RTCWEB charter . . . 29
113 5.3.8. SIP/SDP and current WebRTC API shim compatibility
114 statement . . . . . . . . . . . . . . . . . . . . . . 29
115 5.3.9. Greater Separation of RTCWEB Working Group and Other
116 Working Groups . . . . . . . . . . . . . . . . . . . 30
117 6. Security Considerations . . . . . . . . . . . . . . . . . . . 30
118 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 30
119 7.1. Normative References . . . . . . . . . . . . . . . . . . 30
120 7.2. Informative References . . . . . . . . . . . . . . . . . 31
121 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 32
123 1. Introduction
125 While the IETF RTCWEB WG is not specifically tasked with providing an
126 API by the W3C, the group has effectively defined a surface API with
127 the mandate to use SDP [RFC4566] with offer / answer [RFC3264].
129 SDP is a condensed text based format that typically describes all of
130 the real-time media streams, networking properties, codecs, media
131 state and media attributes. SDP is completely extensible and can be
132 used to describe absolutely anything so long as it is formatted
133 correctly within its minimally defined limitations.
135 The points for mandating SDP with an offer / answer API typically
136 boils down to:
138 1. It's really easy to establish communication, especially with SIP
139 [RFC3261].
141 2. The decision was already made.
143 3. SDP yields greater compatibility (especially with SIP networks).
145 4. We must have some kind of universal exchange format.
147 5. There is no alternative to this approach except destroying
148 everything created and starting from scratch.
150 This document will explain why these reasons are insufficient to
151 continue with an SDP with offer / answer mandate approach given
152 strong logical arguments and reasons with real world scenarios where
153 this approach fails and due in no small part to its lasting
154 consequences (including negative consequences for SIP).
156 The document highlights the benefits and goals for a different
157 "JavaScript Object Model" approach, which satisfies the RTCWEB WG
158 charter's requirements, yields greater compatibility and offers a
159 road-map where future potential extensions can be readily added
160 without breaking existing implementations.
162 A "JavaScript shim" is described including details on how it can
163 offer a wrapped API around a core WebRTC JavaScript Object Model.
164 This Shim will provide the same level of "ease of use" as experienced
165 with the current SDP WebRTC API. However, this JavaScript shim is
166 not mandatory to use for those who do not require an "SDP with offer
167 / answer" model.
169 1.1. Terminology
171 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
172 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
173 document are to be interpreted as described in RFC 2119 [RFC2119].
175 2. Issues with a Universal Session Description Format (and Offer /
176 Answer)
178 The issue with SDP is not the expressiveness of the format but its
179 usage as an arbitrary universal format and an API surface instead of
180 providing JavaScript developers an object model they can readily
181 understand. JavaScript could be used to control the plumbing of
182 media objects using familiar JavaScript expressive concepts enshrined
183 with methods, properties and events. Today, in many real-world use
184 cases, controlling WebRTC requires modifying SDP directly.
186 Requiring JavaScript developers to serialize their API control
187 requests into a text format (via modifications of SDP existing blobs)
188 is only one aspect of the many issues the SDP approach creates for
189 developers. Needlessly, an offer / answer state machine is imposed
190 on JavaScript developers as well.
192 While the currently mandated SDP based API allows developers to
193 quickly implement basic calling demos and interoperability with some
194 SIP networks, it has many issues that will be explored and explained
195 in this document and include (but not limited to):
197 1. Defining a standard universal all-encompassing session
198 description format for use with WebRTC that describes all
199 connections, media, constraints, streams and tracks for all
200 scenarios is especially challenging.
202 2. Rather than focusing and defining the properties needed for
203 communication, the focus is put on the best way to express the
204 format where every nuance and behavior will need to be detailed
205 for any browser vendor to capably implement the SDP based WebRTC
206 specification.
208 3. The bar for browsers (or other applications with WebRTC engines)
209 to produce a WebRTC engine is raised substantially by forcing
210 the browser to implement an entire SDP offer / answer engine
211 too, with little to no added benefit.
213 4. A universal format built into the browser's API is entirely
214 unneeded and goes well beyond the RTCWEB chartered mandate for
215 the RTCWEB Working Group.
217 5. A flexible and expendable universal exchange format leads to
218 greater interpretations and mistakes in various implementations,
219 which in turn leads to increased incompatibilities.
221 6. Given the format is entirely flexible and open to
222 interpretation, resulting implementations will more likely be
223 prone to errors relative to the other truly needed aspects of
224 RTC (which have better defined boundaries, behaviours, and
225 scope).
227 7. Mistakes in the format won't be fixed until a new browser binary
228 update is released and deployed amongst users.
230 8. Mistakes in implementation of the session description format can
231 become enshrined and difficult to deprecate (for the sake of
232 compatibility).
234 9. Compatibility issues caused by the format will not be limited to
235 browsers-only as many hybrid browser-engine based applications
236 now exist too.
238 10. Using alternative signaling formats will require complete
239 understanding of the universal format to be able to translate it
240 into other alternative signaling formats.
242 11. JavaScript (or proxies) will need to parse and rewrite the
243 output session description format with 100% precision and
244 without loss. They will also require pre-knowledge of what each
245 browser produces and expects, despite the likelihood of a
246 multitude of outputted flavors, on various platforms, and from
247 version to version and despite the inability to easily predict
248 or detect the variants.
250 12. JavaScript developers trying to control WebRTC features will
251 need to manipulate any defined universal format rather than
252 interacting with JavaScript objects.
254 13. Offer / answer is mandated and the state machine is required but
255 the exact rules and violations of the rules ill defined when
256 used within WebRTC.
258 14. The rules of how a universal format can be modified before being
259 delivered to remote parties need to be meticulously defined or
260 compatibility issues will arise (including the allowed rules of
261 post browser format regeneration as to what can be modified and
262 fed back into the browser).
264 15. Due to the issues defined above, SIP compatibility will worsen,
265 not strengthen.
267 An alternative to all of the issues caused by a universal format and
268 state machine are described later in the document. This alternative
269 allows JavaScript to control the behavior of the media engine's
270 plumbing while providing extensible and modifiable shims written
271 entirely in JavaScript that produce consistent signaling and exchange
272 formats for the specific network where those formats operate.
274 2.1. Goal of Minimized Requirements
276 While the primary goal of WebRTC is to enable browser to browser
277 communication, the definition of a "browser" is ever expanding.
278 Beyond just traditional hand-held applications, hybrid applications
279 that are part HTML-5 and part native code exist. Servers will become
280 as much as part of the WebRTC infrastructure as browsers. Minimizing
281 the requirements to the basic wire compatibility necessary to achieve
282 RTC is essential for maximum compatibility, flexibility and varying
283 usage scenarios.
285 The mandate for the RTCWEB charter is to simply define requirements,
286 provide basic "on-the-wire" compatibility, and define security
287 requirements (such as enforcing ICE connection agreements). The
288 RTCWEB charter goals have been exceeded by going well beyond that
289 scope by mandating an API that works fine for simple SIP
290 interoperability demos but does not provide easy compatibility to the
291 basic constructs needed as outlined from the charter for use with
292 other on-the-wire signaling protocols (other than SIP). If SIP is
293 the only end goal of the WG, then that goal must be specifically
294 stated rather than effectively mandated by making alternative
295 signaling approaches unreasonably difficult to achieve.
297 2.2. Offer / Answer State Machine
299 The current SDP approach requires an offer / answer state machine.
300 Mandating an offer / answer state machine implies that:
302 1. SDP be generated by browser A and sent to browser B
304 2. Browser B must respond with the offer with an answer
306 3. If either party issues a new offer but the offer is rejected, the
307 state must revert to the previous agreed SDP (or fail to none)
309 4. If one side receives an offer while the other side has an
310 outstanding offer, a conflict occurs and both sides must reject
311 and revert and perform SDP conflict resolution to issue an offer
312 again
314 5. The only changes to the media that are allowed happens if both
315 sides agree
317 6. Any change required to the SDP requires a network round trip
318 where both sides mutually agree (at least as traditionally
319 defined in offer / answer but the rules are in flux)
321 This offer / answer model is defined as required with the current
322 implementation. Not only do the browser vendors have to enforce the
323 rules, all JavaScript authors must also adhere to these rules of
324 signaling. While WebRTC does not dictate the signaling mechanism
325 between browsers, effectively it is imposing this signaling state
326 machine on all implementations (which is not a mandate of the RTCWEB
327 Working Group).
329 There are other models for signaling other than offer / answer. For
330 example, one-sided constraints based negotiation is an alternative
331 model. This type of negotiation requires each side to determine what
332 it wants to receive independent of the other. This signaling is akin
333 to saying "if you plan to send anything, make sure it conforms to the
334 following". Changes to the media may occur without agreement from
335 the remote party where each side decides what is acceptable to
336 receive without agreement from the other. The remote side can decide
337 if it wants to send within those constraints or not. There is no
338 round trip offer / answer required in this model to affect change.
340 Offer / answer introduces the unnecessary asynchronism to the API and
341 JavaScript implementations. For example, changing the list of codecs
342 expecting to receive or the current sending codec can be done
343 immediately without the need for asynchronous calls.
345 Offer / answer is not required to achieve RTC wire compatibility but
346 it is currently mandated when alternatives could exist.
348 2.2.1. Offer / Answer Violations
350 The offer / answer SDP state machine is already violated in WebRTC.
351 Trickle ICE precludes offer / answer round trips and other proposed
352 standards like NoPlan [I-D.ivov-rtcweb-noplan] suggest relaxing the
353 offer / answer model even more. The rules of what offer / answer at
354 this point is undefined and in clear violation of the strict previous
355 rules without clear direction on what exactly constitutes offer /
356 answer anymore and where it should and should not be used.
358 A new state for offer / answer called PRANSWER is now defined, which
359 did not exist as part of the standard offer / answer state machine.
360 Offer rollback is not adequately defined either should an offer /
361 answer conflict occur.
363 Currently, switching codecs requires an SDP offer / answer should
364 perform a round trip even though it is not technically needed for an
365 RTC engine to change codecs. Should this be another exception to the
366 offer / answer state machine?
368 2.3. Browser to Browser Format Compatibility Issue
370 SDP is a flexible format, and it allows many alternative methods to
371 express the same intentions. The smallest change may alter the SDP's
372 meaning.
374 This creates a parsing and SDP generation compatibility issues. If
375 SDP is packaged by JavaScript and delivered to the remote browser
376 then each browser must support every single possible variant of SDP
377 for every browser version and platform in existence. They must do
378 this without failure. To maximize compatibility, a browser should
379 generate the SDP format in the variant expected by the remote party
380 (despite not having sufficient knowledge about the remote party to
381 provide the correct SDP).
383 2.4. Browser to JavaScript Compatibility Issues
385 Since WebRTC is not supposed to mandate the format on the wire for
386 signaling, one supported use case for WebRTC must be allowing the
387 browser generated SDP to be converted into alternative on-the-wire
388 formats. This SDP conversion may be performed by JavaScript in the
389 browser, or later by an intermediate gateway. In either case, the
390 converter must be entirely aware of all variants to the SDP possible
391 from every browser platform and version, despite browser version
392 detection being heavily frowned upon by industry best practices.
393 Likewise, the JavaScript or gateway must know how to generate the
394 correct SDP for all browsers and versions before passing the
395 serialize SDP blob into the browser. Generating compatible SDP may
396 be impossible unless the exact formats and restrictions are
397 unquestionably clear by all implementers of the specification (which
398 is anything but clearly described in the current WebRTC SDP based API
399 that developers are mandated to use).
401 2.5. SDP as a surface API for JavaScript developers
403 The current SDP based API is limited to placing a call and answering
404 a call and adding media. To perform common edge cases or to utilize
405 RTC features beyond the basic API typically requires SDP mangling.
407 Many of the operations from JavaScript to control or fetch properties
408 from RTC will be through serialization to / from the SDP instead of a
409 developer using familiar JavaScript language constructs (e.g. object
410 methods, structures, properties and events). The JavaScript
411 developer must learn an entirely new protocol called "SDP" and be
412 able to parse and generate not only basic SDP but any SDP extensions
413 without introducing a single compatibility issue.
415 Examples; A JavaScript developer wants to hold / un-hold media
416 streams. The developer must use a widely adopted but hidden feature
417 to parse the SDP from the browser, change it to add the appropriate
418 "hold" state, send that hold state to the remote side, wait for the
419 "answer" to accept the hold, parse the result on the return to see if
420 the hold was accepted and feed the result to the browser.
422 Worse, a flood of extensions to SDP for WebRTC are being written to
423 "enhance" and "extend" the functionality of the browser with new
424 features. Many basic things are ill defined in the current SDP based
425 API, for example, changing non-negotiated codec parameters, such as
426 codec bandwidth.
428 There is no facility for JavaScript to detect what SDP the browser is
429 currently using or capable of delivering. The developer has no idea
430 of the extensions available, or what SDP will be produced, or what
431 SDP is compatible. The developer's JavaScript code must be able to
432 handle everything generated by the browser for any use case beyond
433 basic call, answer and hang-up. This is a heavy burden to place on a
434 JavaScript developer who is not familiar with the details of RTC
435 concepts as expressed in SDP, and is a challenge even for those who
436 are familiar.
438 Effective APIs are meant to be contracts between a producer and
439 consumer, whereas this SDP methodology offers little in the form of
440 any such contract.
442 If SDP is to become standardized for use with WebRTC then JavaScript
443 developers must learn SDP to use RTC's available features and build
444 new features. Alternatively, accessors will need to be provided to
445 manipulate the SDP on behalf of the JavaScript (and if so, then why
446 not move to an object model straight away and do away with SDP?).
448 2.6. Is SDP allowed to be mangled?
450 The choice must be made if SDP may be modified or not. If
451 modifications are the only way to achieve RTC features available then
452 what is allowed to be modified must be clearly defined in exact
453 detail and the expected behavior of each feature (and modification of
454 each feature), as expressed in SDP, must be defined. Anything short
455 of exact specifications will cause incompatibility. Again, the
456 implication is that Web / JavaScript developers must learn SDP to
457 utilize the available RTC features and they must learn the rules of
458 modification equally well, which virtually do not exist at all today.
460 If the choice is to not allow complete SDP modification at all, then
461 the protocol becomes extremely tied to SDP based protocols like SIP.
462 Yet, there is no mandate for SIP to be the standardized protocol in
463 WebRTC. In fact, the mandate to require SIP was explicitly denied,
464 which presents the argument that SDP manipulation must be allowed.
466 The SDP mangling issue isn't just an issue when the format is sent
467 on-the-wire. If Browser A sends Browser B an SDP, the current
468 philosophy is that the SDP is allowed to be modified. However, there
469 is the possibility of modifying the SDP generated by Browser A and
470 giving that modified SDP back to Browser A to change it's behavior
471 (i.e. a serialized text based API call) before the offer is given to
472 Browser B (and likewise with Browser B when it responds with its SDP
473 answer).
475 How much of the SDP is allowed to be modified before giving the SDP
476 back to the local browser? SDP is a free-form format so anything can
477 theoretically get changed, but should it be allowed? If not, what
478 can and cannot be modified? CODECS? SSRC? SDES? Fingerprints?
479 Transports? M-lines? And so on...
481 This issue becomes further compounded when extensions are factored in
482 as well.
484 2.7. SDP errata and bugs compatibility issues
486 With the SDP baked into the browser binary, the only way SDP
487 compatibility issues can be fixed is by releasing a new browser
488 update, and the JavaScript developers must support or work around
489 flaws until the browser vendors deliver the fix and the user base
490 upgrades their browsers.
492 While it could be argued that any bug must be worked around, SDP is a
493 unique problem. SDP is a free-form format. Being compatible isn't
494 as easy as implementing a limited wire protocol for media transport
495 or a API contract with well defined features and attributes. The
496 likelihood of free-form SDP containing errors is far greater than a
497 typical well defined API due to SDPs many flavors, interpretations
498 and lack of strong definition.
500 2.7.1. SDP Bugs Become Enshrined
502 To illustrate a scenario:
504 1. Browser Vendor A has a bug
506 2. Browser Vendor B can't work with A because of the bug so it
507 implements a "work around"
509 3. Browser Vendor A fixes the bug but implements a work around to be
510 compatible with Browser Vendor B's "work around"
512 This situation demonstrates is how browser bugs can become enshrined
513 as there's no way to update the SDP produced by the browser binary
514 once it's released until the next update release cycle occurs. This
515 would not be true if JavaScript was used via a shim to produce SDP as
516 JavaScript can be dynamically updated as needed at any time and a
517 service provider can choose to update their JavaScript implementation
518 to exacting expectations for their network regardless of the browser
519 version.
521 The lower level RTC wire protocols that need to be mandated by the
522 RTCWEB Working Group have limited scopes and well defined behaviors.
523 Any mistakes are obvious, likely to present very rapidly, and easy to
524 spot which party is doing something wrong and much easier to fix
525 earlier as a result. This is not true with a free form highly
526 descriptive language for sessions. The combinations are limitless
527 and every scenario is difficult to test, especially in concert with
528 every other browser vendor with every version released. The session
529 description will be the likely place of failure across the browsers
530 when the session description is generated inside the browser's
531 binary.
533 2.8. SIP/SDP compatibility worsened
535 One of the main arguments for using SDP with offer / answer was
536 supposed to be ease of compatibility with existing signaling
537 networks, like SIP. Instead, variations in the browser's SDP will
538 likely worsen SIP compatibility instead of enhance it.
540 A SIP provider must now be compatible with every browser's SDP on
541 every platform and version and the browser's SDP must be compatible
542 with every SDP from a SIP network. Alternatively, JavaScript or SBCs
543 (Session Border Controller) must be used to re-write any incompatible
544 SDP to be compatible. However, this moves the problem from the
545 browser to JavaScript, or requires SBCs to "fix" the problem.
547 Had SDP been entirely generated by JavaScript rather than come from
548 the browser engine, the JavaScript could create only SDPs compatible
549 with a particular SIP provider under control of their own JavaScript
550 and the SIP provider could chose which JavaScript SDP parsing /
551 generation code to run, for maximum compatibility.
553 2.9. Increased surface API
555 By mandating SDP, the requirement for compatibility with WebRTC is
556 increased substantially with little benefit. Instead of just
557 supporting basic media RTP [RFC3550], STUN/ICE/TURN [RFC5389]/
558 [RFC5245]/[RFC5766], DTLS [RFC6347] and CODECS an additional bar must
559 be passed, i.e. a browser or other WebRTC compliant API must support
560 SDP with a full offer / answer state machine (or a state machine with
561 additional rules to make it flexible for various scenarios).
563 With an alternative approach, the entire requirement for SDP could be
564 removed without any loss of compatibility or increase in complexity
565 while achieving greater compatibility via the JavaScript shim.
567 2.10. Impossible API to implement to achieve browser compatibility
569 The current mandated SDP based API cannot be implemented as a
570 standard by independent browser vendors in its current form. A list
571 of subsequent behaviors regarding the usage, parsing, handling,
572 extensions, behaviors, constraints and other such reference documents
573 must be meticulously defined for SDP with the modified offer / answer
574 state machine or no browser can ever claim to be "compliant". The
575 current definition process is far from complete.
577 The current WebRTC SDP based API is far from achieving that goal due
578 to the inclusion of free-form SDP with offer / answer and it is
579 grounds for removing it as it goes beyond the RTCWEB's charter and
580 limited scope.
582 Any incremental approach that does not remove the offer / answer
583 model requirement yields a road block to achieving alternative WebRTC
584 signaling protocols other than SIP.
586 An alternative WebRTC JavaScript object model approach that does not
587 require an all-encompassing session description and related state
588 machine is being proposed as an alternative solution so the RTCWEB
589 charter can complete its defined goals in a timely fashion.
591 2.10.1. Example Oddities That Need Definition
593 There are many oddities in the SDP RFC [RFC4566] and the various
594 related extensions.
596 For example; will RTP CODEC maps be required or not? They are not
597 required for basic CODECs according to the SDP RFC. However, with
598 all the flavors of CODECs being offered, defining a mapping between
599 payloads is critical to compatibility and not just a good idea.
601 Another example; should "t=0 0" be respected? Is that allowed to be
602 changed? Do the browser vendors need to enforce the attribute, or
603 should the JavaScript layer enforce it? Should the streams wait to
604 start until the NTP time stamp and close when the NTP time completes?
606 These are just small samples of questions that must all be completely
607 addressed in detail. This could also cause a cascade of updated
608 reference drafts and confusion as to which version is to be adhered
609 by browsers as well as what each browser specifically supports.
610 Nominally referencing the SDP RFC will not be sufficient, and deltas
611 from the established standards when violated will need to be defined
612 when the rules change.
614 2.11. Plan A, Plan B vs NoPlan
616 At the time of authoring this document, three plans on how to handle
617 large number of media streams in SDP have emerged currently under
618 consideration from the IETF, referred to as PlanA
619 [I-D.roach-rtcweb-plan-a], PlanB [I-D.uberti-rtcweb-plan] and NoPlan
620 [I-D.ivov-rtcweb-noplan].
622 PlanA and PlanB acknowledge that using SDP as it is historically
623 defined in SIP is inefficient and problematic for large number of
624 media streams, especially factoring in that each media line must have
625 its own unique ports.
627 NoPlan allows for media to be described in a more JavaScript friendly
628 way and goes a long way towards improving the situation from SDP by
629 taking out the mapping of the streams from the SDP but does not
630 remove the reliance upon SDP. This creates a dual format system
631 where some information is initially carried over SDP and other
632 information is signaled through an alternative approach (including
633 the possibility of SDP offer/answer). NoPlan could have been the
634 sufficient approach if it took one step further and removed SDP
635 entirely.
637 PlanA, PlanB and NoPlan are a perfect example of why not to use SDP
638 as the basis for WebRTC. SDP has some arbitrary limitations as a
639 description protocol for multiple streams whereas no such limitations
640 exist at the lower layer transports themselves. RTP allows for
641 multiplexing multiple SSRCs. In other words, the problem is SDP, not
642 the real time transportation technologies.
644 These drafts illustrate the limitations of SDP and attempt to solve
645 it by introducing even more complex descriptions around SDP and / or
646 by "relaxation" of the offer answer model combined with altering the
647 description language of SDP.
649 None of these drafts address most of the concerns outlined in this
650 draft. If anything, they further illustrate how divergent the SDP
651 will become as more and more effort is put into working around
652 problems inherent to the nature of utilizing SDP (or any universal
653 format).
655 The issue that SDP implementers face should be isolated to those who
656 require SDP for their signaling protocols (namely SIP) where they can
657 choose the best practices for their networks for interoperability.
658 These complex approaches do not have to be forced on other signaling
659 protocols that do not have or require such limitations.
661 Certainly JavaScript programmers and the W3C should not be impacted
662 by such limitations by introducing SDP (or any universal format) into
663 the mix when it adds zero value and fails in its primary objectives,
664 namely: interoperability with existing SIP vendors & networks.
666 This further illustrates why SDP baked into the browser binary is not
667 beneficial for SIP vendors either. They will be forced to upgrade
668 their SIP infrastructure to support SDP packets from browsers with
669 these kinds of extensions or be forced to utilize a JavaScript SDP
670 re-write of SDP approach to "fix" these incompatibilities.
672 With an object approach, newer signaling protocols could describe
673 multiple media streams with ease and SIP providers could ensure they
674 only generate compatible SDP with their networks and agree on their
675 best practices and launch new features that incorporate approaches
676 like as PlanA, PlanB or NoPlan in a manner they deem fit rather then
677 when the browser vendors decide to upgrade the SDP arbitrarily.
679 2.12. SIP Forking Issue
681 The current SDP based API model does not allow for SIP parallel
682 forking even though the RTC engine can allow for demuxing a media
683 stream. The current model does not allow for one offer to be
684 transmitted but accepts multiple answers, which is legal in SIP. A
685 complex UPDATE process is described on how to work around the problem
686 instead of fixing the original problem, i.e. the state machine being
687 required.
689 A WebRTC JavaScript object model is designed to easily allow forking
690 but does not care if an upper shim supports SDP / SIP style forking
691 in the negotiation or not, so long as the basic rules of the RTC
692 media engine is respected.
694 3. Alternatives to Fixing these Issues Now
696 3.1. Waiting for WebRTC 2.0
698 If we don't get WebRTC 1.0 correct, fixing the API in WebRTC 2.0 may
699 become even more difficult.
701 At this stage, prototypes are underway but to our knowledge there are
702 no major commercial services deployed by more that one major vendor
703 using the current WebRTC API. Yet, the argument to even consider an
704 alternative is that 'it's too late'. Imagine trying to argue fixing
705 it after major networks are reliant upon specific browser
706 implementation. Having a good but simple architecture from the start
707 could alleviate a lot of pressure to fix a broken 1.0 in a 2.0
708 release before APIs become entrenched.
710 3.1.1. Cost now to fix versus fixing later
712 The cost of fixing the API issues today may pale in comparison to the
713 cost of compatibility problems spread across entire sets of
714 industries where constant fixes and work around may be required.
716 3.1.2. If starting over, would even SIP people want SDP as a surface
717 API?
719 Even SIP providers and vendors have started to realize that baking
720 SDP into the browser is not necessarily in their best interests, but
721 they do have an interest in a simple API to use since they aren't
722 specialized JavaScript developers but SIP integrators.
724 If an alternative approach provides SIP providers a simple JavaScript
725 API shim they desire and achieves greater interoperability because of
726 predictable, controllable and tailored SDP for their network, would
727 they not prefer such a model over the current "baked in the browser"
728 approach?
730 If the current WebRTC specification was ever rebooted, the current
731 mandated SDP based API would undoubtedly be scrapped in favor of a
732 better approach without its inherent design and use case flaws with
733 negative long term compatibility consequences.
735 3.1.3. Incremental Approach may make Compatibility Worse
737 One argument put forward, to keep the current SDP model, proposes the
738 current WebRTC SDP-based API must be completed soon and an
739 incremental improvement approach can be used to gradually move away
740 from these obvious problems.
742 The trouble with an incremental approach is that it may increase
743 incompatibility further. Not all browser vendors will match the
744 incremental improvements in unison nor will all customers upgrade
745 simultaneously. This puts the onus on JavaScript developers to
746 support multiple versions of the WebRTC API and increase the number
747 of APIs they must learn and maintain. The JavaScript developers must
748 still perform all the workarounds required for the current API even
749 if they support the increments. This limits their willingness to use
750 any additional APIs until all browsers universally support the
751 incremental improvements. This will likely slow innovation and
752 adoption of future improvements.
754 This will likely create a situation where browser vendors cannot
755 easily achieve compliance because they too must support the existing
756 API and incremental improvements along the way, or break those
757 reliant upon the current methods.
759 Having a good solid simple foundation is key to ensuring basic
760 compatibility while allowing for innovation to occur for those
761 developers who are willing to give new APIs a trial without needing
762 to support multiple sets of equivalent but incompatible APIs
763 simultaneously.
765 3.2. Session Description Format Construction API
767 An alternative JavaScript model has in the past been floated around,
768 other than the model advocated in this draft. That model creates a
769 JavaScript session description format construction API in the
770 browser. Such an API would use JavaScript objects to construct the
771 session description format rather than allowing direct control of how
772 media should be plumbed together from JavaScript.
774 While using SDP as the chosen format for WebRTC highlights the issues
775 described in this draft particularly well, using an alternative
776 format like JSON instead of SDP does not remove many of the issues
777 presented in this draft. The issues expressed are not solely caused
778 by the lack of expressiveness of the SDP format but the nature of
779 creating a universal all-encompassing format to describe all
780 transport, media, constraints, and negotiations with an attached
781 inflexible state machine is the nature of the issue. This format
782 must do everything and encompass all concepts and becomes the
783 effective mandate for signaling even if not explicitly required to
784 perform signaling.
786 A few years ago there was an attempt to create a new "SDP 2.0" format
787 with a draft named Session Description and Capability Negotiation
788 [I-D.ietf-mmusic-sdpng]. This effort to create the "ultimate" SDP
789 format in XML was ultimately abandoned, in no small part because of
790 the difficulties in coming up with a single solution that works for
791 all scenarios.
793 Given the difficulty in creating a universal all-encompassing format
794 that works for all scenarios, the idea that creating a JavaScript
795 based API that constructs a similar flexible, but well defined
796 universal session description format using JavaScript objects is
797 highly suspect to fail equally. The reality is that such an effort
798 is complex.
800 Even if successful, this format is not necessarily the format that
801 will be sent on-the-wire, especially for existing alternative
802 signaling protocols. As such, the format will still need to be
803 transformed into alternative formats by JavaScript (or by a gateway).
804 If the format must be parsed or interpreted by an intermediate then
805 the format becomes an interaction point to the browser no matter how
806 clever the JavaScript session description construction API
807 implementation. Whatever format is selected, each browser or
808 alternative protocol format will have to decide how to convert and
809 interpret the output and generate new compatible inputs and deal with
810 the variations that will undoubtedly arrive from browser to browser
811 and from version to version.
813 Even if JavaScript APIs are made available to simplify the
814 construction or interpretation of a defined format, this format would
815 still become a do-everything serialization access point for the
816 browser and the defined exchange point for the local and remote
817 browser. Therefore the format itself must be described in meticulous
818 detail.
820 The standardization requirements for such an approach would increase
821 substantially over the WebRTC JavaScript object model advocated by
822 this draft since not only would such a JavaScript format construction
823 API have to be standardized (as any JavaScript Model would) but the
824 formatting rules and state machine it relies upon needs to become
825 standardized in detail as well.
827 Every combination of this all-encompassing format would need to be
828 outlined, rather than minimal definition of fixed properties needed
829 on a scoped objects as used in the WebRTC JavaScript Object Model.
830 Any slight variations would likely cause JavaScript developers or
831 other browsers to break their implementations. Obtaining 100%
832 stability in such an output equally across all browsers, on all
833 platforms with all versions is highly doubtful.
835 While a JavaScript format construction API is merely hypothetical at
836 the time of writing this draft, any proposal will need to be vetted
837 to see if it addresses all the concerns and issues brought up in this
838 draft.
840 This hypothetical JavaScript session description construction API
841 still puts the emphasis in driving the developer towards building up
842 a media signaling exchange format rather than in the logic of how the
843 media should be controlled and pipelined.
845 The WebRTC JavaScript object model is being proposed as the
846 alternative. In a follow-up to this draft the model will describe
847 how the JavaScript developer gains control over the stream's
848 pipelining for the browser's media/RTC engine and thus free the
849 JavaScript developer to express signaling and state machines using
850 whatever mechanism desired. A simplified shim implemented entirely
851 in JavaScript will allow easier interpretation to any format desired
852 by the JavaScript developer in a way that can be updated
853 independently of a browser's binary release. Should any changes be
854 needed in signaling, a JavaScript shim generating this custom format
855 is strictly under the control of the service provider and not the
856 browser.
858 4. Example Difficult Usage Cases with Current Model
860 4.1. On / off hold example usage case
862 This is a typical scenario widely adopted SIP technique of an SDP
863 attribute to place a stream on / off hold. This is the accepted
864 methodology and performing alternative approaches would deviate from
865 the expected practices for use with SIP and its manipulation of SDP.
866 Although not officially documented as supported, it is effectively
867 supported in WebRTC implementations. This is a typical use case need
868 by media application:
870 1. Browser A establishes a connection with Browser B
872 2. Browser A and browser B are streaming media
874 3. JavaScript developer wants Browser A to put "on hold"
876 These are the steps that must be performed by a JavaScript developer:
878 1. createOffer to obtain the SDP from Browser A
880 2. Parse the SDP
882 3. Add "a=sendonly" or "a=inactive" to all media
884 4. Regenerate the SDP, feed back to browser
886 5. Send the SDP to Browser B
888 6. Receive the answer from Browser B (which should respond with
889 a=recvonly if it still wishes media)
891 7. Parse the received SDP and modify with "a=recvonly" if it did not
892 respond correctly (to ensure the local side hold back its media)
894 8. Pass the modified SDP answer back into Browser A
896 This also implies that:
898 1. All future SDP events received from Browser B must be mangled to
899 ensure the "sendonly/recvonly/inactive" attribute is maintained
900 while on hold
902 2. All future createOffer/createAnswer calls from Browser A must be
903 modified to ensure the "sendonly" property is maintained
905 3. We need to handle alternative formats to describe hold, e.g.
906 "c=0.0.0.0" from Browser B which may not utilize the latest SDP
907 specifications depending on the remote device / platform
909 Ironically, hold is a very SIP and telephony specific concept. The
910 better approach would be to allow the streams to be pause/unpaused at
911 will as that does not require interaction with the SDP, and allow the
912 higher layers to signal the desire to pause the session to the remote
913 peer in whatever manner desired.
915 This is a very basic use case that is extremely complex for a
916 JavaScript developer, but it is the only way to perform this
917 particular action which is effectively supported by the browsers,
918 except only via the "SDP surface API". Even if this particular use
919 case ends up being an exposed JavaScript method to manipulate the SDP
920 by the browser, there are countless other scenarios where tweaking a
921 field to modify the behavior in the format will only be only
922 available via SDP manipulation.
924 4.2. One-Sided Constraints Negotiation use Case Scenario
926 As WebRTC is a web API and not a SIP API, the API must be capable of
927 allowing for alternative signaling methods without enforcing it's own
928 signaling aspects (other than basic principles like ensure ICE
929 agreement has been achieved for security reasons).
931 Consider the following scenario:
933 1. Browser A and Browser B establish a connection
935 2. Browser A and Browser B use one-sided constraints negotiation
936 where each party independently decides what "it expects to
937 receive"
939 3. Browser A decides that it wishes to alter the properties of the
940 video it expects to receive
942 With this model, browser A must be capable of independently modifying
943 its expectations without waiting for an answer from the remote side
944 (as that's illegal by the nature of the offer / answer signaling),
945 unless the rules are relaxed and special exceptions are made. For
946 the model to work, browser A's receive constraints must be applied to
947 the send constraints of the remote peer. This model does not require
948 an SDP offer / answer exchange since the sending peer can monitor the
949 expectations of the receiving peer and set its send constraints as
950 appropriate.
952 To achieve this a for one-sided negotiation:
954 1. Browser A's JavaScript must respond to every SDP offer with an
955 answer locally generated from JavaScript without a round trip,
956 extracting out last known expectations from the remote SDP last
957 received as part of the answer
959 2. The JavaScript must update the constraint signaling for the
960 remote party
962 3. Browser B's JavaScript sees the constraints have changed from
963 Browser A thus it initiates a fake offer from the remote party
964 (generating the intentions of the constraint and generating an
965 SDP format)
967 4. Browser B's JavaScript must examine the answer if any constraints
968 have changed, and if so, it may trigger another reverse situation
969 where step 1 is repeated, except with Browser A and B's role
970 reversed.
972 Is this really doable? Maybe, with a great deal of difficulty and
973 SDP mangling but it is unquestionably a hack and a violation of offer
974 / answer (and relaxed rules create exceptions and exceptions require
975 additional logic to handle). The offer / answer rules are violated
976 because no round trip was performed at the time when the constraints
977 were changed.
979 This is also fragile because if Browser B failed to accept the fake
980 offer there is no way to enforce the constraint nor can the
981 JavaScript rollback the expected constraint. Likewise if the state
982 machine in Browser A expected an offer to be generated before a new
983 offer would be accepted, the conflict resolution process would be
984 extremely difficult and messy.
986 This offer / answer state machine is not even required to fulfill the
987 mandate of the RTCWEB Working Group charter but it is currently
988 mandated because it supposedly makes producing "SIP interoperability"
989 easier (which is highly suspect at best).
991 A JavaScript shim approach on a WebRTC JavaScript object model and
992 without offer / answer could achieve the same (or better) "SIP
993 interoperability" without breaking other stateless negotiation
994 models, such as one-sided negotiation.
996 4.3. Meet-me Negotiation Use Case Scenario
998 1. WebRTC client A generates an offer and sends to a server
1000 2. WebRTC client B generates an offer and sends to a server
1002 3. WebRTC client C generates an offer and sends to a server
1004 4. The server returns all the exchanges to each of these clients
1005 simultaneously
1007 5. WebRTC client A, B and C interconnect
1009 Technically, there is no need for independent SDP offer / answer
1010 negotiation amongst all these peers to achieve a mesh scenario for
1011 this use case. Each client has enough information about the other
1012 clients to establish a peer connection. The current WebRTC SDP API
1013 imposes independent round trip negotiations that are not technically
1014 necessary. If WebRTC client D was added later, the original
1015 connection can be forked and re-use the same DTLS fingerprints to
1016 negotiate new encryptions keys for media or data. Fingerprint or
1017 identity signature reuse should not introduce any additional security
1018 concerns since identities will be verified and keys negotiated for
1019 each peer-to-peer connection.
1021 A JavaScript object model approach would allow for this kind of
1022 scenario without independent round trip negotiations for each WebRTC
1023 client in the mesh.
1025 4.4. Browser to Browser Compatibility Extension Compatibility Issue
1026 Scenario
1028 Consider the following scenario:
1030 1. Browser A has implemented an extension to SDP (which is allowed)
1032 2. Browser B has no knowledge of such an extension
1034 3. The JavaScript engine running on Browser A has no knowledge of
1035 the extension
1037 4. The JavaScript engine packages up the SDP from Browser A and
1038 sends it to Browser B
1040 Under this scenario, what should browser B do? To reject the offer
1041 means communication cannot occur. To accept the offer has ambiguous
1042 meaning because the answer might have misunderstood the extension's
1043 intention and does not allow for the appropriate behavior.
1045 The exact rules of what is allowed in SDP and what is not and how
1046 extensions are treated must be defined clearly and non ambiguously.
1047 Even though current SDP offer / answer API can deal with some
1048 extensions, like new codecs being introduced, it is ambiguous on how
1049 to deal with more major extensions such as new SDP profiles,
1050 transports, or encryption methods.
1052 Assuming that a lack of response to an extension is non-agreement to
1053 use the extension is not acceptable. For example, if the extension
1054 was security related dictating some security precondition to opening
1055 a stream, the offer must be rejected as the precondition cannot be
1056 met. Ignoring the extension would mean the offer was accepted where
1057 it cannot be accepted. Another example would be introduction of new
1058 SDP profile, like AVPF2. Offer/answer negotiation simply fails when
1059 it encounters an unknown profile even if it is backwards compatible,
1060 like for instance, most of the calls to current SIP devices will fail
1061 if AVPF is used instead of AVP. A better approach is to define the
1062 rules for how extensions can be made, whereas SDP has no such rules.
1064 Currently, in SIP networks, such extensions are agreed upon in
1065 advanced and extensively tested before they are introduced. SBCs
1066 (Session Border Controllers) are often used to make devices with
1067 different feature sets work with each other. By allowing JavaScript
1068 control over the format generated on the wire, feature roll out is
1069 under strict control of the provider, and not whenever a browser
1070 vendor decides to produce an update.
1072 4.5. Building Interoperability between WebRTC and a SIP Service
1073 Scenario
1075 Consider the following scenario:
1077 1. Developer takes SDP produced by browser and send to SIP gateway
1078 (which is supposed to be SIP "compatible")
1080 2. Users happily use this service
1082 3. Browser Vendor A updates the browser SDP generator and a slight
1083 variation in SDP changes
1085 4. Users are now broken
1087 5. SIP gateway must be updated to handle new SDP (and old SDP)
1088 6. Browser Vendor B updates their browser SDP generator (with a
1089 different SDP variation)
1091 7. Users are now broken again
1093 8. SIP gateway must be updated to handle another variation of SDP
1094 (and maintain the old variations)
1096 9. Repeat to step 3, but add Browser Vendor C, D and multiple
1097 platforms
1099 This is not an unrealistic scenario by any stretch of the
1100 imagination. This currently happens in the SIP world, but at least
1101 in that world new devices are tested to ensure compatibility before
1102 roll outs occur on the network so issues can be addressed before the
1103 user's experience is broken. Since the SIP provider and gateway
1104 vendor do not have control over the update cycle of the browsers,
1105 their users are much more prone to breakage by taking the SDP from
1106 the browser and sending to their network.
1108 Whereas this is what happens with a JavaScript Object API model with
1109 SDP shim written in JavaScript-only:
1111 1. Developer uses shim to generate SDP by browser and sends to SIP
1112 gateway (with SDP that is compatible)
1114 2. Users happily use this service
1116 3. Browser Vendor A updates the browser with a new RTC feature.
1118 4. Repeat to step 2
1120 The reason why the browser update does not affect the gateway is
1121 because the SDP is generated entirely in JavaScript and thus updates
1122 to the browser do not change the SDP generation logic. The SDP is
1123 entirely in control of SIP network provider. Any bugs with SDP
1124 compatibility can be addressed by the SIP provider without changes in
1125 the browser's binary. Bugs, updates and improvements are completely
1126 within the boundary and control of the SIP network provider.
1128 4.6. Bit-rate Change Scenario
1130 Consider the follow scenario:
1132 1. User is connected to a conference server
1134 2. While user is listening, the user transmits a low bit-rate
1135 3. The users starts to communicate and the bit-rate is adjusted to
1136 maximum quality
1138 Using the current WebRTC API, this would require an offer / answer
1139 round trip to perform the change and thus the quality would be
1140 updated until the answer was acknowledged, although proposals have
1141 been made to alter the rules for offer / answer in this case and
1142 allow for an exception. This round trip is unnecessary technically
1143 since the bit-rate can be dynamically adjusted without remote
1144 acknowledgment. Yet, the current offer / answer model imposes a
1145 round trip (unless yet another exception to the SDP rules are
1146 adopted).
1148 4.7. Video Codec Option Change Scenario
1150 Consider the follow scenario:
1152 1. JavaScript wishes to change a video codec option
1154 Using the current WebRTC API, this would require parsing the entire
1155 SDP, isolating the video codecs for a particular video media line,
1156 figuring the mapping and then reconstructing the original SDP with
1157 the newly incorporated changes. Accessors have been suggested for
1158 these common use cases but do not exist yet. If such accessors are
1159 created then a more involved API cannot be avoided out of necessity.
1160 One of the main justifications given by SDP proponents for only
1161 having an API that creates and accepts SDP is due to its supposed
1162 simplicity, as opposed to providing a more involved API.
1164 4.8. Video Upgrade Scenario
1166 1. Alice and Bob are having an audio conversation
1168 2. Alice presses the video button on her application and offers Bob
1169 video
1171 3. Bob does not wish to see Alice's video, so the application
1172 rejects the media (e.g. using "a=inactive" or "m=video 0")
1174 4. Alice's web application successfully parses and interprets Bob's
1175 rejection
1177 5. As Alice's video window of herself is independent of the SDP
1178 negotiation, Alice's HTML5 application successfully renders
1179 Alice's video locally
1181 The current WebRTC implementation offers no event to indicate the
1182 rejection, thus Alice is given no feedback of the rejection. She
1183 incorrectly assumes she's in a video conversation. In order to solve
1184 this scenario, custom signaling must be added to indicate of Bob's
1185 rejection of Alice's video. Yet this is duplication of signaling as
1186 the video is already rejected in the SDP. This leaves the JavaScript
1187 developer with a choice: either parse the SDP, understand the SDP and
1188 derive meaning, or duplicate the SDP efforts by introducing custom
1189 signaling for a common scenario when upgrading from audio to video
1190 and providing appropriate user feedback.
1192 5. Proposal: WebRTC JavaScript Object Model
1194 5.1. Overview
1196 The browser can expose simple object methods, properties and events
1197 representing the various RTC components at an abstracted level and
1198 provide a solid API for controlling how the media should be
1199 pipelined. The properties needed to be exchanged is separated into
1200 the appropriate object rather than meshed into an all-encompassing
1201 format.
1203 A JavaScript-only shim can be layered on top of an object model to
1204 provide easy SDP offer / answer capability for those who want a
1205 similar "simple" API to the current WebRTC API for use with SIP. A
1206 developer can chose to use this shim or not if they do not need SDP.
1207 Likewise, the object model could be used to produce alternative
1208 formats to SDP if the same do-everything format is needed but in an
1209 alternative on-the-wire session description format.
1211 The object model described in the solution is presented in a related
1212 draft. This solution will allow for the RTCWEB Working Group to
1213 complete its chartered mandate without starting from scratch. If
1214 adopted, all of the drafts proposed to solve issues in expressing SDP
1215 for WebRTC can be moved to more appropriate working groups. For
1216 example, SDP for SIP issues can be moved to the appropriate SIP
1217 working groups and multi-party SDP to the MMUSIC (e.g. drafts like
1218 PlanA or PlanB).
1220 5.2. Benefits
1222 5.2.1. Greater compatibility
1224 By having a WebRTC JavaScript object model, the exact inputs,
1225 outputs, properties and events can be well defined on individual
1226 objects and each object will be designed to be a specific contract
1227 between browser vendors and JavaScript developers.
1229 5.2.2. Easier to extend
1230 New objects and methods can be added without breaking existing
1231 compatibility. Compliance can be verified with unit tests able to
1232 test each and every behavior across all browsers' versions on every
1233 platform. JavaScript developers can expect their version of the API
1234 object contract to remain fixed to expected behaviors and not break
1235 (unless through well planned deprecation).
1237 Any extensions added to a JavaScript object model does not change the
1238 behavior expectation from JavaScript developers when using the
1239 current version of the API regardless of any extensions, unless
1240 explicitly deprecated. This is unlike SDP where extensions could be
1241 silently added into the SDP produced by the browsers at will, even in
1242 minor browser version changes, where any component that consumes the
1243 SDP may be unaware what those additional feature behaviors imply or
1244 require as a result.
1246 5.2.3. Faster Reaction Time To Issues
1248 Signaling related bugs produced by the JavaScript shims can easily be
1249 fixed and updated at any time regardless of the browser's release
1250 cycle. If a SIP provider discovers their SIP is not compatible
1251 within their JavaScript shim, the SIP provider can update the shim
1252 code to their own needs dynamically without lobbying the browser
1253 vendor and waiting for the browser to be patched and updated.
1255 5.2.4. Decreased surface API
1257 With a JavaScript object model, the features are well defined so the
1258 surface API is fixed to the agreed contract. Once agreed, a browser
1259 vendor only has to ensure their compatibility with well defined
1260 limited scope unit tests, and need not worry about some free-form
1261 format that may introduce untold compatibility issues should another
1262 vendor issue an update. This is also true of any non-browsers that
1263 may wish to implement and be compliant to the WebRTC API for
1264 JavaScript and provide their own JavaScript and WebRTC engines.
1266 5.2.5. Greater compatibility for SIP
1268 While SIP is not the main RTCWEB Working Group charter responsibility
1269 for WebRTC, SIP compatibility is highly desirable. By exclusively
1270 generating SDP from a JavaScript shim, the SDP produced will be
1271 identical across all platforms and all devices with every browser
1272 version and entirely under the control of the SIP provider. This
1273 increases compatibility for SIP providers. The SDP produced from the
1274 shim can be custom tailored to a SIP network without affecting any
1275 other SIP vendor or harming compatibility with other utilizing
1276 WebRTC.
1278 5.2.6. Alternative formats
1280 With a JavaScript shim approach on top of an object model, the
1281 information going over the wire can be transformed from the
1282 JavaScript object properties to alternative formats, including JSON,
1283 XML or SIP (or anything custom). As the JavaScript shim to use is
1284 under control of the service provider and identical regardless of the
1285 platform, the output from the JavaScript format generation is
1286 consistent and controllable, thus ensuring maximum compatibility
1287 within a network.
1289 The party receiving this format can be sure the format is to an
1290 exacting specification of their choosing rather than relying on
1291 whatever format is produced by whatever browser vendor.
1293 5.3. Design Goals and Considerations
1295 5.3.1. Objects Model Kept Simple
1297 The JavaScript developer should not need to understand the mechanics
1298 of RTC other than understanding how to plumb the objects together.
1299 Those who need extended properties or events for finer control can
1300 obtain them with simple method access to an object, but those
1301 extended attributes should not be required for simple use cases.
1303 5.3.2. Simple to Gather Negotiation Information
1305 The objects model should allow a simple method for collecting
1306 information that will be needed for various alternative negotiation
1307 models, highly focused to the object. One of the targets for
1308 negotiation must be SDP and SIP.
1310 5.3.3. Offer / Answer
1312 The proposed JavaScript object model should not require the offer /
1313 answer state machine but must not preclude this state machine being
1314 built in a layer above. The offer / answer state machine must be
1315 possible to implement as a JavaScript shim without any additional
1316 built-in browser services needing to be implemented.
1318 5.3.4. Extensions
1320 Extending the object model for the expected common extension use
1321 cases without breaking the JavaScript API should be possible. Such
1322 possible extension use cases should include items like local mixing
1323 and data synchronization, or extended properties, events or features.
1325 As any design, there may be limitations but the design should hold up
1326 to various realistic scenarios that are likely to happen in the near
1327 future.
1329 5.3.5. Well Defined Behaviors
1331 An API must describe specific API behavior sets to the browser
1332 vendors so they have the appropriate guidelines for implementation,
1333 including the mapping to on-the-wire to RTC protocols. The API
1334 presented in the related draft may be the input to a W3C efforts to
1335 define specific and exact expected behavior sets for an object based
1336 JavaScript API for an official WebRTC 1.0 release.
1338 5.3.6. Data Channel
1340 The proposed WebRTC JavaScript Object model will provide a definition
1341 for basic JavaScript usage of the data channel.
1343 5.3.7. Satisfy the expectations of the RTCWEB charter
1345 The object model must adhere to the expectations of the RTCWEB
1346 charter either directly, via extensions that can be defined by the
1347 working group on top of the object model or possibly via a JavaScript
1348 shim written to utilize the functionality of the object model but it
1349 must not preclude the RTCWEB charter from fulfilling its previously
1350 stated goals.
1352 5.3.8. SIP/SDP and current WebRTC API shim compatibility statement
1354 The goal of the object model is to allow for a JavaScript shim that
1355 provides a simple mechanism for parsing and generating SDP for basic
1356 compatibility with SIP networks (capable of supporting the WebRTC
1357 wire protocols).
1359 The goal of this object based model is not to provide working
1360 JavaScript shim on top that is a 1-for-1 matching of the current
1361 WebRTC API as a shim, including all behaviors, features, bugs and
1362 expectations since the definition of the current approach is not
1363 defined enough to be able to produce that level of compatibility.
1364 This would be an impossible goal as a result, and would add little
1365 value.
1367 Extensions are beyond the scope of the JavaScript shim, but it is
1368 possible for others to fork and modify the shim to their own needs
1369 specific to their own SIP/SDP network infrastructure.
1371 Compatibility with the SDP used in all SIP networks is not a stated
1372 goal for any JavaScript shim since not even SIP providers can agree
1373 on a common agreed definitive standard set of RFCs and drafts.
1375 5.3.9. Greater Separation of RTCWEB Working Group and Other Working
1376 Groups
1378 A JavasScript object model would remove much of the need for cross
1379 IETF working group coordination, which has become common place with
1380 the current movement because of utilizing SDP and its close ties to
1381 SIP. By limiting the RTCWEB technologies used to only those required
1382 for Real-Time Communication from the browser (e.g. RTP, ICE/STUN/
1383 TURN, DTLS), the RTCWEB Working Group is freed from tight couplings
1384 with other IETF working groups, each having their own charters,
1385 schedules, agendas and interests and thus ensures more rapid progress
1386 between RTCWEB Working Group the W3C and developers who are to use
1387 this technology.
1389 6. Security Considerations
1391 While RTCWEB has it's own security considerations for protocols, a
1392 JavaScript object model has no additional requirements other than
1393 those already established for use within RTCWEB, e.g. ICE
1394 connectivity permission check or DTLS fingerprint checks.
1396 JavaScript as a browser language itself has security consideration
1397 but nothing inherent to using a JavaScript object model versus a
1398 JavaScript SDP API model, as any proposed implementations must have a
1399 JavaScript API. The specifics of any API must list their own
1400 specific security considerations to their defined model and API,
1401 should any exist.
1403 Any specific issues for the proposed JavaScript object model will be
1404 outlined in the separated draft WebRTC JavaScript object model draft
1405 as needed and warranted.
1407 7. References
1409 7.1. Normative References
1411 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
1412 with Session Description Protocol (SDP)", RFC 3264, June
1413 2002.
1415 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
1416 Description Protocol", RFC 4566, July 2006.
1418 7.2. Informative References
1420 [I-D.ietf-mmusic-sdpng]
1421 Kutscher, D., Ott, J., and C. Bormann, "Session
1422 Description and Capability Negotiation", draft-ietf-
1423 mmusic-sdpng-08 (work in progress), February 2005.
1425 [I-D.ivov-rtcweb-noplan]
1426 Ivov, E., Marocco, E., and P. Thatcher, "No Plan:
1427 Economical Use of the Offer/Answer Model in WebRTC
1428 Sessions with Multiple Media Sources", draft-ivov-rtcweb-
1429 noplan-01 (work in progress), June 2013.
1431 [I-D.roach-rtcweb-plan-a]
1432 Roach, A. and M. Thomson, "Using SDP with Large Numbers of
1433 Media Flows", draft-roach-rtcweb-plan-a-00 (work in
1434 progress), May 2013.
1436 [I-D.uberti-rtcweb-plan]
1437 Uberti, J., "Plan B: a proposal for signaling multiple
1438 media sources in WebRTC.", draft-uberti-rtcweb-plan-00
1439 (work in progress), May 2013.
1441 [MediaCapture]
1442 Burnett, D., "Media Capture and Streams", May 2013, .
1445 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
1446 Requirement Levels", BCP 14, RFC 2119, March 1997.
1448 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston,
1449 A., Peterson, J., Sparks, R., Handley, M., and E.
1450 Schooler, "SIP: Session Initiation Protocol", RFC 3261,
1451 June 2002.
1453 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V.
1454 Jacobson, "RTP: A Transport Protocol for Real-Time
1455 Applications", STD 64, RFC 3550, July 2003.
1457 [RFC5245] Rosenberg, J., "Interactive Connectivity Establishment
1458 (ICE): A Protocol for Network Address Translator (NAT)
1459 Traversal for Offer/Answer Protocols", RFC 5245, April
1460 2010.
1462 [RFC5389] Rosenberg, J., Mahy, R., Matthews, P., and D. Wing,
1463 "Session Traversal Utilities for NAT (STUN)", RFC 5389,
1464 October 2008.
1466 [RFC5766] Mahy, R., Matthews, P., and J. Rosenberg, "Traversal Using
1467 Relays around NAT (TURN): Relay Extensions to Session
1468 Traversal Utilities for NAT (STUN)", RFC 5766, April 2010.
1470 [RFC6347] Rescorla, E. and N. Modadugu, "Datagram Transport Layer
1471 Security Version 1.2", RFC 6347, January 2012.
1473 [WebRTC10]
1474 Bergkvist, A., "WebRTC 1.0 Real-time Communication Between
1475 Browsers", August 2012,
1476 .
1478 Authors' Addresses
1480 Robin Raymond
1481 Hookflash
1482 436, 3553 31 St. NW
1483 Calgary, Alberta T2L 2K7
1485 Email: robin@hookflash.com
1487 Erik Lagerway
1488 Hookflash
1489 436, 3553 31 St. NW
1490 Calgary, Alberta T2L 2K7
1491 Canada
1493 Email: erik@hookflash.com
1495 Inaki Baz Castillo
1496 Versatica
1497 Barakaldo
1498 Basque Country
1499 Spain
1501 Email: ibc@aliax.net
1503 Roman Shpount
1504 TurboBridge
1505 4905 Del Ray Ave Suite 300
1506 Bethesda, MD 20814
1507 USA
1509 Email: rshpount@turbobridge.com