idnits 2.17.1 draft-ivov-rtcweb-noplan-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** There is 1 instance of too long lines in the document, the longest one being 3 characters in excess of 72. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 419: '...ers WebRTC applications MUST therefore...' Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 29, 2013) is 3986 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Obsolete informational reference (is this intentional?): RFC 5285 (Obsoleted by RFC 8285) Summary: 3 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group E. Ivov 3 Internet-Draft Jitsi 4 Intended status: Standards Track May 29, 2013 5 Expires: November 30, 2013 7 No Plan: Economical Use of the Offer/Answer Model in WebRTC Sessions 8 with Multiple Media Sources 9 draft-ivov-rtcweb-noplan-00 11 Abstract 13 This document describes a model for the lightweight use of SDP Offer/ 14 Answer in WebRTC. The goal is to minimize reliance on Offer/Answer 15 exchanges in a WebRTC session and provide applications with the tools 16 necessary to implement the signalling that they may need in a way 17 that best fits their custom requirements and topologies. This 18 simplifies signalling of multiple media sources or providing RTP 19 Synchronisation source (SSRC) identification in multi-party sessions. 20 Another important goal of this model is to remove from clients 21 topological constraints such as the requirement to know in advance 22 all SSRC identifiers that they could potentially introduce in a 23 particular session. 25 This document does not question the use of SDP and the Offer/Answer 26 model or the value they have in terms of interoperability with legacy 27 or other non-WebRTC devices. 29 Status of This Memo 31 This Internet-Draft is submitted in full conformance with the 32 provisions of BCP 78 and BCP 79. 34 Internet-Drafts are working documents of the Internet Engineering 35 Task Force (IETF). Note that other groups may also distribute 36 working documents as Internet-Drafts. The list of current Internet- 37 Drafts is at http://datatracker.ietf.org/drafts/current/. 39 Internet-Drafts are draft documents valid for a maximum of six months 40 and may be updated, replaced, or obsoleted by other documents at any 41 time. It is inappropriate to use Internet-Drafts as reference 42 material or to cite them other than as "work in progress." 44 This Internet-Draft will expire on November 30, 2013. 46 Copyright Notice 47 Copyright (c) 2013 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (http://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 Table of Contents 62 1. Background . . . . . . . . . . . . . . . . . . . . . . . . . 2 63 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 64 3. Reliance on Offer/Answer . . . . . . . . . . . . . . . . . . 5 65 3.1. Interoperability with Legacy . . . . . . . . . . . . . . 6 66 4. Additional Session Control and Signalling . . . . . . . . . . 8 67 5. Demultiplexing and Identifying Streams (Use of Bundle) . . . 9 68 6. Simulcasting, FEC, Layering and RTX (Open Issue) . . . . . . 10 69 7. WebRTC API Requirements . . . . . . . . . . . . . . . . . . . 12 70 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 71 9. Informative References . . . . . . . . . . . . . . . . . . . 13 72 Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 14 73 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 14 75 1. Background 77 In its early stages the RTCWEB working group chose to use the Session 78 Description Protocol (SDP) and the Offer/Answer model [RFC3264] when 79 establishing and negotiating sessions. This choice was also 80 accompanied by the decision not to mandate a specific signalling 81 protocol so that, once interoperability has been achieved, web 82 applications can choose the semantics that best fit their 83 requirements. In some scenarios however, such as those involving the 84 use of multiple media sources, these choices have left open the issue 85 of exactly which operations should be handled by SDP Offer/Answer and 86 which of them should be left to application-specific signalling. 88 At the time of writing of this document, the RTCWEB working group is 89 considering two approaches to addressing the issue, that are often 90 referred to as Plan A [PlanA] and Plan B [PlanB]. Both of them 91 describe semantics that require Offer/Answer exchanges in a number of 92 situations where this could be avoided, particularly when adding or 93 removing media sources to a session. This requirement applies 94 equally to cases where a client adds the stream of a newly activated 95 web cam, a simulcast flow or upon the arrival or departure of a 96 conference participant. 98 Plan A handles such notifications with the addition or removal of 99 independent m= lines [PlanA], while Plan B relies on the use of 100 multiplexed m= lines but still depends on the Offer/Answer exchanges 101 for the addition or removal of media stream identifiers [MSID]. 103 By taking the Offer/Answer approach, both Plan A and Plan B take away 104 from the application the opportunity to handle such events in a way 105 that is most fitting for the use case, which, among other things, 106 also goes against the working group's decision to not to define a 107 specific signalling protocol. (It could be argued that it is 108 therefore only natural how proponents of each plan, having different 109 use cases in mind, are remarkably far from reaching consensus). 111 Another problem, more specific to Plan B, is the reliance on 112 preliminary announcement of SSRC identifiers for stream 113 identification. Why this could be perceived as relatively 114 straightforward in one-to-one sessions or even conference calls 115 within controlled environments, it can be a problem in the following 116 cases: 118 o interoperability with legacy/non-WebRTC endpoints 120 o use within non-controlled and potentially federated conference 121 environments where new RTP streams may appear relatively often. 122 In such cases the signalling required to describe all of them 123 through Offer/Answer may represent substantial overhead while none 124 or only a part of it (e.g. the description of a main, active 125 speaker stream) may be required by the application. 127 By increasing the number of Offer/Answer exchanges Both Plan A and 128 Plan B also increase the risk of encountering glare situations (i.e. 129 cases where both parties attempt to modify a session at the same 130 time). While glare is also possible with basic Offer/Answer and 131 resolution of such situations must be implemented anyway, the need to 132 frequently resort to such code may either negatively impact user 133 experience (e.g. when "back off" resolution is used) or require 134 substantial modifications in the Offer/Answer model and/or further 135 venturing into the land of signalling protocols 136 [ROACH-GLARELESS-ADD]. 138 Finally, both Plan A and Plan B, also create expectations that fine 139 grained control of FEC, layering and RTX flows will always be 140 implemented through Offer/Answer, which would not necessarily the 141 best way to handle this in congested situations. 143 2. Introduction 145 The goal of this document is to provide directions for use of the SDP 146 Offer/Answer model in a way that satisfies the following 147 requirements: 149 o the addition and removal of media sources (e.g. conference 150 participants, multiple web cams or "slides" ) must be possible 151 without the need of Offer/Answer exchanges; 153 o the addition or removal of simulcast or layered streams must be 154 possible without the need for Offer/Answer exchanges beyond the 155 initial declaration of such capabilities for either direction. 157 o call establishment must not require preliminary announcement or 158 even knowledge of all potentially participating media sources; 160 o application specific signalling should be used to cover most 161 semantics following call establishment, such as adding, removing 162 or identifying SSRCs; 164 o straightforward interoperability with widely deployed legacy 165 endpoints with rudimentary support for Offer/Answer. This 166 includes devices that allow for one audio and potentially one 167 video m= line and that expect to only ever be required to render a 168 single RTP stream at a time for any of them. (Note that this does 169 NOT include devices that expect to see multiple "m=video" lines 170 for different SSRCs as they can hardly be viewed as "widely 171 deployed legacy"). 173 To achieve the above requirements this specification expects that 174 browsers and WebRTC endpoints in general will only use SDP Offer/ 175 Answer to establish transport channels and initialize an RTP stack 176 and codec/processing chains. This also includes any renegotiation 177 that requires the re-initialisation of these chains. For example, 178 adding VP8 to a session that was setup with only H.264, would 179 obviously still require an Offer/Answer exchange. 181 All other session control and signalling are to be left to 182 applications. 184 The actual Offer/Answer semantics presented here do not differ 185 fundamentally from those proposed by Plan A and Plan B. The main 186 differentiation point of this approach is the fact that the exact 187 protocol mechanism is left to WebRTC applications. Such applications 188 or lightweight signalling gateways can then implement either Plan A, 189 or Plan B, or an entirely different signalling protocol, depending on 190 what best matches their use cases and topology. 192 3. Reliance on Offer/Answer 194 The model presented in this specification relies on use of SDP and 195 Offer/Answer in quite the same way as many of the pre-WebRTC (and 196 most of the legacy) endpoints do: negotiating formats, establishing 197 transport channels and exchanging, in a declarative way, media and 198 transport parameters that are then used for the initialization of the 199 corresponding stacks. 201 The following is an example presenting what this specification views 202 as a typical offer sent by a WebRTC endpoint: 204 v=0 205 o=- 0 0 IN IP4 198.51.100.33 206 s= 207 t=0 0 209 a=group:BUNDLE audio video // declaring BUNDLE Support 210 c=IN IP4 198.51.100.33 211 a=ice-ufrag:Qq8o/jZwknkmXpIh // initializing ICE 212 a=ice-pwd:gTMACiJcZv1xdPrjfbTHL5qo 213 a=ice-options:trickle 214 a=fingerprint:sha-1 // DTLS-SRTP keying 215 a4:b1:97:ab:c7:12:9b:02:12:b8:47:45:df:d8:3a:97:54:08:3f:16 217 m=audio 5000 RTP/SAVPF 96 0 8 218 a=mid:audio 219 a=rtcp-mux 221 a=rtpmap:96 opus/48000/2 // PT mappings 222 a=rtpmap:0 PCMU/8000 223 a=rtpmap:8 PCMA/8000 225 a=extmap:1 urn:ietf:params:rtp-hdrext:csrc-audio-level //5825 header 226 a=extmap:2 urn:ietf:params:rtp-hdrext:ssrc-audio-level //extensions 228 [ICE Candidates] 230 m=video 5002 RTP/SAVPF 97 98 231 a=mid:video 232 a=rtcp-mux 234 a=rtpmap:97 VP8/90000 // PT mappings and resolutions capabilities 235 a=imageattr:97 \ 236 send [x=[480:16:800],y=[320:16:640],par=[1.2-1.3],q=0.6] \ 237 [x=[176:8:208],y=[144:8:176],par=[1.2-1.3]] \ 239 recv * 240 a=rtpmap:98 H264/90000 241 a=imageattr:98 send [x=800,y=640,sar=1.1,q=0.6] [x=480,y=320] \ 242 recv [x=330,y=250] 244 a=extmap:3 urn:ietf:params:rtp-hdrext:fec-source-ssrc //5825 header 245 a=extmap:4 urn:ietf:params:rtp-hdrext:rtx-source-ssrc //extensions 247 a=max-send-ssrc:{*:1} // declaring maximum 248 a=max-recv-ssrc:{*:4} // number of SSRCs 250 [ICE Candidates] 252 The answer to the offer above would have roughly the same structure 253 and content. The most important aspects here are: 255 o Preserves interoperability with most kinds of legacy or non-WebRTC 256 endpoints. 258 o Allows the negotiation of most parameters that concern the media/ 259 RTP stack (typically the browser). 261 o Only a single Offer/Answer exchange is required for session 262 establishment and, in most cases, for the entire duraftion of a 263 session. 265 o Leaves complete freedom to applications as to the way that they 266 are going to signal any other information such as SSRC 267 identification information or the addition or removal of RTP 268 streams. 270 3.1. Interoperability with Legacy 272 Interoperating with the "widely deployed legacy endpoints" is one of 273 the main reasons for the RTCWEB working group to choose the SDP Offer 274 /Answer model as basis for media negotiation. It is hence important 275 to clarify the compatibility claims that this specification makes. 277 A "widely deployed legacy endpoint" is considered to have the 278 following characteristics: 280 o Likely to use the SIP protocol. 282 o Capability to gracefully handle one audio and potentially one 283 video m= line in an SDP Offer. 285 o Capability to render one SSRC per m=line at any given moment but 286 multiple, consecutive SSRCs over a period of time. This would be 287 the case with transferred session replacements for example. While 288 the capability to handle multiple SSRCs simultaneously is not 289 uncommon it cannot be relied upon and should first be confirmed by 290 signalling. 292 o Possibly have features such as ICE, BUNDLE, RTCP-MUX, etc. Just 293 as likely not to. 295 o Very unlikely to announce in SDP the SSRCs that they intend to use 296 for a given session. 298 o Exact set of features and capabilities: Guaranteed to be wildly 299 and widely diverse. 301 While it is relatively simple for RTCWEB to accommodate some of the 302 above, it is obviously impossible to design a model that could simply 303 be labeled as "compatible with legacy". It is reasonable to assume 304 that use cases involving use of such endpoints will be designed for a 305 relatively specific set of devices and applications. The role of the 306 WebRTC framework is to hence provide a least-common-denominator model 307 that can then be extended by applications. 309 It is just as important not to make choices or assumptions that will 310 render interoperability for some applications or topologies difficult 311 or even impossible. 313 This is exactly what the use of Offer/Answer discussed here strives 314 to achieve. Audio/Video offers originating from WebRTC endpoints 315 will always have a maximum of one audio and one video m= line. It 316 will be up to applications to determine exactly how many streams they 317 can afford to send once such a session has been established. The 318 exact mechanism to do this is outside the scope of this document (or 319 WebRTC in general). 321 Note that it is still possible for WebRTC endpoints to indicate 322 support for a maximum number of incoming or outgoing streams for 323 reasons such as processing constraints. Use of the "max-send-ssrc" 324 and "max-recv-ssrc" attributes [MAX-SSRC] could be one way of doing 325 this, although that mechanism would need to be extended to provide 326 ways of distinguishing between independent flows and complementary 327 ones such as layered FEC and RTX. Even with this in mind it is still 328 important, not to rely on the presence of that indication in incoming 329 descriptions as well as to provide applications with a way of 330 retrieving such capabilities from the WebRTC stack (e.g. the 331 browser). 333 Determining whether a peer has the ability to seamlessly switch from 334 one SSRC to another is also left to application specific signalling. 335 It is worth noting that protocols such as SIP for example, often 336 accompany SSRC replacements with extra signalling (re-INVITEs with a 337 "replaces" header) that can easily be reused by applications or 338 mapped to something that they deem more convenient. 340 For the sake of interoperability this specification strongly advises 341 against the use of multiple m= lines for a single media type. Not 342 only would such use be meaningless to a large number of legacy 343 endpoints but it is also likely to be mishandled by many of them and 344 to cause unexpected behaviour. 346 Finally, it is also worth pointing out that there is a significant 347 number of feature rich non-WebRTC applications and devices that have 348 relatively advanced, modern sets of capabilities. Such endpoints 349 hardly fit the "legacy" qualification. Yet, as is often the case 350 with novel and/or proprietary applications, they too have adopted 351 diverse signalling mechanisms and the requirements described in this 352 section fully apply when it comes to interoperating with them. 354 4. Additional Session Control and Signalling 356 o Adding and removing RTP streams to an existing session. 358 o Accepting and refusing some of them. 360 o Identifying SSRCs and obtaining additional metadata for them (e.g. 361 the user corresponding to a specific SSRC). 363 All of the above semantics are best handled and hence should be left 364 to applications. There are numerous existing or emerging solutions, 365 some of them developed by the IETF, that already cover this. This 366 includes CLUE channels [CLUE], the SIP Event Package For Conference 367 State [RFC4575] and its XMPP variant [COIN]. Additional mechanisms, 368 undoubtedly many based on JSON, are very likely to emerge in the 369 future as WebRTC applications address varying use cases, scenarios 370 and topologies. 372 The most important part of this specification is hence to prevent 373 certain assumptions or topologies from being imposed on applications. 374 One example of this is the need to know and include in the Offer/ 375 Answer exchange, all the SSRCs that can show up in a session. This 376 can be particularly problematic for scenarios that involve non-WebRTC 377 endpoints. 379 Large scale conference calls, potentially federated through RTP 380 translator-like bridges, would be another problematic scenario. 382 Being able to always pre-announce SSRCs in such situations could of 383 course be made to work but it would come at a price. It would either 384 require a very high number of Offer/Answer updates that propagate the 385 information through the entire topology, or use of tricks such as 386 pre-allocating a range of "fake" SSRCs, announcing them to 387 participants and then overwriting the actual SSRCs with them. 388 Depending on the scenario both options could prove inappropriate or 389 inefficient while some applications may not even need such 390 information. Others could be retrieving it through simplistic means 391 such as access to a centralized resource (e.g. an URL pointing to a 392 JSON description of the conference). 394 5. Demultiplexing and Identifying Streams (Use of Bundle) 396 This document assumes use of BUNDLE in WebRTC endpoints. This 397 implies that all RTP streams are likely to end up being received on 398 the same port. A demuxing mechanism is therefore necessary in order 399 for these packets to then be fed into the appropriate processing 400 chain (i.e. matched to an m= line). 402 Note: it is important to distinguish between the demultiplexing 403 and the identification of incoming flows. Throughout this 404 specification the former is used to refer to the process of 405 choosing selecting a depacketizing/decoding/processing chain to 406 feed incoming packets to. Such decisions depend solely on the 407 format that is used to encode the content of incoming packets. 409 The above is not to be confused with the process of making 410 rendering decision about a processed flow. Such decisions include 411 showing a "current speaker" flow at a specific location, window or 412 video tag, while choosing a different one for a second, "slides" 413 flow. Another example would be the possibility to attach "Alice", 414 "Bob" and "Carol" labels on top of the appropriate UI components. 415 This specification leaves such rendering choices entirely to 416 application-specific signalling as described in Section 4. 418 This specification uses demuxing based on RTP payload types. When 419 creating offers and answers WebRTC applications MUST therefore 420 allocate RTP payload types only once per bundle group. In cases 421 where rtcp-mux is in use this would mean a maximum of 96 payload 422 types per bundle [RFC5761]. It has been pointed out that some legacy 423 devices may have unpredictable behaviour with payload types that are 424 outside the 96-127 range reserved by [RFC3551] for dynamic use. Some 425 applications or implementations may therefore choose not to use 426 values outside this range. Whatever the reason, offerers that find 427 they need more than the available payload type numbers, will simply 428 need to either use a second bundle group or not use BUNDLE at all 429 (which in the case of a single audio and a single video m= line 430 amounts to roughly the same thing). This would also imply building a 431 dynamic table, mapping SSRCs to PTs and m= lines, in order to then 432 also allow for RTCP demuxing. 434 While not desirable, the implications of such a decision would be 435 relatively limited. Use of trickle ICE [TRICKLE-ICE] is going to 436 lessen the impact on call establishment latency. Also, the fact that 437 this would only occur in a limited number of cases makes it unlikely 438 to have a significant effect on port consumption. 440 An additional requirement that has been expressed toward demuxing is 441 the ability to assign incoming packets with the same payload type to 442 different processing chains depending on their SSRCs. A possible 443 example for this is a scenario where two video streams are being 444 rendered on different video screens that each have their own decoding 445 hardware. 447 While the above may appear as a demuxing and a decoding related 448 problem it is really mostly a rendering policy specific to an 449 application. As such it should be handled by app. specific 450 signalling that could involve custom-formatted, per-SSRC information 451 that accompanies SDP offers and answers. 453 6. Simulcasting, FEC, Layering and RTX (Open Issue) 455 From a WebRTC perspective, repair flows such as layering, FEC, RTX 456 and to some extent simulcasting, present an interesting challenge, 457 which is why they are considered an open issue by this specification. 459 On the one hand they are transport utilities that need to be 460 understood, supported and used by browsers in a way that is mostly 461 transparent to applications. On the other, some applications may 462 need to be made aware of them and given the option to control their 463 use. This could be necessary in cases where their use needs to be 464 signalled to non-WebRTC endpoints in an application specific way. 465 Another example is the possibility for an application to choose to 466 disable some or all repair flows because it has been made aware by 467 application-specific signalling that they are temporarily not being 468 used/rendered by the remote end (e.g. because it is only displaying 469 a thumbnail or because a corresponding video tag is not currently 470 visible). 472 One way of handling such flows would be to advertise them in the way 473 suggested by [RFC5956] and to then control them through application 474 specific signalling. This options has the merit of already existing 475 but it also implies the pre-announcement and propagation of SSRCs and 476 the bloated signalling that this incurs. Also, relying solely on 477 Offer/Answer here would expose an offerer to the typical race 478 condition of repair SSRCs arriving before the answer and the 479 processing ambiguity that this would imply. 481 Another approach could be a combination of RTCP and RTP header 482 extensions [RFC5285] in a way similar to the one employed by the 483 Rapid Synchronisation of RTP Flows [RFC6051]. While such a mechanism 484 is not currently defined by the IETF, specifying it could be 485 relatively straightforward: 487 Every packet belonging to a repair flow could carry an RTP header 488 extension [RFC5285] that points to the source stream (or source layer 489 in case of layered mechanisms). The following shows one possible way 490 of signalling this: 492 a=extmap:3 urn:ietf:params:rtp-hdrext:fec-source-ssrc 494 In this case the actual RTP packet and header extension could look 495 like this: 497 0 1 2 3 498 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 499 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 500 |V=2|P|1| CC |M| PT | sequence number | 501 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+R 502 | timestamp |T 503 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+P 504 | synchronisation source (SSRC) identifier | 505 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 506 | 0xBE | 0xDE | length=3 | 507 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+E 508 | ID-3 | L=3 | SSRC of the source RTP flow ... |x 509 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+t 510 | ... SSRC | 0 (pad) | 0 (pad) | 0 (pad) |n 511 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 512 | payload data | 513 | .... | 514 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 516 Note that the above is just a stub. It's an example that's meant to 517 show one possible solution with some mechanisms (e.g. 1-D 518 Interleaved Parity [RFC6015]). Other mechanisms may and probably 519 will require different extensions or signalling ([SRCNAME] will 520 likely be an option for some). In some cases, where layering 521 information is provided by the codec, an extensions is not going to 522 be necessary at all. 524 In cases where FEC or simulcast relations are not immediately needed 525 by the recipient, the above information could also be delayed until 526 the reception of the first RTCP packet. 528 7. WebRTC API Requirements 530 One of the main characteristics of this specification is the use of 531 SDP for transport channel setup and media stack initialisation only. 532 In order for applications to be able to cover everything else it is 533 important that WebRTC APIs actually allow for it. Given the initial 534 directions taken by early implementations and specification work, 535 this is currently almost but not entirely possible. 537 The following is a list of requirements that the WebRTC APIs would 538 need to satisfy in order for this specification to be usable. (Note: 539 some of the items are already possible and are only included for the 540 sake of completeness.) 542 1. Expose the SSRCs of all local MediaStreamTrack-s that the 543 application may want to attach to a PeerConnection. 545 2. Expose the SSRCs of all remote MediaStreamTrack-s that are 546 received on a PeerConnection 548 3. Expose to applications all locally generated repair flows that 549 exist for a source (e.g. FEC and RTX flows that will be 550 generated for a webcam) their types relations and SSRCs. 552 4. Expose information about the maximum number of incoming streams 553 that can be decoded and rendered. 555 5. Applications should be able to pause and resume (disable and 556 enable) any MediaStreamTrack. This should also include the 557 possibility to do so for specific repair flows. 559 6. Information about how certain MediaStreamTrack-s relate to each 560 other (e.g. a given audio flow is related to a specific video 561 flow) may be exchanged by applications after media has started 562 arriving. At that point the corresponding MediaStreamTrack-s may 563 have been announced to the application within independent 564 MediaStream-s. It should therefore be possible for applications 565 to join such tracks within a single MediaStream. 567 8. IANA Considerations 569 None. 571 9. Informative References 573 [CLUE] Duckworth, M., Pepperell, A., and S. Wenger, "Framework 574 for Telepresence Multi-Streams", reference.I-D.ietf-clue- 575 framework (work in progress), May 2013, . 578 [COIN] Ivov, E. and E. Marocco, "XEP-0298: Delivering Conference 579 Information to Jingle Participants (Coin)", XSF XEP 0298, 580 June 2011, . 582 [MAX-SSRC] 583 Westerlund, M., Burman, B., and F. Jansson, "Multiple 584 Synchronization sources (SSRC) in RTP Session Signaling ", 585 reference.I-D.westerlund-avtcore-max-ssrc (work in 586 progress), July 2012, . 589 [MSID] Alvestrand, H., "Cross Session Stream Identification in 590 the Session Description Protocol", reference.I-D.ietf- 591 mmusic-msid (work in progress), February 2013, 592 . 594 [PlanA] Roach, A. B. and M. Thomson, "Using SDP with Large Numbers 595 of Media Flows", reference.I-D.roach-rtcweb-plan-a (work 596 in progress), May 2013, . 599 [PlanB] Uberti, J., "Plan B: a proposal for signaling multiple 600 media sources in WebRTC.", reference.I-D.uberti-rtcweb- 601 plan (work in progress), May 2013, . 604 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 605 with Session Description Protocol (SDP)", RFC 3264, June 606 2002. 608 [RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and 609 Video Conferences with Minimal Control", STD 65, RFC 3551, 610 July 2003. 612 [RFC4575] Rosenberg, J., Schulzrinne, H., and O. Levin, "A Session 613 Initiation Protocol (SIP) Event Package for Conference 614 State", RFC 4575, August 2006. 616 [RFC5285] Singer, D. and H. Desineni, "A General Mechanism for RTP 617 Header Extensions", RFC 5285, July 2008. 619 [RFC5761] Perkins, C. and M. Westerlund, "Multiplexing RTP Data and 620 Control Packets on a Single Port", RFC 5761, April 2010. 622 [RFC5956] Begen, A., "Forward Error Correction Grouping Semantics in 623 the Session Description Protocol", RFC 5956, September 624 2010. 626 [RFC6015] Begen, A., "RTP Payload Format for 1-D Interleaved Parity 627 Forward Error Correction (FEC)", RFC 6015, October 2010. 629 [RFC6051] Perkins, C. and T. Schierl, "Rapid Synchronisation of RTP 630 Flows", RFC 6051, November 2010. 632 [ROACH-GLARELESS-ADD] 633 Roach, A. B., "An Approach for Adding RTCWEB Media Streams 634 without Glare", reference.I-D.roach-rtcweb-glareless-add 635 (work in progress), May 2013, . 638 [SRCNAME] Westerlund, M., Burman, B., and P. Sandgren, "RTCP SDES 639 Item SRCNAME to Label Individual Sources ", reference.I-D 640 .westerlund-avtext-rtcp-sdes-srcname (work in progress), 641 October 2012, . 644 [TRICKLE-ICE] 645 Ivov, E., Rescorla, E.K., and J. Uberti, "Trickle ICE: 646 Incremental Provisioning of Candidates for the Interactive 647 Connectivity Establishment (ICE) Protocol ", reference.I-D 648 .ivov-mmusic-trickle-ice (work in progress), March 2013, 649 . 651 Appendix A. Acknowledgements 653 Many thanks to Enrico Marocco, Bernard Aboba and Peter Thatcher for 654 reviewing this document and providing numerous comments and 655 substantial input. 657 Author's Address 658 Emil Ivov 659 Jitsi 660 Strasbourg 67000 661 France 663 Phone: +33-177-624-330 664 Email: emcho@jitsi.org