idnits 2.17.1 draft-ietf-mmusic-msid-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 10, 2013) is 4094 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-54) exists of draft-ietf-mmusic-sdp-bundle-negotiation-01 Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group H. Alvestrand 3 Internet-Draft Google 4 Intended status: Standards Track February 10, 2013 5 Expires: August 14, 2013 7 Cross Session Stream Identification in the Session Description Protocol 8 draft-ietf-mmusic-msid-00 10 Abstract 12 This document specifies a grouping mechanism for RTP media streams 13 that can be used to specify relations between media streams within 14 different RTP sessions as well as within a single RTP session, and 15 independently of whether these media streams are described within one 16 SDP m-line or in multiple m-lines. 18 This mechanism is used to signal the association between the RTP 19 concept of SSRC and the WebRTC concept of "MediaStream" / 20 "MediaStreamTrack" using SDP signaling. 22 This document is a work item of the MMUSIC WG, whose discussion list 23 is mmusic@ietf.org. 25 Requirements Language 27 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 28 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 29 document are to be interpreted as described in RFC 2119 [RFC2119]. 31 Status of this Memo 33 This Internet-Draft is submitted in full conformance with the 34 provisions of BCP 78 and BCP 79. 36 Internet-Drafts are working documents of the Internet Engineering 37 Task Force (IETF). Note that other groups may also distribute 38 working documents as Internet-Drafts. The list of current Internet- 39 Drafts is at http://datatracker.ietf.org/drafts/current/. 41 Internet-Drafts are draft documents valid for a maximum of six months 42 and may be updated, replaced, or obsoleted by other documents at any 43 time. It is inappropriate to use Internet-Drafts as reference 44 material or to cite them other than as "work in progress." 46 This Internet-Draft will expire on August 14, 2013. 48 Copyright Notice 49 Copyright (c) 2013 IETF Trust and the persons identified as the 50 document authors. All rights reserved. 52 This document is subject to BCP 78 and the IETF Trust's Legal 53 Provisions Relating to IETF Documents 54 (http://trustee.ietf.org/license-info) in effect on the date of 55 publication of this document. Please review these documents 56 carefully, as they describe your rights and restrictions with respect 57 to this document. Code Components extracted from this document must 58 include Simplified BSD License text as described in Section 4.e of 59 the Trust Legal Provisions and are provided without warranty as 60 described in the Simplified BSD License. 62 Table of Contents 64 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 65 1.1. Structure Of This Document . . . . . . . . . . . . . . . . 3 66 1.2. Why A New Mechanism Is Needed . . . . . . . . . . . . . . 3 67 1.3. Application to the WEBRTC MediaStream . . . . . . . . . . 4 68 2. The Msid Mechanism . . . . . . . . . . . . . . . . . . . . . . 5 69 3. The Msid-Semantic Attribute . . . . . . . . . . . . . . . . . 6 70 4. Applying Msid to WebRTC MediaStreams . . . . . . . . . . . . . 6 71 4.1. Handling of non-signalled tracks . . . . . . . . . . . . . 7 72 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 73 6. Security Considerations . . . . . . . . . . . . . . . . . . . 9 74 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 10 75 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 10 76 8.1. Normative References . . . . . . . . . . . . . . . . . . . 10 77 8.2. Informative References . . . . . . . . . . . . . . . . . . 10 78 Appendix A. Design considerations, open questions and and 79 alternatives . . . . . . . . . . . . . . . . . . . . 11 80 Appendix B. Change log . . . . . . . . . . . . . . . . . . . . . 11 81 B.1. Changes from rtcweb-msid-00 to -01 . . . . . . . . . . . . 12 82 B.2. Changes from alvestrand-rtcweb-msid-01 to -02 . . . . . . 12 83 B.3. Changes from alvestrand-rtcweb-msid-02 to 84 mmusic-msid-00 . . . . . . . . . . . . . . . . . . . . . . 12 85 B.4. Changes from alvestrand-mmusic-msid-00 to -01 . . . . . . 12 86 B.5. Changes from alvestrand-mmusic-msid-01 to -02 . . . . . . 12 87 B.6. Changes from alvestrand-mmusic-msid-02 to 88 ietf-mmusic-00 . . . . . . . . . . . . . . . . . . . . . . 13 89 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 13 91 1. Introduction 93 1.1. Structure Of This Document 95 This document extends the SSRC grouping framework [RFC5888] by adding 96 a new grouping relation that can cross RTP session boundaries if 97 needed. 99 Section 1.2 gives the background on why a new mechanism is needed. 101 Section 2 gives the definition of the new mechanism. 103 Section 4 gives the application of the new mechanism for providing 104 necessary semantic information for the association of 105 MediaStreamTracks to MediaStreams in the WebRTC API . 107 1.2. Why A New Mechanism Is Needed 109 When media is carried by RTP [RFC3550], each RTP media stream is 110 distinguished inside an RTP session by its SSRC; each RTP session is 111 distinguished from all other RTP sessions by being on a different 112 transport association (strictly speaking, 2 transport associations, 113 one used for RTP and one used for RTCP, unless RTCP multiplexing 114 [RFC5761] is used). 116 There exist cases where an application using RTP and SDP needs to 117 signal some relationship between RTP media streams that may be 118 carried in either the same RTP session or different RTP sessions. 119 For instance, there may be a need to signal a relationship between a 120 video track in one RTP session and an audio track in another RTP 121 session. In traditional SDP, it is not possible to signal that these 122 two tracks should be carried in one session, so they are carried in 123 different RTP sessions. 125 Traditionally, SDP was used to describe the RTP sessions, with one 126 m-line being used to describe each RTP session. With the advent of 127 extensions like BUNDLE [I-D.ietf-mmusic-sdp-bundle-negotiation], this 128 association may be more complex, with multiple m-lines being used to 129 describe one RTP session; the rest of this document therefore talks 130 about m-lines, not RTP sessions, when describing the signalling 131 mechanism. 133 The SSRC grouping mechanism ("a=ssrc-group") [RFC5576] can be used to 134 associate RTP media streams when those RTP media streams are 135 described by the same m-line. The semantics of this mechanism 136 prevent the association of RTP media streams that are spread across 137 different m-lines. 139 The SDP grouping framework [RFC5888] can be used to group m-lines. 140 When an m-line describes one and only one RTP media stream, it is 141 possible to associate RTP media streams across different m-lines. 142 However, if an m-line has multiple RTP media streams, using multiple 143 SSRCs, the SDP grouping framework cannot be used for this purpose. 145 There are use cases (some of which are discussed in 146 [I-D.westerlund-avtcore-multiplex-architecture] ) where neither of 147 these approaches is appropriate; In those cases, a new mechanism is 148 needed. 150 In addition, there is sometimes the need for an application to 151 specify some application-level information about the association 152 between the SSRC and the group. This is not possible using either of 153 the frameworks above. 155 1.3. Application to the WEBRTC MediaStream 157 The W3C WebRTC API specification [W3C.WD-webrtc-20120209] specifies 158 that communication between WebRTC entities is done via MediaStreams, 159 which contain MediaStreamTracks. A MediaStreamTrack is generally 160 carried using a single SSRC in an RTP session (forming an RTP media 161 stream. The collision of terminology is unfortunate.) There might 162 possibly be additional SSRCs, possibly within additional RTP 163 sessions, in order to support functionality like forward error 164 correction or simulcast. This complication is ignored below. 166 In the RTP specification, media streams are identified using the SSRC 167 field. Streams are grouped into RTP Sessions, and also carry a 168 CNAME. Neither CNAME nor RTP session correspond to a MediaStream. 169 Therefore, the association of an RTP media stream to MediaStreams 170 need to be explicitly signaled. 172 The marking needs to be on a per-SSRC basis, since one RTP session 173 can carry media from multiple MediaStreams, and one MediaStream can 174 have media in multiple RTP sessions. This means that the [RFC4574] 175 "label" attribute, which is used to label m-lines, is not usable for 176 this purpose. 178 The marking needs to also carry the unique identifier of the RTP 179 media stream as a MediaStreamTrack within the media stream; this is 180 done using a single letter to identify whether it belongs in the 181 video or audio track list, and the MediaStreamTrack's position within 182 that array. 184 This usage is described in Section 4. 186 2. The Msid Mechanism 188 This document extends the Source-Specific Media Attributes framework 189 [RFC5576] by adding a new "msid" attribute that can be used with the 190 "a=ssrc" SDP attribute. This new attribute allows endpoints to 191 associate RTP media streams that are carried in the same or different 192 m-lines, as well as allowing application-specific information to the 193 association. 195 The value of the "msid" attribute consists of an identifier and 196 optional application-specific data, according to the following ABNF 197 [RFC5234] grammar: 199 ; "attribute" is defined in RFC 4566. 200 ; This attribute should be used with the ssrc-attr from RFC 5576. 201 attribute =/ msid-attr 202 msid-attr = "msid:" identifier [ " " appdata ] 203 identifier = token 204 appdata = token 206 An example MSID value for the SSRC 1234 might look like this: 207 a=ssrc:1234 msid:examplefoo v1 209 The identifier is a string of ASCII characters chosen from 0-9, a-z, 210 A-Z and - (hyphen), consisting of between 1 and 64 characters. It 211 MUST be unique among the identifier values used in the same SDP 212 session. It is RECOMMENDED that is generated using a random-number 213 generator. 215 Application data is carried on the same line as the identifier, 216 separated from the identifier by a space. 218 The identifier uniquely identifies a group within the scope of an SDP 219 description. 221 There may be multiple msid attributes on a single SSRC. There may 222 also be multiple SSRCs that have the same value for identifier and 223 application data. 225 Endpoints can update the associations between SSRCs as expressed by 226 msid attributes at any time; the semantics and restrictions of such 227 grouping and ungrouping are application dependent. 229 3. The Msid-Semantic Attribute 231 In order to fully reproduce the semantics of the SDP and SSRC 232 grouping frameworks, a session-level attribute is defined for 233 signaling the semantics associated with an msid grouping. 235 This OPTIONAL attribute gives the group identifier and its group 236 semantic; it carries the same meaning as the ssrc-group-attr of RFC 237 5576 section 4.2, but uses the identifier of the group rather than a 238 list of SSRC values. 240 An empty list of identifiers is an indication that the sender 241 understands the indicated semantic, but has no msid groupings of the 242 given type in the present SDP. 244 The ABNF of msid-semantic is: 246 attribute =/ msid-semantic-attr 247 msid-semantic-attr = "msid-semantic:" token (" " identifier)* 248 token = 250 The semantic field may hold values from the IANA registries 251 "Semantics for the "ssrc-group" SDP Attribute" and "Semantics for the 252 "group" SDP Attribute". 254 An example msid-semantic might look like this: 255 a=msid-semantic:LS xyzzy forolow 257 This means that the SDP description has two lip sync groups, with the 258 group identifiers xyzzy and forolow, respectively. 260 4. Applying Msid to WebRTC MediaStreams 262 This section creates a new semantic for use with the framework 263 defined in Section 2, to be used for associating SSRCs representing 264 MediaStreamTracks within MediaStreams as defined in 265 [W3C.WD-webrtc-20120209]. 267 The semantic token for this semantic is "WMS" (short for WebRTC Media 268 Stream). 270 The value of the msid corresponds to the "id" attribute of a 271 MediaStream. 273 In a WebRTC-compatible SDP description, all SSRCs intending to be 274 sent from one peer will be identified in the SDP generated by that 275 entity. 277 The appdata for a WebRTC MediaStreamTrack consists of the "id" 278 attribute of a MediaStreamTrack. 280 If two different SSRCs have the same value for identifier and 281 appdata, it means that these two SSRCs are both intended for the same 282 MediaStreamTrack. This may occur if the sender wishes to use 283 simulcast or forward error correction, or if the sender intends to 284 switch between multiple codecs on the same MediaStreamTrack. 286 When an SDP description is updated, a specific msid continues to 287 refer to the same MediaStream. Once negotiation has completed on a 288 session, there is no memory; an msid value that appears in a later 289 negotiation will be taken to refer to a new MediaStream. 291 The following are the rules for handling updates of the list of SSRCs 292 and their msid values. 294 o When a new msid value occurs in the description, the recipient can 295 signal to its application that a new MediaStream has been added. 297 o When a description is updated to have more SSRCs with the same 298 msid value, but different appdata values, the recipient can signal 299 to its application that new media stream tracks have been added to 300 the media stream. 302 o When a description is updated to no longer list the msid value on 303 a specific ssrc, the recipient can signal to its application that 304 the corresponding media stream track has been closed. 306 o When a description is updated to no longer list the msid value on 307 any ssrc, the recipient can signal to its application that the 308 media stream has been closed. 310 In addition to signaling that the track is closed when it disappears 311 from the SDP, the track will also be signaled as being closed when 312 the SSRC disappears by the rules of [RFC3550] section 6.3.4 (BYE 313 packet received) and 6.3.5 (timeout). 315 4.1. Handling of non-signalled tracks 317 Pre-WebRTC entities will not send msid. This means that there will 318 be some incoming RTP packets with SSRCs where the recipient does not 319 know about a corresponding MediaStream id. 321 Handling will depend on whether or not any SSRCs are signaled in the 322 relevant m-line(s). There are two cases: 324 o No SSRC is signaled with an msid attribute. The SDP session is 325 assumed to be a backwards-compatible session. All incoming SSRCs, 326 on all m-lines that are part of the SDP session, are assumed to 327 belong to independent media streams, each with one track. The 328 identifier of this media stream and of the media stream track is a 329 randomly generated string; the label of this media stream will be 330 set to "Non-WMS stream". 332 o Some SSRCs are signaled with an msid attribute. In this case, the 333 session is WebRTC compatible, and the newly arrived SSRCs are 334 either caused by a bug or by timing skew between the arrival of 335 the media packets and the SDP description. These packets MAY be 336 discarded, or they MAY be buffered for a while in order to allow 337 immediate startup of the media stream when the SDP description is 338 updated. The arrival of media packets MUST NOT cause a new 339 MediaStreamTrack to be signaled. 341 If a WebRTC entity sends a description, it MUST include the msid- 342 semantic:WMS attribute, even if no media streams are sent. This 343 allows us to distinguish between the case of no media streams at the 344 moment and the case of legacy SDP generation. 346 It follows from the above that the WebRTC entity must have the SDP of 347 the other party before it can decide correctly whether or not a 348 "default" MediaStream should be created. RTP media packets that 349 arrive before the remote party's SDP MUST be buffered or discarded, 350 and MUST NOT cause a new MediaStreamTrack to be signalled. 352 It follows from the above that media stream tracks in the "default" 353 media stream cannot be closed by signaling; the application must 354 instead signal these as closed when the SSRC disappears according to 355 the rules of RFC 3550 section 6.3.4 and 6.3.5. 357 NOTE IN DRAFT: Previous versions of this memo suggested adding all 358 incoming SSRCs to a single MediaStream. This is problematic because 359 we do not know if the SSRCs are synchronized or not before we learn 360 the CNAME of the SSRCs, which only happens when an RTCP packet 361 arrives. How to identify a non-WMS stream is still open for 362 discussion - including whether it's necessary to do so. Using the 363 stream label seems like an easy thing to do for debuggability - it's 364 not signalled, and is intended for human consumption anyway. 366 5. IANA Considerations 368 This document requests IANA to register the "msid" attribute in the 369 "att-field (source level)" registry within the SDP parameters 370 registry, according to the procedures of [RFC5576] 371 The required information is: 373 o Contact name, email: IETF, contacted via mmusic@ietf.org, or a 374 successor address designated by IESG 376 o Attribute name: msid 378 o Long-form attribute name: Media stream group Identifier 380 o The attribute value contains only ASCII characters, and is 381 therefore not subject to the charset attribute. 383 o The attribute gives an association over a set of SSRCs, 384 potentially in different m-lines. It can be used to signal the 385 relationship between a WebRTC MediaStream and a set of SSRCs. 387 o The details of appropriate values are given in RFC XXXX. 389 This document requests IANA to create a new registry called 390 "Semantics for the msid-semantic SDP attribute", which should have 391 exactly the same rules as for the "Semantics for the ssrc-group SDP 392 attribute" registry (Expert Review), and to register the "WMS" 393 semantic within this new registry. 395 The required information is: 397 o Description: WebRTC Media Stream, as given in RFC XXXX. 399 o Token: WMS 401 o Standards track reference: RFC XXXX 403 IANA is requested to replace "RFC XXXX" with the RFC number of this 404 document upon publication. 406 6. Security Considerations 408 An adversary with the ability to modify SDP descriptions has the 409 ability to switch around tracks between media streams. This is a 410 special case of the general security consideration that modification 411 of SDP descriptions needs to be confined to entities trusted by the 412 application. 414 If implementing buffering as mentioned in section Section 4.1, the 415 amount of buffering should be limited to avoid memory exhaustion 416 attacks. 418 No other attacks that are relevant to the browser's security have 419 been identified that depend on this mechanism. 421 7. Acknowledgements 423 This note is based on sketches from, among others, Justin Uberti and 424 Cullen Jennings. 426 Special thanks to Miguel Garcia and Paul Kyzivat for their work in 427 reviewing this draft, with many specific language suggestions. 429 8. References 431 8.1. Normative References 433 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 434 Requirement Levels", BCP 14, RFC 2119, March 1997. 436 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 437 Jacobson, "RTP: A Transport Protocol for Real-Time 438 Applications", RFC 3550, July 2003. 440 [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax 441 Specifications: ABNF", RFC 5234, January 2008. 443 [RFC5576] Lennox, J., Ott, J., and T. Schierl, "Source-Specific 444 Media Attributes in the Session Description Protocol 445 (SDP)", RFC 5576, June 2009. 447 [W3C.WD-webrtc-20120209] 448 Bergkvist, A., Burnett, D., Narayanan, A., and C. 449 Jennings, "WebRTC 1.0: Real-time Communication Between 450 Browsers", World Wide Web Consortium WD WD-webrtc- 451 20120209, February 2012, 452 . 454 8.2. Informative References 456 [I-D.ietf-mmusic-sdp-bundle-negotiation] 457 Holmberg, C. and H. Alvestrand, "Multiplexing Negotiation 458 Using Session Description Protocol (SDP) Port Numbers", 459 draft-ietf-mmusic-sdp-bundle-negotiation-01 (work in 460 progress), August 2012. 462 [I-D.westerlund-avtcore-multiplex-architecture] 463 Westerlund, M., Burman, B., Perkins, C., and H. 465 Alvestrand, "Guidelines for using the Multiplexing 466 Features of RTP", 467 draft-westerlund-avtcore-multiplex-architecture-02 (work 468 in progress), July 2012. 470 [RFC4574] Levin, O. and G. Camarillo, "The Session Description 471 Protocol (SDP) Label Attribute", RFC 4574, August 2006. 473 [RFC5761] Perkins, C. and M. Westerlund, "Multiplexing RTP Data and 474 Control Packets on a Single Port", RFC 5761, April 2010. 476 [RFC5888] Camarillo, G. and H. Schulzrinne, "The Session Description 477 Protocol (SDP) Grouping Framework", RFC 5888, June 2010. 479 Appendix A. Design considerations, open questions and and alternatives 481 This appendix should be deleted before publication as an RFC. 483 One suggested mechanism has been to use CNAME instead of a new 484 attribute. This was abandoned because CNAME identifies a 485 synchronization context; one can imagine both wanting to have tracks 486 from the same synchronization context in multiple MediaStreams and 487 wanting to have tracks from multiple synchronization contexts within 488 one MediaStream (but the latter is impossible, since a MediaStream is 489 defined to impose synchronization on its members). 491 Another suggestion has been to put the msid value within an attribute 492 of RTCP SR (sender report) packets. This doesn't offer the ability 493 to know that you have seen all the tracks currently configured for a 494 media stream. 496 There has been a suggestion that this mechanism could be used to mute 497 tracks too. This is not done at the moment. 499 Discarding of incoming data when the SDP description isn't updated 500 yet (section 3) may cause clipping. However, the same issue exists 501 when crypto keys aren't available. Input sought. 503 There's been a suggestion that acceptable SSRCs should be signaled in 504 a response, giving a recipient the ability to say "no" to certain 505 SSRCs. This is not supported in the current version of this 506 document. 508 Appendix B. Change log 510 This appendix should be deleted before publication as an RFC. 512 B.1. Changes from rtcweb-msid-00 to -01 514 Added track identifier. 516 Added inclusion-by-reference of draft-lennox-mmusic-source-selection 517 for track muting. 519 Some rewording. 521 B.2. Changes from alvestrand-rtcweb-msid-01 to -02 523 Split document into sections describing a generic grouping mechanism 524 and sections describing the application of this grouping mechanism to 525 the WebRTC MediaStream concept. 527 Removed the mechanism for muting tracks, since this is not central to 528 the MSID mechanism. 530 B.3. Changes from alvestrand-rtcweb-msid-02 to mmusic-msid-00 532 Changed the draft name according to the wishes of the MMUSIC group 533 chairs. 535 Added text indicting cases where it's appropriate to have the same 536 appdata for multiple SSRCs. 538 Minor textual updates. 540 B.4. Changes from alvestrand-mmusic-msid-00 to -01 542 Increased the amount of explanatory text, much based on a review by 543 Miguel Garcia. 545 Removed references to BUNDLE, since that spec is under active 546 discussion. 548 Removed distinguished values of the MSID identifier. 550 B.5. Changes from alvestrand-mmusic-msid-01 to -02 552 Changed the order of the "msid-semantic: " attribute's value fields 553 and allowed multiple identifiers. This makes the attribute useful as 554 a marker for "I understand this semantic". 556 Changed the syntax for "identifier" and "appdata" to be "token". 558 Changed the registry for the "msid-semantic" attribute values to be a 559 new registry, based on advice given in Atlanta. 561 B.6. Changes from alvestrand-mmusic-msid-02 to ietf-mmusic-00 563 Updated terminology to refer to m-lines rather than RTP sessions when 564 discussing SDP formats and the ability of other linking mechanisms to 565 refer to SSRCs. 567 Changed the "default" mechanism to return independent streams after 568 considering the synchronization problem. 570 Removed the space from between "msid-semantic" and its value, to be 571 consistent with RFC 5576. 573 Author's Address 575 Harald Alvestrand 576 Google 577 Kungsbron 2 578 Stockholm, 11122 579 Sweden 581 Email: harald@alvestrand.no