idnits 2.17.1 draft-abhishek-mmusic-overlay-grouping-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 27, 2020) is 1276 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 mmusic R. Abhishek 3 Internet-Draft Tencent 4 Intended status: Standards Track October 27, 2020 5 Expires: April 30, 2021 7 SDP Overlay Grouping framework for immersive telepresence media streams 8 draft-abhishek-mmusic-overlay-grouping-00 10 Abstract 12 This document defines semantics that allow for signalling a new SDP 13 group "OL" for overlays in an immersive telepresence session. The 14 "OL" attribute can be used by the application to relate all the 15 overlay media streams enabling them to be added as overlay on top of 16 the immersive video. The overlay grouping semantics is required, if 17 the media data is seperate and transported via different protocols. 19 Status of This Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at https://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on April 30, 2021. 36 Copyright Notice 38 Copyright (c) 2020 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (https://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. Code Components extracted from this document must 47 include Simplified BSD License text as described in Section 4.e of 48 the Trust Legal Provisions and are provided without warranty as 49 described in the Simplified BSD License. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 54 2. Contribution and Discussion Venues for this draft . . . . . . 3 55 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 56 4. Overview of Operation . . . . . . . . . . . . . . . . . . . . 3 57 5. Overlay Stream Group Identification Attribute . . . . . . . . 4 58 6. Use of group and mid . . . . . . . . . . . . . . . . . . . . 4 59 7. Example of OL . . . . . . . . . . . . . . . . . . . . . . . . 5 60 8. Security Considerations . . . . . . . . . . . . . . . . . . . 5 61 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 6 62 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 6 63 10.1. Normative References . . . . . . . . . . . . . . . . . . 7 64 10.2. Informative References . . . . . . . . . . . . . . . . . 8 65 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 8 67 1. Introduction 69 Telepresence [RFC7205] can be described as a technology that allows a 70 person the experience of "being present" at a remote location for 71 video as well as audio telepresence sessions, so as to enable the 72 users sense of realism and presence [TS26.223] . SDP [RFC4566] is 73 being predominantly used for describing the format for multimedia 74 communication session for telepresence conferencing. These use open 75 standards such as RTP [RFC3550] and SIP [RFC3261] . 77 An SDP session may contain more than one media lines with each media 78 line identified by "m"=line. Each line denotes a single media 79 stream. If multiple media lines are present in a session, a receiver 80 needs to identify relationship between those media lines. 82 Overlay media stream can be defined as a piece of visual media which 83 can be rendered over an immersive video or image or over a viewport 84 [ISO23090] . When an overlay is transmitted, its media stream needs 85 to be uniquely identified across multiple SDP descriptions exchanged 86 with different receivers so that the streams can be identified in 87 terms of its role in the session irrespective of its media type and 88 transport protocol. 90 In an immersive telepresence session, one media is streamed as an 91 immersive stream whereas other media streams are overlaid on top of 92 the immersive video/image. An end user can stream more than one 93 overlay, subject to its decoding capacity. When multiple overlay 94 streams are transmitted within a session, the end application upon 95 receiving, needs to be able to relate the media streams to each 96 other. This can be achieved by SDP grouping framework by using the 97 "group" attribute that groups different "m" lines in a session. 98 However, the current SDP signalling framework does not provide such 99 grouping semantics for overlays. 101 This document describes a new SDP group semantics for grouping the 102 overlays when an immersive media stream is transmitted for 103 telepresence conferencing. SDP session description consists of one 104 or multiple media lines know as "m" lines which can be identified by 105 a token carried in a "mid" attribute. The SDP session describes a 106 session-level group level attributes that groups different media 107 lines using a defined group semantics. The semantics defined in this 108 memo is to be used in conjuction with [RFC5888] titled "The Session 109 Description Protocol (SDP) Grouping Framework". 111 2. Contribution and Discussion Venues for this draft 113 (Note to RFC Editor - if this document ever reaches you, please 114 remove this section) 116 Substantial discussion of this document should take place on the 117 MMUSIC working group mailing list ( mmusic@ietf.org). Subscription 118 and archive details are at https://www.ietf.org/mailman/listinfo/ 119 mmusic. 121 3. Terminology 123 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 124 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 125 document are to be interpreted as described in [RFC2119]. 127 4. Overview of Operation 129 A non-normative description of SDP overlay group semantics is 130 described in this section. An immersive stream for a telepresence 131 session may consist of one or more conference rooms with a 360-degree 132 camera and the remote users using head mounted display for streaming. 133 "Participant cameras" are used to capture the conference participants 134 whereas "presentation cameras" or "content cameras" can be used for 135 document display [RFC7205] . The remote participant can stream any 136 of the available immersive video in the session as background whereas 137 other available streams such as the presentation stream or 2D video 138 from any other room or participant can be used as an overlay on top 139 of the immersive video/image. 141 A user with a head mounted display may stream more than one overlay 142 in a single SDP session. These overlay streams are transmitted via 143 "m" line in SDP session description. Each "m" line in the session 144 description is identified by a token carried via the "mid" attribute. 145 When multiple overlay streams are transmitted within a session, the 146 end application upon receiving, needs to be able to relate the media 147 streams to each other. This is achieved by using the SDP grouping 148 framework [RFC5888]. The session descriptions carries session-level 149 "group" attribute for the overlays which groups different "m" lines 150 using overlay(OL) group semantics. 152 5. Overlay Stream Group Identification Attribute 154 The "overlay media stream identification" attribute is used to 155 identify overlay media streams within a session description. In a 156 overlay group, the media lines MAY have different media contents. 157 Its formatting in SDP [RFC4566] is described by the following 158 Augmented Backus-Naur Form (ABNF) [RFC5234] : 160 mid-attribute = "a=mid:" identification-tag 161 identification-tag = token 162 ; token is defined in RFC4566 164 This documents defines a new group semantics "OL" identification 165 media attribute, which is used to identify overlay group media 166 streams within a session description. It is used for grouping the 167 media streams for different overlays together within a session. An 168 application that receives a session description that contains "m" 169 lines grouped together using "OL" semantics MUST overlay the 170 corresponding media streams on top of the immersive media stream. 172 6. Use of group and mid 174 All group and mid attributes MUST follow the rules defined in 175 [RFC5888]. The "mid" attribute should be used for all "m" lines 176 within a session description . If for any "m" lines within a session, 177 no "mid" attribute is identified for a session description, the 178 application MUST NOT perform any media line grouping. If the 179 identification-tags associated with "a=group" lines do not map to any 180 "m" lines, it MUST be ignored. 182 group-attribute ="a=group:" semantics 183 *(SP identification-tag) 184 semantics = "OL" / semantics-extension 185 semantics-extension = token 186 ; token is defined in RFC4566 188 7. Example of OL 190 The following two examples show a session description for overlays in 191 an immersive telepresence conference. The "group" line indicates 192 that the "m" lines with tokens 1 and 2 are grouped for the purpose of 193 overlays and intended to be overlaid on top of the immersive video. 195 In the first example shown below, two overlays are being transmitted. 196 The first media stream (mid:1) carries the video stream, and the 197 second stream (mid:2) contains an audio stream. 199 v=0 200 o=Alice 292742730 29277831 IN IP4 233.252.0.74 201 c=IN IP4 233.252.0.79 202 t=0 0 203 a=group:OL 1 2 204 m=video 30000 RTP/AVP 31 205 a=mid:1 206 m=audio 30002 RTP/AVP 31 207 a=mid:2 209 The second example, below, uses 'content' attribute with the media 210 streams which are transmitted for overlay purpose. 212 v=0 213 o=Alice 292742730 29277831 IN IP4 233.252.0.74 214 c=IN IP4 233.252.0.79 215 t=0 0 216 a=group:OL 1 2 217 m=video 30000 RTP/AVP 31 218 a= content:slides 219 a=mid:1 220 m=video 30002 RTP/AVP 31 221 a=content:speaker 222 a=mid:2 224 8. Security Considerations 226 All security considerations as defined in [RFC5888] apply: 228 Using the "group" parameter with FID semantics, an entity that 229 managed to modify the session descriptions exchanged between the 230 participants to establish a multimedia session could force the 231 participants to send a copy of the media to any destination of its 232 choosing. 234 Integrity mechanisms provided by protocols used to exchange session 235 descriptions and media encryption can be used to prevent this attack. 237 In SIP, Secure/Multipurpose Internet Mail Extensions (S/MIME) 238 [RFC8550] and Transport Layer Security (TLS) [RFC8446] can be used to 239 protect session description exchanges in an end-to-end and a hop- 240 byhop fashion, respectively. 242 9. IANA Considerations 244 The following contact information shall be used for all registrations 245 included here: 247 Contact: Rohit Abhishek 248 email: rabhishek@rabhishek.com 249 tel : +1-816-585-7500 251 This document defines a new SDP group semantics for overlays for a 252 immersive telepresence session. This attribute can be used by the 253 application to group all the overlays in a session. Semantics values 254 to be used with this framework should be registered by the IANA 255 following the Standards Action policy [RFC8126]. This document adds 256 a new group semantics and follows the registry group defined in 257 [RFC5888]. 259 The following semantics needs to be registered by IANA in Semantics 260 for the "group" SDP Attribute under SDP Parameters. 262 Semantics Token Reference 263 ---------------------------------------------- 264 Overlay OL RFCXXXX 266 The "OL" attribute is used to group different media streams to be 267 rendered as overlays. Its format is defined in Section 5 . 269 The IANA Considerations section of the RFC MUST include the following 270 information, which appears in the IANA registry along with the RFC 271 number of the publication. 273 o A brief description of the semantics. 275 o Token to be used within the "group" attribute. This token may be 276 of any length, but SHOULD be no more than four characters long. 278 o Reference to a standards track RFC. 280 10. References 281 10.1. Normative References 283 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 284 Requirement Levels", BCP 14, RFC 2119, 285 DOI 10.17487/RFC2119, March 1997, 286 . 288 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 289 A., Peterson, J., Sparks, R., Handley, M., and E. 290 Schooler, "SIP: Session Initiation Protocol", RFC 3261, 291 DOI 10.17487/RFC3261, June 2002, 292 . 294 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 295 Jacobson, "RTP: A Transport Protocol for Real-Time 296 Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550, 297 July 2003, . 299 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 300 Description Protocol", RFC 4566, DOI 10.17487/RFC4566, 301 July 2006, . 303 [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax 304 Specifications: ABNF", STD 68, RFC 5234, 305 DOI 10.17487/RFC5234, January 2008, 306 . 308 [RFC5888] Camarillo, G. and H. Schulzrinne, "The Session Description 309 Protocol (SDP) Grouping Framework", RFC 5888, 310 DOI 10.17487/RFC5888, June 2010, 311 . 313 [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for 314 Writing an IANA Considerations Section in RFCs", BCP 26, 315 RFC 8126, DOI 10.17487/RFC8126, June 2017, 316 . 318 [RFC8446] Rescorla, E., "The Transport Layer Security (TLS) Protocol 319 Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018, 320 . 322 [RFC8550] Schaad, J., Ramsdell, B., and S. Turner, "Secure/ 323 Multipurpose Internet Mail Extensions (S/MIME) Version 4.0 324 Certificate Handling", RFC 8550, DOI 10.17487/RFC8550, 325 April 2019, . 327 10.2. Informative References 329 [ISO23090] 330 "Information technology -- Coded representation of 331 immersive media -- Part 2: Omnidirectional MediA Format 332 (OMAF) 2nd Edition", ISO ISO 23090-2:2020(E), February 333 2020. 335 [RFC7205] Romanow, A., Botzko, S., Duckworth, M., and R. Even, Ed., 336 "Use Cases for Telepresence Multistreams", RFC 7205, 337 DOI 10.17487/RFC7205, April 2014, 338 . 340 [TS26.223] 341 "3rd Generation Partnership Project; Technical 342 Specification Group Services and System Aspects; 343 Telepresence using the IP Multimedia Subsystem (IMS); 344 Media Handling and Interaction", 3GPP TS26.223, March 345 2020. 347 Author's Address 349 Rohit Abhishek 350 Tencent 351 2747 Park Blvd 352 Palo Alto 94588 353 USA 355 Email: rabhishek@rabhishek.com