Network Working Group T. Schierl Internet-Draft Fraunhofer HHI Intended status: Standards Track S. Wenger Expires: August 24, 2008 Nokia February 25, 2008 Signaling media decoding dependency in Session Description Protocol (SDP) draft-ietf-mmusic-decoding-dependency-01 Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on August 24, 2008. Copyright Notice Copyright (C) The IETF Trust (2008). Abstract This memo defines semantics that allow for signaling the decoding dependency of different media descriptions with the same media type in the Session Description Protocol (SDP). This is required, for example, if media data is separated and transported in different network streams Schierl & Wenger Expires August 24, 2008 [Page 1] Internet-Draft draft-ietf-mmusic-decoding-dependency-01 February 2008 as a result of the use of a layered or multiple descriptive media coding process. A new grouping type "DDP" -- decoding dependency -- is defined, to be used in conjunction with RFC 3388 entitled "Grouping of Media Lines in the Session Description Protocol". In addition, an attribute is specified describing the relationship of the media streams in a "DDP" group indicated by media identification attribute(s) and RTP payload type(s). Schierl & Wenger Expires August 24, 2008 [Page 2] Internet-Draft draft-ietf-mmusic-decoding-dependency-01 February 2008 Table of Contents 1. Introduction .................................................. 4 2. Terminology ................................................... 4 3. Definitions ................................................... 5 4. Motivation, Use Cases, and Architecture ....................... 6 4.1. Motivation .................................................. 6 4.2. Use cases ................................................... 7 5. Signaling Media Dependencies .................................. 8 5.1. Design Principles ........................................... 8 5.2. Semantics ................................................... 8 5.2.1. SDP grouping semantics for decoding dependency............. 8 5.2.2. Attribute for dependency signaling per media-stream........ 9 6. Usage of new semantics in SDP ................................ 10 6.1. Usage with the SDP Offer/Answer Model ...................... 10 6.2. Declarative usage .......................................... 10 6.3. Usage with Capability Negotiation .......................... 10 6.4. Examples ................................................... 11 7. Security Considerations ...................................... 12 8. IANA Considerations .......................................... 12 9. Open Issues .................................................. 13 10. References ................................................... 13 10.1. Normative References ...................................... 13 10.2. Informative References .................................... 13 Appendix A. Changes From Earlier Versions........................ 14 Authors' Addresses................................................ 15 Full Copyright Statement.......................................... 15 Intellectual Property Statement................................... 15 Acknowledgements.................................................. 16 Schierl & Wenger Expires August 24, 2008 [Page 3] Internet-Draft draft-ietf-mmusic-decoding-dependency-01 February 2008 1. Introduction An SDP session description may contain one or more media descriptions, each identifying a single media stream. A media description is identified by one "m=" line. Today, if more than one "m=" lines exist indicating the same media type, a receiver cannot identify a specific relationship between those media. A Multiple Description Coding (MDC) or layered Media Bitstream contains, by definition, one or more Media Partitions that are conveyed in their own media stream. In Multi View Coding (MVC) [I- D.wang-avt-rtp-mvc] layered dependencies between views are used to increase the coding efficiency. The cases we are interested in are layered and MDC Bitstreams with two or more Media Partitions. Carrying more than one Media Partition in its own session is one of the key use cases for employing layered or MDC coded media Senders, network elements, or receivers can suppress sending/forwarding/subscribing/decoding individual Media Partitions and still preserve perhaps suboptimal, but still useful media quality. One property of all Media Bitstreams relevant to this memo is that their Media Partitions have a well-defined usage relationship. For example, in layered coding, "higher" Media Partitions are useless without "lower" ones. In MDC coding, Media Partitions are complementary -- the more Media Partitions one receives, the better a reproduced quality may be. At present, SDP and its supporting infrastructure of RFCs lack the means to express such a usage relationship. Trigger for the present memo has been the standardization process of the RTP payload format for the Scalable Video Coding extension to ITU-T Rec. H.264 / MPEG-4 AVC [I-D.ietf-avt-rtp-svc]. When drafting [I-D.ietf-avt-rtp-svc] , it was observed that the aforementioned lack in signaling support is one that is not specific to SVC, but applies to all layered or MDC codecs. Therefore, this memo presents a generic solution. The mechanisms defined herein are media transport protocol dependent, i.e. applicable to the use of RTP [RFC3550] only. The SDP grouping of Media Lines of different media types is out of scope of this memo. 2. Terminology Schierl & Wenger Expires August 24, 2008 [Page 4] Internet-Draft draft-ietf-mmusic-decoding-dependency-01 February 2008 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14, RFC 2119 [RFC2119]. 3. Definitions Media stream: As per [RFC4566]. Media Bitstream: A valid, decodable stream, containing all media partitions generated by the encoder. A Media Bitstream normally conforms to a media coding standard. Media Partition: A subset of a Media Bitstream intended for independent transportation. An integer number of Media Partitions forms a Media Bitstream. In layered coding, a Media Partition represents one or more layers that are handled as a unit. In MDC coding, a Media Partition represents one or more descriptions that are handled as a unit. Decoding dependency: The class of relationships media partitions have to each other. At present, this memo defines two decoding dependencies: layering and multiple description. Layered coding dependency: Each Media Partition is only useful (i.e. can be decoded) when all of the Media Partitions it depends on are available. The dependencies between the Media Partitions therefore create a directed graph. Note: normally, in layered coding, the more Media Partitions are employed (following the rule above), the better a reproduced quality is possible. Multi description coding (MDC) dependency: N of M Media Partitions are required to form a Media Bitstream, but there is no hierarchy between these Media Partitions. Most MDC schemes aim at an increase of reproduced media quality when more media partitions are decoded. Some MDC schemes require more than one Media Partition to form an Operation point. Operation point: In layered coding, a subset of a layered Media Bitstream that includes all Media Partitions required for reconstruction at a certain point of quality, error resilience, or another property, and Schierl & Wenger Expires August 24, 2008 [Page 5] Internet-Draft draft-ietf-mmusic-decoding-dependency-01 February 2008 does not include any other Media Partitions. In MDC coding, a subset of an MDC Media Bitstream that is compliant with the MDC coding standard in question. 4. Motivation, Use Cases, and Architecture 4.1. Motivation This memo is concerned with two types of decoding dependencies: layered, and multi-description. The transport of layered and multi description coding share as key motivators the desire for media adaptation to network conditions, i.e. related to bandwidth, error rates, connectivity of endpoints in multicast or broadcast scenarios, and similar. o Layered decoding dependency: In layered coding, the partitions of a Media Bitstream are known as media layers or simply layers. One or more layers may be transported in different media streams in the sense of [RFC4566]. A classic use case is known as receiver-driven layered multicast, in which a receiver selects a combination of media streams in response to quality or bit-rate requirements. Back in the mid 1990s, the then available layered media formats and codecs envisioned primarily (or even exclusively) a one-dimensional hierarchy of layers. That is, each so-called enhancement layer referred to exactly one layer "below". The single exception has been the base layer, which is self-contained. Therefore, the identification of one enhancement layer fully specifies the operation point of a layered coding scheme, including knowledge about all the other layers that need to be decoded. [RFC4566] contains rudimentary support for exactly this use case and media formats, in that it allows for signaling a range of transport addresses in a certain media description. By definition, a higher transport address identifies a higher layer in the one-dimensional hierarchy. A receiver needs only to decode data conveyed over this transport address and lower transport addresses to decode this Operation Point. Newer media formats depart from this simple one-dimensional hierarchy, in that highly complex (at least tree-shaped) dependency hierarchies can be implemented. Compelling use cases for these complex hierarchies have been identified by industry. Support for it is therefore desirable. However, SDP, in its current form, does not allow for the signaling of these complex relationships. Therefore, Schierl & Wenger Expires August 24, 2008 [Page 6] Internet-Draft draft-ietf-mmusic-decoding-dependency-01 February 2008 receivers cannot make an informed decision on which layers to subscribe (in case of layered multicast). Layered decoding dependency may also exit in a Multi View Coding environment. Views may be coded using inter-view dependencies to increase coding efficiency. This results in Media Bitstreams, which logically may be separated into Media Partitions representing different views of the reconstructed video signal. These Media Partitions cannot be decoded independently, and, therefore, other Media Partitions are required for reconstruction. To express this relationship, the signaling needs to express the dependencies of the views, which in turn are Media Partitions in the sense of this document. o Multi descriptive decoding dependency: In the most basic form of MDC, each Media Partition forms an independent representation of the media. That is, decoding of any of the Media Partitions yields useful reproduced media data. When more than one Media Partition is available, then a decoder can process them jointly, and the resulting media quality increases. The highest reproduced quality is available if all original Media Partitions are available for decoding. More complex forms of multiple description coding can also be envisioned, i.e. where, as a minimum, N out of M total Media Partitions need to be available to allow meaningful decoding. MDC has not yet been embraced heavily by the media standardization community, though it is subject of a lot of academic research. As an example, we refer to [MDC]. In this memo, we cover MDC because we a) envision that MDC media formats will come into practical use within the lifetime of this memo, and b) the solution for its signaling is very similar to the one of layered coding. 4.2. Use cases o Receiver driven layered multicast This technology is discussed in [RFC3550] and references therein. We refrain from elaborating further; the subject is well known and understood. o Multiple end-to-end transmission with different properties Schierl & Wenger Expires August 24, 2008 [Page 7] Internet-Draft draft-ietf-mmusic-decoding-dependency-01 February 2008 Assume a unicast and point-to-point topology, wherein one endpoint sends media to another. Assume further that different forms of media transmission are available. The difference may lie in the cost of the transmission (free, charged), in the available protection (unprotected/secure), in the quality of service (guaranteed quality / best effort), or other factors. Layered and MDC coding allow to match the media characteristics to the available transmission path(s). For example, in layered coding it makes sense to convey the base layer over high QoS. Enhancement layers, on the other hand, can be conveyed over best effort, as they are "optional" in their characteristic -- nice to have, but non- essential for media consumption. In a different scenario, the base layer may be offered in a non-encrypted session as a free preview. An encrypted enhancement layer references this base layer and allows optimal quality play-back; however, it is only accessible to users who have the key, which may have been distributed by a conditional access mechanism. 5. Signaling Media Dependencies 5.1. Design Principles The dependency signaling is only feasible between media descriptions described with an "m="-line and with an assigned media identification attribute ("mid"), as defined in [RFC3388]. 5.2. Semantics 5.2.1. SDP grouping semantics for decoding dependency This specification defines a new grouping semantic Decoding Dependency "DDP": DDP associates a media stream, identified by its mid attribute, with a DDP group. Each media stream MUST be composed of an integer number of Media Partitions. A media stream is identified by a session- unique RTP payload type number within a "m="-line. In a DDP group, all media streams MUST have the same type of decoding dependency (as signaled by the attribute defined in 5.2.2). All media streams MUST contain at least one operation point. The DDP group type informs a receiver about the requirement for treating the media streams of the group according to the new media level attribute "depend", as defined in 5.2.2. Schierl & Wenger Expires August 24, 2008 [Page 8] Internet-Draft draft-ietf-mmusic-decoding-dependency-01 February 2008 When using multiple codecs, e.g. for Offer/Answer model, the media streams MUST have the same dependency structure, regardless which payload type number is used. 5.2.2. Attribute for dependency signaling per media-stream This memo defines a new media-level attribute, "depend", with the following ABNF [RFC4234]. The "identification-tag" is defined in [RFC3388]: depend-attribute = "a" "=" "depend" ":" ( dependent-payload-type dependency-tag ";" ) *( SP dependent-payload-type dependency-tag ";" ) CRLF dependency-tag = dependency-type *1( SP identification-tag ":" payload-type-dependency *( "," payload-type-dependency ) ) dependency-type = "lay" / "mdc" "dependent-payload-type", indicates the payload type number, as defined in [RFC4566], that depends on a "payload-type-dependency" in the "m="-line indicated by the value of "identification-tag" within the "dependency-tag". "payload-type-dependency", indicates the payload type number in the "m="-line identified by the "identification-tag" within the "dependency-tag", which the "dependent-payload-type" number of the dependent "m="-line depends on. The "depend"-attribute describes the decoding dependency. The "depend"-attribute MAY be followed by a sequence of "dependency- tag"(s) which identify all related RTP payload types in all related "m="-lines. The attribute MAY be used with multicast as well as with unicast transport addresses. The following types of dependencies are defined: o lay: Layered decoding dependency -- identifies the described media stream as one or more Media Partitions of a layered Media Bitstream. When "lay" is used, all required media streams for the Operation Point MUST be identified by "identification-tag" and "payload-type- dependency" following the "lay" string. o mdc: Multi descriptive coding dependency -- signals that the described media stream is part of a set of a MDC Media Bitstream. By Schierl & Wenger Expires August 24, 2008 [Page 9] Internet-Draft draft-ietf-mmusic-decoding-dependency-01 February 2008 definition, at least N out of M media streams of the group need to be available to from an Operation Point. The values of N and M depend on the properties of the Media Bitstream and are not signaled within this context. When "mdc" is used, all required media streams for the Operation Point MUST be identified by "identification-tag" and "payload-type-dependency" following the "mdc" string. 6. Usage of new semantics in SDP 6.1. Usage with the SDP Offer/Answer Model The backward compatibility in offer / answer is generally handled as specified in [RFC3388]. Depending on the implementation, a node that does not understand DDP grouping (either does not understand line grouping at all, or just does not understand the DDP semantics) SHOULD respond to an offer containing DDP grouping either (1) with an answer that ignores the grouping attribute or (2) with a refusal to the request (e.g., 488 Not acceptable here or 606 Not acceptable in SIP). In the first case, the original sender of the offer MUST respond by offering a single media stream that represents an Operation Point. Note: in most cases, this will be the base layer of a layered Media Bitstream, equally possible are Operation Points containing a set of enhancement layers as long as all are part of a single media stream. In the second case, if the sender of the offer still wishes to establish the session, it SHOULD re-try the request with an offer including only a single media stream. 6.2. Declarative usage If an RTSP receiver understands signaling according to this memo, it SHALL setup all media streams that are required to decode the Operation Point of its choice. If an RTSP receiver does not understand the signaling defined within this memo, it falls back to normal SDP processing. Two likely cases have to be distinguished: (1) if at least one of the media types included in the SDP is within the receiver's capabilities, it selects among those candidates according to implementation specific criteria for setup, as usual. (2) If none of the media type included in the SDP can be processed, then obviously no setup can occur. 6.3. Usage with Capability Negotiation Schierl & Wenger Expires August 24, 2008 [Page 10] Internet-Draft draft-ietf-mmusic-decoding-dependency-01 February 2008 This memo does not cover the interaction with Capability Negotiation [I-D.ietf-mmusic-sdp-capability-negotiation]. This issue should be addressed in a different memo. 6.4. Examples a.) Example for signaling layered decoding dependency dependency: v=0 o=svcsrv 289083124 289083124 IN IP4 host.example.com s=LAYERED VIDEO SIGNALING Seminar t=0 0 c=IN IP4 192.0.2.1/127 a=group:DDP 1 2 3 4 m=video 40000 RTP/AVP 94 194 b=AS:96 a=framerate:15 a=rtpmap:94 H264/90000 a=rtpmap: 194 H264/90000 a=mid:1 m=video 40002 RTP/AVP 95 195 b=AS:64 a=framerate:15 a=rtpmap:95 H264-SVC/90000 a=rtpmap:195 H264-SVC/90000 a=mid:2 a=depend:95 lay 1:94,194; 195 lay 1:194; m=video 40004 RTP/AVP 96 196 b=AS:128 a=framerate:30 a=rtpmap:96 H264-SVC/90000 a=rtpmap:196 H264-SVC/90000 a=mid:3 a=depend:96 lay 1:94,194; 196 lay 1:194; m=video 40004 RTP/SAVP 100 200 c=IN IP4 192.0.2.2/127 b=AS:512 k=uri:conditional-access-server.example.com a=framerate:30 a=rtpmap:100 H264-SVC/90000 a=rtpmap:200 H264-SVC/90000 a=mid:4 Schierl & Wenger Expires August 24, 2008 [Page 11] Internet-Draft draft-ietf-mmusic-decoding-dependency-01 February 2008 a=depend:100 lay 1:94,194 3:96; 200 lay 1:194 3:196; b.) Example for signaling of multi descriptive coding dependency: v=0 o=mdcsrv 289083124 289083124 IN IP4 host.example.com s=MULTI DESCRIPTION VIDEO SIGNALING Seminar t=0 0 c=IN IP4 192.0.2.1/127 a=group:DDP 1 2 3 m=video 40000 RTP/AVP 94 a=mid:1 a=depend:94 mdc 2:95 3:96; m=video 40002 RTP/AVP 95 a=mid:2 a=depend:95 mdc 1:94 3:96; m=video 40004 RTP/AVP 96 c=IN IP4 192.0.2.2/127 a=mid:3 a=depend:96 mdc 1:94 2:95; 7. Security Considerations All security implications of SDP apply. There may be a risk of manipulation the dependency signaling of a session description by an attacker. This may mislead a receiver or middle box, e.g. a receiver may try to compose a bitstream that does not form an Operation Point, although the signaling made it believe it would form a valid Operation Point, with potential fatal consequences for the media decoding process. It is recommended that the receiver SHOULD perform an integrity check on SDP and follow the security considerations of SDP to only trust SDP from trusted sources. 8. IANA Considerations This document defines the "DDP" semantics to be used with grouping of media lines in SDP as defined in RFC 3388. The "DDP" semantics defined in this memo are to be registered by the IANA when it is published in standard track RFCs. Schierl & Wenger Expires August 24, 2008 [Page 12] Internet-Draft draft-ietf-mmusic-decoding-dependency-01 February 2008 The attribute "depend" is to be registered by IANA as a new media- level attribute. The purpose of this attribute is to express a dependency, which may exist between "m"-lines of a media session. 9. Open Issues - Requirement on media stream: With the new draft, different media streams can be present in a DDP group, that is different codecs may be used within the same DDP group? - IANA registration for 'lay' and 'mdc'? 10. References 10.1. Normative References [RFC4566] Handley, M., Jacobson, V, and C. Perkins, "SDP: Session Description Protocol", RFC 4566, July 2006. [RFC3388] Camarillo, G., Holler, J., and H. Schulzrinne, "Grouping of Media Lines in the Session Description Protocol (SDP)", RFC 3388, December 2002. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, July 2003. [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", RFC 5234, January 2008. [I-D.ietf-mmusic-sdp-capability-negotiation] Andreasen, F., "SDP Capability Negotiation", draft-ietf-mmusic-sdp-capability-negotiation-08, (work in progress), December 2008 10.2. Informative References [I-D.ietf-avt-rtp-svc] Wenger, S., Wang Y.-K. and T. Schierl, "RTP Payload Format for SVC Video", draft-ietf-avt-rtp-svc-07 (work in progress), February 2008. [MDC] Vitali, A., Borneo, A., Fumagalli, M., and R. Rinaldo, "Video over IP using Standard-Compatible Multiple Description Coding: an IETF proposal", Packet Video Workshop, April 2006, Hangzhou, China [I-D.wang-avt-rtp-mvc] Wang, Y.-K. and T. Schierl, "RTP Payload Format for MVC Video", draft-wang-avt-rtp-mvc-00 (work in progress), November 2007. Schierl & Wenger Expires August 24, 2008 [Page 13] Internet-Draft draft-ietf-mmusic-decoding-dependency-01 February 2008 Appendix A. Changes From Earlier Versions A.1 Changes from individual submission 19Dec06 / TS: removed SSRC multiplexing and with that various information about RTP draft title correction corrected SDP reference editorial modifications throughout the document added Stephan Wenger to the list of authors removed section "network elements not supporting dependency signaling" 20-28Dec06 / TS, StW: Editorial improvements 3Mar07 / TS: adjustment for new I-D style, added Offer/Answer text, corrected ABNF reference, added Security and IANA considerations, added section Usage with existing entities not supporting new signaling, added text for Declarative usage section, added Open issues section. 21-Jun07: Numerous editorial changes and reworked section 6. 11-Nov07: Added Payload Type of media stream in question to dependency signaling. Note on usage with Cap. Negotiation. Added multi view coding (MVC) dependency as part of 'lay'-dependency. Added ref. to MVC activity at ITU-T/MPEG. A.2 Changes from draft-ietf-mmusic-decoding-dependency-00 to draft-ietf-mmusic-decoding-dependency-01: 21-Feb08: Enhanced mechanism by multiple "payload-type-dependencies" for the same "mid". Typically the case, when using different packetization modes as defined in RFC3984. 25-Feb08: Modification throughout informative part of definition section Different codecs may be present within the same DDP group. Schierl & Wenger Expires August 24, 2008 [Page 14] Internet-Draft draft-ietf-mmusic-decoding-dependency-01 February 2008 Authors' Addresses Thomas Schierl Fraunhofer HHI Einsteinufer 37 D-10587 Berlin Germany Phone: +49-30-31002-227 Email: schierl@hhi.fhg.de Stephan Wenger Nokia 955 Page Mill Road Palo Alto, CA, 94304 USA Phone: +1-650-862-7368 Email: stewe@stewe.org Full Copyright Statement Copyright (C) The IETF Trust (2008). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property Statement The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Schierl & Wenger Expires August 24, 2008 [Page 15] Internet-Draft draft-ietf-mmusic-decoding-dependency-01 February 2008 Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Acknowledgements Funding for the RFC Editor function is currently provided by the Internet Society. Schierl & Wenger Expires August 24, 2008 [Page 16]