Network Working Group M. Zanaty Internet-Draft S. Nandakumar Intended status: Informational Cisco Expires: 14 September 2023 P. Thatcher Microsoft 13 March 2023 Low Overhead Media Container draft-mzanaty-moq-loc-00 Abstract This specification describes a media container format for encoded and encrypted audio and video media data to be used for interactive media usecases, with the goal of it being a low overhead format. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on 14 September 2023. Copyright Notice Copyright (c) 2023 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/ license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License. Table of Contents 1. Introduction 1.1. Requirements Notation and Conventions 1.2. Terminology 2. Payload Format 3. Payload Header Data 3.1. Common Header Data 3.2. Video Header Data 3.3. Audio Header Data 4. Header Data Registration 5. Payload Encryption 6. Container Serialization 7. MOQ Transport Mapping 8. Security Considerations 9. IANA Considerations 10. Normative References Appendix A. Acknowledgements Authors' Addresses 1. Introduction This specification describes a low-overhead media container format for encoded and encrypted audio and video media data. "Low-overhead" refers to minimal extra encapsulation as well as minimal application overhead when interfacing with WebCodecs. The container format description is specified for all audio and video codecs defined in the WebCodecs Codec Registry. The audio and video payload bitstream is identical to the internal data inside an EncodedAudioChunk and EncodedVideoChunk, respectively, specified in the registry. In addition to the media payloads, critical metadata is also specified for audio and video payloads. A primary motivation is to align with media formats used in WebCodecs to minimize application overhead when interfacing with WebCodecs. Other container formats like CMAF or RTP would require more extensive application overhead in format conversions, as well as larger encapsultion overhead which may burden some use cases like low bitrate audio scenarios. 1.1. Requirements Notation and Conventions The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD","SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. 1.2. Terminology TODO 2. Payload Format The WebCodecs Codec Registry defines the contents of an EncodedAudioChunk and EncodedVideoChunk for the audio and video codec formats in the registry. The "internal data" in these chunks is used directly in this specification as the payload bitstream. 3. Payload Header Data This section specified metadata that needs to be carried out as payload metadata. Payload header data provides necessary information for intermediaries to perform switching decisions when the payload is inaccessible, due to encryption. Section ((#reg)) provides framework for registering new payload header fields that aren't defined by this specification 3.1. Common Header Data Following metadata MUST be captured for each media frame Sequence Number: Identifies a sequentially increasing variable length integer that is incremented per encoded media frame. Capture Timestamp in Microseconds: Captures the wall-clock time of the encoded media frame. 3.2. Video Header Data Flags for frames which are independent, discardable, or base layer sync points, as well as temporal and spatial layer identification. [I-D.ietf-avtext-framemarking] . 3.3. Audio Header Data Audio Level: captures the magnitude of the audio level of the corresponding audio frame and values in encoded in 7 bits as defined in the section 3 of [RFC6464] 4. Header Data Registration This section details the procedures to register header data fields that might be useful for a particular class of media applications. Registering a given metadata field requires the following attributes to be specified. Shortname: Short name for the metadata. (Not sent on the wire.) Description: Detailed description for the metadata. (Not sent on the wire.) ID: Identifier assigned by the registry. (varint) Length: Length of metadata value in bytes. (varint) Value: Value of metadata. (length bytes) Registration of type "Specification Required" is followed for registering new for header data values. 5. Payload Encryption When end to end encryption is supported, the encoded payload is encrypted with keys from symmetric keying mechanisms, such a MLS, and the payload itself is protected using SFrame or equivalent. 6. Container Serialization The wire encoding of the payload conforming to this specification is a set of length delimited values as shown below. The Bytes is obtained as output of AEAD operation for encrypting the Payload with the header data as additional data input. +--------+------------+-------+------------+ | Payload | Bytes | Payload | Bytes | | Len | (0) | Len (1) | (1) | ... +--------+------------+-------+------------+ 7. MOQ Transport Mapping TODO 8. Security Considerations TODO 9. IANA Considerations TODO on specification required for metadata registration. 10. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, . [I-D.ietf-avtext-framemarking] Zanaty, M., Berger, E., and S. Nandakumar, "Frame Marking RTP Header Extension", Work in Progress, Internet-Draft, draft-ietf-avtext-framemarking-13, 11 November 2021, . [RFC6464] Lennox, J., Ed., Ivov, E., and E. Marocco, "A Real-time Transport Protocol (RTP) Header Extension for Client-to- Mixer Audio Level Indication", RFC 6464, DOI 10.17487/RFC6464, December 2011, . Appendix A. Acknowledgements Thanks to Cullen Jennings for suggestions and review. Authors' Addresses Mo Zanaty Cisco Email: mzanaty@cisco.com Suhas Nandakumar Cisco Email: snandaku@cisco.com Peter Thatcher Microsoft Email: pthatcher@microsoft.com