idnits 2.17.1 draft-ietf-cellar-codec-09.txt: -(2076): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(2108): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(2387): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There are 6 instances of lines with non-ascii characters in the document. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (1 May 2022) is 719 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'Events' is mentioned on line 1791, but not defined -- Possible downref: Non-RFC (?) normative reference: ref. 'DolbyVisionWithinIso' -- Possible downref: Non-RFC (?) normative reference: ref. 'IEEE.1857-10' -- Possible downref: Non-RFC (?) normative reference: ref. 'IEEE.1857-4' -- Possible downref: Non-RFC (?) normative reference: ref. 'IEEE.754' ** Downref: Normative reference to an Informational RFC: RFC 6386 -- Possible downref: Non-RFC (?) normative reference: ref. 'ST12' Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 cellar S. Lhomme 3 Internet-Draft 4 Intended status: Standards Track M. Bunkus 5 Expires: 2 November 2022 6 D. Rice 7 1 May 2022 9 Matroska Media Container Codec Specifications 10 draft-ietf-cellar-codec-09 12 Abstract 14 This document defines the Matroska codec mappings, including the 15 codec ID, layout of data in a Block Element and in an optional 16 CodecPrivate Element. 18 Status of This Memo 20 This Internet-Draft is submitted in full conformance with the 21 provisions of BCP 78 and BCP 79. 23 Internet-Drafts are working documents of the Internet Engineering 24 Task Force (IETF). Note that other groups may also distribute 25 working documents as Internet-Drafts. The list of current Internet- 26 Drafts is at https://datatracker.ietf.org/drafts/current/. 28 Internet-Drafts are draft documents valid for a maximum of six months 29 and may be updated, replaced, or obsoleted by other documents at any 30 time. It is inappropriate to use Internet-Drafts as reference 31 material or to cite them other than as "work in progress." 33 This Internet-Draft will expire on 2 November 2022. 35 Copyright Notice 37 Copyright (c) 2022 IETF Trust and the persons identified as the 38 document authors. All rights reserved. 40 This document is subject to BCP 78 and the IETF Trust's Legal 41 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 42 license-info) in effect on the date of publication of this document. 43 Please review these documents carefully, as they describe your rights 44 and restrictions with respect to this document. Code Components 45 extracted from this document must include Revised BSD License text as 46 described in Section 4.e of the Trust Legal Provisions and are 47 provided without warranty as described in the Revised BSD License. 49 Table of Contents 51 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 5 52 2. Status of this document . . . . . . . . . . . . . . . . . . . 5 53 3. Notation and Conventions . . . . . . . . . . . . . . . . . . 5 54 4. Codec Mappings . . . . . . . . . . . . . . . . . . . . . . . 5 55 4.1. Defining Matroska Codec Support . . . . . . . . . . . . . 6 56 4.1.1. Codec ID . . . . . . . . . . . . . . . . . . . . . . 6 57 4.1.2. Codec Name . . . . . . . . . . . . . . . . . . . . . 7 58 4.1.3. Description . . . . . . . . . . . . . . . . . . . . . 7 59 4.1.4. Initialization . . . . . . . . . . . . . . . . . . . 7 60 4.1.5. Codec BlockAdditions . . . . . . . . . . . . . . . . 7 61 4.1.6. Citation . . . . . . . . . . . . . . . . . . . . . . 8 62 4.1.7. Deprecation Date . . . . . . . . . . . . . . . . . . 8 63 4.1.8. Superseded By . . . . . . . . . . . . . . . . . . . . 9 64 4.2. Recommendations for the Creation of New Codec Mappings . 9 65 4.3. Video Codec Mappings . . . . . . . . . . . . . . . . . . 9 66 4.3.1. V_MS/VFW/FOURCC . . . . . . . . . . . . . . . . . . . 9 67 4.3.2. V_UNCOMPRESSED . . . . . . . . . . . . . . . . . . . 10 68 4.3.3. V_MPEG4/ISO/SP . . . . . . . . . . . . . . . . . . . 10 69 4.3.4. V_MPEG4/ISO/ASP . . . . . . . . . . . . . . . . . . . 10 70 4.3.5. V_MPEG4/ISO/AP . . . . . . . . . . . . . . . . . . . 10 71 4.3.6. V_MPEG4/MS/V3 . . . . . . . . . . . . . . . . . . . . 11 72 4.3.7. V_MPEG1 . . . . . . . . . . . . . . . . . . . . . . . 11 73 4.3.8. V_MPEG2 . . . . . . . . . . . . . . . . . . . . . . . 11 74 4.3.9. V_MPEG4/ISO/AVC . . . . . . . . . . . . . . . . . . . 12 75 4.3.10. V_MPEGH/ISO/HEVC . . . . . . . . . . . . . . . . . . 12 76 4.3.11. V_AVS2 . . . . . . . . . . . . . . . . . . . . . . . 12 77 4.3.12. V_AVS3 . . . . . . . . . . . . . . . . . . . . . . . 13 78 4.3.13. V_REAL/RV10 . . . . . . . . . . . . . . . . . . . . . 13 79 4.3.14. V_REAL/RV20 . . . . . . . . . . . . . . . . . . . . . 13 80 4.3.15. V_REAL/RV30 . . . . . . . . . . . . . . . . . . . . . 13 81 4.3.16. V_REAL/RV40 . . . . . . . . . . . . . . . . . . . . . 14 82 4.3.17. V_QUICKTIME . . . . . . . . . . . . . . . . . . . . . 14 83 4.3.18. V_THEORA . . . . . . . . . . . . . . . . . . . . . . 14 84 4.3.19. V_PRORES . . . . . . . . . . . . . . . . . . . . . . 15 85 4.3.20. V_VP8 . . . . . . . . . . . . . . . . . . . . . . . . 15 86 4.3.21. V_VP9 . . . . . . . . . . . . . . . . . . . . . . . . 16 87 4.3.22. V_FFV1 . . . . . . . . . . . . . . . . . . . . . . . 16 88 4.4. Audio Codec Mappings . . . . . . . . . . . . . . . . . . 17 89 4.4.1. A_MPEG/L3 . . . . . . . . . . . . . . . . . . . . . . 17 90 4.4.2. A_MPEG/L2 . . . . . . . . . . . . . . . . . . . . . . 17 91 4.4.3. A_MPEG/L1 . . . . . . . . . . . . . . . . . . . . . . 17 92 4.4.4. A_PCM/INT/BIG . . . . . . . . . . . . . . . . . . . . 17 93 4.4.5. A_PCM/INT/LIT . . . . . . . . . . . . . . . . . . . . 18 94 4.4.6. A_PCM/FLOAT/IEEE . . . . . . . . . . . . . . . . . . 18 95 4.4.7. A_MPC . . . . . . . . . . . . . . . . . . . . . . . . 18 96 4.4.8. A_AC3 . . . . . . . . . . . . . . . . . . . . . . . . 18 97 4.4.9. A_AC3/BSID9 . . . . . . . . . . . . . . . . . . . . . 19 98 4.4.10. A_AC3/BSID10 . . . . . . . . . . . . . . . . . . . . 19 99 4.4.11. A_ALAC . . . . . . . . . . . . . . . . . . . . . . . 19 100 4.4.12. A_DTS . . . . . . . . . . . . . . . . . . . . . . . . 19 101 4.4.13. A_DTS/EXPRESS . . . . . . . . . . . . . . . . . . . . 20 102 4.4.14. A_DTS/LOSSLESS . . . . . . . . . . . . . . . . . . . 20 103 4.4.15. A_VORBIS . . . . . . . . . . . . . . . . . . . . . . 20 104 4.4.16. A_FLAC . . . . . . . . . . . . . . . . . . . . . . . 21 105 4.4.17. A_REAL/14_4 . . . . . . . . . . . . . . . . . . . . . 21 106 4.4.18. A_REAL/28_8 . . . . . . . . . . . . . . . . . . . . . 21 107 4.4.19. A_REAL/COOK . . . . . . . . . . . . . . . . . . . . . 21 108 4.4.20. A_REAL/SIPR . . . . . . . . . . . . . . . . . . . . . 22 109 4.4.21. A_REAL/RALF . . . . . . . . . . . . . . . . . . . . . 22 110 4.4.22. A_REAL/ATRC . . . . . . . . . . . . . . . . . . . . . 22 111 4.4.23. A_MS/ACM . . . . . . . . . . . . . . . . . . . . . . 23 112 4.4.24. A_AAC/MPEG2/MAIN . . . . . . . . . . . . . . . . . . 23 113 4.4.25. A_AAC/MPEG2/LC . . . . . . . . . . . . . . . . . . . 23 114 4.4.26. A_AAC/MPEG2/LC/SBR . . . . . . . . . . . . . . . . . 23 115 4.4.27. A_AAC/MPEG2/SSR . . . . . . . . . . . . . . . . . . . 24 116 4.4.28. A_AAC/MPEG4/MAIN . . . . . . . . . . . . . . . . . . 24 117 4.4.29. A_AAC/MPEG4/LC . . . . . . . . . . . . . . . . . . . 24 118 4.4.30. A_AAC/MPEG4/LC/SBR . . . . . . . . . . . . . . . . . 25 119 4.4.31. A_AAC/MPEG4/SSR . . . . . . . . . . . . . . . . . . . 25 120 4.4.32. A_AAC/MPEG4/LTP . . . . . . . . . . . . . . . . . . . 25 121 4.4.33. A_QUICKTIME . . . . . . . . . . . . . . . . . . . . . 25 122 4.4.34. A_QUICKTIME/QDMC . . . . . . . . . . . . . . . . . . 26 123 4.4.35. A_QUICKTIME/QDM2 . . . . . . . . . . . . . . . . . . 26 124 4.4.36. A_TTA1 . . . . . . . . . . . . . . . . . . . . . . . 26 125 4.4.37. A_WAVPACK4 . . . . . . . . . . . . . . . . . . . . . 27 126 4.5. Subtitle Codec Mappings . . . . . . . . . . . . . . . . . 27 127 4.5.1. S_TEXT/UTF8 . . . . . . . . . . . . . . . . . . . . . 27 128 4.5.2. S_TEXT/SSA . . . . . . . . . . . . . . . . . . . . . 27 129 4.5.3. S_TEXT/ASS . . . . . . . . . . . . . . . . . . . . . 28 130 4.5.4. S_TEXT/WEBVTT . . . . . . . . . . . . . . . . . . . . 28 131 4.5.5. S_IMAGE/BMP . . . . . . . . . . . . . . . . . . . . . 28 132 4.5.6. S_DVBSUB . . . . . . . . . . . . . . . . . . . . . . 28 133 4.5.7. S_VOBSUB . . . . . . . . . . . . . . . . . . . . . . 29 134 4.5.8. S_HDMV/PGS . . . . . . . . . . . . . . . . . . . . . 29 135 4.5.9. S_HDMV/TEXTST . . . . . . . . . . . . . . . . . . . . 29 136 4.5.10. S_KATE . . . . . . . . . . . . . . . . . . . . . . . 29 137 4.6. Button Codec Mappings . . . . . . . . . . . . . . . . . . 30 138 4.6.1. B_VOBBTN . . . . . . . . . . . . . . . . . . . . . . 30 139 4.7. Block Addition Mappings . . . . . . . . . . . . . . . . . 30 140 4.7.1. Use BlockAddIDValue . . . . . . . . . . . . . . . . . 30 141 4.7.2. Opaque data . . . . . . . . . . . . . . . . . . . . . 30 142 4.7.3. ITU T.35 metadata . . . . . . . . . . . . . . . . . . 31 143 4.7.4. avcE . . . . . . . . . . . . . . . . . . . . . . . . 31 144 4.7.5. dvcC . . . . . . . . . . . . . . . . . . . . . . . . 31 145 4.7.6. dvvC . . . . . . . . . . . . . . . . . . . . . . . . 31 146 4.7.7. hvcE . . . . . . . . . . . . . . . . . . . . . . . . 31 147 4.7.8. mvcC . . . . . . . . . . . . . . . . . . . . . . . . 32 148 5. Subtitles . . . . . . . . . . . . . . . . . . . . . . . . . . 32 149 5.1. Images Subtitles . . . . . . . . . . . . . . . . . . . . 33 150 5.2. SRT Subtitles . . . . . . . . . . . . . . . . . . . . . . 36 151 5.3. SSA/ASS Subtitles . . . . . . . . . . . . . . . . . . . . 36 152 5.4. WebVTT . . . . . . . . . . . . . . . . . . . . . . . . . 41 153 5.4.1. Storage of WebVTT in Matroska . . . . . . . . . . . . 41 154 5.4.1.1. CodecID: codec identification . . . . . . . . . . 41 155 5.4.1.2. CodecPrivate: storage of global WebVTT blocks . . 41 156 5.4.1.3. Storage of non-global WebVTT blocks . . . . . . . 41 157 5.4.1.4. Storage of Cues in Matroska blocks . . . . . . . 41 158 5.4.1.5. BlockAdditions: storing non-global WebVTT blocks, 159 Cue Settings Lists and Cue identifiers . . . . . . 42 160 5.4.2. Examples of transformation . . . . . . . . . . . . . 42 161 5.4.2.1. Example WebVTT file . . . . . . . . . . . . . . . 42 162 5.4.2.2. Example of CodecPrivate . . . . . . . . . . . . . 43 163 5.4.2.3. Storage of Cue 1 . . . . . . . . . . . . . . . . 44 164 5.4.2.4. Storage of Cue 2 . . . . . . . . . . . . . . . . 44 165 5.4.2.5. Storage of Cue 3 . . . . . . . . . . . . . . . . 45 166 5.4.2.6. Storage of Cue 4 . . . . . . . . . . . . . . . . 45 167 5.4.3. Storage of WebVTT in Matroska vs. WebM . . . . . . . 45 168 5.5. HDMV presentation graphics subtitles . . . . . . . . . . 46 169 5.5.1. Storage of HDMV presentation graphics subtitles . . . 46 170 5.5.1.1. Storage of HDMV PGS Segments in Matroska 171 Blocks . . . . . . . . . . . . . . . . . . . . . . 46 172 5.6. HDMV text subtitles . . . . . . . . . . . . . . . . . . . 46 173 5.6.1. Storage of HDMV text subtitles . . . . . . . . . . . 46 174 5.6.1.1. Storage of HDMV TextST Dialog Presentation Segments 175 in Matroska Blocks . . . . . . . . . . . . . . . . 47 176 5.6.1.2. Character set . . . . . . . . . . . . . . . . . . 47 177 5.7. Digital Video Broadcasting (DVB) subtitles . . . . . . . 47 178 5.7.1. Storage of DVB subtitles . . . . . . . . . . . . . . 48 179 5.7.1.1. CodecID . . . . . . . . . . . . . . . . . . . . . 48 180 5.7.1.2. CodecPrivate . . . . . . . . . . . . . . . . . . 48 181 5.7.1.3. Storage of DVB subtitles in Matroska Blocks . . . 48 182 6. Block Additional Mapping . . . . . . . . . . . . . . . . . . 48 183 6.1. Summary of Assigned BlockAddIDType Values . . . . . . . . 50 184 6.2. SMPTE ST 12-1 Timecode . . . . . . . . . . . . . . . . . 50 185 6.2.1. Timecode Description . . . . . . . . . . . . . . . . 50 186 6.2.2. BlockAddIDType . . . . . . . . . . . . . . . . . . . 51 187 6.2.3. BlockAddIDName . . . . . . . . . . . . . . . . . . . 51 188 6.2.4. BlockAddIDExtraData . . . . . . . . . . . . . . . . . 52 189 7. Security Considerations . . . . . . . . . . . . . . . . . . . 52 190 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 52 191 9. Normative References . . . . . . . . . . . . . . . . . . . . 52 192 10. Informative References . . . . . . . . . . . . . . . . . . . 53 193 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 53 195 1. Introduction 197 Matroska aims to become THE standard of multimedia container formats. 198 It stores interleaved and timestamped audio/video/subtitle data using 199 various codecs. To interpret the codec data, a mapping between the 200 way the data is stored in Matroska and how it is understood by such a 201 codec is necessary. 203 This document intends to define this mapping for many commonly used 204 codecs in Matroska. 206 2. Status of this document 208 This document is a work-in-progress specification defining the 209 Matroska file format as part of the IETF Cellar working group 210 (https://datatracker.ietf.org/wg/cellar/charter/). It uses basic 211 elements and concept already defined in the Matroska specifications 212 defined by this workgroup. 214 3. Notation and Conventions 216 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 217 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 218 "OPTIONAL" in this document are to be interpreted as described in BCP 219 14 [RFC2119] [RFC8174] when, and only when, they appear in all 220 capitals, as shown here. 222 4. Codec Mappings 224 A Codec Mapping is a set of attributes to identify, name, and 225 contextualize the format and characteristics of encoded data that can 226 be contained within Matroska Clusters. 228 Each TrackEntry used within Matroska MUST reference a defined Codec 229 Mapping using the Codec ID to identify and describe the format of the 230 encoded data in its associated Clusters. This Codec ID is a unique 231 registered identifier that represents the encoding stored within the 232 Track. Certain encodings MAY also require some form of codec 233 initialization in order to provide its decoder with context and 234 technical metadata. 236 The intention behind this list is not to list all existing audio and 237 video codecs, but rather to list those codecs that are currently 238 supported in Matroska and therefore need a well defined Codec ID so 239 that all developers supporting Matroska will use the same Codec ID. 240 If you feel we missed support for a very important codec, please tell 241 us on our development mailing list (cellar at ietf.org). 243 4.1. Defining Matroska Codec Support 245 Support for a codec is defined in Matroska with the following values. 247 4.1.1. Codec ID 249 Each codec supported for storage in Matroska MUST have a unique Codec 250 ID. Each Codec ID MUST be prefixed with the string from the 251 following table according to the associated type of the codec. All 252 characters of a Codec ID Prefix MUST be capital letters (A-Z) except 253 for the last character of a Codec ID Prefix which MUST be an 254 underscore ("_"). 256 +============+=================+ 257 | Codec Type | Codec ID Prefix | 258 +============+=================+ 259 | Video | "V_" | 260 +------------+-----------------+ 261 | Audio | "A_" | 262 +------------+-----------------+ 263 | Subtitle | "S_" | 264 +------------+-----------------+ 265 | Button | "B_" | 266 +------------+-----------------+ 268 Table 1 270 Each Codec ID MUST include a Major Codec ID immediately following the 271 Codec ID Prefix. A Major Codec ID MAY be followed by an OPTIONAL 272 Codec ID Suffix to communicate a refinement of the Major Codec ID. 273 If a Codec ID Suffix is used, then the Codec ID MUST include a 274 forward slash ("/") as a separator between the Major Codec ID and the 275 Codec ID Suffix. The Major Codec ID MUST be composed of only capital 276 letters (A-Z) and numbers (0-9). The Codec ID Suffix MUST be 277 composed of only capital letters (A-Z), numbers (0-9), underscore 278 ("_"), and forward slash ("/"). 280 The following table provides examples of valid Codec IDs and their 281 components: 283 +==========+==========+===========+===========+=================+ 284 | Codec ID | Major | Separator | Codec ID | Codec ID | 285 | Prefix | Codec ID | | Suffix | | 286 +==========+==========+===========+===========+=================+ 287 | A_ | AAC | / | MPEG2/LC/ | A_AAC/MPEG2/LC/ | 288 | | | | SBR | SBR | 289 +----------+----------+-----------+-----------+-----------------+ 290 | V_ | MPEG4 | / | ISO/ASP | V_MPEG4/ISO/ASP | 291 +----------+----------+-----------+-----------+-----------------+ 292 | V_ | MPEG1 | | | V_MPEG1 | 293 +----------+----------+-----------+-----------+-----------------+ 295 Table 2 297 4.1.2. Codec Name 299 Each encoding supported for storage in Matroska MUST have a Codec 300 Name. The Codec Name provides a readable label for the encoding. 302 4.1.3. Description 304 An optional description for the encoding. This value is only 305 intended for human consumption. 307 4.1.4. Initialization 309 Each encoding supported for storage in Matroska MUST have a defined 310 Initialization. The Initialization MUST describe the storage of data 311 necessary to initialize the decoder, which MUST be stored within the 312 CodecPrivate Element. When the Initialization is updated within a 313 track, then that updated Initialization data MUST be written into the 314 CodecState Element of the first Cluster to require it. If the 315 encoding does not require any form of Initialization, then none MUST 316 be used to define the Initialization and the CodecPrivate Element 317 SHOULD NOT be written and MUST be ignored. Data that is defined 318 Initialization to be stored in the CodecPrivate Element is known as 319 Private Data. 321 4.1.5. Codec BlockAdditions 323 Additional data that contextualizes or supplements a Block can be 324 stored within the BlockAdditional Element of a BlockMore Element. 325 This BlockAdditional data MAY be passed to the associated decoder 326 along with the content of the Block Element. Each BlockAdditional is 327 coupled with a BlockAddID that identifies the kind of data it 328 contains. The following table defines the meanings of BlockAddID 329 values. 331 +============+=====================================================+ 332 | BlockAddID | Definition | 333 | Value | | 334 +============+=====================================================+ 335 | 0 | Invalid. | 336 +------------+-----------------------------------------------------+ 337 | 1 | Indicates that the context of the BlockAdditional | 338 | | data is defined by the corresponding Codec Mapping. | 339 +------------+-----------------------------------------------------+ 340 | 2 or | BlockAddID values of 2 and greater are mapped to | 341 | greater | the BlockAddIDValue of the BlockAdditionMapping of | 342 | | the associated Track. | 343 +------------+-----------------------------------------------------+ 345 Table 3 347 The values of BlockAddID that are 2 of greater have no semantic 348 meaning, but simply associate the BlockMore Element with a 349 BlockAdditionMapping of the associated Track. See Section 6 on Block 350 Additional Mappings for more information. 352 The following XML depicts the nested Elements of a BlockGroup Element 353 with an example of BlockAdditions: 355 356 {Binary data of a VP9 video frame in YUV} 357 358 359 1 360 361 {alpha channel encoding to supplement the VP9 frame} 362 363 364 365 367 4.1.6. Citation 369 Documentation of the associated normative and informative references 370 for the codec is RECOMMENDED. 372 4.1.7. Deprecation Date 374 A timestamp, expressed in [RFC3339] that notes when support for the 375 Codec Mapping within Matroska was deprecated. If a Codec Mapping is 376 defined with a Deprecation Date, then it is RECOMMENDED that Matroska 377 writers SHOULD NOT use the Codec Mapping after the Deprecation Date. 379 4.1.8. Superseded By 381 A Codec Mapping MAY only be defined with a Superseded By value, if it 382 has an expressed Deprecation Date. If used, the Superseded By value 383 MUST store the Codec ID of another Codec Mapping that has superseded 384 the Codec Mapping. 386 4.2. Recommendations for the Creation of New Codec Mappings 388 Creators of new Codec Mappings to be used in the context of Matroska: 390 * SHOULD assume that all Codec Mappings they create might become 391 standardized, public, commonly deployed, or usable across multiple 392 implementations. 394 * SHOULD employ meaningful values for Codec ID and Codec Name that 395 they have reason to believe are currently unused. 397 * SHOULD NOT prefix their Codec ID with "X_" or similar constructs. 399 These recommendations are based upon Section 3 of [RFC6648]. 401 4.3. Video Codec Mappings 403 4.3.1. V_MS/VFW/FOURCC 405 Codec ID: V_MS/VFW/FOURCC 407 Codec Name: Microsoft (TM) Video Codec Manager (VCM) 409 Description: The private data contains the VCM structure 410 BITMAPINFOHEADER including the extra private bytes, as defined by 411 Microsoft (https://msdn.microsoft.com/en-us/library/windows/desktop/ 412 dd318229(v=vs.85).aspx). The data are stored in little-endian format 413 (like on IA32 machines). Where is the Huffman table stored in 414 HuffYUV, not AVISTREAMINFO ??? And the FourCC, not in 415 AVISTREAMINFO.fccHandler ??? 417 Initialization: Private Data contains the VCM structure 418 BITMAPINFOHEADER including the extra private bytes, as defined by 419 Microsoft in https://msdn.microsoft.com/en- 420 us/library/windows/desktop/dd183376(v=vs.85).aspx 421 (https://msdn.microsoft.com/en-us/library/windows/desktop/ 422 dd183376(v=vs.85).aspx). 424 Citation: https://msdn.microsoft.com/en-us/library/windows/desktop/ 425 dd183376(v=vs.85).aspx (https://msdn.microsoft.com/en- 426 us/library/windows/desktop/dd183376(v=vs.85).aspx) 428 4.3.2. V_UNCOMPRESSED 430 Codec ID: V_UNCOMPRESSED 432 Codec Name: Video, raw uncompressed video frames 434 Description: All details about the used color specs and bit depth are 435 to be put/read from the TrackEntry\Video\UncompressedFourCC elements. 437 Initialization: none 439 4.3.3. V_MPEG4/ISO/SP 441 Codec ID: V_MPEG4/ISO/SP 443 Codec Name: MPEG4 ISO simple profile (DivX4) 445 Description: Stream was created via improved codec API (UCI) or even 446 transmuxed from AVI (no b-frames in Simple Profile), frame order is 447 coding order. 449 Initialization: none 451 4.3.4. V_MPEG4/ISO/ASP 453 Codec ID: V_MPEG4/ISO/ASP 455 Codec Name: MPEG4 ISO advanced simple profile (DivX5, XviD, FFMPEG) 457 Description: Stream was created via improved codec API (UCI) or 458 transmuxed from MP4, not simply transmuxed from AVI. Note there are 459 differences how b-frames are handled in these native streams, when 460 being compared to a VfW created stream, as here there are no dummy 461 frames inserted, the frame order is exactly the same as the coding 462 order, same as in MP4 streams. 464 Initialization: none 466 4.3.5. V_MPEG4/ISO/AP 468 Codec ID: V_MPEG4/ISO/AP 470 Codec Name: MPEG4 ISO advanced profile 471 Description: Stream was created via improved codec API (UCI) or 472 transmuxed from MP4, not simply transmuxed from AVI. Note there are 473 differences how b-frames are handled in these native streams, when 474 being compared to a VfW created stream, as here there are no dummy 475 frames inserted, the frame order is exactly the same as the coding 476 order, same as in MP4 streams. 478 Initialization: none 480 4.3.6. V_MPEG4/MS/V3 482 Codec ID: V_MPEG4/MS/V3 484 Codec Name: Microsoft (TM) MPEG4 V3 486 Description: Microsoft (TM) MPEG4 V3 and derivates, means DivX3, 487 Angelpotion, SMR, etc.; stream was created using VfW codec or 488 transmuxed from AVI; note that V1/V2 are covered in VfW compatibility 489 mode. 491 Initialization: none 493 4.3.7. V_MPEG1 495 Codec ID: V_MPEG1 497 Codec Name: MPEG 1 499 Description: The Matroska video stream will contain a demuxed 500 Elementary Stream (ES), where block boundaries are still to be 501 defined. Its RECOMMENDED to use MPEG2MKV.exe for creating those 502 files, and to compare the results with self-made implementations 504 Initialization: none 506 4.3.8. V_MPEG2 508 Codec ID: V_MPEG2 510 Codec Name: MPEG 2 512 Description: The Matroska video stream will contain a demuxed 513 Elementary Stream (ES), where block boundaries are still to be 514 defined. Its RECOMMENDED to use MPEG2MKV.exe for creating those 515 files, and to compare the results with self-made implementations 517 Initialization: none 519 4.3.9. V_MPEG4/ISO/AVC 521 Codec ID: V_MPEG4/ISO/AVC 523 Codec Name: AVC/H.264 525 Description: Individual pictures (which could be a frame, a field, or 526 2 fields having the same timestamp) of AVC/H.264 stored as described 527 in [ISO.14496-15]. 529 Initialization: The Private Data contains a 530 AVCDecoderConfigurationRecord structure, as defined in 531 [ISO.14496-15]. For legacy reasons, because Block Addition Mappings 532 are preferred, see Section 4.7, the AVCDecoderConfigurationRecord 533 structure MAY be followed by an extension block beginning with a 534 4-byte extension block size field in big-endian byte order which is 535 the size of the extension block minus 4 (excluding the size of the 536 extension block size field) and a 4-byte field corresponding to a 537 BlockAddIDType of "mvcC" followed by a content corresponding to the 538 content of BlockAddIDExtraData for mvcC; see Section 4.7.8. 540 4.3.10. V_MPEGH/ISO/HEVC 542 Codec ID: V_MPEGH/ISO/HEVC 544 Codec Name: HEVC/H.265 546 Description: Individual pictures (which could be a frame, a field, or 547 2 fields having the same timestamp) of HEVC/H.265 stored as described 548 in [ISO.14496-15]. 550 Initialization: The Private Data contains a 551 HEVCDecoderConfigurationRecord structure, as defined in 552 [ISO.14496-15]. 554 4.3.11. V_AVS2 556 Codec ID: V_AVS2 558 Codec Name: AVS2-P2/IEEE.1857.4 560 Description: Individual pictures of AVS2-P2 stored as described in 561 the second part of [IEEE.1857-4]. 563 Initialization: none. 565 4.3.12. V_AVS3 567 Codec ID: V_AVS3 569 Codec Name: AVS3-P2/IEEE.1857.10 571 Description: Individual pictures of AVS3-P2 stored as described in 572 the second part of [IEEE.1857-10]. 574 Initialization: none. 576 4.3.13. V_REAL/RV10 578 Codec ID: V_REAL/RV10 580 Codec Name: RealVideo 1.0 aka RealVideo 5 582 Description: Individual slices from the Real container are combined 583 into a single frame. 585 Initialization: The Private Data contains a real_video_props_t 586 structure in big-endian byte order as found in librmff 587 (https://github.com/mbunkus/mkvtoolnix/blob/master/lib/librmff/ 588 librmff.h). 590 4.3.14. V_REAL/RV20 592 Codec ID: V_REAL/RV20 594 Codec Name: RealVideo G2 and RealVideo G2+SVT 596 Description: Individual slices from the Real container are combined 597 into a single frame. 599 Initialization: The Private Data contains a real_video_props_t 600 structure in big-endian byte order as found in librmff 601 (https://github.com/mbunkus/mkvtoolnix/blob/master/lib/librmff/ 602 librmff.h). 604 4.3.15. V_REAL/RV30 606 Codec ID: V_REAL/RV30 608 Codec Name: RealVideo 8 610 Description: Individual slices from the Real container are combined 611 into a single frame. 613 Initialization: The Private Data contains a real_video_props_t 614 structure in big-endian byte order as found in librmff 615 (https://github.com/mbunkus/mkvtoolnix/blob/master/lib/librmff/ 616 librmff.h). 618 4.3.16. V_REAL/RV40 620 Codec ID: V_REAL/RV40 622 Codec Name: rv40 : RealVideo 9 624 Description: Individual slices from the Real container are combined 625 into a single frame. 627 Initialization: The Private Data contains a real_video_props_t 628 structure in big-endian byte order as found in librmff 629 (https://github.com/mbunkus/mkvtoolnix/blob/master/lib/librmff/ 630 librmff.h). 632 4.3.17. V_QUICKTIME 634 Codec ID: V_QUICKTIME 636 Codec Name: Video taken from QuickTime(TM) files 638 Description: Several codecs as stored in QuickTime, e.g., Sorenson or 639 Cinepak. 641 Initialization: The Private Data contains all additional data that is 642 stored in the 'stsd' (sample description) atom in the QuickTime file 643 *after* the mandatory video descriptor structure (starting with the 644 size and FourCC fields). For an explanation of the QuickTime file 645 format read QuickTime File Format Specification 646 (https://developer.apple.com/library/mac/documentation/QuickTime/ 647 QTFF/QTFFPreface/qtffPreface.html). 649 4.3.18. V_THEORA 651 Codec ID: V_THEORA 653 Codec Name: Theora 655 Initialization: The Private Data contains the first three Theora 656 packets in order. The lengths of the packets precedes them. The 657 actual layout is: 659 * Byte 1: number of distinct packets #p minus one inside the 660 CodecPrivate block. This MUST be "2" for current (as of 661 2016-07-08) Theora headers. 663 * Bytes 2..n: lengths of the first #p packets, coded in Xiph-style 664 lacing. The length of the last packet is the length of the 665 CodecPrivate block minus the lengths coded in these bytes minus 666 one. 668 * Bytes n+1..: The Theora identification header, followed by the 669 commend header followed by the codec setup header. Those are 670 described in the Theora specs (http://www.theora.org/doc/ 671 Theora.pdf). 673 4.3.19. V_PRORES 675 Codec ID: V_PRORES 677 Codec Name: Apple ProRes 679 Initialization: The Private Data contains the FourCC as found in MP4 680 movies: 682 * ap4x: ProRes 4444 XQ 684 * ap4h: ProRes 4444 686 * apch: ProRes 422 High Quality 688 * apcn: ProRes 422 Standard Definition 690 * apcs: ProRes 422 LT 692 * apco: ProRes 422 Proxy 694 * aprh: ProRes RAW High Quality 696 * aprn: ProRes RAW Standard Definition 698 this page for more technical details on ProRes 699 (http://wiki.multimedia.cx/index.php?title=Apple_ProRes#Frame_layout) 701 4.3.20. V_VP8 703 Codec ID: V_VP8 705 Codec Name: VP8 Codec format 706 Description: VP8 is an open and royalty free video compression format 707 developed by Google and created by On2 Technologies as a successor to 708 VP7. [RFC6386] 710 Codec BlockAdditions: A single-channel encoding of an alpha channel 711 MAY be stored in BlockAdditions. The BlockAddId of the BlockMore 712 containing these data MUST be 1. 714 Initialization: none 716 4.3.21. V_VP9 718 Codec ID: V_VP9 720 Codec Name: VP9 Codec format 722 Description: VP9 is an open and royalty free video compression format 723 developed by Google as a successor to VP8. Draft VP9 Bitstream and 724 Decoding Process Specification (https://www.webmproject.org/vp9/) 726 Codec BlockAdditions: A single-channel encoding of an alpha channel 727 MAY be stored in BlockAdditions. The BlockAddId of the BlockMore 728 containing these data MUST be 1. 730 Initialization: none 732 4.3.22. V_FFV1 734 Codec ID: V_FFV1 736 Codec Name: FF Video Codec 1 738 Description: FFV1 is a lossless intra-frame video encoding format 739 designed to efficiently compress video data in a variety of pixel 740 formats. Compared to uncompressed video, FFV1 offers storage 741 compression, frame fixity, and self-description, which makes FFV1 742 useful as a preservation or intermediate video format. Draft FFV1 743 Specification (https://datatracker.ietf.org/doc/draft-ietf-cellar- 744 ffv1/) 746 Initialization: For FFV1 versions 0 or 1, Private Data SHOULD NOT be 747 written. For FFV1 version 3 or greater, the Private Data MUST 748 contain the FFV1 Configuration Record structure, as defined in 749 https://tools.ietf.org/html/draft-ietf-cellar-ffv1-04#section-4.2 750 (https://tools.ietf.org/html/draft-ietf-cellar-ffv1-04#section-4.2), 751 and no other data. 753 4.4. Audio Codec Mappings 755 4.4.1. A_MPEG/L3 757 Codec ID: A_MPEG/L3 759 Codec Name: MPEG Audio 1, 2, 2.5 Layer III 761 Description: The data contain everything needed for playback in the 762 MPEG Audio header of each frame. Corresponding ACM wFormatTag : 763 0x0055 765 Initialization: none 767 4.4.2. A_MPEG/L2 769 Codec ID: A_MPEG/L2 771 Codec Name: MPEG Audio 1, 2 Layer II 773 Description: The data contain everything needed for playback in the 774 MPEG Audio header of each frame. Corresponding ACM wFormatTag : 775 0x0050 777 Initialization: none 779 4.4.3. A_MPEG/L1 781 Codec ID: A_MPEG/L1 783 Codec Name: MPEG Audio 1, 2 Layer I 785 Description: The data contain everything needed for playback in the 786 MPEG Audio header of each frame. Corresponding ACM wFormatTag : 787 0x0050 789 Initialization: none 791 4.4.4. A_PCM/INT/BIG 793 Codec ID: A_PCM/INT/BIG 795 Codec Name: PCM Integer Big Endian 797 Description: The audio bit depth MUST be read and set from the 798 BitDepth Element. Audio samples MUST be considered as signed values, 799 except if the audio bit depth is 8 which MUST be interpreted as 800 unsigned values. Corresponding ACM wFormatTag : ??? 801 Initialization: none 803 4.4.5. A_PCM/INT/LIT 805 Codec ID: A_PCM/INT/LIT 807 Codec Name: PCM Integer Little Endian 809 Description: The audio bit depth MUST be read and set from the 810 BitDepth Element. Audio samples MUST be considered as signed values, 811 except if the audio bit depth is 8 which MUST be interpreted as 812 unsigned values. Corresponding ACM wFormatTag : 0x0001 814 Initialization: none 816 4.4.6. A_PCM/FLOAT/IEEE 818 Codec ID: A_PCM/FLOAT/IEEE 820 Codec Name: Floating Point, IEEE compatible 822 Description: The audio bit depth MUST be read and set from the 823 BitDepth Element (32 bit in most cases). The floats are stored as 824 defined in [IEEE.754] and in little-endian order. Corresponding ACM 825 wFormatTag : 0x0003 827 Initialization: none 829 4.4.7. A_MPC 831 Codec ID: A_MPC 833 Codec Name: MPC (musepack) SV8 835 Description: The main developer for musepack has requested that we 836 wait until the SV8 framing has been fully defined for musepack before 837 defining how to store it in Matroska. 839 4.4.8. A_AC3 841 Codec ID: A_AC3 843 Codec Name: (Dolby™ (U+2122)) AC3 845 Description: BSID <= 8 !! The private data is void ??? Corresponding 846 ACM wFormatTag : 0x2000 ; channel number have to be read from the 847 corresponding audio element 849 4.4.9. A_AC3/BSID9 851 Codec ID: A_AC3/BSID9 853 Codec Name: (Dolby™ (U+2122)) AC3 855 Description: The ac3 frame header has, similar to the mpeg-audio 856 header a version field. Normal ac3 is defined as bitstream id 8 (5 857 Bits, numbers are 0-15). Everything below 8 is still compatible with 858 all decoders that handle 8 correctly. Everything higher are 859 additions that break decoder compatibility. For the samplerates 860 24kHz (00); 22,05kHz (01) and 16kHz (10) the BSID is 9 For the 861 samplerates 12kHz (00); 11,025kHz (01) and 8kHz (10) the BSID is 10 863 Initialization: none 865 4.4.10. A_AC3/BSID10 867 Codec ID: A_AC3/BSID10 869 Codec Name: (Dolby™ (U+2122)) AC3 871 Description: The ac3 frame header has, similar to the mpeg-audio 872 header a version field. Normal ac3 is defined as bitstream id 8 (5 873 Bits, numbers are 0-15). Everything below 8 is still compatible with 874 all decoders that handle 8 correctly. Everything higher are 875 additions that break decoder compatibility. For the samplerates 876 24kHz (00); 22,05kHz (01) and 16kHz (10) the BSID is 9 For the 877 samplerates 12kHz (00); 11,025kHz (01) and 8kHz (10) the BSID is 10 879 Initialization: none 881 4.4.11. A_ALAC 883 Codec ID: A_ALAC 885 Codec Name: ALAC (Apple Lossless Audio Codec) 887 Initialization: The Private Data contains ALAC's magic cookie (both 888 the codec specific configuration as well as the optional channel 889 layout information). Its format is described in ALAC's official 890 source code (http://alac.macosforge.org/trac/browser/trunk/ 891 ALACMagicCookieDescription.txt). 893 4.4.12. A_DTS 895 Codec ID: A_DTS 896 Codec Name: Digital Theatre System 898 Description: Supports DTS, DTS-ES, DTS-96/26, DTS-HD High Resolution 899 Audio and DTS-HD Master Audio. The private data is void. 900 Corresponding ACM wFormatTag : 0x2001 902 Initialization: none 904 4.4.13. A_DTS/EXPRESS 906 Codec ID: A_DTS/EXPRESS 908 Codec Name: Digital Theatre System Express 910 Description: DTS Express (a.k.a. LBR) audio streams. The private 911 data is void. Corresponding ACM wFormatTag : 0x2001 913 Initialization: none 915 4.4.14. A_DTS/LOSSLESS 917 Codec ID: A_DTS/LOSSLESS 919 Codec Name: Digital Theatre System Lossless 921 Description: DTS Lossless audio that does not have a core substream. 922 The private data is void. Corresponding ACM wFormatTag : 0x2001 924 Initialization: none 926 4.4.15. A_VORBIS 928 Codec ID: A_VORBIS 930 Codec Name: Vorbis 932 Initialization: The Private Data contains the first three Vorbis 933 packet in order. The lengths of the packets precedes them. The 934 actual layout is: - Byte 1: number of distinct packets #p minus one 935 inside the CodecPrivate block. This MUST be "2" for current (as of 936 2016-07-08) Vorbis headers. - Bytes 2..n: lengths of the first #p 937 packets, coded in Xiph-style lacing. The length of the last packet 938 is the length of the CodecPrivate block minus the lengths coded in 939 these bytes minus one. - Bytes n+1..: The Vorbis identification 940 header (https://xiph.org/vorbis/doc/Vorbis_I_spec.html), followed by 941 the Vorbis comment header (https://xiph.org/vorbis/doc/ 942 v-comment.html) followed by the codec setup header 943 (https://xiph.org/vorbis/doc/Vorbis_I_spec.html). 945 4.4.16. A_FLAC 947 Codec ID: A_FLAC 949 Codec Name: FLAC (Free Lossless Audio Codec) 950 (http://flac.sourceforge.net/) 952 Initialization: The Private Data contains all the header/metadata 953 packets before the first data packet. These include the first header 954 packet containing only the word fLaC as well as all metadata packets. 956 4.4.17. A_REAL/14_4 958 Codec ID: A_REAL/14_4 960 Codec Name: Real Audio 1 962 Initialization: The Private Data contains either the 963 "real_audio_v4_props_t" or the "real_audio_v5_props_t" structure 964 (differentiated by their "version" field; big-endian byte order) as 965 found in librmff 966 (https://github.com/mbunkus/mkvtoolnix/blob/master/lib/librmff/ 967 librmff.h). 969 4.4.18. A_REAL/28_8 971 Codec ID: A_REAL/28_8 973 Codec Name: Real Audio 2 975 Initialization: The Private Data contains either the 976 "real_audio_v4_props_t" or the "real_audio_v5_props_t" structure 977 (differentiated by their "version" field; big-endian byte order) as 978 found in librmff 979 (https://github.com/mbunkus/mkvtoolnix/blob/master/lib/librmff/ 980 librmff.h). 982 4.4.19. A_REAL/COOK 984 Codec ID: A_REAL/COOK 986 Codec Name: Real Audio Cook Codec (codename: Gecko) 987 Initialization: The Private Data contains either the 988 "real_audio_v4_props_t" or the "real_audio_v5_props_t" structure 989 (differentiated by their "version" field; big-endian byte order) as 990 found in librmff 991 (https://github.com/mbunkus/mkvtoolnix/blob/master/lib/librmff/ 992 librmff.h). 994 4.4.20. A_REAL/SIPR 996 Codec ID: A_REAL/SIPR 998 Codec Name: Sipro Voice Codec 1000 Initialization: The Private Data contains either the 1001 "real_audio_v4_props_t" or the "real_audio_v5_props_t" structure 1002 (differentiated by their "version" field; big-endian byte order) as 1003 found in librmff 1004 (https://github.com/mbunkus/mkvtoolnix/blob/master/lib/librmff/ 1005 librmff.h). 1007 4.4.21. A_REAL/RALF 1009 Codec ID: A_REAL/RALF 1011 Codec Name: Real Audio Lossless Format 1013 Initialization: The Private Data contains either the 1014 "real_audio_v4_props_t" or the "real_audio_v5_props_t" structure 1015 (differentiated by their "version" field; big-endian byte order) as 1016 found in librmff 1017 (https://github.com/mbunkus/mkvtoolnix/blob/master/lib/librmff/ 1018 librmff.h). 1020 4.4.22. A_REAL/ATRC 1022 Codec ID: A_REAL/ATRC 1024 Codec Name: Sony Atrac3 Codec 1026 Initialization: The Private Data contains either the 1027 "real_audio_v4_props_t" or the "real_audio_v5_props_t" structure 1028 (differentiated by their "version" field; big-endian byte order) as 1029 found in librmff 1030 (https://github.com/mbunkus/mkvtoolnix/blob/master/lib/librmff/ 1031 librmff.h). 1033 4.4.23. A_MS/ACM 1035 Codec ID: A_MS/ACM 1037 Codec Name: Microsoft(TM) Audio Codec Manager (ACM) 1039 Description: The data are stored in little-endian format (like on 1040 IA32 machines). 1042 Initialization: The Private Data contains the ACM structure 1043 WAVEFORMATEX including the extra private bytes, as defined by 1044 Microsoft 1045 (http://msdn.microsoft.com/library/default.asp?url=/library/en- 1046 us/multimed/mmstr_625u.asp). 1048 4.4.24. A_AAC/MPEG2/MAIN 1050 Codec ID: A_AAC/MPEG2/MAIN 1052 Codec Name: MPEG2 Main Profile 1054 Description: Channel number and sample rate have to be read from the 1055 corresponding audio element. Audio stream is stripped from ADTS 1056 headers and normal Matroska frame based muxing scheme is applied. 1057 AAC audio always uses wFormatTag 0xFF. 1059 Initialization: none 1061 4.4.25. A_AAC/MPEG2/LC 1063 Codec ID: A_AAC/MPEG2/LC 1065 Codec Name: Low Complexity 1067 Description: Channel number and sample rate have to be read from the 1068 corresponding audio element. Audio stream is stripped from ADTS 1069 headers and normal Matroska frame based muxing scheme is applied. 1070 AAC audio always uses wFormatTag 0xFF. 1072 Initialization: none 1074 4.4.26. A_AAC/MPEG2/LC/SBR 1076 Codec ID: A_AAC/MPEG2/LC/SBR 1078 Codec Name: Low Complexity with Spectral Band Replication 1079 Description: Channel number and sample rate have to be read from the 1080 corresponding audio element. Audio stream is stripped from ADTS 1081 headers and normal Matroska frame based muxing scheme is applied. 1082 AAC audio always uses wFormatTag 0xFF. 1084 Initialization: none 1086 4.4.27. A_AAC/MPEG2/SSR 1088 Codec ID: A_AAC/MPEG2/SSR 1090 Codec Name: Scalable Sampling Rate 1092 Description: Channel number and sample rate have to be read from the 1093 corresponding audio element. Audio stream is stripped from ADTS 1094 headers and normal Matroska frame based muxing scheme is applied. 1095 AAC audio always uses wFormatTag 0xFF. 1097 Initialization: none 1099 4.4.28. A_AAC/MPEG4/MAIN 1101 Codec ID: A_AAC/MPEG4/MAIN 1103 Codec Name: MPEG4 Main Profile 1105 Description: Channel number and sample rate have to be read from the 1106 corresponding audio element. Audio stream is stripped from ADTS 1107 headers and normal Matroska frame based muxing scheme is applied. 1108 AAC audio always uses wFormatTag 0xFF. 1110 Initialization: none 1112 4.4.29. A_AAC/MPEG4/LC 1114 Codec ID: A_AAC/MPEG4/LC 1116 Codec Name: Low Complexity 1118 Description: Channel number and sample rate have to be read from the 1119 corresponding audio element. Audio stream is stripped from ADTS 1120 headers and normal Matroska frame based muxing scheme is applied. 1121 AAC audio always uses wFormatTag 0xFF. 1123 Initialization: none 1125 4.4.30. A_AAC/MPEG4/LC/SBR 1127 Codec ID: A_AAC/MPEG4/LC/SBR 1129 Codec Name: Low Complexity with Spectral Band Replication 1131 Description: Channel number and sample rate have to be read from the 1132 corresponding audio element. Audio stream is stripped from ADTS 1133 headers and normal Matroska frame based muxing scheme is applied. 1134 AAC audio always uses wFormatTag 0xFF. 1136 Initialization: none 1138 4.4.31. A_AAC/MPEG4/SSR 1140 Codec ID: A_AAC/MPEG4/SSR 1142 Codec Name: Scalable Sampling Rate 1144 Description: Channel number and sample rate have to be read from the 1145 corresponding audio element. Audio stream is stripped from ADTS 1146 headers and normal Matroska frame based muxing scheme is applied. 1147 AAC audio always uses wFormatTag 0xFF. 1149 Initialization: none 1151 4.4.32. A_AAC/MPEG4/LTP 1153 Codec ID: A_AAC/MPEG4/LTP 1155 Codec Name: Long Term Prediction 1157 Description: Channel number and sample rate have to be read from the 1158 corresponding audio element. Audio stream is stripped from ADTS 1159 headers and normal Matroska frame based muxing scheme is applied. 1160 AAC audio always uses wFormatTag 0xFF. 1162 Initialization: none 1164 4.4.33. A_QUICKTIME 1166 Codec ID: A_QUICKTIME 1168 Codec Name: Audio taken from QuickTime(TM) files 1170 Description: Several codecs as stored in QuickTime, e.g., QDesign 1171 Music v1 or v2. 1173 Initialization: The Private Data contains all additional data that is 1174 stored in the 'stsd' (sample description) atom in the QuickTime file 1175 *after* the mandatory sound descriptor structure (starting with the 1176 size and FourCC fields). For an explanation of the QuickTime file 1177 format read QuickTime File Format Specification 1178 (https://developer.apple.com/library/mac/documentation/QuickTime/ 1179 QTFF/QTFFPreface/qtffPreface.html). 1181 4.4.34. A_QUICKTIME/QDMC 1183 Codec ID: A_QUICKTIME/QDMC 1185 Codec Name: QDesign Music 1187 Description: 1189 Initialization: The Private Data contains all additional data that is 1190 stored in the 'stsd' (sample description) atom in the QuickTime file 1191 *after* the mandatory sound descriptor structure (starting with the 1192 size and FourCC fields). For an explanation of the QuickTime file 1193 format read QuickTime File Format Specification 1194 (https://developer.apple.com/library/mac/documentation/QuickTime/ 1195 QTFF/QTFFPreface/qtffPreface.html). 1197 Superseded By: A_QUICKTIME 1199 4.4.35. A_QUICKTIME/QDM2 1201 Codec ID: A_QUICKTIME/QDM2 1203 Codec Name: QDesign Music v2 1205 Description: 1207 Initialization: The Private Data contains all additional data that is 1208 stored in the 'stsd' (sample description) atom in the QuickTime file 1209 *after* the mandatory sound descriptor structure (starting with the 1210 size and FourCC fields). For an explanation of the QuickTime file 1211 format read QuickTime File Format Specification 1212 (https://developer.apple.com/library/mac/documentation/QuickTime/ 1213 QTFF/QTFFPreface/qtffPreface.html). 1215 Superseded By: A_QUICKTIME 1217 4.4.36. A_TTA1 1219 Codec ID: A_TTA1 1220 Codec Name: The True Audio (http://tausoft.org/) lossless audio 1221 compressor 1223 Description: TTA format description (http://tausoft.org/wiki/ 1224 True_Audio_Codec_Format) Each frame is kept intact, including the 1225 CRC32. The header and seektable are dropped. SamplingFrequency, 1226 Channels and BitDepth are used in the TrackEntry. wFormatTag = 0x77A1 1228 Initialization: none 1230 4.4.37. A_WAVPACK4 1232 Codec ID: A_WAVPACK4 1234 Codec Name: WavPack (http://www.wavpack.com/) lossless audio 1235 compressor 1237 Description: The Wavpack packets consist of a stripped header 1238 followed by the frame data. For multi-track (> 2 tracks) a frame 1239 consists of many packets. For more details, check the WavPack muxing 1240 description (wavpack.html). 1242 Codec BlockAdditions: For hybrid A_WAVPACK4 encodings (that include a 1243 lossy encoding with a supplemental correction to produce a lossless 1244 encoding), the correction part is stored in BlockAdditional. The 1245 BlockAddId of the BlockMore containing these data MUST be 1. 1247 Initialization: none 1249 4.5. Subtitle Codec Mappings 1251 4.5.1. S_TEXT/UTF8 1253 Codec ID: S_TEXT/UTF8 1255 Codec Name: UTF-8 Plain Text 1257 Description: Basic text subtitles. For more information, see 1258 Section 5 on Subtitles. 1260 4.5.2. S_TEXT/SSA 1262 Codec ID: S_TEXT/SSA 1264 Codec Name: Subtitles Format 1265 Description: The [Script Info] and [V4 Styles] sections are stored in 1266 the codecprivate. Each event is stored in its own Block. For more 1267 information, see Section 5.3 on SSA/ASS. 1269 4.5.3. S_TEXT/ASS 1271 Codec ID: S_TEXT/ASS 1273 Codec Name: Advanced Subtitles Format 1275 Description: The [Script Info] and [V4 Styles] sections are stored in 1276 the codecprivate. Each event is stored in its own Block. For more 1277 information, see Section 5.3 on SSA/ASS. 1279 4.5.4. S_TEXT/WEBVTT 1281 Codec ID: S_TEXT/WEBVTT 1283 Codec Name: Web Video Text Tracks Format (WebVTT) 1285 Description: Advanced text subtitles. For more information, see 1286 Section 5.4 on WebVTT. 1288 4.5.5. S_IMAGE/BMP 1290 Codec ID: S_IMAGE/BMP 1292 Codec Name: Bitmap 1294 Description: Basic image based subtitle format; The subtitles are 1295 stored as images, like in the DVD. The timestamp in the block header 1296 of Matroska indicates the start display time, the duration is set 1297 with the Duration element. The full data for the subtitle bitmap is 1298 stored in the Block's data section. 1300 4.5.6. S_DVBSUB 1302 Codec ID: S_DVBSUB 1304 Codec Name: Digital Video Broadcasting (DVB) subtitles 1306 Description: This is the graphical subtitle format used in the 1307 Digital Video Broadcasting standard. For more information, see 1308 Section 5.7 on Digital Video Broadcasting (DVB). 1310 4.5.7. S_VOBSUB 1312 Codec ID: S_VOBSUB 1314 Codec Name: VobSub subtitles 1316 Description: The same subtitle format used on DVDs. Supported is 1317 only format version 7 and newer. VobSubs consist of two files, the 1318 .idx containing information, and the .sub, containing the actual 1319 data. The .idx file is stripped of all empty lines, of all comments 1320 and of lines beginning with alt: or langidx:. The line beginning with 1321 id: SHOULD be transformed into the appropriate Matroska track 1322 language element and is discarded. All remaining lines but the ones 1323 containing timestamps and file positions are put into the 1324 CodecPrivate element. 1326 For each line containing the timestamp and file position data is read 1327 from the appropriate position in the .sub file. This data consists 1328 of a MPEG program stream which in turn contains SPU packets. The 1329 MPEG program stream data is discarded, and each SPU packet is put 1330 into one Matroska frame. 1332 4.5.8. S_HDMV/PGS 1334 Codec ID: S_HDMV/PGS 1336 Codec Name: HDMV presentation graphics subtitles (PGS) 1338 Description: This is the graphical subtitle format used on Blu-rays. 1339 For more information, see Section 5.6 on HDMV text presentation. 1341 4.5.9. S_HDMV/TEXTST 1343 Codec ID: S_HDMV/TEXTST 1345 Codec Name: HDMV text subtitles 1347 Description: This is the textual subtitle format used on Blu-rays. 1348 For more information, see Section 5.5 on HDMV graphics presentation. 1350 4.5.10. S_KATE 1352 Codec ID: S_KATE 1354 Codec Name: Karaoke And Text Encapsulation 1355 Description: A subtitle format developed for ogg. The mapping for 1356 Matroska is described on the Xiph wiki 1357 (http://wiki.xiph.org/index.php/OggKate#Matroska_mapping). As for 1358 Theora and Vorbis, Kate headers are stored in the private data as 1359 xiph-laced packets. 1361 4.6. Button Codec Mappings 1363 4.6.1. B_VOBBTN 1365 Codec ID: B_VOBBTN 1367 Codec Name: VobBtn Buttons 1369 Description: Based on MPEG/VOB PCI packets 1370 (http://dvd.sourceforge.net/dvdinfo/pci_pkt.html). The file contains 1371 a header consisting of the string "butonDVD" followed by the width 1372 and height in pixels (16 bits integer each) and 4 reserved bytes. 1373 The rest is full PCI packets (http://dvd.sourceforge.net/dvdinfo/ 1374 pci_pkt.html). 1376 4.7. Block Addition Mappings 1378 Registered BlockAddIDType are: 1380 4.7.1. Use BlockAddIDValue 1382 Block type identifier: 0 1384 Block type name: Use BlockAddIDValue 1386 Description: This value indicates that the actual type is stored in 1387 BlockAddIDValue instead. This value is expected to be used when it 1388 is important to have a strong compatibility with players or derived 1389 formats not supporting BlockAdditionMapping but using BlockAdditions 1390 with an unknown BlockAddIDValue, and SHOULD NOT be used if it is 1391 possible to use another value. 1393 4.7.2. Opaque data 1395 Block type identifier: 1 1397 Block type name: Opaque data 1399 Description: the BlockAdditional data is interpreted as opaque 1400 additional data passed to the codec with the Block data. 1401 BlockAddIDValue MUST be 1. 1403 4.7.3. ITU T.35 metadata 1405 Block type identifier: 4 1407 Block type name: ITU T.35 metadata 1409 Description: the BlockAdditional data is interpreted as ITU T.35 1410 metadata, as defined by ITU-T T.35 terminal codes. BlockAddIDValue 1411 MUST be 4. 1413 4.7.4. avcE 1415 Block type identifier: 0x61766345 1417 Block type name: Dolby Vision enhancement-layer AVC configuration 1419 Description: the BlockAddIDExtraData data is interpreted as the Dolby 1420 Vision enhancement-layer AVC configuration box as described in 1421 [DolbyVisionWithinIso]. This extension MUST NOT be used if Codec ID 1422 is not V_MPEG4/ISO/AVC. 1424 4.7.5. dvcC 1426 Block type identifier: 0x64766343 1428 Block type name: Dolby Vision configuration 1430 Description: the BlockAddIDExtraData data is interpreted as 1431 DOVIDecoderConfigurationRecord structure, as defined in 1432 [DolbyVisionWithinIso], for Dolby Vision profiles less than and equal 1433 to 7. 1435 4.7.6. dvvC 1437 Block type identifier: 0x64767643 1439 Block type name: Dolby Vision configuration 1441 Description: the BlockAddIDExtraData data is interpreted as 1442 DOVIDecoderConfigurationRecord structure, as defined in 1443 [DolbyVisionWithinIso], for Dolby Vision profiles greater than 7. 1445 4.7.7. hvcE 1447 Block type identifier: 0x68766345 1449 Block type name: Dolby Vision enhancement-layer HEVC configuration 1450 Description: the BlockAddIDExtraData data is interpreted as the Dolby 1451 Vision enhancement-layer HEVC configuration as described in 1452 [DolbyVisionWithinIso]. This extension MUST NOT be used if Codec ID 1453 is not V_MPEGH/ISO/HEVC. 1455 4.7.8. mvcC 1457 Block type identifier: 0x6D766343 1459 Block type name: MVC configuration 1461 Description: the BlockAddIDExtraData data is interpreted as 1462 MVCDecoderConfigurationRecord structure, as defined in 1463 [ISO.14496-15]. This extension MUST NOT be used if Codec ID is not 1464 V_MPEG4/ISO/AVC. 1466 5. Subtitles 1468 Because Matroska is a general container format, we try to avoid 1469 specifying the formats to store in it. This type of work is really 1470 outside of the scope of a container-only format. However, because 1471 the use of subtitles in A/V containers has been so limited (with the 1472 exception of DVD) we are taking the time to specify how to store some 1473 of the more common subtitle formats in Matroska. This is being done 1474 to help facilitate their growth. Otherwise, incompatibilities could 1475 prevent the standardization and use of subtitle storage. 1477 This page is not meant to be a complete listing of all subtitle 1478 formats that will be used in Matroska, it is only meant to be a guide 1479 for the more common, current formats. It is possible that we will 1480 add future formats to this page as they are created, but it is not 1481 likely as any other new subtitle format designer would likely have 1482 their own specifications. Any specification listed here SHOULD be 1483 strictly adhered to or it SHOULD NOT use the corresponding Codec ID. 1485 Here is a list of pointers for storing subtitles in Matroska: 1487 * Any Matroska file containing only subtitles SHOULD use the 1488 extension ".mks". 1490 * As a general rule of thumb for all codecs, information that is 1491 global to an entire stream SHOULD be stored in the CodecPrivate 1492 element. 1494 * Start and stop timestamps that are used in a timestamps native 1495 storage format SHOULD be removed when being placed in Matroska as 1496 they could interfere if the file is edited afterwards. Instead, 1497 the Blocks timestamp and Duration SHOULD be used to say when the 1498 timestamp is displayed. 1500 * Because a "subtitle" stream is actually just an overlay stream, 1501 anything with a transparency layer could be use, including video. 1503 5.1. Images Subtitles 1505 The first image format that is a goal to import into Matroska is the 1506 VobSub subtitle format. This subtitle type is generated by exporting 1507 the subtitles from a DVD. 1509 The requirement for muxing VobSub into Matroska is v7 subtitles (see 1510 first line of the .IDX file). If the version is smaller, you must 1511 remux them using the SubResync utility from VobSub 2.23 (or MPC) into 1512 v7 format. Generally any newly created subs will be in v7 format. 1514 The .IFO file will not be used at all. 1516 If there is more than one subtitle stream in the VobSub set, each 1517 stream will need to be separated into separate tracks for storage in 1518 Matroska. E.g. the VobSub file contains streams for both English and 1519 German subtitles. Then the resulting Matroska file SHOULD contain 1520 two tracks. That way the language information can be dropped and 1521 mapped to Matroska's language tags. 1523 The .IDX file is reformatted (see below) and placed in the 1524 CodecPrivate. 1526 Each .BMP will be stored in its own Block. The Timestamp with be 1527 stored in the Blocks Timestamp and the duration will be stored in the 1528 Default Duration. 1530 Here is an example .IDX file: 1532 # VobSub index file, v7 (do not modify this line!) 1533 # 1534 # To repair desynchronization, you can insert gaps this way: 1535 # (it usually happens after vob id changes) 1536 # 1537 # delay: [sign]hh:mm:ss:ms 1538 # 1539 # Where: 1540 # [sign]: +, - (optional) 1541 # hh: hours (0 <= hh) 1542 # mm/ss: minutes/seconds (0 <= mm/ss <= 59) 1543 # ms: milliseconds (0 <= ms <= 999) 1544 # 1545 # Note: You can't position a sub before the previous with a negative 1546 # value. 1547 # 1548 # You can also modify timestamps or delete a few subs you don't 1549 # like. Just make sure they stay in increasing order. 1551 # Settings 1553 # Original frame size 1554 size: 720x480 1556 # Origin, relative to the upper-left corner, can be overloaded by 1557 # alignment 1558 org: 0, 0 1560 # Image scaling (hor,ver), origin is at the upper-left corner or at 1561 # the alignment coord (x, y) 1562 scale: 100%, 100% 1564 # Alpha blending 1565 alpha: 100% 1567 # Smoothing for very blocky images (use OLD for no filtering) 1568 smooth: OFF 1570 # In millisecs 1571 fadein/out: 50, 50 1573 # Force subtitle placement relative to (org.x, org.y) 1574 align: OFF at LEFT TOP 1576 # For correcting non-progressive desync. (in millisecs or 1577 # hh:mm:ss:ms) 1578 # Note: Not effective in DirectVobSub, use "delay: ... " instead. 1579 time offset: 0 1581 # ON: displays only forced subtitles, OFF: shows everything 1582 forced subs: OFF 1584 # The original palette of the DVD 1585 palette: 000000, 7e7e7e, fbff8b, cb86f1, 7f74b8, e23f06, 0a48ea, \ 1586 b3d65a, 6b92f1, 87f087, c02081, f8d0f4, e3c411, 382201, e8840b, \ 1587 fdfdfd 1589 # Custom colors (transp idxs and the four colors) 1590 custom colors: OFF, tridx: 0000, colors: 000000, 000000, 000000, \ 1591 000000 1593 # Language index in use 1594 langidx: 0 1596 # English 1597 id: en, index: 0 1598 # Uncomment next line to activate alternative name in DirectVobSub / 1599 # Windows Media Player 6.x 1600 # alt: English 1601 # Vob/Cell ID: 1, 1 (PTS: 0) 1602 timestamp: 00:00:01:101, filepos: 000000000 1603 timestamp: 00:00:08:708, filepos: 000001000 1605 First, lines beginning with "#" are removed. These are comments to 1606 make text file editing easier, and as this is not a text file, they 1607 aren't needed. 1609 Next remove the "langidx" and "id" lines. These are used to 1610 differentiate the subtitle streams and define the language. As the 1611 streams will be stored separately anyway, there is no need to 1612 differentiate them here. Also, the language setting will be stored 1613 in the Matroska tags, so there is no need to store it here. 1615 Finally, the "timestamp" will be used to set the Block's timestamp. 1616 Once it is set there, there is no need for it to be stored here. 1617 Also, as it may interfere if the file is edited, it SHOULD NOT be 1618 stored here. 1620 Once all of these items are removed, the data to store in the 1621 CodecPrivate SHOULD look like this: 1623 size: 720x480 1624 org: 0, 0 1625 scale: 100%, 100% 1626 alpha: 100% 1627 smooth: OFF 1628 fadein/out: 50, 50 1629 align: OFF at LEFT TOP 1630 time offset: 0 1631 forced subs: OFF 1632 palette: 000000, 7e7e7e, fbff8b, cb86f1, 7f74b8, e23f06, 0a48ea, \ 1633 b3d65a, 6b92f1, 87f087, c02081, f8d0f4, e3c411, 382201, e8840b, \ 1634 fdfdfd 1635 custom colors: OFF, tridx: 0000, colors: 000000, 000000, 000000, \ 1636 000000 1638 There SHOULD also be two Blocks containing one image each with the 1639 timestamps "00:00:01:101" and "00:00:08:708". 1641 5.2. SRT Subtitles 1643 SRT is perhaps the most basic of all subtitle formats. 1645 It consists of four parts, all in text: 1647 1. A number indicating which subtitle it is in the sequence. 2. The 1648 time that the subtitle appears on the screen, and then disappears. 3. 1649 The subtitle itself. 4. A blank line indicating the start of a new 1650 subtitle. 1652 When placing SRT in Matroska, part 3 is converted to UTF-8 (S_TEXT/ 1653 UTF8) and placed in the data portion of the Block. Part 2 is used to 1654 set the timestamp of the Block, and BlockDuration element. Nothing 1655 else is used. 1657 Here is an example SRT file: 1659 1 1660 00:02:17,440 --> 00:02:20,375 1661 Senator, we're making 1662 our final approach into Coruscant. 1664 2 1665 00:02:20,476 --> 00:02:22,501 1666 Very good, Lieutenant. 1668 In this example, the text "Senator, we're making our final approach 1669 into Coruscant." would be converted into UTF-8 and placed in the 1670 Block. The timestamp of the block would be set to "00:02:17,440". 1671 And the BlockDuration element would be set to "00:00:02,935". 1673 The same is repeated for the next subtitle. 1675 Because there are no general settings for SRT, the CodecPrivate is 1676 left blank. 1678 5.3. SSA/ASS Subtitles 1680 SSA stands for Sub Station Alpha. It's the file format used by the 1681 popular subtitle editor, SubStation Alpha (http://wiki.multimedia.cx/ 1682 index.php?title=SubStation_Alpha). This format is widely used by 1683 fansubbers. 1685 It allows you to do some advanced display features, like positioning, 1686 karaoke, style managements... 1688 For detailed information on SSA/ASS, see the SSA specs 1689 (http://moodub.free.fr/video/ass-specs.doc). It includes an SSA 1690 specs description and the advanced features added by ASS format 1691 (standing for Advanced SSA). Because SSA and ASS are so similar, 1692 they are treated the same here. 1694 Like SRT, this format is text based with a particular syntax. 1696 A file consists of 4 or 5 parts, declared ala INI file (but it's not 1697 an INI !) 1699 The first, "[Script Info]" contains some information about the 1700 subtitle file, such as it's title, who created it, type of script and 1701 a very important one: "PlayResY". Be careful of this value, 1702 everything in your script (font size, positioning) is scaled by it. 1703 Sub Station Alpha uses your desktops Y resolution to write this 1704 value, so if a friend with a large monitor and a high screen 1705 resolution gives you an edited script, you can mess everything up by 1706 saving the script in SSA with your low-cost monitor. 1708 The second, "[V4 Styles]", is a list of style definitions. A style 1709 describe how will look a text on the screen. It defines font, font 1710 size, primary/.../outile colour, position, alignment, etc. 1712 For example this: 1714 Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, \ 1715 TertiaryColour, BackColour, Bold, Italic, BorderStyle, Outline, \ 1716 Shadow, Alignment, MarginL, MarginR, MarginV, AlphaLevel, Encoding 1717 Style: Wolf main,Wolf_Rain,56,15724527,15724527,15724527,4144959,0,\ 1718 0,1,1,2,2,5,5,30,0,0 1720 The third, "[Events]", is the list of text you want to display at the 1721 right timing. You can specify some attribute here. Like the style 1722 to use for this event (MUSTbe defined in the list), the position of 1723 the text (Left, Right, Vertical Margin), an effect. Name is mostly 1724 used by translator to know who said this sentence. Timing is in 1725 h:mm:ss.cc (centisec). 1727 Format: Marked, Start, End, Style, Name, MarginL, MarginR, MarginV, \ 1728 Effect, Text 1729 Dialogue: Marked=0,0:02:40.65,0:02:41.79,Wolf main,Cher,0000,0000,\ 1730 0000,,Et les enregistrements de ses ondes delta ? 1731 Dialogue: Marked=0,0:02:42.42,0:02:44.15,Wolf main,autre,0000,0000,\ 1732 0000,,Toujours rien. 1734 "[Pictures]" or "[Fonts]" part can be found in some SSA file, they 1735 contains UUE-encoded pictures/font but those features are only used 1736 by Sub Station Alpha -- i.e. no filter (Vobsub/Avery Lee Subtiler 1737 filter) use them. 1739 Now, how are they stored in Matroska? 1741 * All text is converted to UTF-8 1743 * All the headers are stored in CodecPrivate (Script Info and the 1744 Styles list) 1746 * Start & End field are used to set TimeStamp and the BlockDuration 1747 element. the data stored is: 1749 * Events are stored in the Block in this order: ReadOrder, Layer, 1750 Style, Name, MarginL, MarginR, MarginV, Effect, Text (Layer comes 1751 from ASS specs ... it's empty for SSA.) "ReadOrder field is 1752 needed for the decoder to be able to reorder the streamed samples 1753 as they were placed originally in the file." 1755 Here is an example of an SSA file. 1757 [Script Info] 1758 ; This is a Sub Station Alpha v4 script. 1759 ; For Sub Station Alpha info and downloads, 1760 ; go to \ 1761 ; [http://www.eswat.demon.co.uk/](http://www.eswat.demon.co.uk/) 1762 ; or email \ 1763 ; [kotus@eswat.demon.co.uk](mailto:kotus@eswat.demon.co.uk) 1764 Title: Wolf's rain 2 1765 Original Script: Anime-spirit Ishin-francais 1766 Original Translation: Coolman 1767 Original Editing: Spikewolfwood 1768 Original Timing: Lord_alucard 1769 Original Script Checking: Spikewolfwood 1770 ScriptType: v4.00 1771 Collisions: Normal 1772 PlayResY: 1024 1773 PlayDepth: 0 1774 Wav: 0, 128697,D:\Alex\Anime\- Fansub -\- TAFF -\WR_-_02_Wav.wav 1775 Wav: 0, 120692,H:\team truc\WR_-_02.wav 1776 Wav: 0, 116504,E:\sub\wolf's_rain\WOLF'S RAIN 02.wav 1777 LastWav: 3 1778 Timer: 100,0000 1780 [V4 Styles] 1781 Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, \ 1782 TertiaryColour, BackColour, Bold, Italic, BorderStyle, Outline, \ 1783 Shadow, Alignment, MarginL, MarginR, MarginV, AlphaLevel, Encoding 1784 Style: Default,Arial,20,65535,65535,65535,-2147483640,-1,0,1,3,0,2,\ 1785 30,30,30,0,0 1786 Style: Titre_episode,Akbar,140,15724527,65535,65535,986895,-1,0,1,1,\ 1787 0,3,30,30,30,0,0 1788 Style: Wolf main,Wolf_Rain,56,15724527,15724527,15724527,4144959,0,\ 1789 0,1,1,2,2,5,5,30,0,0 1791 [Events] 1792 Format: Marked, Start, End, Style, Name, MarginL, MarginR, MarginV, \ 1793 Effect, Text 1794 Dialogue: Marked=0,0:02:40.65,0:02:41.79,Wolf main,Cher,0000,0000,\ 1795 0000,,Et les enregistrements de ses ondes delta ? 1796 Dialogue: Marked=0,0:02:42.42,0:02:44.15,Wolf main,autre,0000,0000,\ 1797 0000,,Toujours rien. 1799 Here is what would be placed into the CodecPrivate element. 1801 [Script Info] 1802 ; This is a Sub Station Alpha v4 script. 1803 ; For Sub Station Alpha info and downloads, 1804 ; go to \ 1805 ; [http://www.eswat.demon.co.uk/](http://www.eswat.demon.co.uk/) 1806 ; or email \ 1807 ; [kotus@eswat.demon.co.uk](mailto:kotus@eswat.demon.co.uk) 1808 Title: Wolf's rain 2 1809 Original Script: Anime-spirit Ishin-francais 1810 Original Translation: Coolman 1811 Original Editing: Spikewolfwood 1812 Original Timing: Lord_alucard 1813 Original Script Checking: Spikewolfwood 1814 ScriptType: v4.00 1815 Collisions: Normal 1816 PlayResY: 1024 1817 PlayDepth: 0 1818 Wav: 0, 128697,D:\Alex\Anime\- Fansub -\- TAFF -\WR_-_02_Wav.wav 1819 Wav: 0, 120692,H:\team truc\WR_-_02.wav 1820 Wav: 0, 116504,E:\sub\wolf's_rain\WOLF'S RAIN 02.wav 1821 LastWav: 3 1822 Timer: 100,0000 1824 [V4 Styles] 1825 Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, \ 1826 TertiaryColour, BackColour, Bold, Italic, BorderStyle, Outline, \ 1827 Shadow, Alignment, MarginL, MarginR, MarginV, AlphaLevel, Encoding 1828 Style: Default,Arial,20,65535,65535,65535,-2147483640,-1,0,1,3,0,2,\ 1829 30,30,30,0,0 1830 Style: Titre_episode,Akbar,140,15724527,65535,65535,986895,-1,0,1,1,\ 1831 0,3,30,30,30,0,0 1832 Style: Wolf main,Wolf_Rain,56,15724527,15724527,15724527,4144959,0,\ 1833 0,1,1,2,2,5,5,30,0,0 1835 And here are the two blocks that would be generated. 1837 Block's timestamp: 00:02:40.650 BlockDuration: 00:00:01.140 1839 1,,Wolf main,Cher,0000,0000,0000,,Et les enregistrements de ses \ 1840 ondes delta ? 1842 Block's timestamp: 00:02:42.420 BlockDuration: 00:00:01.730 1844 2,,Wolf main,autre,0000,0000,0000,,Toujours rien. 1846 5.4. WebVTT 1848 The "Web Video Text Tracks Format" (short: WebVTT) is developed by 1849 the World Wide Web Consortium (W3C) (https://www.w3.org/). Its 1850 specifications are freely available (https://w3c.github.io/webvtt/). 1852 The guiding principles for the storage of WebVTT in Matroska are: 1854 * Consistency: store data in a similar way to other subtitle codecs 1856 * Simplicity: making decoding and remuxing as easy as possible for 1857 existing infrastructures 1859 * Completeness: keeping as much data as possible from the original 1860 WebVTT file 1862 5.4.1. Storage of WebVTT in Matroska 1864 5.4.1.1. CodecID: codec identification 1866 The CodecID to use is S_TEXT/WEBVTT. 1868 5.4.1.2. CodecPrivate: storage of global WebVTT blocks 1870 This element contains all global blocks before the first subtitle 1871 entry. This starts at the "WEBVTT" file identification marker but 1872 excludes the optional byte order mark. 1874 5.4.1.3. Storage of non-global WebVTT blocks 1876 Non-global WebVTT blocks (e.g., "NOTE") before a WebVTT Cue Text are 1877 stored in Matroska's BlockAddition element together with the Matroska 1878 Block containing the WebVTT Cue Text these blocks precede (see below 1879 for the actual format). 1881 5.4.1.4. Storage of Cues in Matroska blocks 1883 Each WebVTT Cue Text is stored directly in the Matroska Block. 1885 A muxer MUST change all WebVTT Cue Timestamps present within the Cue 1886 Text to be relative to the Matroska Block's timestamp. 1888 The Cue's start timestamp is used as the Matroska Block's timestamp. 1890 The difference between the Cue's end timestamp and its start 1891 timestamp is used as the Matroska Block's duration. 1893 5.4.1.5. BlockAdditions: storing non-global WebVTT blocks, Cue Settings 1894 Lists and Cue identifiers 1896 Each Matroska Block may be accompanied by one BlockAdditions element. 1897 Its format is as follows: 1899 1. The first line contains the WebVTT Cue Text's optional Cue 1900 Settings List followed by one line feed character (U+0x000a). 1901 The Cue Settings List may be empty, in which case the line 1902 consists of the line feed character only. 1904 2. The second line contains the WebVTT Cue Text's optional Cue 1905 Identifier followed by one line feed character (U+0x000a). The 1906 line may be empty indicating that there was no Cue Identifier in 1907 the source file, in which case the line consists of the line feed 1908 character only. 1910 3. The third and all following lines contain all WebVTT Comment 1911 Blocks that precede the current WebVTT Cue Block. These may be 1912 absent. 1914 If there is no Matroska BlockAddition element stored together with 1915 the Matroska Block, then all three components (Cue Settings List, Cue 1916 Identifier, Cue Comments) MUST be assumed to be absent. 1918 5.4.2. Examples of transformation 1920 Here's an example how a WebVTT is transformed. 1922 5.4.2.1. Example WebVTT file 1924 Let's take the following example file: 1926 WEBVTT with text after the signature 1928 STYLE 1929 ::cue { 1930 background-image: linear-gradient(to bottom, dimgray, lightgray); 1931 color: papayawhip; 1932 } 1933 /* Style blocks cannot use blank lines nor "dash dash greater \ 1934 than" */ 1936 NOTE comment blocks can be used between style blocks. 1938 STYLE 1939 ::cue(b) { 1940 color: peachpuff; 1942 } 1944 REGION 1945 id:bill 1946 width:40% 1947 lines:3 1948 regionanchor:0%,100% 1949 viewportanchor:10%,90% 1950 scroll:up 1952 NOTE 1953 Notes always span a whole block and can cover multiple 1954 lines. Like this one. 1955 An empty line ends the block. 1957 hello 1958 00:00:00.000 --> 00:00:10.000 1959 Example entry 1: Hello world. 1961 NOTE style blocks cannot appear after the first cue. 1963 00:00:25.000 --> 00:00:35.000 1964 Example entry 2: Another entry. 1965 This one has multiple lines. 1967 00:01:03.000 --> 00:01:06.500 position:90% align:right size:35% 1968 Example entry 3: That stuff to the right of the timestamps are cue \ 1969 settings. 1971 00:03:10.000 --> 00:03:20.000 1972 Example entry 4: Entries can even include timestamps. 1973 For example:<00:03:15.000>This becomes visible five seconds 1974 after the first part. 1976 5.4.2.2. Example of CodecPrivate 1978 The resulting CodecPrivate element will look like this: 1980 WEBVTT with text after the signature 1982 STYLE 1983 ::cue { 1984 background-image: linear-gradient(to bottom, dimgray, lightgray); 1985 color: papayawhip; 1986 } 1987 /* Style blocks cannot use blank lines nor "dash dash greater \ 1988 than" */ 1990 NOTE comment blocks can be used between style blocks. 1992 STYLE 1993 ::cue(b) { 1994 color: peachpuff; 1995 } 1997 REGION 1998 id:bill 1999 width:40% 2000 lines:3 2001 regionanchor:0%,100% 2002 viewportanchor:10%,90% 2003 scroll:up 2005 NOTE 2006 Notes always span a whole block and can cover multiple 2007 lines. Like this one. 2008 An empty line ends the block. 2010 5.4.2.3. Storage of Cue 1 2012 Example Cue 1: timestamp 00:00:00.000, duration 00:00:10.000, Block's 2013 content: 2015 Example entry 1: Hello world. 2017 BlockAddition's content starts with one empty line as there's no Cue 2018 Settings List: 2020 hello 2022 5.4.2.4. Storage of Cue 2 2024 Example Cue 2: timestamp 00:00:25.000, duration 00:00:10.000, Block's 2025 content: 2027 Example entry 2: Another entry. 2028 This one has multiple lines. 2030 BlockAddition's content starts with two empty lines as there's 2031 neither a Cue Settings List nor a Cue Identifier: 2033 NOTE style blocks cannot appear after the first cue. 2035 5.4.2.5. Storage of Cue 3 2037 Example Cue 3: timestamp 00:01:03.000, duration 00:00:03.500, Block's 2038 content: 2040 Example entry 3: That stuff to the right of the timestamps are cue \ 2041 settings. 2043 BlockAddition's content ends with an empty line as there's no Cue 2044 Identifier and there were no WebVTT Comment blocks: 2046 position:90% align:right size:35% 2048 5.4.2.6. Storage of Cue 4 2050 Example Cue 4: timestamp 00:03:10.000, duration 00:00:10.000, Block's 2051 content: 2053 Example entry 4: Entries can even include timestamps. For 2054 example:00:00:05.000 (00:00:05.000)This becomes visible five seconds 2055 after the first part. 2057 This Block does not need a BlockAddition as the Cue did not contain 2058 an Identifier, nor a Settings List, and it wasn't preceded by Comment 2059 blocks. 2061 5.4.3. Storage of WebVTT in Matroska vs. WebM 2063 Note: the storage of WebVTT in Matroska is not the same as the design 2064 document for storage of WebVTT in WebM. There are several reasons 2065 for this including but not limited to: the WebM document is old (from 2066 February 2012) and was based on an earlier draft of WebVTT and 2067 ignores several parts that were added to WebVTT later; WebM does 2068 still not support subtitles at all (http://www.webmproject.org/docs/ 2069 container/); the proposal suggests splitting the information across 2070 multiple tracks making demuxer's and remuxer's life very difficult. 2072 5.5. HDMV presentation graphics subtitles 2074 The specifications for the HDMV presentation graphics subtitle format 2075 (short: HDMV PGS) can be found in the document "Blu-ray Disc Read- 2076 Only Format; Part 3 — (U+2014) Audio Visual Basic Specifications" in 2077 section 9.14 "HDMV graphics streams". 2079 5.5.1. Storage of HDMV presentation graphics subtitles 2081 The CodecID to use is S_HDMV/PGS. A CodecPrivate element is not 2082 used. 2084 5.5.1.1. Storage of HDMV PGS Segments in Matroska Blocks 2086 Each HDMV PGS Segment (short: Segment) will be stored in a Matroska 2087 Block. A Segment is the data structure described in section 9.14.2.1 2088 "Segment coding structure and parameters" of the Blu-ray 2089 specifications. 2091 Each Segment contains a presentation timestamp. This timestamp will 2092 be used as the timestamp for the Matroska Block. 2094 A Segment is normally shown until a subsequent Segment is 2095 encountered. Therefore the Matroska Block MAY have no Duration. In 2096 that case, a player MUST display a Segment within a Matroska Block 2097 until the next Segment is encountered. 2099 A muxer MAY use a Duration, e.g., by calculating the distance between 2100 two subsequent Segments. If a Matroska Block has a Duration, a 2101 player MUST display that Segment only for the duration of the Block's 2102 Duration. 2104 5.6. HDMV text subtitles 2106 The specifications for the HDMV text subtitle format (short: HDMV 2107 TextST) can be found in the document "Blu-ray Disc Read-Only Format; 2108 Part 3 — (U+2014) Audio Visual Basic Specifications" in section 9.15 2109 "HDMV text subtitle streams". 2111 5.6.1. Storage of HDMV text subtitles 2113 The CodecID to use is S_HDMV/TEXTST. 2115 A CodecPrivate Element is required. It MUST contain the stream's 2116 Dialog Style Segment as described in section 9.15.4.2 "Dialog Style 2117 Segment" of the Blu-ray specifications. 2119 5.6.1.1. Storage of HDMV TextST Dialog Presentation Segments in 2120 Matroska Blocks 2122 Each HDMV Dialog Presentation Segment (short: Segment) will be stored 2123 in a Matroska Block. A Segment is the data structure described in 2124 section 9.15.4.3 "Dialog presentation segment" of the Blu-ray 2125 specifications. 2127 Each Segment contains a start and an end presentation timestamp 2128 (short: start PTS & end PTS). The start PTS will be used as the 2129 timestamp for the Matroska Block. The Matroska Block MUST have a 2130 Duration, and that Duration is the difference between the end PTS and 2131 the start PTS. 2133 A player MUST use the Matroska Block's timestamp and Duration instead 2134 of the Segment's start and end PTS for determining when and how long 2135 to show the Segment. 2137 5.6.1.2. Character set 2139 When TextST subtitles are stored inside Matroska, the only allowed 2140 character set is UTF-8. 2142 Each HDMV text subtitle stream in a Blu-ray can use one of a handful 2143 of character sets. This information is not stored in the MPEG2 2144 Transport Stream itself but in the accompanying Clip Information 2145 file. 2147 Therefore a muxer MUST parse the accompanying Clip Information file. 2148 If the information indicates a character set other than UTF-8, it 2149 MUST re-encode all text Dialog Presentation Segments from the 2150 indicated character set to UTF-8 prior to storing them in Matroska. 2152 5.7. Digital Video Broadcasting (DVB) subtitles 2154 The specifications for the Digital Video Broadcasting subtitle 2155 bitstream format (short: DVB subtitles) can be found in the document 2156 "ETSI EN 300 743 - Digital Video Broadcasting (DVB); Subtitling 2157 systems". The storage of DVB subtitles in MPEG transport streams is 2158 specified in the document "ETSI EN 300 468 - Digital Video 2159 Broadcasting (DVB); Specification for Service Information (SI) in DVB 2160 systems". 2162 5.7.1. Storage of DVB subtitles 2164 5.7.1.1. CodecID 2166 The CodecID to use is S_DVBSUB. 2168 5.7.1.2. CodecPrivate 2170 The CodecPrivate element is five bytes long and has the following 2171 structure: 2173 * 2 bytes: composition page ID (bit string, left bit first) 2175 * 2 bytes: ancillary page ID (bit string, left bit first) 2177 * 1 byte: subtitling type (bit string, left bit first) 2179 The semantics of these bytes are the same as the ones described in 2180 section 6.2.41 "Subtitling descriptor" of ETSI EN 300 468. 2182 5.7.1.3. Storage of DVB subtitles in Matroska Blocks 2184 Each Matroska Block consists of one or more DVB Subtitle Segments as 2185 described in segment 7.2 "Syntax and semantics of the subtitling 2186 segment" of ETSI EN 300 743. 2188 Each Matroska Block SHOULD have a Duration indicating how long the 2189 DVB Subtitle Segments in that Block SHOULD be displayed. 2191 6. Block Additional Mapping 2193 Extra data or metadata can be added to each Block using 2194 BlockAdditional data. Each BlockAdditional contains a BlockAddID 2195 that identifies the kind of data it contains. When the BlockAddID is 2196 set to "1" the contents of the BlockAdditional Element are define by 2197 the Codec Mappings defines; see Section 4.1.5. When the BlockAddID 2198 is set a value greater than "1", then the contents of the 2199 BlockAdditional Element are defined by the BlockAdditionalMapping 2200 Element, within the associated Track Element, where the BlockAddID 2201 Element of BlockAdditional Element equals the BlockAddIDValue of the 2202 associated Track's BlockAdditionalMapping Element. That 2203 BlockAdditionalMapping Element identifies a particular Block 2204 Additional Mapping by the BlockAddIDType. 2206 The following XML depicts a use of a Block Additional Mapping to 2207 associate a timecode value with a Block: 2209 2210 2211 2212 2213 1 2214 568001708 2215 1 2216 2217 2 2219 timecode 2220 12 2221 2222 V_FFV1 2223 2227 2228 2229 2230 3000 2231 2232 {binary video frame} 2233 2234 2235 2 2237 01:00:00:00 2238 2239 2240 2241 2242 2244 Block Additional Mappings detail how additional data MAY be stored in 2245 the BlockMore Element with a BlockAdditionMapping Element, within the 2246 Track Element, which identifies the BlockAdditional content. Block 2247 Additional Mappings define the BlockAddIDType value reserved to 2248 identify that type of data as well as providing an optional label 2249 stored within the BlockAddIDName Element. When the Block Additional 2250 Mapping is dependent on additional contextual information, then the 2251 Mapping SHOULD describe how such additional contextual information is 2252 stored within the BlockAddIDExtraData Element. 2254 The following Block Additional Mappings are defined. 2256 6.1. Summary of Assigned BlockAddIDType Values 2258 For convenience, the following table shows the assigned 2259 BlockAddIDType values along with the BlockAddIDName and Citation. 2261 +================+========================+=============+ 2262 | BlockAddIDType | BlockAddIDName | Citation | 2263 +================+========================+=============+ 2264 | 121 | SMPTE ST 12-1 timecode | Section 6.2 | 2265 +----------------+------------------------+-------------+ 2267 Table 4 2269 6.2. SMPTE ST 12-1 Timecode 2271 6.2.1. Timecode Description 2273 SMPTE ST 12-1 timecode values can be stored in the BlockMore Element 2274 to associate the content of a Matroska Block with a particular 2275 timecode value. If the Block uses Lacing, the timecode value is 2276 associated with the first frame of the Lace. 2278 The Block Additional Mapping contains a full binary representation of 2279 a 64 bit SMPTE timecode value stored in big-endian format and 2280 expressed exactly as defined in Section 8 and 9 of SMPTE 12M [ST12]. 2281 For convenience, here are the bit assignments for a SMPTE ST 12-1 2282 binary representation as described in Section 6.2 of [RFC5484]: 2284 +===============+========================+ 2285 | Bit Positions | Label | 2286 +===============+========================+ 2287 | 0--3 | Units of frames | 2288 +---------------+------------------------+ 2289 | 4--7 | First binary group | 2290 +---------------+------------------------+ 2291 | 8--9 | Tens of frames | 2292 +---------------+------------------------+ 2293 | 10 | Drop frame flag | 2294 +---------------+------------------------+ 2295 | 11 | Color frame flag | 2296 +---------------+------------------------+ 2297 | 12--15 | Second binary group | 2298 +---------------+------------------------+ 2299 | 16--19 | Units of seconds | 2300 +---------------+------------------------+ 2301 | 20--23 | Third binary group | 2302 +---------------+------------------------+ 2303 | 24--26 | Tens of seconds | 2304 +---------------+------------------------+ 2305 | 27 | Polarity correction | 2306 +---------------+------------------------+ 2307 | 28--31 | Fourth binary group | 2308 +---------------+------------------------+ 2309 | 32--35 | Units of minutes | 2310 +---------------+------------------------+ 2311 | 36--39 | Fifth binary group | 2312 +---------------+------------------------+ 2313 | 40--42 | Tens of minutes | 2314 +---------------+------------------------+ 2315 | 43 | Binary group flag BGF0 | 2316 +---------------+------------------------+ 2317 | 44--47 | Sixth binary group | 2318 +---------------+------------------------+ 2319 | 48--51 | Units of hours | 2320 +---------------+------------------------+ 2321 | 52--55 | Seventh binary group | 2322 +---------------+------------------------+ 2323 | 56--57 | Tens of hours | 2324 +---------------+------------------------+ 2325 | 58 | Binary group flag BGF1 | 2326 +---------------+------------------------+ 2327 | 59 | Binary group flag BGF2 | 2328 +---------------+------------------------+ 2329 | 60--63 | Eighth binary group | 2330 +---------------+------------------------+ 2332 Table 5 2334 For example, a timecode value of "07:32:54;18" can be expressed as a 2335 64 bit SMPTE 12M value as: 2337 10000000 01100000 01100000 01010000 2338 00100000 00110000 01110000 00000000 2340 6.2.2. BlockAddIDType 2342 The BlockAddIDType value reserved for timecode is "121". 2344 6.2.3. BlockAddIDName 2346 The BlockAddIDName value reserved for timecode is "SMPTE ST 12-1 2347 timecode". 2349 6.2.4. BlockAddIDExtraData 2351 BlockAddIDExtraData is unused within this block additional mapping. 2353 7. Security Considerations 2355 This document inherits security considerations from the EBML and 2356 Matroska documents. 2358 8. IANA Considerations 2360 To be determined. 2362 9. Normative References 2364 [DolbyVisionWithinIso] 2365 Dolby, "Dolby Vision Streams Within the ISO Base MediaFile 2366 Format", 7 February 2020, 2367 . 2371 [IEEE.1857-10] 2372 IEEE, "IEEE Standard for Third Generation Video Coding", 9 2373 November 2021, 2374 . 2376 [IEEE.1857-4] 2377 IEEE, "IEEE Standard for Second-Generation IEEE 1857 Video 2378 Coding", 23 October 2018, 2379 . 2381 [IEEE.754] IEEE, "IEEE Standard for Binary Floating-Point 2382 Arithmetic", 13 June 2019, 2383 . 2385 [ISO.14496-15] 2386 International Organization for Standardization, 2387 "Information technology — Coding of audio-visual objects — 2388 Part 15: Carriage of network abstraction layer (NAL) unit 2389 structured video in ISO base media file format", 2390 ISO Standard 14496, 2014. 2392 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2393 Requirement Levels", BCP 14, RFC 2119, 2394 DOI 10.17487/RFC2119, March 1997, 2395 . 2397 [RFC3339] Klyne, G. and C. Newman, "Date and Time on the Internet: 2398 Timestamps", RFC 3339, DOI 10.17487/RFC3339, July 2002, 2399 . 2401 [RFC6386] Bankoski, J., Koleszar, J., Quillio, L., Salonen, J., 2402 Wilkins, P., and Y. Xu, "VP8 Data Format and Decoding 2403 Guide", RFC 6386, DOI 10.17487/RFC6386, November 2011, 2404 . 2406 [RFC6648] Saint-Andre, P., Crocker, D., and M. Nottingham, 2407 "Deprecating the "X-" Prefix and Similar Constructs in 2408 Application Protocols", BCP 178, RFC 6648, 2409 DOI 10.17487/RFC6648, June 2012, 2410 . 2412 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2413 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 2414 May 2017, . 2416 [ST12] SMPTE, "Time and Control Code", ST ST 12-1:2014, DOI 2417 10.5594/SMPTE.ST12-1.2014, 20 February 2014, 2418 . 2420 10. Informative References 2422 [RFC5484] Singer, D., "Associating Time-Codes with RTP Streams", 2423 RFC 5484, DOI 10.17487/RFC5484, March 2009, 2424 . 2426 Authors' Addresses 2428 Steve Lhomme 2429 Email: slhomme@matroska.org 2431 Moritz Bunkus 2432 Email: moritz@bunkus.org 2434 Dave Rice 2435 Email: dave@dericed.com