idnits 2.17.1 draft-ietf-cellar-ebml-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to use 'NOT RECOMMENDED' as an RFC 2119 keyword, but does not include the phrase in its RFC 2119 key words list. -- The document date (January 3, 2018) is 2302 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'EBMLParentPath' is mentioned on line 978, but not defined == Missing Reference: 'EBMLMinOccurrence' is mentioned on line 989, but not defined == Missing Reference: 'EBMLMaxOccurrence' is mentioned on line 989, but not defined == Missing Reference: 'PathMinOccurrence' is mentioned on line 992, but not defined == Missing Reference: 'PathMaxOccurrence' is mentioned on line 992, but not defined -- Possible downref: Non-RFC (?) normative reference: ref. 'IEEE.754.1985' -- Possible downref: Non-RFC (?) normative reference: ref. 'ITU.V42.1994' Summary: 0 errors (**), 0 flaws (~~), 7 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 cellar S. Lhomme 3 Internet-Draft 4 Intended status: Standards Track D. Rice 5 Expires: July 7, 2018 6 M. Bunkus 7 January 3, 2018 9 Extensible Binary Meta Language 10 draft-ietf-cellar-ebml-04 12 Abstract 14 This document defines the Extensible Binary Meta Language (EBML) 15 format as a generalized file format for any type of data in a 16 hierarchical form. EBML is designed as a binary equivalent to XML 17 and uses a storage-efficient approach to build nested Elements with 18 identifiers, lengths, and values. Similar to how an XML Schema 19 defines the structure and semantics of an XML Document, this document 20 defines how EBML Schemas are created to convey the semantics of an 21 EBML Document. 23 Status of This Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at https://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on July 7, 2018. 40 Copyright Notice 42 Copyright (c) 2018 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (https://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 58 2. Notation and Conventions . . . . . . . . . . . . . . . . . . 4 59 3. Security Considerations . . . . . . . . . . . . . . . . . . . 6 60 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 61 5. Structure . . . . . . . . . . . . . . . . . . . . . . . . . . 7 62 6. Variable Size Integer . . . . . . . . . . . . . . . . . . . . 7 63 6.1. VINT_WIDTH . . . . . . . . . . . . . . . . . . . . . . . 8 64 6.2. VINT_MARKER . . . . . . . . . . . . . . . . . . . . . . . 8 65 6.3. VINT_DATA . . . . . . . . . . . . . . . . . . . . . . . . 8 66 6.4. VINT Examples . . . . . . . . . . . . . . . . . . . . . . 9 67 7. Element ID . . . . . . . . . . . . . . . . . . . . . . . . . 9 68 8. Element Data Size . . . . . . . . . . . . . . . . . . . . . . 11 69 9. EBML Element Types . . . . . . . . . . . . . . . . . . . . . 12 70 9.1. Signed Integer Element . . . . . . . . . . . . . . . . . 13 71 9.2. Unsigned Integer Element . . . . . . . . . . . . . . . . 13 72 9.3. Float Element . . . . . . . . . . . . . . . . . . . . . . 13 73 9.4. String Element . . . . . . . . . . . . . . . . . . . . . 14 74 9.5. UTF-8 Element . . . . . . . . . . . . . . . . . . . . . . 14 75 9.6. Date Element . . . . . . . . . . . . . . . . . . . . . . 14 76 9.7. Master Element . . . . . . . . . . . . . . . . . . . . . 14 77 9.8. Binary Element . . . . . . . . . . . . . . . . . . . . . 15 78 10. Terminating Elements . . . . . . . . . . . . . . . . . . . . 15 79 11. Guidelines for Updating Elements . . . . . . . . . . . . . . 16 80 11.1. Reducing a Element Data in Size . . . . . . . . . . . . 16 81 11.1.1. Adding a Void Element . . . . . . . . . . . . . . . 16 82 11.1.2. Extending the Element Data Size . . . . . . . . . . 16 83 11.1.3. Terminating Element Data . . . . . . . . . . . . . . 17 84 11.2. Considerations when Updating Elements with CRC . . . . . 17 85 12. EBML Document . . . . . . . . . . . . . . . . . . . . . . . . 18 86 12.1. EBML Header . . . . . . . . . . . . . . . . . . . . . . 18 87 12.2. EBML Body . . . . . . . . . . . . . . . . . . . . . . . 18 88 13. EBML Stream . . . . . . . . . . . . . . . . . . . . . . . . . 19 89 14. Elements semantic . . . . . . . . . . . . . . . . . . . . . . 19 90 14.1. EBML Schema . . . . . . . . . . . . . . . . . . . . . . 19 91 14.1.1. Element . . . . . . . . . . . . . . . . . . . . . . 20 92 14.1.2. Attributes . . . . . . . . . . . . . . . . . . . . . 20 93 14.1.3. Element . . . . . . . . . . . . . . . . . . . . . . 20 94 14.1.4. Attributes . . . . . . . . . . . . . . . . . . . . . 21 95 14.1.5. Element . . . . . . . . . . . . . . . . . . . . . . 25 96 14.1.6. Attributes . . . . . . . . . . . . . . . . . . . . . 25 97 14.1.7. Element . . . . . . . . . . . . . . . . . . . . . . 26 98 14.1.8. Element . . . . . . . . . . . . . . . . . . . . . . 26 99 14.1.9. Attributes . . . . . . . . . . . . . . . . . . . . . 26 100 14.1.10. XML Schema for EBML Schema . . . . . . . . . . . . . 26 101 14.1.11. EBML Schema Example . . . . . . . . . . . . . . . . 28 102 14.1.12. Identically Recurring Elements . . . . . . . . . . . 29 103 14.1.13. Expression of range . . . . . . . . . . . . . . . . 29 104 14.1.14. Textual expression of floats . . . . . . . . . . . . 30 105 14.1.15. Note on the use of default attributes to define 106 Mandatory EBML Elements . . . . . . . . . . . . . . 31 107 14.2. EBML Header Elements . . . . . . . . . . . . . . . . . . 31 108 14.2.1. EBML Element . . . . . . . . . . . . . . . . . . . . 31 109 14.2.2. EBMLVersion Element . . . . . . . . . . . . . . . . 32 110 14.2.3. EBMLReadVersion Element . . . . . . . . . . . . . . 32 111 14.2.4. EBMLMaxIDLength Element . . . . . . . . . . . . . . 33 112 14.2.5. EBMLMaxSizeLength Element . . . . . . . . . . . . . 33 113 14.2.6. DocType Element . . . . . . . . . . . . . . . . . . 34 114 14.2.7. DocTypeVersion Element . . . . . . . . . . . . . . . 34 115 14.2.8. DocTypeReadVersion Element . . . . . . . . . . . . . 35 116 14.3. Global Elements . . . . . . . . . . . . . . . . . . . . 35 117 14.3.1. CRC-32 Element . . . . . . . . . . . . . . . . . . . 35 118 14.3.2. Void Element . . . . . . . . . . . . . . . . . . . . 36 119 15. References . . . . . . . . . . . . . . . . . . . . . . . . . 36 120 15.1. Normative References . . . . . . . . . . . . . . . . . . 36 121 15.2. Informative References . . . . . . . . . . . . . . . . . 37 122 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 38 124 1. Introduction 126 "EBML", short for Extensible Binary Meta Language, specifies a binary 127 and octet (byte) aligned format inspired by the principle of XML (a 128 framework for structuring data). 130 The goal of this document is to define a generic, binary, space- 131 efficient format that can be used to define more complex formats 132 (such as containers for multimedia content) using an "EBML Schema". 133 The definition of the "EBML" format recognizes the idea behind HTML 134 and XML as a good one: separate structure and semantics allowing the 135 same structural layer to be used with multiple, possibly widely 136 differing semantic layers. Except for the "EBML Header" and a few 137 "Global Elements" this specification does not define particular 138 "EBML" format semantics; however this specification is intended to 139 define how other "EBML"-based formats can be defined. 141 "EBML" uses a simple approach of building "Elements" upon three 142 pieces of data (tag, length, and value) as this approach is well 143 known, easy to parse, and allows selective data parsing. The "EBML" 144 structure additionally allows for hierarchical arrangement to support 145 complex structural formats in an efficient manner. 147 2. Notation and Conventions 149 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 150 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 151 document are to be interpreted as described in [RFC2119]. 153 This document defines specific terms in order to define the format 154 and application of "EBML". Specific terms are defined below: 156 "EBML": Extensible Binary Meta Language 158 "EBML Document Type": A name provided by an "EBML Schema" to 159 designate a particular implementation of "EBML" for a data format 160 (e.g.: matroska and webm). 162 "EBML Schema": A standardized definition for the structure of an 163 "EBML Document Type". 165 "EBML Document": A datastream comprised of only two components, an 166 "EBML Header" and an "EBML Body". 168 "EBML Reader": A data parser that interprets the semantics of an 169 "EBML Document" and creates a way for programs to use "EBML". 171 "EBML Stream": A file that consists of one or more "EBML Documents" 172 that are concatenated together. 174 "EBML Header": A declaration that provides processing instructions 175 and identification of the "EBML Body". The "EBML Header" may be 176 considered as analogous to an XML Declaration [W3C.REC-xml-20081126] 177 (see section 2.8 on Prolog and Document Type Declaration). 179 "EBML Body": All data of an "EBML Document" following the "EBML 180 Header". 182 "Variable Size Integer": A compact variable-length binary value which 183 defines its own length. 185 "VINT": Also known as "Variable Size Integer". 187 "EBML Element": A foundation block of data that contains three parts: 188 an "Element ID", an "Element Data Size", and "Element Data". 190 "Element ID": The "Element ID" is a binary value, encoded as a 191 "Variable Size Integer", used to uniquely identify a defined "EBML 192 Element" within a specific "EBML Schema". 194 "EBML Class": A representation of the octet length of an "Element 195 ID". 197 "Element Data Size": An expression, encoded as a "Variable Size 198 Integer", of the length in octets of "Element Data". 200 "VINTMAX": The maximum possible value that can be stored as "Element 201 Data Size". 203 "Unknown-Sized Element": An "Element" with an unknown "Element Data 204 Size". 206 "Element Data": The value(s) of the "EBML Element" which is 207 identified by its "Element ID" and "Element Data Size". The form of 208 the "Element Data" is defined by this document and the corresponding 209 "EBML Schema" of the Element's "EBML Document Type". 211 "Root Level": The starting level in the hierarchy of an "EBML 212 Document". 214 "Root Element": A mandatory, non-repeating "EBML Element" which 215 occurs at the top level of the path hierarchy within an "EBML Body" 216 and contains all other "EBML Elements" of the "EBML Body", excepting 217 optional "Void Elements". 219 "Top-Level Element": An "EBML Element" defined to only occur as a 220 "Child Element" of the "Root Element". 222 "Master Element": The "Master Element" contains zero, one, or many 223 other "EBML Elements". 225 "Child Element": A "Child Element" is a relative term to describe the 226 "EBML Elements" immediately contained within a "Master Element". 228 "Parent Element": A relative term to describe the "Master Element" 229 which contains a specified element. For any specified "EBML Element" 230 that is not at "Root Level", the "Parent Element" refers to the 231 "Master Element" in which that "EBML Element" is contained. 233 "Descendant Element": A relative term to describe any "EBML Elements" 234 contained within a "Master Element", including any of the "Child 235 Elements" of its "Child Elements", and so on. 237 "Void Element": A "Void Element" is an "Element" used to overwrite 238 damaged data or reserve space within a "Master Element" for later 239 use. 241 "Element Name": The official human-readable name of the "EBML 242 Element". 244 "Element Path": The hierarchy of "Parent Element" where the "EBML 245 Element" is expected to be found in the "EBML Body". 247 "Empty Element": An "EBML Element" that has an "Element Data Size" 248 with all "VINT_DATA" bits set to zero, which indicates that the 249 "Element Data" of the "Element" is zero octets in length. 251 3. Security Considerations 253 "EBML" itself does not offer any kind of security and does not 254 provide confidentiality. "EBML" does not provide any kind of 255 authorization. "EBML" only offers marginally useful and effective 256 data integrity options, such as CRC elements. 258 Even if the semantic layer offers any kind of encryption, "EBML" 259 itself could leak information at both the semantic layer (as declared 260 via the "DocType Element") and within the "EBML" structure (the 261 presence of "EBML Elements" can be derived even with an unknown 262 semantic layer using a heuristic approach; not without errors, of 263 course, but with a certain degree of confidence). 265 Attacks on an "EBML Reader" could include: 267 o Invalid "Element IDs" that are longer than the limit stated in the 268 "EBMLMaxIDLength Element" of the "EBML Header". 270 o Invalid "Element IDs" that are not encoded in the shortest- 271 possible way. 273 o Invalid "Element IDs" comprised of reserved values. 275 o Invalid "Element Data Size" values that are longer than the limit 276 stated in the "EBMLMaxSizeLength Element" of the "EBML Header". 278 o Invalid "Element Data Size" values (e.g. extending the length of 279 the "EBML Element" beyond the scope of the "Parent Element"; 280 possibly triggering access-out-of-bounds issues). 282 o Very high lengths in order to force out-of-memory situations 283 resulting in a denial of service, access-out-of-bounds issues etc. 285 o Missing "EBML Elements" that are mandatory and have no declared 286 default value. 288 o Usage of "0x00" octets in "EBML Elements" with a string type. 290 o Usage of invalid UTF-8 encoding in "EBML Elements" of UTF-8 type 291 (e.g. in order to trigger access-out-of-bounds or buffer overflow 292 issues). 294 o Usage of invalid data in "EBML Elements" with a date type. 296 Side channel attacks could exploit: 298 o The semantic equivalence of the same string stored in a "String 299 Element" or "UTF-8 Element" with and without zero-bit padding. 301 o The semantic equivalence of "VINT_DATA" within "Element Data Size" 302 with to different lengths due to left-padding zero bits. 304 o Data contained within a "Master Element" which is not itself part 305 of an "EBML Element". 307 o Extraneous copies of "Identically Recurring Element". 309 o Copies of "Identically Recurring Element" within a "Parent 310 Element" that contain invalid "CRC-32 Elements". 312 o Use of "Void Elements". 314 4. IANA Considerations 316 This document has no IANA actions. 318 5. Structure 320 "EBML" uses a system of "Elements" to compose an "EBML Document". 321 "EBML Elements" incorporate three parts: an "Element ID", an "Element 322 Data Size", and "Element Data". The "Element Data", which is 323 described by the "Element ID", includes either binary data, one or 324 many other "EBML Elements", or both. 326 6. Variable Size Integer 328 The "Element ID" and "Element Data Size" are both encoded as a 329 "Variable Size Integer", developed according to a UTF-8 like system. 330 The "Variable Size Integer" is composed of a "VINT_WIDTH", 331 "VINT_MARKER", and "VINT_DATA", in that order. "Variable Size 332 Integers" MUST left-pad the "VINT_DATA" value with zero bits so that 333 the whole "Variable Size Integer" is octet-aligned. "Variable Size 334 Integer" will be referred to as "VINT" for shorthand. 336 6.1. VINT_WIDTH 338 Each "Variable Size Integer" begins with a "VINT_WIDTH" which 339 consists of zero or many zero-value bits. The count of consecutive 340 zero-values of the "VINT_WIDTH" plus one equals the length in octets 341 of the "Variable Size Integer". For example, a "Variable Size 342 Integer" that starts with a "VINT_WIDTH" which contains zero 343 consecutive zero-value bits is one octet in length and a "Variable 344 Size Integer" that starts with one consecutive zero-value bit is two 345 octets in length. The "VINT_WIDTH" MUST only contain zero-value bits 346 or be empty. 348 Within the "EBML Header" the "VINT_WIDTH" MUST NOT exceed three bits 349 in length (meaning that the "Variable Size Integer" MUST NOT exceed 350 four octets in length). Within the "EBML Body", when a "VINT" is 351 used to express an "Element ID", the maximum length allowed for the 352 "VINT_WIDTH" is one less than the value set in the "EBMLMaxIDLength 353 Element". Within the "EBML Body", when a "VINT" is used to express 354 an "Element Data Size", the maximum length allowed for the 355 "VINT_WIDTH" is one less than the value set in the "EBMLMaxSizeLength 356 Element". 358 6.2. VINT_MARKER 360 The "VINT_MARKER" serves as a separator between the "VINT_WIDTH" and 361 "VINT_DATA". Each "Variable Size Integer" MUST contain exactly one 362 "VINT_MARKER". The "VINT_MARKER" MUST be one bit in length and 363 contain a bit with a value of one. The first bit with a value of one 364 within the "Variable Size Integer" is the "VINT_MARKER". 366 6.3. VINT_DATA 368 The "VINT_DATA" portion of the "Variable Size Integer" includes all 369 data that follows (but not including) the "VINT_MARKER" until end of 370 the "Variable Size Integer" whose length is derived from the 371 "VINT_WIDTH". The bits required for the "VINT_WIDTH" and the 372 "VINT_MARKER" combined use one out of eight bits of the total length 373 of the "Variable Size Integer". Thus a "Variable Size Integer" of 1 374 octet length supplies 7 bits for "VINT_DATA", a 2 octet length 375 supplies 14 bits for "VINT_DATA", and a 3 octet length supplies 21 376 bits for "VINT_DATA". If the number of bits required for "VINT_DATA" 377 are less than the bit size of "VINT_DATA", then "VINT_DATA" SHOULD be 378 zero-padded to the left to a size that fits. The "VINT_DATA" value 379 MUST be expressed as a big-endian unsigned integer. 381 6.4. VINT Examples 383 This table shows examples of "Variable Size Integers" with lengths 384 from 1 to 5 octets. The Size column refers to the size of the 385 "VINT_DATA" in bits. The Representation column depicts a binary 386 expression of "Variable Size Integers" where "VINT_WIDTH" is depicted 387 by '0', the "VINT_MARKER" as '1', and the "VINT_DATA" as 'x'. 389 +-------------+------+----------------------------------------------+ 390 | Octet | Size | Representation | 391 | Length | | | 392 +-------------+------+----------------------------------------------+ 393 | 1 | 2^7 | 1xxx xxxx | 394 | 2 | 2^14 | 01xx xxxx xxxx xxxx | 395 | 3 | 2^21 | 001x xxxx xxxx xxxx xxxx xxxx | 396 | 4 | 2^28 | 0001 xxxx xxxx xxxx xxxx xxxx xxxx xxxx | 397 | 5 | 2^35 | 0000 1xxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx | 398 | | | xxxx | 399 +-------------+------+----------------------------------------------+ 401 Data encoded as a "Variable Size Integer" MAY be rendered at octet 402 lengths larger than needed to store the data. In this table a binary 403 value of "0b10" is shown encoded as different "Variable Size 404 Integers" with lengths from one octet to four octet. All four 405 encoded examples have identical semantic meaning though the 406 "VINT_WIDTH" and the padding of the "VINT_DATA" vary. 408 +--------------+--------------+-------------------------------------+ 409 | Binary Value | Octet Length | As Represented in Variable Size | 410 | | | Integer | 411 +--------------+--------------+-------------------------------------+ 412 | 10 | 1 | 1000 0010 | 413 | 10 | 2 | 0100 0000 0000 0010 | 414 | 10 | 3 | 0010 0000 0000 0000 0000 0010 | 415 | 10 | 4 | 0001 0000 0000 0000 0000 0000 0000 | 416 | | | 0010 | 417 +--------------+--------------+-------------------------------------+ 419 7. Element ID 421 The "Element ID" MUST be encoded as a "Variable Size Integer". By 422 default, "Element IDs" are encoded in lengths from one octet to four 423 octets, although "Element IDs" of greater lengths are used if the 424 octet length of the longest "Element ID" of the "EBML Document" is 425 declared in the "EBMLMaxIDLength Element" of the "EBML Header" (see 426 Section 14.2.4). The "VINT_DATA" component of the "Element ID" MUST 427 NOT be either defined or written as either all zero values or all one 428 values. Any "Element ID" with the "VINT_DATA" component set as all 429 zero values or all one values MUST be ignored and MUST NOT be 430 considered an error in the "EBML Document". The "VINT_DATA" 431 component of the "Element ID" MUST be encoded at the shortest valid 432 length. For example, an "Element ID" with binary encoding of "1011 433 1111" is valid, whereas an "Element ID" with binary encoding of "0100 434 0000 0011 1111" stores a semantically equal "VINT_DATA" but is 435 invalid because a shorter "VINT" encoding is possible. Additionally, 436 an "Element ID" with binary encoding of "1111 1111" is invalid since 437 the "VINT_DATA" section is set to all one values, whereas an "Element 438 ID" with binary encoding of "0100 0000 0111 1111" stores a 439 semantically equal "VINT_DATA" and is the shortest possible "VINT" 440 encoding. 442 The following table details these specific examples further: 444 +------------+-------------+----------------+-----------------------+ 445 | VINT_WIDTH | VINT_MARKER | VINT_DATA | Element ID Status | 446 +------------+-------------+----------------+-----------------------+ 447 | | 1 | 0000000 | Invalid: "VINT_DATA" | 448 | | | | MUST NOT be set to | 449 | | | | all 0 | 450 | 0 | 1 | 00000000000000 | Invalid: "VINT_DATA" | 451 | | | | MUST NOT be set to | 452 | | | | all 0 | 453 | | 1 | 0000001 | Valid | 454 | 0 | 1 | 00000000000001 | Invalid: A shorter | 455 | | | | "VINT_DATA" encoding | 456 | | | | is available. | 457 | | 1 | 0111111 | Valid | 458 | 0 | 1 | 00000000111111 | Invalid: A shorter | 459 | | | | "VINT_DATA" encoding | 460 | | | | is available. | 461 | | 1 | 1111111 | Invalid: "VINT_DATA" | 462 | | | | MUST NOT be set to | 463 | | | | all 1 | 464 | 0 | 1 | 00000001111111 | Valid | 465 +------------+-------------+----------------+-----------------------+ 467 The octet length of an "Element ID" determines its "EBML Class". 469 +------------+--------------+--------------------------------+ 470 | EBML Class | Octet Length | Number of Possible Element IDs | 471 +------------+--------------+--------------------------------+ 472 | Class A | 1 | 2^7 - 2 = 126 | 473 | Class B | 2 | 2^14 - 2^7 - 1 = 16,255 | 474 | Class C | 3 | 2^21 - 2^14 - 1 = 2,080,767 | 475 | Class D | 4 | 2^28 - 2^21 - 1 = 266,338,303 | 476 +------------+--------------+--------------------------------+ 478 8. Element Data Size 480 The "Element Data Size" expresses the length in octets of "Element 481 Data". The "Element Data Size" itself MUST be encoded as a "Variable 482 Size Integer". By default, "Element Data Sizes" can be encoded in 483 lengths from one octet to eight octets, although "Element Data Sizes" 484 of greater lengths MAY be used if the octet length of the longest 485 "Element Data Size" of the "EBML Document" is declared in the 486 "EBMLMaxSizeLength Element" of the "EBML Header" (see 487 Section 14.2.5). Unlike the "VINT_DATA" of the "Element ID", the 488 "VINT_DATA" component of the "Element Data Size" is not mandated to 489 be encoded at the shortest valid length. For example, an "Element 490 Data Size" with binary encoding of "1011 1111" or a binary encoding 491 of "0100 0000 0011 1111" are both valid "Element Data Sizes" and both 492 store a semantically equal value (both "0b00000000111111" and 493 "0b0111111", the "VINT_DATA" sections of the examples, represent the 494 integer 63). 496 Although an "Element ID" with all "VINT_DATA" bits set to zero is 497 invalid, an "Element Data Size" with all "VINT_DATA" bits set to zero 498 is allowed for "EBML Element Types" which do not mandate a non-zero 499 length (see Section 9). An "Element Data Size" with all "VINT_DATA" 500 bits set to zero indicates that the "Element Data" is zero octets in 501 length. Such an "EBML Element" is referred to as an "Empty Element". 502 If an "Empty Element" has a "default" value declared then the "EBML 503 Reader" MUST interpret the value of the "Empty Element" as the 504 "default" value. If an "Empty Element" has no "default" value 505 declared then the "EBML Reader" MUST interpret the value of the 506 "Empty Element" as defined as part of the definition of the 507 corresponding "EBML Element Type" associated with the "Element ID". 509 An "Element Data Size" with all "VINT_DATA" bits set to one is 510 reserved as an indicator that the size of the "EBML Element" is 511 unknown. The only reserved value for the "VINT_DATA" of "Element 512 Data Size" is all bits set to one. An "EBML Element" with an unknown 513 "Element Data Size" is referred to as an "Unknown-Sized Element". 514 Only "Master Elements" SHALL be "Unknown-Sized Elements". "Master 515 Elements" MUST NOT use an unknown size unless the 516 "unknownsizeallowed" attribute of their "EBML Schema" is set to true 517 (see Section 14.1.4.10). The use of "Unknown-Sized Elements" allows 518 for an "EBML Element" to be written and read before the size of the 519 "EBML Element" is known. "Unknown-Sized Element" MUST NOT be used or 520 defined unnecessarily; however if the "Element Data Size" is not 521 known before the "Element Data" is written, such as in some cases of 522 data streaming, then "Unknown-Sized Elements" MAY be used. The end 523 of an "Unknown-Sized Element" is determined by whichever comes first: 524 the end of the file or the beginning of the next "EBML Element", 525 defined by this document or the corresponding "EBML Schema", that is 526 not independently valid as "Descendant Element" of the "Unknown-Sized 527 Element". 529 For "Element Data Sizes" encoded at octet lengths from one to eight, 530 this table depicts the range of possible values that can be encoded 531 as an "Element Data Size". An "Element Data Size" with an octet 532 length of 8 is able to express a size of 2^56-2 or 533 72,057,594,037,927,934 octets (or about 72 petabytes). The maximum 534 possible value that can be stored as "Element Data Size" is referred 535 to as "VINTMAX". 537 +--------------+----------------------+ 538 | Octet Length | Possible Value Range | 539 +--------------+----------------------+ 540 | 1 | 0 to 2^7-2 | 541 | 2 | 0 to 2^14-2 | 542 | 3 | 0 to 2^21-2 | 543 | 4 | 0 to 2^28-2 | 544 | 5 | 0 to 2^35-2 | 545 | 6 | 0 to 2^42-2 | 546 | 7 | 0 to 2^49-2 | 547 | 8 | 0 to 2^56-2 | 548 +--------------+----------------------+ 550 If the length of "Element Data" equals "2^(n*7)-1" then the octet 551 length of the "Element Data Size" MUST be at least "n+1". This rule 552 prevents an "Element Data Size" from being expressed as a reserved 553 value. For example, an "EBML Element" with an octet length of 127 554 MUST NOT be encoded in an "Element Data Size" encoding with a one 555 octet length. The following table clarifies this rule by showing a 556 valid and invalid expression of an "Element Data Size" with a 557 "VINT_DATA" of 127 (which is equal to 2^(1*7)-1). 559 +------------+-------------+----------------+-----------------------+ 560 | VINT_WIDTH | VINT_MARKER | VINT_DATA | Element Data Size | 561 | | | | Status | 562 +------------+-------------+----------------+-----------------------+ 563 | | 1 | 1111111 | Reserved (meaning | 564 | | | | Unknown) | 565 | 0 | 1 | 00000001111111 | Valid (meaning 127 | 566 | | | | octets) | 567 +------------+-------------+----------------+-----------------------+ 569 9. EBML Element Types 571 "EBML Elements" are defined by an "EBML Schema" which MUST declare 572 one of the following "EBML Element Types" for each "EBML Element". 573 An "EBML Element Type" defines a concept of storing data within an 574 "EBML Element" that describes such characteristics as length, 575 endianness, and definition. 577 "EBML Elements" which are defined as a "Signed Integer Element", 578 "Unsigned Integer Element", "Float Element", or "Date Element" use 579 big endian storage. 581 9.1. Signed Integer Element 583 A "Signed Integer Element" MUST declare a length from zero to eight 584 octets. If the "EBML Element" is not defined to have a "default" 585 value, then a "Signed Integer Element" with a zero-octet length 586 represents an integer value of zero. 588 A "Signed Integer Element" stores an integer (meaning that it can be 589 written without a fractional component) which could be negative, 590 positive, or zero. Signed Integers MUST be stored with two's 591 complement notation with the leftmost bit being the sign bit. 592 Because "EBML" limits Signed Integers to 8 octets in length a "Signed 593 Integer Element" stores a number from -9,223,372,036,854,775,808 to 594 +9,223,372,036,854,775,807. 596 9.2. Unsigned Integer Element 598 An "Unsigned Integer Element" MUST declare a length from zero to 599 eight octets. If the "EBML Element" is not defined to have a 600 "default" value, then an "Unsigned Integer Element" with a zero-octet 601 length represents an integer value of zero. 603 An "Unsigned Integer Element" stores an integer (meaning that it can 604 be written without a fractional component) which could be positive or 605 zero. Because "EBML" limits Unsigned Integers to 8 octets in length 606 an "Unsigned Integer Element" stores a number from 0 to 607 18,446,744,073,709,551,615. 609 9.3. Float Element 611 A "Float Element" MUST declare a length of either zero octets (0 612 bit), four octets (32 bit) or eight octets (64 bit). If the "EBML 613 Element" is not defined to have a "default" value, then a "Float 614 Element" with a zero-octet length represents a numerical value of 615 zero. 617 A "Float Element" stores a floating-point number as defined in 618 [IEEE.754.1985]. 620 9.4. String Element 622 A "String Element" MUST declare a length in octets from zero to 623 "VINTMAX". If the "EBML Element" is not defined to have a "default" 624 value, then a "String Element" with a zero-octet length represents an 625 empty string. 627 A "String Element" MUST either be empty (zero-length) or contain 628 printable ASCII characters [RFC0020] in the range of "0x20" to 629 "0x7E", with an exception made for termination (see Section 10). 631 9.5. UTF-8 Element 633 A "UTF-8 Element" MUST declare a length in octets from zero to 634 "VINTMAX". If the "EBML Element" is not defined to have a "default" 635 value, then a "UTF-8 Element" with a zero-octet length represents an 636 empty string. 638 A "UTF-8 Element" contains only a valid Unicode string as defined in 639 [RFC3629], with an exception made for termination (see Section 10). 641 9.6. Date Element 643 A "Date Element" MUST declare a length of either zero octets or eight 644 octets. If the "EBML Element" is not defined to have a "default" 645 value, then a "Date Element" with a zero-octet length represents a 646 timestamp of 2001-01-01T00:00:00.000000000 UTC [RFC3339]. 648 The "Date Element" stores an integer in the same format as the 649 "Signed Integer Element" that expresses a point in time referenced in 650 nanoseconds from the precise beginning of the third millennium of the 651 Gregorian Calendar in Coordinated Universal Time (also known as 652 2001-01-01T00:00:00.000000000 UTC). This provides a possible 653 expression of time from 1708-09-11T00:12:44.854775808 UTC to 654 2293-04-11T11:47:16.854775807 UTC. 656 9.7. Master Element 658 A "Master Element" MUST declare a length in octets from zero to 659 "VINTMAX". The "Master Element" MAY also use an unknown length. See 660 Section 8 for rules that apply to elements of unknown length. 662 The "Master Element" contains zero, one, or many other elements. 663 "EBML Elements" contained within a "Master Element" MUST have the 664 "EBMLParentPath" of their "Element Path" equals to the 665 "EBMLReferencePath" of the "Master Element" "Element Path" (see 666 Section 14.1.4.2). "Element Data" stored within "Master Elements" 667 SHOULD only consist of "EBML Elements" and SHOULD NOT contain any 668 data that is not part of an "EBML Element". When "EBML" is used in 669 transmission or streaming, data that is not part of an "EBML Element" 670 is permitted to be present within a "Master Element" if 671 "unknownsizeallowed" is enabled within the definition for that 672 "Master Element". In this case, the "EBML Reader" should skip data 673 until a valid "Element ID" of the same "EBMLParentPath" or the next 674 upper level "Element Path" of the "Master Element" is found. What 675 "Element IDs" are considered valid within a "Master Element" is 676 identified by the "EBML Schema" for that version of the "EBML 677 Document Type". Any data contained within a "Master Element" that is 678 not part of a "Child Element" MUST be ignored. 680 9.8. Binary Element 682 A "Binary Element" MUST declare a length in octets from zero to 683 "VINTMAX". 685 The contents of a "Binary Element" should not be interpreted by the 686 "EBML Reader". 688 10. Terminating Elements 690 "Null Octets", which are octets with all bits set to zero, MAY follow 691 the value of a "String Element" or "UTF-8 Element" to serve as a 692 terminator. An "EBML Writer" MAY terminate a "String Element" or 693 "UTF-8 Element" with "Null Octets" in order to overwrite a stored 694 value with a new value of lesser length while maintaining the same 695 "Element Data Size" (this can prevent the need to rewrite large 696 portions of an "EBML Document"); otherwise the use of "Null Octets" 697 within a "String Element" or "UTF-8 Element" is NOT RECOMMENDED. An 698 "EBML Reader" MUST consider the value of the "String Element" or 699 "UTF-8 Element" to be terminated upon the first read "Null Octet" and 700 MUST ignore any data following the first "Null Octet" within that 701 "Element". A string value and a copy of that string value terminated 702 by one or more "Null Octets" are semantically equal. 704 The following table shows examples of semantics and validation for 705 the use of "Null Octets". Values to represent "Stored Values" and 706 the "Semantic Meaning" as represented as hexadecimal values. 708 +---------------------+---------------------+ 709 | Stored Value | Semantic Meaning | 710 +---------------------+---------------------+ 711 | 0x65 0x62 0x6d 0x6c | 0x65 0x62 0x6d 0x6c | 712 | 0x65 0x62 0x00 0x6c | 0x65 0x62 | 713 | 0x65 0x62 0x00 0x00 | 0x65 0x62 | 714 | 0x65 0x62 | 0x65 0x62 | 715 +---------------------+---------------------+ 717 11. Guidelines for Updating Elements 719 An EBML Document can be updated without requiring that the entire 720 EBML Document be rewritten. These recommendations describe 721 strategies to change the "Element Data" of a written "EBML Element" 722 with minimal disruption to the rest of the "EBML Document". 724 11.1. Reducing a Element Data in Size 726 There are three methods to reduce the size of "Element Data" of a 727 written "EBML Element". 729 11.1.1. Adding a Void Element 731 When an "EBML Element" is changed to reduce its total length by more 732 than one octet, an "EBML Writer" SHOULD fill the freed space with a 733 "Void Element". 735 11.1.2. Extending the Element Data Size 737 The same value for "Element Data Size" MAY be written in variable 738 lengths, so for minor reductions in octet length the "Element Data 739 Size" MAY be written to a longer octet length to fill the freed 740 space. 742 For example, the first row of the following table depicts a "String 743 Element" that stores an "Element ID" (3 octets), "Element Data Size" 744 (1 octet), and "Element Data" (4 octets). If the "Element Data" is 745 changed to reduce the length by one octet and if the current length 746 of the "Element Data Size" is less than its maximum permitted length, 747 then the "Element Data Size" of that "Element" MAY be rewritten to 748 increase its length by one octet. Thus before and after the change 749 the "EBML Element" maintains the same length of 8 octets and data 750 around the "Element" does not need to be moved. 752 +-------------+------------+-------------------+--------------+ 753 | Status | Element ID | Element Data Size | Element Data | 754 +-------------+------------+-------------------+--------------+ 755 | Before edit | 0x3B4040 | 0x84 | 0x65626d6c | 756 | After edit | 0x3B4040 | 0x4003 | 0x6d6b76 | 757 +-------------+------------+-------------------+--------------+ 759 This method is only RECOMMENDED for reducing "Element Data" by a 760 single octet; for reductions by two or more octets it is RECOMMENDED 761 to fill the freed space with a "Void Element". 763 Note that if the "Element Data" length needs to be rewritten as 764 shortened by one octet and the "Element Data Size" could be rewritten 765 as a shorter "VINT" then it is RECOMMENDED to rewrite the "Element 766 Data Size" as one octet shorter, shorten the "Element Data" by one 767 octet, and follow that "Element" with a "Void Element". For example, 768 the following table depicts a "String Element" that stores an 769 "Element ID" (3 octets), "Element Data Size" (2 octets, but could be 770 rewritten in one octet), and "Element Data" (3 octets). If the 771 "Element Data" is to be rewritten to a two octet length, then another 772 octet can be taken from "Element Data Size" so that there is enough 773 space to add a two octet "Void Element". 775 +--------+------------+-----------------+-------------+-------------+ 776 | Status | Element ID | Element Data | Element | Void | 777 | | | Size | Data | Element | 778 +--------+------------+-----------------+-------------+-------------+ 779 | Before | 0x3B4040 | 0x4003 | 0x6d6b76 | | 780 | After | 0x3B4040 | 0x82 | 0x6869 | 0xEC80 | 781 +--------+------------+-----------------+-------------+-------------+ 783 11.1.3. Terminating Element Data 785 For "String Elements" and "UTF-8 Elements" the length of "Element 786 Data" MAY be reduced by adding "Null Octets" to terminate the 787 "Element Data" (see Section 10). 789 In the following table, a four octet long "Element Data" is changed 790 to a three octet long value followed by a "Null Octet"; the "Element 791 Data Size" includes any "Null Octets" used to terminate "Element 792 Data" so remains unchanged. 794 +-------------+------------+-------------------+--------------+ 795 | Status | Element ID | Element Data Size | Element Data | 796 +-------------+------------+-------------------+--------------+ 797 | Before edit | 0x3B4040 | 0x84 | 0x65626d6c | 798 | After edit | 0x3B4040 | 0x84 | 0x6d6b7600 | 799 +-------------+------------+-------------------+--------------+ 801 Note that this method is NOT RECOMMENDED. For reductions of one 802 octet, the method for "Extending the Element Data Size" SHOULD be 803 used. For reduction by more than one octet, the method for "Adding a 804 Void Element" SHOULD be used. 806 11.2. Considerations when Updating Elements with CRC 808 If the "Element" to be changed is a "Descendant Element" of any 809 "Master Element" that contains an "CRC-32 Element" then the "CRC-32 810 Element" MUST be verified before permitting the change. Additionally 811 the "CRC-32 Element" value MUST be subsequently updated to reflect 812 the changed data. 814 12. EBML Document 816 An "EBML Document" is comprised of only two components, an "EBML 817 Header" and an "EBML Body". An "EBML Document" MUST start with an 818 "EBML Header" that declares significant characteristics of the entire 819 "EBML Body". An "EBML Document" consists of "EBML Elements" and MUST 820 NOT contain any data that is not part of an "EBML Element". 822 12.1. EBML Header 824 The "EBML Header" is a declaration that provides processing 825 instructions and identification of the "EBML Body". The "EBML 826 Header" of an "EBML Document" is analogous to the XML Declaration of 827 an XML Document. 829 The "EBML Header" documents the "EBML Schema" (also known as the 830 "EBML DocType") that is used to semantically interpret the structure 831 and meaning of the "EBML Document". Additionally the "EBML Header" 832 documents the versions of both "EBML" and the "EBML Schema" that were 833 used to write the "EBML Document" and the versions required to read 834 the "EBML Document". 836 The "EBML Header" MUST contain a single "Master Element" with an 837 "Element Name" of "EBML" and "Element ID" of "0x1A45DFA3" (see 838 Section 14.2.1) and any number of additional "EBML Elements" within 839 it. The "EBML Header" MUST only contain "EBML Elements" that are 840 defined as part of this document. 842 All "EBML Elements" within the "EBML Header" MUST NOT use any 843 "Element ID" with a length greater than 4 octets. All "EBML 844 Elements" within the "EBML Header" MUST NOT use any "Element Data 845 Size" with a length greater than 4 octets. 847 12.2. EBML Body 849 All data of an "EBML Document" following the "EBML Header" is the 850 "EBML Body". The end of the "EBML Body", as well as the end of the 851 "EBML Document" that contains the "EBML Body", is considered as 852 whichever comes first: the beginning of a new "EBML Header" at the 853 "Root Level" or the end of the file. The "EBML Body" MUST consist 854 only of "EBML Elements" and MUST NOT contain any data that is not 855 part of an "EBML Element". This document defines precisely what 856 "EBML Elements" are to be used within the "EBML Header", but does not 857 name or define what "EBML Elements" are to be used within the "EBML 858 Body". The definition of what "EBML Elements" are to be used within 859 the "EBML Body" is defined by an "EBML Schema". 861 13. EBML Stream 863 An "EBML Stream" is a file that consists of one or many "EBML 864 Documents" that are concatenated together. An occurrence of a "EBML 865 Header" at the "Root Level" marks the beginning of an "EBML 866 Document". 868 14. Elements semantic 870 14.1. EBML Schema 872 An "EBML Schema" is an XML Document that defines the properties, 873 arrangement, and usage of "EBML Elements" that compose a specific 874 "EBML Document Type". The relationship of an "EBML Schema" to an 875 "EBML Document" may be considered analogous to the relationship of an 876 XML Schema [W3C.REC-xmlschema-0-20010502] to an XML Document 877 [W3C.REC-xml-20081126]. An "EBML Schema" MUST be clearly associated 878 with one or many "EBML Document Types". An "EBML Schema" must be 879 expressed as well-formed XML. An "EBML Document Type" is identified 880 by a string stored within the "EBML Header" in the "DocType Element"; 881 for example "matroska" or "webm" (see Section 14.2.6). The "DocType" 882 value for an "EBML Document Type" SHOULD be unique and persistent. 884 An "EBML Schema" MUST declare exactly one "EBML Element" at "Root 885 Level" (referred to as the "Root Element") that MUST occur exactly 886 once within an "EBML Document". The "Void Element" MAY also occur at 887 "Root Level" but is not considered to be "Root Elements" (see 888 Section 14.3.2). 890 The "EBML Schema" MUST document all Elements of the "EBML Body". The 891 "EBML Schema" does not document "Global Elements" that are defined by 892 this document (namely the "Void Element" and the "CRC-32 Element"). 894 An "EBML Schema" MAY constrain the use of "EBML Header Elements" (see 895 Section 14.2) by adding or constraining that Element's "range" 896 attribute. For example, an "EBML Schema" MAY constrain the 897 "EBMLMaxSizeLength" to a maximum value of "8" or MAY constrain the 898 "EBMLVersion" to only support a value of "1". If an "EBML Schema" 899 adopts the "EBML Header Element" as-is, then it is not REQUIRED to 900 document that Element within the "EBML Schema". If an "EBML Schema" 901 constrains the range of an "EBML Header Element", then that "Element" 902 MUST be documented within an "" node of the "EBML Schema". 903 This document provides an example of an "EBML Schema", see 904 Section 14.1.11. 906 14.1.1. Element 908 As an XML Document, the "EBML Schema" MUST use "" as the 909 top level element. The "" element MAY contain 910 "" sub-elements. 912 14.1.2. Attributes 914 Within an "EBML Schema" the "" element uses the following 915 attributes: 917 14.1.2.1. docType 919 The "docType" lists the official name of the "EBML Document Type" 920 that is defined by the "EBML Schema"; for example, "". 923 The "docType" attribute is REQUIRED within the "" 924 Element. 926 14.1.2.2. version 928 The "version" lists an incremental non-negative integer that 929 specifies the version of the docType documented by the "EBML Schema". 930 Unlike XML Schemas, an "EBML Schema" documents all versions of a 931 docType's definition rather than using separate "EBML Schemas" for 932 each version of a "docType". "EBML Elements" may be introduced and 933 deprecated by using the "minver" and "maxver" attributes of 934 "". 936 The "version" attribute is REQUIRED within the "" 937 Element. 939 14.1.3. Element 941 Each "" defines one "EBML Element" through the use of 942 several attributes that are defined in Section 14.1.2. "EBML 943 Schemas" MAY contain additional attributes to extend the semantics 944 but MUST NOT conflict with the definitions of the "" 945 attributes defined within this document. 947 The "" nodes contain a description of the meaning and use of 948 the "EBML Element" stored within one or many "" sub- 949 elements and zero or one "" sub-element. All 950 "" nodes MUST be sub-elements of the "". 952 14.1.4. Attributes 954 Within an "EBML Schema" the "" uses the following attributes 955 to define an "EBML Element": 957 14.1.4.1. name 959 The "name" provides the official human-readable name of the "EBML 960 Element". The value of the name MUST be in the form of characters 961 "A" to "Z", "a" to "z", "0" to "9", "-" and ".". 963 The "name" attribute is REQUIRED. 965 14.1.4.2. path 967 The path defines the allowed storage locations of the "EBML Element" 968 within an "EBML Document". This path MUST be defined with the full 969 hierarchy of "EBML Elements" separated with a "/". The top "EBML 970 Element" in the path hierarchy being the first in the value. The 971 syntax of the "path" attribute is defined using this Augmented 972 Backus-Naur Form (ABNF) [RFC5234] with the case sensitive update 973 [RFC7405] notation: 975 The "path" attribute is REQUIRED. 977 EBMLFullPath = EBMLElementOccurrence "(" EBMLReferencePath ")" 978 EBMLReferencePath = [EBMLParentPath] EBMLElementPath 979 EBMLParentPath = EBMLFixedParent EBMLLastParent 980 EBMLFixedParent = *(EBMLPathAtom) 981 EBMLElementPath = EBMLPathAtom / EBMLPathAtomRecursive 982 EBMLPathAtom = PathDelimiter EBMLAtomName 983 EBMLPathAtomRecursive = "(1*(" EBMLPathAtom "))" 984 EBMLLastParent = EBMLPathAtom / EBMLVariableParent 985 EBMLVariableParent = "(" VariableParentOccurrence "\)" 986 EBMLAtomName = 1*(EBMLNameChar) 987 EBMLNameChar = ALPHA / DIGIT / "-" / "." 988 PathDelimiter = "\" 989 EBMLElementOccurrence = [EBMLMinOccurrence] "*" [EBMLMaxOccurrence] 990 EBMLMinOccurrence = 1*DIGIT 991 EBMLMaxOccurrence = 1*DIGIT 992 VariableParentOccurrence = [PathMinOccurrence] "*" [PathMaxOccurrence] 993 PathMinOccurrence = 1*DIGIT 994 PathMaxOccurrence = 1*DIGIT 996 The ""*"", ""("" and "")"" symbols MUST be interpreted as they are 997 defined in the ABNF. 999 The "EBMLPathAtom" part of the "EBMLElementPath" MUST be equal to the 1000 "name" attribute of the "EBML Schema". 1002 The starting "PathDelimiter" of the path corresponds to the root of 1003 the "EBML Document". 1005 The "EBMLElementOccurrence" part is interpreted as an ABNF Variable 1006 Repetition. The repetition amounts correspond to how many times the 1007 "EBML Element" can be found in its "Parent Element". 1009 The "EBMLMinOccurrence" represents the minimum number of occurrences 1010 of this "EBML Element" within its "Parent Element". Each instance of 1011 the "Parent Element" MUST contain at least this many instances of 1012 this "EBML Element". If the "EBML Element" has an empty 1013 "EBMLParentPath" then "EBMLMinOccurrence" refers to constraints on 1014 the occurrence of the "EBML Element" within the "EBML Document". If 1015 "EBMLMinOccurrence" is not present then that "EBML Element" is 1016 considered to have a "EBMLMinOccurrence" value of 0. The semantic 1017 meaning of "EBMLMinOccurrence" within an "EBML Schema" is considered 1018 analogous to the meaning of "minOccurs" within an "XML Schema". 1019 "EBML Elements" with "EBMLMinOccurrence" set to "1" that also have a 1020 "default" value (see Section 14.1.4.8) declared are not REQUIRED to 1021 be stored but are REQUIRED to be interpreted, see Section 14.1.15. 1022 An "EBML Element" defined with a "EBMLMinOccurrence" value greater 1023 than zero is called a "Mandatory EBML Element". 1025 The "EBMLMaxOccurrence" represents the maximum number of occurrences 1026 of this "EBML Element" within its "Parent Element". Each instance of 1027 the "Parent Element" MUST contain at most this many instances of this 1028 "EBML Element". If the "EBML Element" has an empty "EBMLParentPath" 1029 then "EBMLMaxOccurrence" refers to constraints on the occurrence of 1030 the "EBML Element" within the "EBML Document". If 1031 "EBMLMaxOccurrence" is not present then that "EBML Element" is 1032 considered to have no maximum occurrence. The semantic meaning of 1033 "EBMLMaxOccurrence" within an "EBML Schema path" is considered 1034 analogous to the meaning of "maxOccurs" within an "XML Schema". 1036 The "VariableParentOccurrence" part is interpreted as an ABNF 1037 Variable Repetition. The repetition amounts correspond to the amount 1038 of unspecified "Parent Element" levels there can be between the 1039 "EBMLFixedParent" and the actual "EBMLElementPath". 1041 If the path contains an "EBMLPathAtomRecursive" part, the "EBML 1042 Element" can occur within itself recursively (see the 1043 Section 14.1.4.11). 1045 14.1.4.3. id 1047 The "Element ID" encoded as a "Variable Size Integer" expressed in 1048 hexadecimal notation prefixed by a "0x" that is read and stored in 1049 big-endian order. To reduce the risk of false positives while 1050 parsing "EBML Streams", the "Element IDs" of the "Root Element" and 1051 "Top-Level Elements" SHOULD be at least 4 octets in length. "Element 1052 IDs" defined for use at "Root Level" or directly under the "Root 1053 Level" MAY use shorter octet lengths to facilitate padding and 1054 optimize edits to "EBML Documents"; for instance, the "Void Element" 1055 uses an "Element ID" with a one octet length to allow its usage in 1056 more writing and editing scenarios. 1058 The "id" attribute is REQUIRED. 1060 14.1.4.4. minOccurs 1062 An integer expressing the minimum number of occurrences of this "EBML 1063 Element" within its "Parent Element". The "minOccurs" value MUST be 1064 equal to the "EBMLMinOccurrence" value of the "path". 1066 The "minOccurs" attribute is OPTIONAL. If the "minOccurs" attribute 1067 is not present then that "EBML Element" is considered to have a 1068 "minOccurs" value of 0. 1070 14.1.4.5. maxOccurs 1072 An integer expressing the maximum number of occurrences of this "EBML 1073 Element" within its "Parent Element". The "maxOccurs" value MUST be 1074 equal to the "EBMLMaxOccurrence" value of the "path". 1076 The "maxOccurs" attribute is OPTIONAL. If the "maxOccurs" attribute 1077 is not present then that "EBML Element" is considered to have no 1078 maximum occurrence, similar to "unbounded" in the XML world. 1080 14.1.4.6. range 1082 A numerical range for "EBML Elements" which are of numerical types 1083 (Unsigned Integer, Signed Integer, Float, and Date). If specified 1084 the value of the "EBML Element" MUST be within the defined range. 1085 See Section 14.1.13 for rules applied to expression of range values. 1087 The "range" attribute is OPTIONAL. If the "range" attribute is not 1088 present then any value legal for the "type" attribute is valid. 1090 14.1.4.7. size 1092 A value to express the valid length of the "Element Data" as written 1093 measured in octets. The "size" provides a constraint in addition to 1094 the Length value of the definition of the corresponding "EBML Element 1095 Type". This "size" MUST be expressed as either a non-negative 1096 integer or a range (see Section 14.1.13) that consists of only non- 1097 negative integers and valid operators. 1099 The "size" attribute is OPTIONAL. If the "size" attribute is not 1100 present for that "EBML Element" then that "EBML Element" is only 1101 limited in size by the definition of the associated "EBML Element 1102 Type". 1104 14.1.4.8. default 1106 If an "Element" is mandatory (has a "EBMLMinOccurrence" value greater 1107 than zero) but not written within its "Parent Element" or stored as 1108 an "Empty Element", then the "EBML Reader" of the "EBML Document" 1109 MUST semantically interpret the "EBML Element" as present with this 1110 specified default value for the "EBML Element". "EBML Elements" that 1111 are "Master Elements" MUST NOT declare a "default" value. "EBML 1112 Elements" with a "minOccurs" value greater than 1 MUST NOT declare a 1113 "default" value. 1115 The "default" attribute is OPTIONAL. 1117 14.1.4.9. type 1119 The "type" MUST be set to one of the following values: 'integer' 1120 (signed integer), 'uinteger' (unsigned integer), 'float', 'string', 1121 'date', 'utf-8', 'master', or 'binary'. The content of each "type" 1122 is defined within Section 9. 1124 The "type" attribute is REQUIRED. 1126 14.1.4.10. unknownsizeallowed 1128 A boolean to express if an "EBML Element" MAY be used as an "Unknown- 1129 Sized Element" (having all "VINT_DATA" bits of "Element Data Size" 1130 set to 1). "EBML Elements" that are not "Master Elements" MUST NOT 1131 set "unknownsizeallowed" to true. An "EBML Element" that is defined 1132 with an "unknownsizeallowed" attribute set to 1 MUST also have the 1133 "unknownsizeallowed" attribute of its "Parent Element" set to 1. 1135 The "unknownsizeallowed" attribute is OPTIONAL. If the 1136 "unknownsizeallowed" attribute is not used then that "EBML Element" 1137 is not allowed to use an unknown "Element Data Size". 1139 14.1.4.11. recursive 1141 A boolean to express if an "EBML Element" MAY be stored recursively. 1142 In this case the "EBML Element" MAY be stored within another "EBML 1143 Element" that has the same "Element ID". Which itself can be stored 1144 in an "EBML Element" that has the same "Element ID", and so on. 1145 "EBML Elements" that are not "Master Elements" MUST NOT set 1146 "recursive" to true. 1148 If the "path" contains an "EBMLPathAtomRecursive" part then the 1149 "recursive" value MUST be true and false otherwise. 1151 The "recursive" attribute is OPTIONAL. If the "recursive" attribute 1152 is not present then the "EBML Element" MUST NOT be used recursively. 1154 14.1.4.12. minver 1156 The "minver" (minimum version) attribute stores a non-negative 1157 integer that represents the first version of the "docType" to support 1158 the "EBML Element". 1160 The "minver" attribute is OPTIONAL. If the "minver" attribute is not 1161 present, then the "EBML Element" has a minimum version of "1". 1163 14.1.4.13. maxver 1165 The "maxver" (maximum version) attribute stores a non-negative 1166 integer that represents the last or most recent version of the 1167 "docType" to support the element. "maxver" MUST be greater than or 1168 equal to "minver". 1170 The "maxver" attribute is OPTIONAL. If the "maxver" attribute is not 1171 present then the "EBML Element" has a maximum version equal to the 1172 value stored in the "version" attribute of "". 1174 14.1.5. Element 1176 The "" element provides additional information about 1177 the "EBML Element". 1179 14.1.6. Attributes 1181 14.1.6.1. lang 1183 A "lang" attribute which is set to the [RFC5646] value of the 1184 language of the element's documentation. 1186 The "lang" attribute is OPTIONAL. 1188 14.1.6.2. type 1190 A "type" attribute distinguishes the meaning of the documentation. 1191 Values for the "" sub-element's "type" attribute MUST 1192 include one of the following: "definition", "rationale", "usage 1193 notes", and "references". 1195 The "type" attribute is OPTIONAL. 1197 14.1.7. Element 1199 The "" element provides information about restrictions 1200 to the allowable values for the "EBML Element" which are listed in 1201 "" elements. 1203 14.1.8. Element 1205 The "" element stores a list of values allowed for storage in 1206 the "EBML Element". The values MUST match the "type" of the "EBML 1207 Element" (for example "" cannot be a valid value 1208 for a "EBML Element" that is defined as an unsigned integer). An 1209 "" element MAY also store "" elements to further 1210 describe the "". 1212 14.1.9. Attributes 1214 14.1.9.1. label 1216 The "label" provides a concise expression for human consumption that 1217 describes what the "value" of the "" represents. 1219 The "label" attribute is OPTIONAL. 1221 14.1.9.2. value 1223 The "value" represents data that MAY be stored within the "EBML 1224 Element". 1226 The "value" attribute is REQUIRED. 1228 14.1.10. XML Schema for EBML Schema 1230 1231 1235 1236 1237 1238 1240 1241 1242 1243 1244 1245 1246 1248 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1269 1270 1271 1272 1273 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1288 14.1.11. EBML Schema Example 1290 1291 1293 1294 1297 1300 1301 1302 Container of data and 1303 attributes representing one or many files. 1304 1305 1307 1308 An attached file. 1309 1310 1311 1314 1315 Filename of the attached file. 1316 1317 1318 1321 1322 MIME type of the file. 1323 1324 1325 1328 1329 Modification timestamp of the file. 1330 1331 1332 1334 1335 The data of the file. 1336 1337 1338 1340 14.1.12. Identically Recurring Elements 1342 An "Identically Recurring Element" is an "EBML Element" that MAY 1343 occur within its "Parent Element" more than once but that each 1344 recurrence within that "Parent Element" MUST be identical both in 1345 storage and semantics. "Identically Recurring Elements" are 1346 permitted to be stored multiple times within the same "Parent 1347 Element" in order to increase data resilience and optimize the use of 1348 "EBML" in transmission. For instance a pertinent "Top-Level Element" 1349 could be periodically resent within a data stream so that an "EBML 1350 Reader" which starts reading the stream from the middle could better 1351 interpret the contents. "Identically Recurring Elements" SHOULD 1352 include a "CRC-32 Element" as a "Child Element"; this is especially 1353 recommended when "EBML" is used for long-term storage or 1354 transmission. If a "Parent Element" contains more than one copy of 1355 an "Identically Recurring Element" which includes a "CRC-32 Element" 1356 as a "Child Element" then the first instance of the "Identically 1357 Recurring Element" with a valid CRC-32 value should be used for 1358 interpretation. If a "Parent Element" contains more than one copy of 1359 an "Identically Recurring Element" which does not contain a "CRC-32 1360 Element" or if "CRC-32 Elements" are present but none are valid then 1361 the first instance of the "Identically Recurring Element" should be 1362 used for interpretation. 1364 14.1.13. Expression of range 1366 The "range" attribute MUST only be used with "EBML Elements" that are 1367 either "signed integer", "unsigned integer", "float", or "date". The 1368 "range" expression may contain whitespace for readability but 1369 whitespace within a "range" expression MUST NOT convey meaning. The 1370 expression of the "range" MUST adhere to one of the following forms: 1372 o "x-y" where x and y are integers or floats and "y" MUST be greater 1373 than "x", meaning that the value MUST be greater than or equal to 1374 "x" and less than or equal to "y". "x" MUST be less than "y". 1376 o ">x" where "x" is an integer or float, meaning that the value MUST 1377 be greater than "x". 1379 o ">=x" where "x" is an integer or float, meaning that the value 1380 MUST be greater than or equal to "x". 1382 o "=4 1538 default: 4 1540 type: Unsigned Integer 1542 description: The "EBMLMaxIDLength Element" stores the maximum length 1543 in octets of the "Element IDs" to be found within the "EBML Body". 1544 An "EBMLMaxIDLength Element" value of four is RECOMMENDED, though 1545 larger values are allowed. 1547 14.2.5. EBMLMaxSizeLength Element 1549 name: "EBMLMaxSizeLength" 1551 path: "1*1(\EBML\EBMLMaxSizeLength)" 1553 id "0x42F3" 1555 minOccurs: 1 1557 maxOccurs: 1 1559 range: not 0 1561 default: 8 1563 type: Unsigned Integer 1565 description: The "EBMLMaxSizeLength Element" stores the maximum 1566 length in octets of the expression of all "Element Data Sizes" to be 1567 found within the "EBML Body". To be clear the "EBMLMaxSizeLength 1568 Element" documents the maximum 'length' of all "Element Data Size" 1569 expressions within the "EBML Body" and not the maximum 'value' of all 1570 "Element Data Size" expressions within the "EBML Body". "EBML 1571 Elements" that have an "Element Data Size" expression which is larger 1572 in octets than what is expressed by "EBMLMaxSizeLength ELEMENT" SHALL 1573 be considered invalid. 1575 14.2.6. DocType Element 1577 name: "DocType" 1579 path: "1*1(\EBML\DocType)" 1581 id "0x4282" 1583 minOccurs: 1 1585 maxOccurs: 1 1587 size: >0 1589 type: String 1591 description: A string that describes and identifies the content of 1592 the "EBML Body" that follows this "EBML Header". 1594 14.2.7. DocTypeVersion Element 1596 name: "DocTypeVersion" 1598 path: "1*1(\EBML\DocTypeVersion)" 1600 id "0x4287" 1602 minOccurs: 1 1604 maxOccurs: 1 1606 default: 1 1608 type: Unsigned Integer 1610 description: The version of "DocType" interpreter used to create the 1611 "EBML Document". 1613 14.2.8. DocTypeReadVersion Element 1615 name: DocTypeReadVersion 1617 path: "1*1(\EBML\DocTypeReadVersion)" 1619 id "0x4285" 1621 minOccurs: 1 1623 maxOccurs: 1 1625 default: 1 1627 type: Unsigned Integer 1629 description: The minimum "DocType" version an "EBML Reader" has to 1630 support to read this "EBML Document". The value of the 1631 "DocTypeReadVersion Element" MUST be less than or equal to the value 1632 of the "DocTypeVersion Element". 1634 14.3. Global Elements 1636 EBML defines these "Global Elements" which MAY be stored within any 1637 "Master Element" of an "EBML Document" as defined by their "Element 1638 Path". 1640 14.3.1. CRC-32 Element 1642 name: CRC-32 1644 path: "*1((1*\)\CRC-32)" 1646 id: "0xBF" 1648 minOccurs: 0 1650 maxOccurs: 1 1652 size: 4 1654 type: Binary 1656 description: The "CRC-32 Element" contains a 32-bit Cyclic Redundancy 1657 Check value of all the "Element Data" of the "Parent Element" as 1658 stored except for the "CRC-32 Element" itself. When the "CRC-32 1659 Element" is present, the "CRC-32 Element" MUST be the first ordered 1660 "EBML Element" within its "Parent Element" for easier reading. All 1661 "Top-Level Elements" of an "EBML Document" that are "Master Elements" 1662 SHOULD include a "CRC-32 Element" as a "Child Element". The CRC in 1663 use is the IEEE-CRC-32 algorithm as used in the [ISO.3309.1979] 1664 standard and in section 8.1.1.6.2 of [ITU.V42.1994], with initial 1665 value of "0xFFFFFFFF". The CRC value MUST be computed on a little 1666 endian bitstream and MUST use little endian storage. 1668 14.3.2. Void Element 1670 name: Void 1672 path: "*((*\)\Void)" 1674 id: "0xEC" 1676 minOccurs: 0 1678 type: Binary 1680 description: Used to void damaged data, to avoid unexpected behaviors 1681 when using damaged data. The content is discarded. Also used to 1682 reserve space in a sub-element for later use. 1684 15. References 1686 15.1. Normative References 1688 [IEEE.754.1985] 1689 Institute of Electrical and Electronics Engineers, 1690 "Standard for Binary Floating-Point Arithmetic", 1691 IEEE Standard 754, August 1985. 1693 [ISO.3309.1979] 1694 International Organization for Standardization, "Data 1695 communication - High-level data link control procedures - 1696 Frame structure", ISO Standard 3309, 1979. 1698 [ISO.9899.2011] 1699 International Organization for Standardization, 1700 "Programming languages - C", ISO Standard 9899, 2011. 1702 [ITU.V42.1994] 1703 International Telecommunications Union, "Error-correcting 1704 Procedures for DCEs Using Asynchronous-to-Synchronous 1705 Conversion", ITU-T Recommendation V.42, 1994. 1707 [RFC0020] Cerf, V., "ASCII format for network interchange", STD 80, 1708 RFC 20, DOI 10.17487/RFC0020, October 1969, 1709 . 1711 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1712 Requirement Levels", BCP 14, RFC 2119, 1713 DOI 10.17487/RFC2119, March 1997, 1714 . 1716 [RFC3339] Klyne, G. and C. Newman, "Date and Time on the Internet: 1717 Timestamps", RFC 3339, DOI 10.17487/RFC3339, July 2002, 1718 . 1720 [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 1721 10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November 1722 2003, . 1724 [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax 1725 Specifications: ABNF", STD 68, RFC 5234, 1726 DOI 10.17487/RFC5234, January 2008, 1727 . 1729 [RFC5646] Phillips, A., Ed. and M. Davis, Ed., "Tags for Identifying 1730 Languages", BCP 47, RFC 5646, DOI 10.17487/RFC5646, 1731 September 2009, . 1733 [RFC7405] Kyzivat, P., "Case-Sensitive String Support in ABNF", 1734 RFC 7405, DOI 10.17487/RFC7405, December 2014, 1735 . 1737 [W3C.REC-xml-20081126] 1738 Bray, T., Paoli, J., Sperberg-McQueen, M., Maler, E., and 1739 F. Yergeau, "Extensible Markup Language (XML) 1.0 (Fifth 1740 Edition)", World Wide Web Consortium Recommendation REC- 1741 xml-20081126, November 2008, 1742 . 1744 15.2. Informative References 1746 [W3C.REC-xmlschema-0-20010502] 1747 Fallside, D., "XML Schema Part 0: Primer", World Wide Web 1748 Consortium Recommendation REC-xmlschema-0-20010502, May 1749 2001, 1750 . 1752 Authors' Addresses 1754 Steve Lhomme 1756 Email: slhomme@matroska.org 1758 Dave Rice 1760 Email: dave@dericed.com 1762 Moritz Bunkus 1764 Email: moritz@bunkus.org