idnits 2.17.1 draft-ietf-cellar-ffv1-12.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (28 January 2020) is 1522 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-20) exists of draft-ietf-cellar-ffv1-11 Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 cellar M. Niedermayer 3 Internet-Draft 4 Intended status: Informational D. Rice 5 Expires: 31 July 2020 6 J. Martinez 7 28 January 2020 9 FFV1 Video Coding Format Version 0, 1, and 3 10 draft-ietf-cellar-ffv1-12 12 Abstract 14 This document defines FFV1, a lossless intra-frame video encoding 15 format. FFV1 is designed to efficiently compress video data in a 16 variety of pixel formats. Compared to uncompressed video, FFV1 17 offers storage compression, frame fixity, and self-description, which 18 makes FFV1 useful as a preservation or intermediate video format. 20 Status of This Memo 22 This Internet-Draft is submitted in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at https://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on 31 July 2020. 37 Copyright Notice 39 Copyright (c) 2020 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 44 license-info) in effect on the date of publication of this document. 45 Please review these documents carefully, as they describe your rights 46 and restrictions with respect to this document. Code Components 47 extracted from this document must include Simplified BSD License text 48 as described in Section 4.e of the Trust Legal Provisions and are 49 provided without warranty as described in the Simplified BSD License. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 54 2. Notation and Conventions . . . . . . . . . . . . . . . . . . 4 55 2.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . 5 56 2.2. Conventions . . . . . . . . . . . . . . . . . . . . . . . 6 57 2.2.1. Pseudo-code . . . . . . . . . . . . . . . . . . . . . 6 58 2.2.2. Arithmetic Operators . . . . . . . . . . . . . . . . 6 59 2.2.3. Assignment Operators . . . . . . . . . . . . . . . . 6 60 2.2.4. Comparison Operators . . . . . . . . . . . . . . . . 7 61 2.2.5. Mathematical Functions . . . . . . . . . . . . . . . 7 62 2.2.6. Order of Operation Precedence . . . . . . . . . . . . 8 63 2.2.7. Range . . . . . . . . . . . . . . . . . . . . . . . . 8 64 2.2.8. NumBytes . . . . . . . . . . . . . . . . . . . . . . 8 65 2.2.9. Bitstream Functions . . . . . . . . . . . . . . . . . 8 66 3. Sample Coding . . . . . . . . . . . . . . . . . . . . . . . . 9 67 3.1. Border . . . . . . . . . . . . . . . . . . . . . . . . . 9 68 3.2. Samples . . . . . . . . . . . . . . . . . . . . . . . . . 10 69 3.3. Median Predictor . . . . . . . . . . . . . . . . . . . . 10 70 3.4. Context . . . . . . . . . . . . . . . . . . . . . . . . . 11 71 3.5. Quantization Table Sets . . . . . . . . . . . . . . . . . 12 72 3.6. Quantization Table Set Indexes . . . . . . . . . . . . . 12 73 3.7. Color spaces . . . . . . . . . . . . . . . . . . . . . . 12 74 3.7.1. YCbCr . . . . . . . . . . . . . . . . . . . . . . . . 13 75 3.7.2. RGB . . . . . . . . . . . . . . . . . . . . . . . . . 13 76 3.8. Coding of the Sample Difference . . . . . . . . . . . . . 15 77 3.8.1. Range Coding Mode . . . . . . . . . . . . . . . . . . 15 78 3.8.2. Golomb Rice Mode . . . . . . . . . . . . . . . . . . 20 79 4. Bitstream . . . . . . . . . . . . . . . . . . . . . . . . . . 25 80 4.1. Parameters . . . . . . . . . . . . . . . . . . . . . . . 26 81 4.1.1. version . . . . . . . . . . . . . . . . . . . . . . . 28 82 4.1.2. micro_version . . . . . . . . . . . . . . . . . . . . 28 83 4.1.3. coder_type . . . . . . . . . . . . . . . . . . . . . 29 84 4.1.4. state_transition_delta . . . . . . . . . . . . . . . 29 85 4.1.5. colorspace_type . . . . . . . . . . . . . . . . . . . 29 86 4.1.6. chroma_planes . . . . . . . . . . . . . . . . . . . . 30 87 4.1.7. bits_per_raw_sample . . . . . . . . . . . . . . . . . 30 88 4.1.8. log2_h_chroma_subsample . . . . . . . . . . . . . . . 31 89 4.1.9. log2_v_chroma_subsample . . . . . . . . . . . . . . . 31 90 4.1.10. extra_plane . . . . . . . . . . . . . . . . . . . . . 31 91 4.1.11. num_h_slices . . . . . . . . . . . . . . . . . . . . 31 92 4.1.12. num_v_slices . . . . . . . . . . . . . . . . . . . . 32 93 4.1.13. quant_table_set_count . . . . . . . . . . . . . . . . 32 94 4.1.14. states_coded . . . . . . . . . . . . . . . . . . . . 32 95 4.1.15. initial_state_delta . . . . . . . . . . . . . . . . . 32 96 4.1.16. ec . . . . . . . . . . . . . . . . . . . . . . . . . 33 97 4.1.17. intra . . . . . . . . . . . . . . . . . . . . . . . . 33 98 4.2. Configuration Record . . . . . . . . . . . . . . . . . . 33 99 4.2.1. reserved_for_future_use . . . . . . . . . . . . . . . 34 100 4.2.2. configuration_record_crc_parity . . . . . . . . . . . 34 101 4.2.3. Mapping FFV1 into Containers . . . . . . . . . . . . 34 102 4.3. Frame . . . . . . . . . . . . . . . . . . . . . . . . . . 35 103 4.4. Slice . . . . . . . . . . . . . . . . . . . . . . . . . . 37 104 4.5. Slice Header . . . . . . . . . . . . . . . . . . . . . . 38 105 4.5.1. slice_x . . . . . . . . . . . . . . . . . . . . . . . 38 106 4.5.2. slice_y . . . . . . . . . . . . . . . . . . . . . . . 38 107 4.5.3. slice_width . . . . . . . . . . . . . . . . . . . . . 38 108 4.5.4. slice_height . . . . . . . . . . . . . . . . . . . . 39 109 4.5.5. quant_table_set_index_count . . . . . . . . . . . . . 39 110 4.5.6. quant_table_set_index . . . . . . . . . . . . . . . . 39 111 4.5.7. picture_structure . . . . . . . . . . . . . . . . . . 39 112 4.5.8. sar_num . . . . . . . . . . . . . . . . . . . . . . . 39 113 4.5.9. sar_den . . . . . . . . . . . . . . . . . . . . . . . 40 114 4.6. Slice Content . . . . . . . . . . . . . . . . . . . . . . 40 115 4.6.1. primary_color_count . . . . . . . . . . . . . . . . . 41 116 4.6.2. plane_pixel_height . . . . . . . . . . . . . . . . . 41 117 4.6.3. slice_pixel_height . . . . . . . . . . . . . . . . . 41 118 4.6.4. slice_pixel_y . . . . . . . . . . . . . . . . . . . . 41 119 4.7. Line . . . . . . . . . . . . . . . . . . . . . . . . . . 41 120 4.7.1. plane_pixel_width . . . . . . . . . . . . . . . . . . 42 121 4.7.2. slice_pixel_width . . . . . . . . . . . . . . . . . . 42 122 4.7.3. slice_pixel_x . . . . . . . . . . . . . . . . . . . . 42 123 4.7.4. sample_difference . . . . . . . . . . . . . . . . . . 42 124 4.8. Slice Footer . . . . . . . . . . . . . . . . . . . . . . 43 125 4.8.1. slice_size . . . . . . . . . . . . . . . . . . . . . 43 126 4.8.2. error_status . . . . . . . . . . . . . . . . . . . . 43 127 4.8.3. slice_crc_parity . . . . . . . . . . . . . . . . . . 43 128 4.9. Quantization Table Set . . . . . . . . . . . . . . . . . 44 129 4.9.1. quant_tables . . . . . . . . . . . . . . . . . . . . 45 130 4.9.2. context_count . . . . . . . . . . . . . . . . . . . . 45 131 5. Restrictions . . . . . . . . . . . . . . . . . . . . . . . . 45 132 6. Security Considerations . . . . . . . . . . . . . . . . . . . 46 133 7. Media Type Definition . . . . . . . . . . . . . . . . . . . . 47 134 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 48 135 9. Appendixes . . . . . . . . . . . . . . . . . . . . . . . . . 48 136 9.1. Decoder implementation suggestions . . . . . . . . . . . 48 137 9.1.1. Multi-threading Support and Independence of Slices . 48 138 10. Changelog . . . . . . . . . . . . . . . . . . . . . . . . . . 49 139 11. Normative References . . . . . . . . . . . . . . . . . . . . 49 140 12. Informative References . . . . . . . . . . . . . . . . . . . 50 141 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 51 143 1. Introduction 145 This document describes FFV1, a lossless video encoding format. The 146 design of FFV1 considers the storage of image characteristics, data 147 fixity, and the optimized use of encoding time and storage 148 requirements. FFV1 is designed to support a wide range of lossless 149 video applications such as long-term audiovisual preservation, 150 scientific imaging, screen recording, and other video encoding 151 scenarios that seek to avoid the generational loss of lossy video 152 encodings. 154 This document defines version 0, 1 and 3 of FFV1. The distinctions 155 of the versions are provided throughout the document, but in summary: 157 * Version 0 of FFV1 was the original implementation of FFV1 and has 158 been in non-experimental use since April 14, 2006 [FFV1_V0]. 160 * Version 1 of FFV1 adds support of more video bit depths and has 161 been in use since April 24, 2009 [FFV1_V1]. 163 * Version 2 of FFV1 only existed in experimental form and is not 164 described by this document, but is available as a LyX file at 165 https://github.com/FFmpeg/FFV1/ 166 blob/8ad772b6d61c3dd8b0171979a2cd9f11924d5532/ffv1.lyx 167 (https://github.com/FFmpeg/FFV1/ 168 blob/8ad772b6d61c3dd8b0171979a2cd9f11924d5532/ffv1.lyx). 170 * Version 3 of FFV1 adds several features such as increased 171 description of the characteristics of the encoding images and 172 embedded CRC data to support fixity verification of the encoding. 173 Version 3 has been in non-experimental use since August 17, 2013 174 [FFV1_V3]. 176 This document assumes familiarity with mathematical and coding 177 concepts such as Range coding [range-coding] and YCbCr color spaces 178 [YCbCr]. 180 2. Notation and Conventions 182 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 183 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 184 "OPTIONAL" in this document are to be interpreted as described in BCP 185 14 [RFC2119] [RFC8174] when, and only when, they appear in all 186 capitals, as shown here. 188 2.1. Definitions 190 "Container": Format that encapsulates "Frames" (see Section 4.3) and 191 (when required) a "Configuration Record" into a bitstream. 193 "Sample": The smallest addressable representation of a color 194 component or a luma component in a "Frame". Examples of "Sample" are 195 Luma, Blue Chrominance, Red Chrominance, Transparency, Red, Green, 196 and Blue. 198 "Plane": A discrete component of a static image comprised of 199 "Samples" that represent a specific quantification of "Samples" of 200 that image. 202 "Pixel": The smallest addressable representation of a color in a 203 "Frame". It is composed of 1 or more "Samples". 205 "ESC": An ESCape symbol to indicate that the symbol to be stored is 206 too large for normal storage and that an alternate storage method is 207 used. 209 "MSB": Most Significant Bit, the bit that can cause the largest 210 change in magnitude of the symbol. 212 "RCT": Reversible Color Transform, a near linear, exactly reversible 213 integer transform that converts between RGB and YCbCr representations 214 of a "Pixel". 216 "VLC": Variable Length Code, a code that maps source symbols to a 217 variable number of bits. 219 "RGB": A reference to the method of storing the value of a "Pixel" by 220 using three numeric values that represent Red, Green, and Blue. 222 "YCbCr": A reference to the method of storing the value of a "Pixel" 223 by using three numeric values that represent the luma of the "Pixel" 224 (Y) and the chrominance of the "Pixel" (Cb and Cr). YCbCr word is 225 used for historical reasons and currently references any color space 226 relying on 1 luma "Sample" and 2 chrominance "Samples", e.g. YCbCr, 227 YCgCo or ICtCp. The exact meaning of the three numeric values is 228 unspecified. 230 "TBA": To Be Announced. Used in reference to the development of 231 future iterations of the FFV1 specification. 233 2.2. Conventions 235 2.2.1. Pseudo-code 237 The FFV1 bitstream is described in this document using pseudo-code. 238 Note that the pseudo-code is used for clarity in order to illustrate 239 the structure of FFV1 and not intended to specify any particular 240 implementation. The pseudo-code used is based upon the C programming 241 language [ISO.9899.1990] and uses its "if/else", "while" and "for" 242 keywords as well as functions defined within this document. 244 2.2.2. Arithmetic Operators 246 Note: the operators and the order of precedence are the same as used 247 in the C programming language [ISO.9899.2018]. 249 "a + b" means a plus b. 251 "a - b" means a minus b. 253 "-a" means negation of a. 255 "a * b" means a multiplied by b. 257 "a / b" means a divided by b. 259 "a ^ b" means a raised to the b-th power. 261 "a & b" means bit-wise "and" of a and b. 263 "a | b" means bit-wise "or" of a and b. 265 "a >> b" means arithmetic right shift of two's complement integer 266 representation of a by b binary digits. 268 "a << b" means arithmetic left shift of two's complement integer 269 representation of a by b binary digits. 271 2.2.3. Assignment Operators 273 "a = b" means a is assigned b. 275 "a++" is equivalent to a is assigned a + 1. 277 "a--" is equivalent to a is assigned a - 1. 279 "a += b" is equivalent to a is assigned a + b. 281 "a -= b" is equivalent to a is assigned a - b. 283 "a *= b" is equivalent to a is assigned a * b. 285 2.2.4. Comparison Operators 287 "a > b" means a is greater than b. 289 "a >= b" means a is greater than or equal to b. 291 "a < b" means a is less than b. 293 "a <= b" means a is less than or equal b. 295 "a == b" means a is equal to b. 297 "a != b" means a is not equal to b. 299 "a && b" means Boolean logical "and" of a and b. 301 "a || b" means Boolean logical "or" of a and b. 303 "!a" means Boolean logical "not" of a. 305 "a ? b : c" if a is true, then b, otherwise c. 307 2.2.5. Mathematical Functions 309 floor(a) the largest integer less than or equal to a 311 ceil(a) the smallest integer greater than or equal to a 313 sign(a) extracts the sign of a number, i.e. if a < 0 then -1, else if 314 a > 0 then 1, else 0 316 abs(a) the absolute value of a, i.e. abs(a) = sign(a)*a 318 log2(a) the base-two logarithm of a 320 min(a,b) the smallest of two values a and b 322 max(a,b) the largest of two values a and b 324 median(a,b,c) the numerical middle value in a data set of a, b, and 325 c, i.e. a+b+c-min(a,b,c)-max(a,b,c) 327 a_(b) the b-th value of a sequence of a 328 a~b,c. the 'b,c'-th value of a sequence of a 330 2.2.6. Order of Operation Precedence 332 When order of precedence is not indicated explicitly by use of 333 parentheses, operations are evaluated in the following order (from 334 top to bottom, operations of same precedence being evaluated from 335 left to right). This order of operations is based on the order of 336 operations used in Standard C. 338 a++, a-- 339 !a, -a 340 a ^ b 341 a * b, a / b, a % b 342 a + b, a - b 343 a << b, a >> b 344 a < b, a <= b, a > b, a >= b 345 a == b, a != b 346 a & b 347 a | b 348 a && b 349 a || b 350 a ? b : c 351 a = b, a += b, a -= b, a *= b 353 2.2.7. Range 355 "a...b" means any value starting from a to b, inclusive. 357 2.2.8. NumBytes 359 "NumBytes" is a non-negative integer that expresses the size in 8-bit 360 octets of a particular FFV1 "Configuration Record" or "Frame". FFV1 361 relies on its "Container" to store the "NumBytes" values; see 362 Section 4.2.3. 364 2.2.9. Bitstream Functions 366 2.2.9.1. remaining_bits_in_bitstream 368 "remaining_bits_in_bitstream( )" means the count of remaining bits 369 after the pointer in that "Configuration Record" or "Frame". It is 370 computed from the "NumBytes" value multiplied by 8 minus the count of 371 bits of that "Configuration Record" or "Frame" already read by the 372 bitstream parser. 374 2.2.9.2. remaining_symbols_in_syntax 376 "remaining_symbols_in_syntax( )" is true as long as the RangeCoder 377 has not consumed all the given input bytes. 379 2.2.9.3. byte_aligned 381 "byte_aligned( )" is true if "remaining_bits_in_bitstream( NumBytes 382 )" is a multiple of 8, otherwise false. 384 2.2.9.4. get_bits 386 "get_bits( i )" is the action to read the next "i" bits in the 387 bitstream, from most significant bit to least significant bit, and to 388 return the corresponding value. The pointer is increased by "i". 390 3. Sample Coding 392 For each "Slice" (as described in Section 4.4) of a "Frame", the 393 "Planes", "Lines", and "Samples" are coded in an order determined by 394 the "Color Space" (see Section 3.7). Each "Sample" is predicted by 395 the median predictor as described in Section 3.3 from other "Samples" 396 within the same "Plane" and the difference is stored using the method 397 described in Section 3.8. 399 3.1. Border 401 A border is assumed for each coded "Slice" for the purpose of the 402 median predictor and context according to the following rules: 404 * one column of "Samples" to the left of the coded slice is assumed 405 as identical to the "Samples" of the leftmost column of the coded 406 slice shifted down by one row. The value of the topmost "Sample" 407 of the column of "Samples" to the left of the coded slice is 408 assumed to be "0" 410 * one column of "Samples" to the right of the coded slice is assumed 411 as identical to the "Samples" of the rightmost column of the coded 412 slice 414 * an additional column of "Samples" to the left of the coded slice 415 and two rows of "Samples" above the coded slice are assumed to be 416 "0" 418 Figure 1 depicts a slice of 9 "Samples" "a,b,c,d,e,f,g,h,i" in a 3x3 419 arrangement along with its assumed border. 421 +---+---+---+---+---+---+---+---+ 422 | 0 | 0 | | 0 | 0 | 0 | | 0 | 423 +---+---+---+---+---+---+---+---+ 424 | 0 | 0 | | 0 | 0 | 0 | | 0 | 425 +---+---+---+---+---+---+---+---+ 426 | | | | | | | | | 427 +---+---+---+---+---+---+---+---+ 428 | 0 | 0 | | a | b | c | | c | 429 +---+---+---+---+---+---+---+---+ 430 | 0 | a | | d | e | f | | f | 431 +---+---+---+---+---+---+---+---+ 432 | 0 | d | | g | h | i | | i | 433 +---+---+---+---+---+---+---+---+ 435 Figure 1: A depiction of FFV1's assumed border for a set example 436 Samples. 438 3.2. Samples 440 Relative to any "Sample" "X", six other relatively positioned 441 "Samples" from the coded "Samples" and presumed border are identified 442 according to the labels used in Figure 2. The labels for these 443 relatively positioned "Samples" are used within the median predictor 444 and context. 446 +---+---+---+---+ 447 | | | T | | 448 +---+---+---+---+ 449 | |tl | t |tr | 450 +---+---+---+---+ 451 | L | l | X | | 452 +---+---+---+---+ 454 Figure 2: A depiction of how relatively positions Samples are 455 references within this document. 457 The labels for these relative "Samples" are made of the first letters 458 of the words Top, Left and Right. 460 3.3. Median Predictor 462 The prediction for any "Sample" value at position "X" may be computed 463 based upon the relative neighboring values of "l", "t", and "tl" via 464 this equation: 466 "median(l, t, l + t - tl)". 468 Note, this prediction template is also used in [ISO.14495-1.1999] and 469 [HuffYUV]. 471 Exception for the median predictor: if "colorspace_type == 0 && 472 bits_per_raw_sample == 16 && ( coder_type == 1 || coder_type == 2 )", 473 the following median predictor MUST be used: 475 "median(left16s, top16s, left16s + top16s - diag16s)" 477 where: 479 left16s = l >= 32768 ? ( l - 65536 ) : l 480 top16s = t >= 32768 ? ( t - 65536 ) : t 481 diag16s = tl >= 32768 ? ( tl - 65536 ) : tl 483 Background: a two's complement signed 16-bit signed integer was used 484 for storing "Sample" values in all known implementations of FFV1 485 bitstream. So in some circumstances, the most significant bit was 486 wrongly interpreted (used as a sign bit instead of the 16th bit of an 487 unsigned integer). Note that when the issue is discovered, the only 488 configuration of all known implementations being impacted is 16-bit 489 YCbCr with no Pixel transformation with Range Coder coder, as other 490 potentially impacted configurations (e.g. 15/16-bit JPEG2000-RCT with 491 Range Coder coder, or 16-bit content with Golomb Rice coder) were 492 implemented nowhere [ISO.15444-1.2016]. In the meanwhile, 16-bit 493 JPEG2000-RCT with Range Coder coder was implemented without this 494 issue in one implementation and validated by one conformance checker. 495 It is expected (to be confirmed) to remove this exception for the 496 median predictor in the next version of the FFV1 bitstream. 498 3.4. Context 500 Relative to any "Sample" "X", the Quantized Sample Differences "L-l", 501 "l-tl", "tl-t", "T-t", and "t-tr" are used as context: 503 context = Q_{0}[l - tl] + 504 Q_{1}[tl - t] + 505 Q_{2}[t - tr] + 506 Q_{3}[L - l] + 507 Q_{4}[T - t] 509 Figure 3 511 If "context >= 0" then "context" is used and the difference between 512 the "Sample" and its predicted value is encoded as is, else 513 "-context" is used and the difference between the "Sample" and its 514 predicted value is encoded with a flipped sign. 516 3.5. Quantization Table Sets 518 The FFV1 bitstream contains 1 or more Quantization Table Sets. Each 519 Quantization Table Set contains exactly 5 Quantization Tables with 520 each Quantization Table corresponding to 1 of the 5 Quantized Sample 521 Differences. For each Quantization Table, both the number of 522 quantization steps and their distribution are stored in the FFV1 523 bitstream; each Quantization Table has exactly 256 entries, and the 8 524 least significant bits of the Quantized Sample Difference are used as 525 index: 527 Q_{j}[k] = quant_tables[i][j][k&255] 529 Figure 4 531 In this formula, "i" is the Quantization Table Set index, "j" is the 532 Quantized Table index, "k" the Quantized Sample Difference. 534 3.6. Quantization Table Set Indexes 536 For each "Plane" of each slice, a Quantization Table Set is selected 537 from an index: 539 * For Y "Plane", "quant_table_set_index[ 0 ]" index is used 541 * For Cb and Cr "Planes", "quant_table_set_index[ 1 ]" index is used 543 * For extra "Plane", "quant_table_set_index[ (version <= 3 || 544 chroma_planes) ? 2 : 1 ]" index is used 546 Background: in first implementations of FFV1 bitstream, the index for 547 Cb and Cr "Planes" was stored even if it is not used (chroma_planes 548 set to 0), this index is kept for version <= 3 in order to keep 549 compatibility with FFV1 bitstreams in the wild. 551 3.7. Color spaces 553 FFV1 supports several color spaces. The count of allowed coded 554 planes and the meaning of the extra "Plane" are determined by the 555 selected color space. 557 The FFV1 bitstream interleaves data in an order determined by the 558 color space. In YCbCr for each "Plane", each "Line" is coded from 559 top to bottom and for each "Line", each "Sample" is coded from left 560 to right. In JPEG2000-RCT for each "Line" from top to bottom, each 561 "Plane" is coded and for each "Plane", each "Sample" is encoded from 562 left to right. 564 3.7.1. YCbCr 566 This color space allows 1 to 4 "Planes". 568 The Cb and Cr "Planes" are optional, but if used then MUST be used 569 together. Omitting the Cb and Cr "Planes" codes the frames in 570 grayscale without color data. 572 An optional transparency "Plane" can be used to code transparency 573 data. 575 An FFV1 "Frame" using YCbCr MUST use one of the following 576 arrangements: 578 * Y 580 * Y, Transparency 582 * Y, Cb, Cr 584 * Y, Cb, Cr, Transparency 586 The Y "Plane" MUST be coded first. If the Cb and Cr "Planes" are 587 used then they MUST be coded after the Y "Plane". If a transparency 588 "Plane" is used, then it MUST be coded last. 590 3.7.2. RGB 592 This color space allows 3 or 4 "Planes". 594 An optional transparency "Plane" can be used to code transparency 595 data. 597 JPEG2000-RCT is a Reversible Color Transform that codes RGB (red, 598 green, blue) "Planes" losslessly in a modified YCbCr color space 599 [ISO.15444-1.2016]. Reversible Pixel transformations between YCbCr 600 and RGB use the following formulae. 602 Cb=b-g 603 Cr=r-g 604 Y=g+(Cb+Cr)>>2 605 g=Y-(Cb+Cr)>>2 606 r=Cr+g 607 b=Cb+g 609 Figure 5 611 Exception for the JPEG2000-RCT conversion: if bits_per_raw_sample is 612 between 9 and 15 inclusive and extra_plane is 0, the following 613 formulae for reversible conversions between YCbCr and RGB MUST be 614 used instead of the ones above: 616 Cb=g-b 617 Cr=r-b 618 Y=b+(Cb+Cr)>>2 619 b=Y-(Cb+Cr)>>2 620 r=Cr+b 621 g=Cb+b 623 Figure 6 625 Background: At the time of this writing, in all known implementations 626 of FFV1 bitstream, when bits_per_raw_sample was between 9 and 15 627 inclusive and extra_plane is 0, GBR "Planes" were used as BGR 628 "Planes" during both encoding and decoding. In the meanwhile, 16-bit 629 JPEG2000-RCT was implemented without this issue in one implementation 630 and validated by one conformance checker. Methods to address this 631 exception for the transform are under consideration for the next 632 version of the FFV1 bitstream. 634 When FFV1 uses the JPEG2000-RCT, the horizontal "Lines" are 635 interleaved to improve caching efficiency since it is most likely 636 that the JPEG2000-RCT will immediately be converted to RGB during 637 decoding. The interleaved coding order is also Y, then Cb, then Cr, 638 and then if used transparency. 640 As an example, a "Frame" that is two "Pixels" wide and two "Pixels" 641 high, could comprise the following structure: 643 +------------------------+------------------------+ 644 | Pixel(1,1) | Pixel(2,1) | 645 | Y(1,1) Cb(1,1) Cr(1,1) | Y(2,1) Cb(2,1) Cr(2,1) | 646 +------------------------+------------------------+ 647 | Pixel(1,2) | Pixel(2,2) | 648 | Y(1,2) Cb(1,2) Cr(1,2) | Y(2,2) Cb(2,2) Cr(2,2) | 649 +------------------------+------------------------+ 651 In JPEG2000-RCT, the coding order would be left to right and then top 652 to bottom, with values interleaved by "Lines" and stored in this 653 order: 655 Y(1,1) Y(2,1) Cb(1,1) Cb(2,1) Cr(1,1) Cr(2,1) Y(1,2) Y(2,2) Cb(1,2) 656 Cb(2,2) Cr(1,2) Cr(2,2) 658 3.8. Coding of the Sample Difference 660 Instead of coding the n+1 bits of the Sample Difference with Huffman 661 or Range coding (or n+2 bits, in the case of JPEG2000-RCT), only the 662 n (or n+1, in the case of JPEG2000-RCT) least significant bits are 663 used, since this is sufficient to recover the original "Sample". In 664 the equation below, the term "bits" represents bits_per_raw_sample+1 665 for JPEG2000-RCT or bits_per_raw_sample otherwise: 667 coder_input = 668 [(sample_difference + 2^(bits-1)) & (2^bits - 1)] - 2^(bits-1) 670 Figure 7 672 3.8.1. Range Coding Mode 674 Early experimental versions of FFV1 used the CABAC Arithmetic coder 675 from H.264 as defined in [ISO.14496-10.2014] but due to the uncertain 676 patent/royalty situation, as well as its slightly worse performance, 677 CABAC was replaced by a Range coder based on an algorithm defined by 678 G. Nigel and N. Martin in 1979 [range-coding]. 680 3.8.1.1. Range Binary Values 682 To encode binary digits efficiently a Range coder is used. "C~i~" is 683 the i-th Context. "B~i~" is the i-th byte of the bytestream. "b~i~" 684 is the i-th Range coded binary value, "S~0,i~" is the i-th initial 685 state. The length of the bytestream encoding n binary symbols is 686 "j~n~" bytes. 688 r_{i} = floor( ( R_{i} * S_{i,C_{i}} ) / 2^8 ) 690 Figure 8 692 S_{i+1,C_{i}} = zero_state_{S_{i,C_{i}}} AND 693 l_i = L_i AND 694 t_i = R_i - r_i <== 695 b_i = 0 <==> 696 L_i < R_i - r_i 698 S_{i+1,C_{i}} = one_state_{S_{i,C_{i}}} AND 699 l_i = L_i - R_i + r_i AND 700 t_i = r_i <== 701 b_i = 1 <==> 702 L_i >= R_i - r_i 704 Figure 9 706 S_{i+1,k} = S_{i,k} <== C_i != k 708 Figure 10 710 R_{i+1} = 2^8 * t_{i} AND 711 L_{i+1} = 2^8 * l_{i} + B_{j_{i}} AND 712 j_{i+1} = j_{i} + 1 <== 713 t_{i} < 2^8 715 R_{i+1} = t_{i} AND 716 L_{i+1} = l_{i} AND 717 j_{i+1} = j_{i} <== 718 t_{i} >= 2^8 720 Figure 11 722 R_{0} = 65280 724 Figure 12 726 L_{0} = 2^8 * B_{0} + B_{1} 728 Figure 13 730 j_{0} = 2 732 Figure 14 734 3.8.1.1.1. Termination 736 The range coder can be used in 3 modes. 738 * In "Open mode" when decoding, every symbol the reader attempts to 739 read is available. In this mode arbitrary data can have been 740 appended without affecting the range coder output. This mode is 741 not used in FFV1. 743 * In "Closed mode" the length in bytes of the bytestream is provided 744 to the range decoder. Bytes beyond the length are read as 0 by 745 the range decoder. This is generally 1 byte shorter than the open 746 mode. 748 * In "Sentinel mode" the exact length in bytes is not known and thus 749 the range decoder MAY read into the data that follows the range 750 coded bytestream by one byte. In "Sentinel mode", the end of the 751 range coded bytestream is a binary symbol with state 129, which 752 value SHALL be discarded. After reading this symbol, the range 753 decoder will have read one byte beyond the end of the range coded 754 bytestream. This way the byte position of the end can be 755 determined. Bytestreams written in "Sentinel mode" can be read in 756 "Closed mode" if the length can be determined, in this case the 757 last (sentinel) symbol will be read non-corrupted and be of value 758 0. 760 Above describes the range decoding, encoding is defined as any 761 process which produces a decodable bytestream. 763 There are 3 places where range coder termination is needed in FFV1. 764 First is in the "Configuration Record", in this case the size of the 765 range coded bytestream is known and handled as "Closed mode". Second 766 is the switch from the "Slice Header" which is range coded to Golomb 767 coded slices as "Sentinel mode". Third is the end of range coded 768 Slices which need to terminate before the CRC at their end. This can 769 be handled as "Sentinel mode" or as "Closed mode" if the CRC position 770 has been determined. 772 3.8.1.2. Range Non Binary Values 774 To encode scalar integers, it would be possible to encode each bit 775 separately and use the past bits as context. However that would mean 776 255 contexts per 8-bit symbol that is not only a waste of memory but 777 also requires more past data to reach a reasonably good estimate of 778 the probabilities. Alternatively assuming a Laplacian distribution 779 and only dealing with its variance and mean (as in Huffman coding) 780 would also be possible, however, for maximum flexibility and 781 simplicity, the chosen method uses a single symbol to encode if a 782 number is 0, and if not, encodes the number using its exponent, 783 mantissa and sign. The exact contexts used are best described by 784 Figure 15, followed by some comments. 786 pseudo-code | type 787 --------------------------------------------------------------|----- 788 void put_symbol(RangeCoder *c, uint8_t *state, int v, int \ | 789 is_signed) { | 790 int i; | 791 put_rac(c, state+0, !v); | 792 if (v) { | 793 int a= abs(v); | 794 int e= log2(a); | 795 | 796 for (i = 0; i < e; i++) { | 797 put_rac(c, state+1+min(i,9), 1); //1..10 | 798 } | 799 | 800 put_rac(c, state+1+min(i,9), 0); | 801 for (i = e-1; i >= 0; i--) { | 802 put_rac(c, state+22+min(i,9), (a>>i)&1); //22..31 | 803 } | 804 | 805 if (is_signed) { | 806 put_rac(c, state+11 + min(e, 10), v < 0); //11..21| 807 } | 808 } | 809 } | 811 Figure 15: A pseudo-code description of the contexts of Range Non 812 Binary Values. 814 3.8.1.3. Initial Values for the Context Model 816 At keyframes all Range coder state variables are set to their initial 817 state. 819 3.8.1.4. State Transition Table 821 one_state_{i} = 822 default_state_transition_{i} + state_transition_delta_{i} 824 Figure 16 826 zero_state_{i} = 256 - one_state_{256-i} 828 Figure 17 830 3.8.1.5. default_state_transition 831 0, 0, 0, 0, 0, 0, 0, 0, 20, 21, 22, 23, 24, 25, 26, 27, 833 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 37, 38, 39, 40, 41, 42, 835 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 56, 57, 837 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 839 74, 75, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 841 89, 90, 91, 92, 93, 94, 94, 95, 96, 97, 98, 99,100,101,102,103, 843 104,105,106,107,108,109,110,111,112,113,114,114,115,116,117,118, 845 119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,133, 847 134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149, 849 150,151,152,152,153,154,155,156,157,158,159,160,161,162,163,164, 851 165,166,167,168,169,170,171,171,172,173,174,175,176,177,178,179, 853 180,181,182,183,184,185,186,187,188,189,190,190,191,192,194,194, 855 195,196,197,198,199,200,201,202,202,204,205,206,207,208,209,209, 857 210,211,212,213,215,215,216,217,218,219,220,220,222,223,224,225, 859 226,227,227,229,229,230,231,232,234,234,235,236,237,238,239,240, 861 241,242,243,244,245,246,247,248,248, 0, 0, 0, 0, 0, 0, 0, 863 3.8.1.6. Alternative State Transition Table 865 The alternative state transition table has been built using iterative 866 minimization of frame sizes and generally performs better than the 867 default. To use it, the coder_type (see Section 4.1.3) MUST be set 868 to 2 and the difference to the default MUST be stored in the 869 "Parameters", see Section 4.1. The reference implementation of FFV1 870 in FFmpeg uses Figure 18 by default at the time of this writing when 871 Range coding is used. 873 0, 10, 10, 10, 10, 16, 16, 16, 28, 16, 16, 29, 42, 49, 20, 49, 875 59, 25, 26, 26, 27, 31, 33, 33, 33, 34, 34, 37, 67, 38, 39, 39, 877 40, 40, 41, 79, 43, 44, 45, 45, 48, 48, 64, 50, 51, 52, 88, 52, 879 53, 74, 55, 57, 58, 58, 74, 60,101, 61, 62, 84, 66, 66, 68, 69, 881 87, 82, 71, 97, 73, 73, 82, 75,111, 77, 94, 78, 87, 81, 83, 97, 883 85, 83, 94, 86, 99, 89, 90, 99,111, 92, 93,134, 95, 98,105, 98, 885 105,110,102,108,102,118,103,106,106,113,109,112,114,112,116,125, 887 115,116,117,117,126,119,125,121,121,123,145,124,126,131,127,129, 889 165,130,132,138,133,135,145,136,137,139,146,141,143,142,144,148, 891 147,155,151,149,151,150,152,157,153,154,156,168,158,162,161,160, 893 172,163,169,164,166,184,167,170,177,174,171,173,182,176,180,178, 895 175,189,179,181,186,183,192,185,200,187,191,188,190,197,193,196, 897 197,194,195,196,198,202,199,201,210,203,207,204,205,206,208,214, 899 209,211,221,212,213,215,224,216,217,218,219,220,222,228,223,225, 901 226,224,227,229,240,230,231,232,233,234,235,236,238,239,237,242, 903 241,243,242,244,245,246,247,248,249,250,251,252,252,253,254,255, 905 Figure 18: Alternative state transition table for Range coding. 907 3.8.2. Golomb Rice Mode 909 The end of the bitstream of the "Frame" is filled with 0-bits until 910 that the bitstream contains a multiple of 8 bits. 912 3.8.2.1. Signed Golomb Rice Codes 914 This coding mode uses Golomb Rice codes. The VLC is split into 2 915 parts, the prefix stores the most significant bits and the suffix 916 stores the k least significant bits or stores the whole number in the 917 ESC case. 919 pseudo-code | type 920 --------------------------------------------------------------|----- 921 int get_ur_golomb(k) { | 922 for (prefix = 0; prefix < 12; prefix++) { | 923 if (get_bits(1)) { | 924 return get_bits(k) + (prefix << k) | 925 } | 926 } | 927 return get_bits(bits) + 11 | 928 } | 929 | 930 int get_sr_golomb(k) { | 931 v = get_ur_golomb(k); | 932 if (v & 1) return - (v >> 1) - 1; | 933 else return (v >> 1); | 934 } 936 3.8.2.1.1. Prefix 938 +----------------+-------+ 939 | bits | value | 940 +================+=======+ 941 | 1 | 0 | 942 +----------------+-------+ 943 | 01 | 1 | 944 +----------------+-------+ 945 | ... | ... | 946 +----------------+-------+ 947 | 0000 0000 0001 | 11 | 948 +----------------+-------+ 949 | 0000 0000 0000 | ESC | 950 +----------------+-------+ 952 Table 1 954 3.8.2.1.2. Suffix 956 +---------+--------------------------------------------------+ 957 +=========+==================================================+ 958 | non ESC | the k least significant bits MSB first | 959 +---------+--------------------------------------------------+ 960 | ESC | the value - 11, in MSB first order, ESC may only | 961 | | be used if the value cannot be coded as non ESC | 962 +---------+--------------------------------------------------+ 964 Table 2 966 3.8.2.1.3. Examples 968 +-----+-------------------------+-------+ 969 | k | bits | value | 970 +=====+=========================+=======+ 971 | 0 | "1" | 0 | 972 +-----+-------------------------+-------+ 973 | 0 | "001" | 2 | 974 +-----+-------------------------+-------+ 975 | 2 | "1 00" | 0 | 976 +-----+-------------------------+-------+ 977 | 2 | "1 10" | 2 | 978 +-----+-------------------------+-------+ 979 | 2 | "01 01" | 5 | 980 +-----+-------------------------+-------+ 981 | any | "000000000000 10000000" | 139 | 982 +-----+-------------------------+-------+ 984 Table 3 986 3.8.2.2. Run Mode 988 Run mode is entered when the context is 0 and left as soon as a non-0 989 difference is found. The level is identical to the predicted one. 990 The run and the first different level are coded. 992 3.8.2.2.1. Run Length Coding 994 The run value is encoded in 2 parts, the prefix part stores the more 995 significant part of the run as well as adjusting the run_index that 996 determines the number of bits in the less significant part of the 997 run. The 2nd part of the value stores the less significant part of 998 the run as it is. The run_index is reset for each "Plane" and slice 999 to 0. 1001 pseudo-code | type 1002 --------------------------------------------------------------|----- 1003 log2_run[41]={ | 1004 0, 0, 0, 0, 1, 1, 1, 1, | 1005 2, 2, 2, 2, 3, 3, 3, 3, | 1006 4, 4, 5, 5, 6, 6, 7, 7, | 1007 8, 9,10,11,12,13,14,15, | 1008 16,17,18,19,20,21,22,23, | 1009 24, | 1010 }; | 1011 | 1012 if (run_count == 0 && run_mode == 1) { | 1013 if (get_bits(1)) { | 1014 run_count = 1 << log2_run[run_index]; | 1015 if (x + run_count <= w) { | 1016 run_index++; | 1017 } | 1018 } else { | 1019 if (log2_run[run_index]) { | 1020 run_count = get_bits(log2_run[run_index]); | 1021 } else { | 1022 run_count = 0; | 1023 } | 1024 if (run_index) { | 1025 run_index--; | 1026 } | 1027 run_mode = 2; | 1028 } | 1029 } | 1031 The log2_run function is also used within [ISO.14495-1.1999]. 1033 3.8.2.2.2. Level Coding 1035 Level coding is identical to the normal difference coding with the 1036 exception that the 0 value is removed as it cannot occur: 1038 diff = get_vlc_symbol(context_state); 1039 if (diff >= 0) { 1040 diff++; 1041 } 1043 Note, this is different from JPEG-LS, which doesn't use prediction in 1044 run mode and uses a different encoding and context model for the last 1045 difference On a small set of test "Samples" the use of prediction 1046 slightly improved the compression rate. 1048 3.8.2.3. Scalar Mode 1050 Each difference is coded with the per context mean prediction removed 1051 and a per context value for k. 1053 get_vlc_symbol(state) { 1054 i = state->count; 1055 k = 0; 1056 while (i < state->error_sum) { 1057 k++; 1058 i += i; 1059 } 1061 v = get_sr_golomb(k); 1063 if (2 * state->drift < -state->count) { 1064 v = -1 - v; 1065 } 1067 ret = sign_extend(v + state->bias, bits); 1069 state->error_sum += abs(v); 1070 state->drift += v; 1072 if (state->count == 128) { 1073 state->count >>= 1; 1074 state->drift >>= 1; 1075 state->error_sum >>= 1; 1076 } 1077 state->count++; 1078 if (state->drift <= -state->count) { 1079 state->bias = max(state->bias - 1, -128); 1081 state->drift = max(state->drift + state->count, 1082 -state->count + 1); 1083 } else if (state->drift > 0) { 1084 state->bias = min(state->bias + 1, 127); 1086 state->drift = min(state->drift - state->count, 0); 1087 } 1089 return ret; 1090 } 1092 3.8.2.4. Initial Values for the VLC context state 1094 At keyframes all coder state variables are set to their initial 1095 state. 1097 drift = 0; 1098 error_sum = 4; 1099 bias = 0; 1100 count = 1; 1102 4. Bitstream 1104 An FFV1 bitstream is composed of a series of 1 or more "Frames" and 1105 (when required) a "Configuration Record". 1107 Within the following sub-sections, pseudo-code is used to explain the 1108 structure of each FFV1 bitstream component, as described in 1109 Section 2.2.1. Table 4 lists symbols used to annotate that pseudo- 1110 code in order to define the storage of the data referenced in that 1111 line of pseudo-code. 1113 +--------+----------------------------------------------+ 1114 | Symbol | Definition | 1115 +========+==============================================+ 1116 | u(n) | unsigned big endian integer using n bits | 1117 +--------+----------------------------------------------+ 1118 | sg | Golomb Rice coded signed scalar symbol coded | 1119 | | with the method described in Section 3.8.2 | 1120 +--------+----------------------------------------------+ 1121 | br | Range coded Boolean (1-bit) symbol with the | 1122 | | method described in Section 3.8.1.1 | 1123 +--------+----------------------------------------------+ 1124 | ur | Range coded unsigned scalar symbol coded | 1125 | | with the method described in Section 3.8.1.2 | 1126 +--------+----------------------------------------------+ 1127 | sr | Range coded signed scalar symbol coded with | 1128 | | the method described in Section 3.8.1.2 | 1129 +--------+----------------------------------------------+ 1131 Table 4: Definition of pseudo-code symbols for this 1132 document. 1134 The same context that is initialized to 128 is used for all fields in 1135 the header. 1137 The following MUST be provided by external means during 1138 initialization of the decoder: 1140 "frame_pixel_width" is defined as "Frame" width in "Pixels". 1142 "frame_pixel_height" is defined as "Frame" height in "Pixels". 1144 Default values at the decoder initialization phase: 1146 "ConfigurationRecordIsPresent" is set to 0. 1148 4.1. Parameters 1150 The "Parameters" section contains significant characteristics about 1151 the decoding configuration used for all instances of "Frame" (in FFV1 1152 version 0 and 1) or the whole FFV1 bitstream (other versions), 1153 including the stream version, color configuration, and quantization 1154 tables. Figure 19 describes the contents of the bitstream. 1156 pseudo-code | type 1157 --------------------------------------------------------------|----- 1158 Parameters( ) { | 1159 version | ur 1160 if (version >= 3) { | 1161 micro_version | ur 1162 } | 1163 coder_type | ur 1164 if (coder_type > 1) { | 1165 for (i = 1; i < 256; i++) { | 1166 state_transition_delta[ i ] | sr 1167 } | 1168 } | 1169 colorspace_type | ur 1170 if (version >= 1) { | 1171 bits_per_raw_sample | ur 1172 } | 1173 chroma_planes | br 1174 log2_h_chroma_subsample | ur 1175 log2_v_chroma_subsample | ur 1176 extra_plane | br 1177 if (version >= 3) { | 1178 num_h_slices - 1 | ur 1179 num_v_slices - 1 | ur 1180 quant_table_set_count | ur 1181 } | 1182 for (i = 0; i < quant_table_set_count; i++) { | 1183 QuantizationTableSet( i ) | 1184 } | 1185 if (version >= 3) { | 1186 for (i = 0; i < quant_table_set_count; i++) { | 1187 states_coded | br 1188 if (states_coded) { | 1189 for (j = 0; j < context_count[ i ]; j++) { | 1190 for (k = 0; k < CONTEXT_SIZE; k++) { | 1191 initial_state_delta[ i ][ j ][ k ] | sr 1192 } | 1193 } | 1194 } | 1195 } | 1196 ec | ur 1197 intra | ur 1198 } | 1199 } | 1201 Figure 19: A pseudo-code description of the bitstream contents. 1203 CONTEXT_SIZE is 32. 1205 4.1.1. version 1207 "version" specifies the version of the FFV1 bitstream. 1209 Each version is incompatible with other versions: decoders SHOULD 1210 reject a file due to an unknown version. 1212 Decoders SHOULD reject a file with version <= 1 && 1213 ConfigurationRecordIsPresent == 1. 1215 Decoders SHOULD reject a file with version >= 3 && 1216 ConfigurationRecordIsPresent == 0. 1218 +-------+-------------------------+ 1219 | value | version | 1220 +=======+=========================+ 1221 | 0 | FFV1 version 0 | 1222 +-------+-------------------------+ 1223 | 1 | FFV1 version 1 | 1224 +-------+-------------------------+ 1225 | 2 | reserved* | 1226 +-------+-------------------------+ 1227 | 3 | FFV1 version 3 | 1228 +-------+-------------------------+ 1229 | Other | reserved for future use | 1230 +-------+-------------------------+ 1232 Table 5 1234 * Version 2 was never enabled in the encoder thus version 2 files 1235 SHOULD NOT exist, and this document does not describe them to keep 1236 the text simpler. 1238 4.1.2. micro_version 1240 "micro_version" specifies the micro-version of the FFV1 bitstream. 1242 After a version is considered stable (a micro-version value is 1243 assigned to be the first stable variant of a specific version), each 1244 new micro-version after this first stable variant is compatible with 1245 the previous micro-version: decoders SHOULD NOT reject a file due to 1246 an unknown micro-version equal or above the micro-version considered 1247 as stable. 1249 Meaning of micro_version for version 3: 1251 +-------+-------------------------+ 1252 | value | micro_version | 1253 +=======+=========================+ 1254 | 0...3 | reserved* | 1255 +-------+-------------------------+ 1256 | 4 | first stable variant | 1257 +-------+-------------------------+ 1258 | Other | reserved for future use | 1259 +-------+-------------------------+ 1261 Table 6: The definitions for 1262 micro_version values. 1264 * development versions may be incompatible with the stable variants. 1266 4.1.3. coder_type 1268 "coder_type" specifies the coder used. 1270 +-------+-------------------------------------------------+ 1271 | value | coder used | 1272 +=======+=================================================+ 1273 | 0 | Golomb Rice | 1274 +-------+-------------------------------------------------+ 1275 | 1 | Range Coder with default state transition table | 1276 +-------+-------------------------------------------------+ 1277 | 2 | Range Coder with custom state transition table | 1278 +-------+-------------------------------------------------+ 1279 | Other | reserved for future use | 1280 +-------+-------------------------------------------------+ 1282 Table 7 1284 4.1.4. state_transition_delta 1286 "state_transition_delta" specifies the Range coder custom state 1287 transition table. 1289 If state_transition_delta is not present in the FFV1 bitstream, all 1290 Range coder custom state transition table elements are assumed to be 1291 0. 1293 4.1.5. colorspace_type 1295 "colorspace_type" specifies the color space encoded, the pixel 1296 transformation used by the encoder, the extra plane content, as well 1297 as interleave method. 1299 +-------+-------------+----------------+--------------+-------------+ 1300 | value | color space | pixel | extra plane | interleave | 1301 | | encoded | transformation | content | method | 1302 +=======+=============+================+==============+=============+ 1303 | 0 | YCbCr | None | Transparency | "Plane" | 1304 | | | | | then | 1305 | | | | | "Line" | 1306 +-------+-------------+----------------+--------------+-------------+ 1307 | 1 | RGB | JPEG2000-RCT | Transparency | "Line" | 1308 | | | | | then | 1309 | | | | | "Plane" | 1310 +-------+-------------+----------------+--------------+-------------+ 1311 | Other | reserved | reserved for | reserved for | reserved | 1312 | | for future | future use | future use | for future | 1313 | | use | | | use | 1314 +-------+-------------+----------------+--------------+-------------+ 1316 Table 8 1318 Restrictions: 1320 If "colorspace_type" is 1, then "chroma_planes" MUST be 1, 1321 "log2_h_chroma_subsample" MUST be 0, and "log2_v_chroma_subsample" 1322 MUST be 0. 1324 4.1.6. chroma_planes 1326 "chroma_planes" indicates if chroma (color) "Planes" are present. 1328 +-------+---------------------------------+ 1329 | value | presence | 1330 +=======+=================================+ 1331 | 0 | chroma "Planes" are not present | 1332 +-------+---------------------------------+ 1333 | 1 | chroma "Planes" are present | 1334 +-------+---------------------------------+ 1336 Table 9 1338 4.1.7. bits_per_raw_sample 1340 "bits_per_raw_sample" indicates the number of bits for each "Sample". 1341 Inferred to be 8 if not present. 1343 +-------+-----------------------------------+ 1344 | value | bits for each sample | 1345 +=======+===================================+ 1346 | 0 | reserved* | 1347 +-------+-----------------------------------+ 1348 | Other | the actual bits for each "Sample" | 1349 +-------+-----------------------------------+ 1351 Table 10 1353 * Encoders MUST NOT store bits_per_raw_sample = 0 Decoders SHOULD 1354 accept and interpret bits_per_raw_sample = 0 as 8. 1356 4.1.8. log2_h_chroma_subsample 1358 "log2_h_chroma_subsample" indicates the subsample factor, stored in 1359 powers to which the number 2 must be raised, between luma and chroma 1360 width ("chroma_width = 2^-log2_h_chroma_subsample^ * luma_width"). 1362 4.1.9. log2_v_chroma_subsample 1364 "log2_v_chroma_subsample" indicates the subsample factor, stored in 1365 powers to which the number 2 must be raised, between luma and chroma 1366 height ("chroma_height=2^-log2_v_chroma_subsample^ * luma_height"). 1368 4.1.10. extra_plane 1370 "extra_plane" indicates if an extra "Plane" is present. 1372 +-------+------------------------------+ 1373 | value | presence | 1374 +=======+==============================+ 1375 | 0 | extra "Plane" is not present | 1376 +-------+------------------------------+ 1377 | 1 | extra "Plane" is present | 1378 +-------+------------------------------+ 1380 Table 11 1382 4.1.11. num_h_slices 1384 "num_h_slices" indicates the number of horizontal elements of the 1385 slice raster. 1387 Inferred to be 1 if not present. 1389 4.1.12. num_v_slices 1391 "num_v_slices" indicates the number of vertical elements of the slice 1392 raster. 1394 Inferred to be 1 if not present. 1396 4.1.13. quant_table_set_count 1398 "quant_table_set_count" indicates the number of Quantization 1399 Table Sets. "quant_table_set_count" MUST be less than or equal to 8. 1401 Inferred to be 1 if not present. 1403 MUST NOT be 0. 1405 4.1.14. states_coded 1407 "states_coded" indicates if the respective Quantization Table Set has 1408 the initial states coded. 1410 Inferred to be 0 if not present. 1412 +-------+--------------------------------+ 1413 | value | initial states | 1414 +=======+================================+ 1415 | 0 | initial states are not present | 1416 | | and are assumed to be all 128 | 1417 +-------+--------------------------------+ 1418 | 1 | initial states are present | 1419 +-------+--------------------------------+ 1421 Table 12 1423 4.1.15. initial_state_delta 1425 "initial_state_delta[ i ][ j ][ k ]" indicates the initial Range 1426 coder state, it is encoded using "k" as context index and 1428 pred = j ? initial_states[ i ][j - 1][ k ] 1430 Figure 20 1432 initial_state[ i ][ j ][ k ] = 1433 ( pred + initial_state_delta[ i ][ j ][ k ] ) & 255 1435 Figure 21 1437 4.1.16. ec 1439 "ec" indicates the error detection/correction type. 1441 +-------+--------------------------------------------+ 1442 | value | error detection/correction type | 1443 +=======+============================================+ 1444 | 0 | 32-bit CRC on the global header | 1445 +-------+--------------------------------------------+ 1446 | 1 | 32-bit CRC per slice and the global header | 1447 +-------+--------------------------------------------+ 1448 | Other | reserved for future use | 1449 +-------+--------------------------------------------+ 1451 Table 13 1453 4.1.17. intra 1455 "intra" indicates the relationship between the instances of "Frame". 1457 Inferred to be 0 if not present. 1459 +-------+-------------------------------------+ 1460 | value | relationship | 1461 +=======+=====================================+ 1462 | 0 | Frames are independent or dependent | 1463 | | (keyframes and non keyframes) | 1464 +-------+-------------------------------------+ 1465 | 1 | Frames are independent (keyframes | 1466 | | only) | 1467 +-------+-------------------------------------+ 1468 | Other | reserved for future use | 1469 +-------+-------------------------------------+ 1471 Table 14 1473 4.2. Configuration Record 1475 In the case of a FFV1 bitstream with "version >= 3", a "Configuration 1476 Record" is stored in the underlying "Container", at the track header 1477 level. It contains the "Parameters" used for all instances of 1478 "Frame". The size of the "Configuration Record", "NumBytes", is 1479 supplied by the underlying "Container". 1481 pseudo-code | type 1482 -----------------------------------------------------------|----- 1483 ConfigurationRecord( NumBytes ) { | 1484 ConfigurationRecordIsPresent = 1 | 1485 Parameters( ) | 1486 while (remaining_symbols_in_syntax(NumBytes - 4)) { | 1487 reserved_for_future_use | br/ur/sr 1488 } | 1489 configuration_record_crc_parity | u(32) 1490 } | 1492 4.2.1. reserved_for_future_use 1494 "reserved_for_future_use" has semantics that are reserved for future 1495 use. 1497 Encoders conforming to this version of this specification SHALL NOT 1498 write this value. 1500 Decoders conforming to this version of this specification SHALL 1501 ignore its value. 1503 4.2.2. configuration_record_crc_parity 1505 "configuration_record_crc_parity" 32 bits that are chosen so that the 1506 "Configuration Record" as a whole has a crc remainder of 0. 1508 This is equivalent to storing the crc remainder in the 32-bit parity. 1510 The CRC generator polynomial used is the standard IEEE CRC polynomial 1511 (0x104C11DB7) with initial value 0. 1513 4.2.3. Mapping FFV1 into Containers 1515 This "Configuration Record" can be placed in any file format 1516 supporting "Configuration Records", fitting as much as possible with 1517 how the file format uses to store "Configuration Records". The 1518 "Configuration Record" storage place and "NumBytes" are currently 1519 defined and supported by this version of this specification for the 1520 following formats: 1522 4.2.3.1. AVI File Format 1524 The "Configuration Record" extends the stream format chunk ("AVI ", 1525 "hdlr", "strl", "strf") with the ConfigurationRecord bitstream. 1527 See [AVI] for more information about chunks. 1529 "NumBytes" is defined as the size, in bytes, of the strf chunk 1530 indicated in the chunk header minus the size of the stream format 1531 structure. 1533 4.2.3.2. ISO Base Media File Format 1535 The "Configuration Record" extends the sample description box 1536 ("moov", "trak", "mdia", "minf", "stbl", "stsd") with a "glbl" box 1537 that contains the ConfigurationRecord bitstream. See 1538 [ISO.14496-12.2015] for more information about boxes. 1540 "NumBytes" is defined as the size, in bytes, of the "glbl" box 1541 indicated in the box header minus the size of the box header. 1543 4.2.3.3. NUT File Format 1545 The codec_specific_data element (in "stream_header" packet) contains 1546 the ConfigurationRecord bitstream. See [NUT] for more information 1547 about elements. 1549 "NumBytes" is defined as the size, in bytes, of the 1550 codec_specific_data element as indicated in the "length" field of 1551 codec_specific_data 1553 4.2.3.4. Matroska File Format 1555 FFV1 SHOULD use "V_FFV1" as the Matroska "Codec ID". For FFV1 1556 versions 2 or less, the Matroska "CodecPrivate" Element SHOULD NOT be 1557 used. For FFV1 versions 3 or greater, the Matroska "CodecPrivate" 1558 Element MUST contain the FFV1 "Configuration Record" structure and no 1559 other data. See [Matroska] for more information about elements. 1561 "NumBytes" is defined as the "Element Data Size" of the 1562 "CodecPrivate" Element. 1564 4.3. Frame 1566 A "Frame" is an encoded representation of a complete static image. 1567 The whole "Frame" is provided by the underlaying container. 1569 A "Frame" consists of the keyframe field, "Parameters" (if version 1570 <=1), and a sequence of independent slices. The pseudo-code below 1571 describes the contents of a "Frame". 1573 pseudo-code | type 1574 --------------------------------------------------------------|----- 1575 Frame( NumBytes ) { | 1576 keyframe | br 1577 if (keyframe && !ConfigurationRecordIsPresent { | 1578 Parameters( ) | 1579 } | 1580 while (remaining_bits_in_bitstream( NumBytes )) { | 1581 Slice( ) | 1582 } | 1583 } | 1585 Architecture overview of slices in a "Frame": 1587 +-----------------------------------------------------------------+ 1588 +=================================================================+ 1589 | first slice header | 1590 +-----------------------------------------------------------------+ 1591 | first slice content | 1592 +-----------------------------------------------------------------+ 1593 | first slice footer | 1594 +-----------------------------------------------------------------+ 1595 | --------------------------------------------------------------- | 1596 +-----------------------------------------------------------------+ 1597 | second slice header | 1598 +-----------------------------------------------------------------+ 1599 | second slice content | 1600 +-----------------------------------------------------------------+ 1601 | second slice footer | 1602 +-----------------------------------------------------------------+ 1603 | --------------------------------------------------------------- | 1604 +-----------------------------------------------------------------+ 1605 | ... | 1606 +-----------------------------------------------------------------+ 1607 | --------------------------------------------------------------- | 1608 +-----------------------------------------------------------------+ 1609 | last slice header | 1610 +-----------------------------------------------------------------+ 1611 | last slice content | 1612 +-----------------------------------------------------------------+ 1613 | last slice footer | 1614 +-----------------------------------------------------------------+ 1616 Table 15 1618 4.4. Slice 1620 A "Slice" is an independent spatial sub-section of a "Frame" that is 1621 encoded separately from an other region of the same "Frame". The use 1622 of more than one "Slice" per "Frame" can be useful for taking 1623 advantage of the opportunities of multithreaded encoding and 1624 decoding. 1626 A "Slice" consists of a "Slice Header" (when relevant), a "Slice 1627 Content", and a "Slice Footer" (when relevant). The pseudo-code 1628 below describes the contents of a "Slice". 1630 pseudo-code | type 1631 --------------------------------------------------------------|----- 1632 Slice( ) { | 1633 if (version >= 3) { | 1634 SliceHeader( ) | 1635 } | 1636 SliceContent( ) | 1637 if (coder_type == 0) { | 1638 while (!byte_aligned()) { | 1639 padding | u(1) 1640 } | 1641 } | 1642 if (version <= 1) { | 1643 while (remaining_bits_in_bitstream( NumBytes ) != 0) {| 1644 reserved | u(1) 1645 } | 1646 } | 1647 if (version >= 3) { | 1648 SliceFooter( ) | 1649 } | 1650 } | 1652 "padding" specifies a bit without any significance and used only for 1653 byte alignment. MUST be 0. 1655 "reserved" specifies a bit without any significance in this revision 1656 of the specification and may have a significance in a later revision 1657 of this specification. 1659 Encoders SHOULD NOT fill these bits. 1661 Decoders SHOULD ignore these bits. 1663 Note in case these bits are used in a later revision of this 1664 specification: any revision of this specification SHOULD care about 1665 avoiding to add 40 bits of content after "SliceContent" for version 0 1666 and 1 of the bitstream. Background: Due to some non-conforming 1667 encoders, some bitstreams were found with 40 extra bits corresponding 1668 to "error_status" and "slice_crc_parity". As a result, a decoder 1669 conforming to the revised specification could not distinguish between 1670 a revised bitstream and a buggy bitstream. 1672 4.5. Slice Header 1674 A "Slice Header" provides information about the decoding 1675 configuration of the "Slice", such as its spatial position, size, and 1676 aspect ratio. The pseudo-code below describes the contents of the 1677 "Slice Header". 1679 pseudo-code | type 1680 --------------------------------------------------------------|----- 1681 SliceHeader( ) { | 1682 slice_x | ur 1683 slice_y | ur 1684 slice_width - 1 | ur 1685 slice_height - 1 | ur 1686 for (i = 0; i < quant_table_set_index_count; i++) { | 1687 quant_table_set_index[ i ] | ur 1688 } | 1689 picture_structure | ur 1690 sar_num | ur 1691 sar_den | ur 1692 } | 1694 4.5.1. slice_x 1696 "slice_x" indicates the x position on the slice raster formed by 1697 num_h_slices. 1699 Inferred to be 0 if not present. 1701 4.5.2. slice_y 1703 "slice_y" indicates the y position on the slice raster formed by 1704 num_v_slices. 1706 Inferred to be 0 if not present. 1708 4.5.3. slice_width 1710 "slice_width" indicates the width on the slice raster formed by 1711 num_h_slices. 1713 Inferred to be 1 if not present. 1715 4.5.4. slice_height 1717 "slice_height" indicates the height on the slice raster formed by 1718 num_v_slices. 1720 Inferred to be 1 if not present. 1722 4.5.5. quant_table_set_index_count 1724 "quant_table_set_index_count" is defined as "1 + ( ( chroma_planes || 1725 version <= 3 ) ? 1 : 0 ) + ( extra_plane ? 1 : 0 )". 1727 4.5.6. quant_table_set_index 1729 "quant_table_set_index" indicates the Quantization Table Set index to 1730 select the Quantization Table Set and the initial states for the 1731 slice. 1733 Inferred to be 0 if not present. 1735 4.5.7. picture_structure 1737 "picture_structure" specifies the temporal and spatial relationship 1738 of each "Line" of the "Frame". 1740 Inferred to be 0 if not present. 1742 +-------+-------------------------+ 1743 | value | picture structure used | 1744 +=======+=========================+ 1745 | 0 | unknown | 1746 +-------+-------------------------+ 1747 | 1 | top field first | 1748 +-------+-------------------------+ 1749 | 2 | bottom field first | 1750 +-------+-------------------------+ 1751 | 3 | progressive | 1752 +-------+-------------------------+ 1753 | Other | reserved for future use | 1754 +-------+-------------------------+ 1756 Table 16 1758 4.5.8. sar_num 1760 "sar_num" specifies the "Sample" aspect ratio numerator. 1762 Inferred to be 0 if not present. 1764 A value of 0 means that aspect ratio is unknown. 1766 Encoders MUST write 0 if "Sample" aspect ratio is unknown. 1768 If "sar_den" is 0, decoders SHOULD ignore the encoded value and 1769 consider that "sar_num" is 0. 1771 4.5.9. sar_den 1773 "sar_den" specifies the "Sample" aspect ratio denominator. 1775 Inferred to be 0 if not present. 1777 A value of 0 means that aspect ratio is unknown. 1779 Encoders MUST write 0 if "Sample" aspect ratio is unknown. 1781 If "sar_num" is 0, decoders SHOULD ignore the encoded value and 1782 consider that "sar_den" is 0. 1784 4.6. Slice Content 1786 A "Slice Content" contains all "Line" elements part of the "Slice". 1788 Depending on the configuration, "Line" elements are ordered by 1789 "Plane" then by row (YCbCr) or by row then by "Plane" (RGB). 1791 pseudo-code | type 1792 --------------------------------------------------------------|----- 1793 SliceContent( ) { | 1794 if (colorspace_type == 0) { | 1795 for (p = 0; p < primary_color_count; p++) { | 1796 for (y = 0; y < plane_pixel_height[ p ]; y++) { | 1797 Line( p, y ) | 1798 } | 1799 } | 1800 } else if (colorspace_type == 1) { | 1801 for (y = 0; y < slice_pixel_height; y++) { | 1802 for (p = 0; p < primary_color_count; p++) { | 1803 Line( p, y ) | 1804 } | 1805 } | 1806 } | 1807 } | 1809 4.6.1. primary_color_count 1811 "primary_color_count" is defined as "1 + ( chroma_planes ? 2 : 0 ) + 1812 ( extra_plane ? 1 : 0 )". 1814 4.6.2. plane_pixel_height 1816 "plane_pixel_height[ p ]" is the height in pixels of plane p of the 1817 slice. 1819 "plane_pixel_height[ 0 ]" and "plane_pixel_height[ 1 + ( 1820 chroma_planes ? 2 : 0 ) ]" value is "slice_pixel_height". 1822 If "chroma_planes" is set to 1, "plane_pixel_height[ 1 ]" and 1823 "plane_pixel_height[ 2 ]" value is "ceil( slice_pixel_height / (1 << 1824 log2_v_chroma_subsample) )". 1826 4.6.3. slice_pixel_height 1828 "slice_pixel_height" is the height in pixels of the slice. 1830 Its value is "floor( ( slice_y + slice_height ) * slice_pixel_height 1831 / num_v_slices ) - slice_pixel_y". 1833 4.6.4. slice_pixel_y 1835 "slice_pixel_y" is the slice vertical position in pixels. 1837 Its value is "floor( slice_y * frame_pixel_height / num_v_slices )". 1839 4.7. Line 1841 A "Line" is a list of the sample differences (relative to the 1842 predictor) of primary color components. The pseudo-code below 1843 describes the contents of the "Line". 1845 pseudo-code | type 1846 --------------------------------------------------------------|----- 1847 Line( p, y ) { | 1848 if (colorspace_type == 0) { | 1849 for (x = 0; x < plane_pixel_width[ p ]; x++) { | 1850 sample_difference[ p ][ y ][ x ] | 1851 } | 1852 } else if (colorspace_type == 1) { | 1853 for (x = 0; x < slice_pixel_width; x++) { | 1854 sample_difference[ p ][ y ][ x ] | 1855 } | 1856 } | 1857 } | 1859 4.7.1. plane_pixel_width 1861 "plane_pixel_width[ p ]" is the width in "Pixels" of "Plane" p of the 1862 slice. 1864 "plane_pixel_width[ 0 ]" and "plane_pixel_width[ 1 + ( chroma_planes 1865 ? 2 : 0 ) ]" value is "slice_pixel_width". 1867 If "chroma_planes" is set to 1, "plane_pixel_width[ 1 ]" and 1868 "plane_pixel_width[ 2 ]" value is "ceil( slice_pixel_width / (1 << 1869 log2_h_chroma_subsample) )". 1871 4.7.2. slice_pixel_width 1873 "slice_pixel_width" is the width in "Pixels" of the slice. 1875 Its value is "floor( ( slice_x + slice_width ) * slice_pixel_width / 1876 num_h_slices ) - slice_pixel_x". 1878 4.7.3. slice_pixel_x 1880 "slice_pixel_x" is the slice horizontal position in "Pixels". 1882 Its value is "floor( slice_x * frame_pixel_width / num_h_slices )". 1884 4.7.4. sample_difference 1886 "sample_difference[ p ][ y ][ x ]" is the sample difference for 1887 "Sample" at "Plane" "p", y position "y", and x position "x". The 1888 "Sample" value is computed based on median predictor and context 1889 described in Section 3.2. 1891 4.8. Slice Footer 1893 A "Slice Footer" provides information about slice size and 1894 (optionally) parity. The pseudo-code below describes the contents of 1895 the "Slice Footer". 1897 Note: "Slice Footer" is always byte aligned. 1899 pseudo-code | type 1900 --------------------------------------------------------------|----- 1901 SliceFooter( ) { | 1902 slice_size | u(24) 1903 if (ec) { | 1904 error_status | u(8) 1905 slice_crc_parity | u(32) 1906 } | 1907 } | 1909 4.8.1. slice_size 1911 "slice_size" indicates the size of the slice in bytes. 1913 Note: this allows finding the start of slices before previous slices 1914 have been fully decoded, and allows parallel decoding as well as 1915 error resilience. 1917 4.8.2. error_status 1919 "error_status" specifies the error status. 1921 +-------+--------------------------------------+ 1922 | value | error status | 1923 +=======+======================================+ 1924 | 0 | no error | 1925 +-------+--------------------------------------+ 1926 | 1 | slice contains a correctable error | 1927 +-------+--------------------------------------+ 1928 | 2 | slice contains a uncorrectable error | 1929 +-------+--------------------------------------+ 1930 | Other | reserved for future use | 1931 +-------+--------------------------------------+ 1933 Table 17 1935 4.8.3. slice_crc_parity 1937 "slice_crc_parity" 32 bits that are chosen so that the slice as a 1938 whole has a crc remainder of 0. 1940 This is equivalent to storing the crc remainder in the 32-bit parity. 1942 The CRC generator polynomial used is the standard IEEE CRC polynomial 1943 (0x104C11DB7), with initial value 0, without pre-inversion and 1944 without post-inversion. 1946 4.9. Quantization Table Set 1948 The Quantization Table Sets are stored by storing the number of equal 1949 entries -1 of the first half of the table (represented as "len - 1" 1950 in the pseudo-code below) using the method described in 1951 Section 3.8.1.2. The second half doesn't need to be stored as it is 1952 identical to the first with flipped sign. "scale" and "len_count[ i 1953 ][ j ]" are temporary values used for the computing of 1954 "context_count[ i ]" and are not used outside Quantization Table Set 1955 pseudo-code. 1957 Example: 1959 Table: 0 0 1 1 1 1 2 2 -2 -2 -2 -1 -1 -1 -1 0 1961 Stored values: 1, 3, 1 1963 pseudo-code | type 1964 --------------------------------------------------------------|----- 1965 QuantizationTableSet( i ) { | 1966 scale = 1 | 1967 for (j = 0; j < MAX_CONTEXT_INPUTS; j++) { | 1968 QuantizationTable( i, j, scale ) | 1969 scale *= 2 * len_count[ i ][ j ] - 1 | 1970 } | 1971 context_count[ i ] = ceil( scale / 2 ) | 1972 } | 1974 MAX_CONTEXT_INPUTS is 5. 1976 pseudo-code | type 1977 --------------------------------------------------------------|----- 1978 QuantizationTable(i, j, scale) { | 1979 v = 0 | 1980 for (k = 0; k < 128;) { | 1981 len - 1 | ur 1982 for (a = 0; a < len; a++) { | 1983 quant_tables[ i ][ j ][ k ] = scale * v | 1984 k++ | 1985 } | 1986 v++ | 1987 } | 1988 for (k = 1; k < 128; k++) { | 1989 quant_tables[ i ][ j ][ 256 - k ] = \ | 1990 -quant_tables[ i ][ j ][ k ] | 1991 } | 1992 quant_tables[ i ][ j ][ 128 ] = \ | 1993 -quant_tables[ i ][ j ][ 127 ] | 1994 len_count[ i ][ j ] = v | 1995 } | 1997 4.9.1. quant_tables 1999 "quant_tables[ i ][ j ][ k ]" indicates the quantification table 2000 value of the Quantized Sample Difference "k" of the Quantization 2001 Table "j" of the Set Quantization Table Set "i". 2003 4.9.2. context_count 2005 "context_count[ i ]" indicates the count of contexts for Quantization 2006 Table Set "i". "context_count[ i ]" MUST be less than or equal to 2007 32768. 2009 5. Restrictions 2011 To ensure that fast multithreaded decoding is possible, starting with 2012 version 3 and if "frame_pixel_width * frame_pixel_height" is more 2013 than 101376, "slice_width * slice_height" MUST be less or equal to 2014 "num_h_slices * num_v_slices / 4". Note: 101376 is the frame size in 2015 "Pixels" of a 352x288 frame also known as CIF ("Common Intermediate 2016 Format") frame size format. 2018 For each "Frame", each position in the slice raster MUST be filled by 2019 one and only one slice of the "Frame" (no missing slice position, no 2020 slice overlapping). 2022 For each "Frame" with keyframe value of 0, each slice MUST have the 2023 same value of "slice_x, slice_y, slice_width, slice_height" as a 2024 slice in the previous "Frame". 2026 6. Security Considerations 2028 Like any other codec, (such as [RFC6716]), FFV1 should not be used 2029 with insecure ciphers or cipher-modes that are vulnerable to known 2030 plaintext attacks. Some of the header bits as well as the padding 2031 are easily predictable. 2033 Implementations of the FFV1 codec need to take appropriate security 2034 considerations into account, as outlined in [RFC4732]. It is 2035 extremely important for the decoder to be robust against malicious 2036 payloads. Malicious payloads must not cause the decoder to overrun 2037 its allocated memory or to take an excessive amount of resources to 2038 decode. The same applies to the encoder, even though problems in 2039 encoders are typically rarer. Malicious video streams must not cause 2040 the encoder to misbehave because this would allow an attacker to 2041 attack transcoding gateways. A frequent security problem in image 2042 and video codecs is also to not check for integer overflows in 2043 "Pixel" count computations, that is to allocate width * height 2044 without considering that the multiplication result may have 2045 overflowed the arithmetic types range. The range coder could, if 2046 implemented naively, read one byte over the end. The implementation 2047 must ensure that no read outside allocated and initialized memory 2048 occurs. 2050 The reference implementation [REFIMPL] contains no known buffer 2051 overflow or cases where a specially crafted packet or video segment 2052 could cause a significant increase in CPU load. 2054 The reference implementation [REFIMPL] was validated in the following 2055 conditions: 2057 * Sending the decoder valid packets generated by the reference 2058 encoder and verifying that the decoder's output matches the 2059 encoder's input. 2061 * Sending the decoder packets generated by the reference encoder and 2062 then subjected to random corruption. 2064 * Sending the decoder random packets that are not FFV1. 2066 In all of the conditions above, the decoder and encoder was run 2067 inside the [VALGRIND] memory debugger as well as clangs address 2068 sanitizer [Address-Sanitizer], which track reads and writes to 2069 invalid memory regions as well as the use of uninitialized memory. 2070 There were no errors reported on any of the tested conditions. 2072 7. Media Type Definition 2074 This registration is done using the template defined in [RFC6838] and 2075 following [RFC4855]. 2077 Type name: video 2079 Subtype name: FFV1 2081 Required parameters: None. 2083 Optional parameters: 2085 This parameter is used to signal the capabilities of a receiver 2086 implementation. This parameter MUST NOT be used for any other 2087 purpose. 2089 version: The version of the FFV1 encoding as defined by 2090 Section 4.1.1. 2092 micro_version: The micro_version of the FFV1 encoding as defined by 2093 Section 4.1.2. 2095 coder_type: The coder_type of the FFV1 encoding as defined by 2096 Section 4.1.3. 2098 colorspace_type: The colorspace_type of the FFV1 encoding as defined 2099 by Section 4.1.5. 2101 bits_per_raw_sample: The bits_per_raw_sample of the FFV1 encoding as 2102 defined by Section 4.1.7. 2104 max-slices: The value of max-slices is an integer indicating the 2105 maximum count of slices with a frames of the FFV1 encoding. 2107 Encoding considerations: 2109 This media type is defined for encapsulation in several audiovisual 2110 container formats and contains binary data; see Section 4.2.3. This 2111 media type is framed binary data Section 4.8 of [RFC6838]. 2113 Security considerations: 2115 See Section 6 of this document. 2117 Interoperability considerations: None. 2119 Published specification: 2121 [I-D.ietf-cellar-ffv1] and RFC XXXX. 2123 [RFC Editor: Upon publication as an RFC, please replace "XXXX" with 2124 the number assigned to this document and remove this note.] 2126 Applications which use this media type: 2128 Any application that requires the transport of lossless video can use 2129 this media type. Some examples are, but not limited to screen 2130 recording, scientific imaging, and digital video preservation. 2132 Fragment identifier considerations: N/A. 2134 Additional information: None. 2136 Person & email address to contact for further information: Michael 2137 Niedermayer michael@niedermayer.cc (mailto:michael@niedermayer.cc) 2139 Intended usage: COMMON 2141 Restrictions on usage: None. 2143 Author: Dave Rice dave@dericed.com (mailto:dave@dericed.com) 2145 Change controller: IETF cellar working group delegated from the IESG. 2147 8. IANA Considerations 2149 The IANA is requested to register the following values: 2151 * Media type registration as described in Section 7. 2153 9. Appendixes 2155 9.1. Decoder implementation suggestions 2157 9.1.1. Multi-threading Support and Independence of Slices 2159 The FFV1 bitstream is parsable in two ways: in sequential order as 2160 described in this document or with the pre-analysis of the footer of 2161 each slice. Each slice footer contains a slice_size field so the 2162 boundary of each slice is computable without having to parse the 2163 slice content. That allows multi-threading as well as independence 2164 of slice content (a bitstream error in a slice header or slice 2165 content has no impact on the decoding of the other slices). 2167 After having checked keyframe field, a decoder SHOULD parse 2168 slice_size fields, from slice_size of the last slice at the end of 2169 the "Frame" up to slice_size of the first slice at the beginning of 2170 the "Frame", before parsing slices, in order to have slices 2171 boundaries. A decoder MAY fallback on sequential order e.g. in case 2172 of a corrupted "Frame" (frame size unknown, slice_size of slices not 2173 coherent...) or if there is no possibility of seeking into the 2174 stream. 2176 10. Changelog 2178 See https://github.com/FFmpeg/FFV1/commits/master 2179 (https://github.com/FFmpeg/FFV1/commits/master) 2181 11. Normative References 2183 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2184 Requirement Levels", BCP 14, RFC 2119, 2185 DOI 10.17487/RFC2119, March 1997, 2186 . 2188 [RFC4732] Handley, M., Ed., Rescorla, E., Ed., and IAB, "Internet 2189 Denial-of-Service Considerations", RFC 4732, 2190 DOI 10.17487/RFC4732, December 2006, 2191 . 2193 [I-D.ietf-cellar-ffv1] 2194 Niedermayer, M., Rice, D., and J. Martinez, "FFV1 Video 2195 Coding Format Version 0, 1, and 3", Work in Progress, 2196 Internet-Draft, draft-ietf-cellar-ffv1-11, 23 October 2197 2019, 2198 . 2200 [ISO.15444-1.2016] 2201 International Organization for Standardization, 2202 "Information technology -- JPEG 2000 image coding system: 2203 Core coding system", October 2016. 2205 [ISO.9899.1990] 2206 International Organization for Standardization, 2207 "Programming languages - C", 1990. 2209 [RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type 2210 Specifications and Registration Procedures", BCP 13, 2211 RFC 6838, DOI 10.17487/RFC6838, January 2013, 2212 . 2214 [RFC4855] Casner, S., "Media Type Registration of RTP Payload 2215 Formats", RFC 4855, DOI 10.17487/RFC4855, February 2007, 2216 . 2218 [RFC6716] Valin, JM., Vos, K., and T. Terriberry, "Definition of the 2219 Opus Audio Codec", RFC 6716, DOI 10.17487/RFC6716, 2220 September 2012, . 2222 [Matroska] IETF, "Matroska", 2019, . 2225 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2226 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 2227 May 2017, . 2229 [ISO.9899.2018] 2230 International Organization for Standardization, 2231 "Programming languages - C", 2018. 2233 12. Informative References 2235 [FFV1_V1] Niedermayer, M., "Commit to release FFV1 version 1", April 2236 2009, . 2239 [ISO.14495-1.1999] 2240 International Organization for Standardization, 2241 "Information technology -- Lossless and near-lossless 2242 compression of continuous-tone still images: Baseline", 2243 December 1999. 2245 [Address-Sanitizer] 2246 The Clang Team, "ASAN AddressSanitizer website", undated, 2247 . 2249 [AVI] Microsoft, "AVI RIFF File Reference", undated, 2250 . 2253 [HuffYUV] Rudiak-Gould, B., "HuffYUV", December 2003, 2254 . 2257 [ISO.14496-10.2014] 2258 International Organization for Standardization, 2259 "Information technology -- Coding of audio-visual objects 2260 -- Part 10: Advanced Video Coding", September 2014. 2262 [FFV1_V0] Niedermayer, M., "Commit to mark FFV1 version 0 as non- 2263 experimental", April 2006, . 2267 [FFV1_V3] Niedermayer, M., "Commit to mark FFV1 version 3 as non- 2268 experimental", August 2013, . 2272 [VALGRIND] Valgrind Developers, "Valgrind website", undated, 2273 . 2275 [REFIMPL] Niedermayer, M., "The reference FFV1 implementation / the 2276 FFV1 codec in FFmpeg", undated, . 2278 [ISO.14496-12.2015] 2279 International Organization for Standardization, 2280 "Information technology -- Coding of audio-visual objects 2281 -- Part 12: ISO base media file format", December 2015. 2283 [range-coding] 2284 Nigel, G. and N. Martin, "Range encoding: an algorithm for 2285 removing redundancy from a digitised message.", July 1979. 2287 [YCbCr] Wikipedia, "YCbCr", undated, 2288 . 2290 [NUT] Niedermayer, M., "NUT Open Container Format", December 2291 2013, . 2293 Authors' Addresses 2295 Michael Niedermayer 2297 Email: michael@niedermayer.cc 2299 Dave Rice 2301 Email: dave@dericed.com 2303 Jerome Martinez 2305 Email: jerome@mediaarea.net