idnits 2.17.1 draft-ietf-cellar-ffv1-16.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (2 July 2020) is 1387 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '41' on line 1039 Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 cellar M. Niedermayer 3 Internet-Draft 4 Intended status: Informational D. Rice 5 Expires: 3 January 2021 6 J. Martinez 7 2 July 2020 9 FFV1 Video Coding Format Version 0, 1, and 3 10 draft-ietf-cellar-ffv1-16 12 Abstract 14 This document defines FFV1, a lossless intra-frame video encoding 15 format. FFV1 is designed to efficiently compress video data in a 16 variety of pixel formats. Compared to uncompressed video, FFV1 17 offers storage compression, frame fixity, and self-description, which 18 makes FFV1 useful as a preservation or intermediate video format. 20 Status of This Memo 22 This Internet-Draft is submitted in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at https://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on 3 January 2021. 37 Copyright Notice 39 Copyright (c) 2020 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 44 license-info) in effect on the date of publication of this document. 45 Please review these documents carefully, as they describe your rights 46 and restrictions with respect to this document. Code Components 47 extracted from this document must include Simplified BSD License text 48 as described in Section 4.e of the Trust Legal Provisions and are 49 provided without warranty as described in the Simplified BSD License. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 54 2. Notation and Conventions . . . . . . . . . . . . . . . . . . 5 55 2.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . 5 56 2.2. Conventions . . . . . . . . . . . . . . . . . . . . . . . 5 57 2.2.1. Pseudo-code . . . . . . . . . . . . . . . . . . . . . 6 58 2.2.2. Arithmetic Operators . . . . . . . . . . . . . . . . 6 59 2.2.3. Assignment Operators . . . . . . . . . . . . . . . . 7 60 2.2.4. Comparison Operators . . . . . . . . . . . . . . . . 7 61 2.2.5. Mathematical Functions . . . . . . . . . . . . . . . 7 62 2.2.6. Order of Operation Precedence . . . . . . . . . . . . 8 63 2.2.7. Range . . . . . . . . . . . . . . . . . . . . . . . . 8 64 2.2.8. NumBytes . . . . . . . . . . . . . . . . . . . . . . 9 65 2.2.9. Bitstream Functions . . . . . . . . . . . . . . . . . 9 66 3. Sample Coding . . . . . . . . . . . . . . . . . . . . . . . . 9 67 3.1. Border . . . . . . . . . . . . . . . . . . . . . . . . . 9 68 3.2. Samples . . . . . . . . . . . . . . . . . . . . . . . . . 10 69 3.3. Median Predictor . . . . . . . . . . . . . . . . . . . . 11 70 3.4. Context . . . . . . . . . . . . . . . . . . . . . . . . . 12 71 3.5. Quantization Table Sets . . . . . . . . . . . . . . . . . 12 72 3.6. Quantization Table Set Indexes . . . . . . . . . . . . . 12 73 3.7. Color spaces . . . . . . . . . . . . . . . . . . . . . . 13 74 3.7.1. YCbCr . . . . . . . . . . . . . . . . . . . . . . . . 13 75 3.7.2. RGB . . . . . . . . . . . . . . . . . . . . . . . . . 14 76 3.8. Coding of the Sample Difference . . . . . . . . . . . . . 15 77 3.8.1. Range Coding Mode . . . . . . . . . . . . . . . . . . 16 78 3.8.2. Golomb Rice Mode . . . . . . . . . . . . . . . . . . 21 79 4. Bitstream . . . . . . . . . . . . . . . . . . . . . . . . . . 26 80 4.1. Quantization Table Set . . . . . . . . . . . . . . . . . 27 81 4.1.1. quant_tables . . . . . . . . . . . . . . . . . . . . 28 82 4.1.2. context_count . . . . . . . . . . . . . . . . . . . . 29 83 4.2. Parameters . . . . . . . . . . . . . . . . . . . . . . . 29 84 4.2.1. version . . . . . . . . . . . . . . . . . . . . . . . 31 85 4.2.2. micro_version . . . . . . . . . . . . . . . . . . . . 31 86 4.2.3. coder_type . . . . . . . . . . . . . . . . . . . . . 32 87 4.2.4. state_transition_delta . . . . . . . . . . . . . . . 32 88 4.2.5. colorspace_type . . . . . . . . . . . . . . . . . . . 33 89 4.2.6. chroma_planes . . . . . . . . . . . . . . . . . . . . 33 90 4.2.7. bits_per_raw_sample . . . . . . . . . . . . . . . . . 34 91 4.2.8. log2_h_chroma_subsample . . . . . . . . . . . . . . . 34 92 4.2.9. log2_v_chroma_subsample . . . . . . . . . . . . . . . 34 93 4.2.10. extra_plane . . . . . . . . . . . . . . . . . . . . . 34 94 4.2.11. num_h_slices . . . . . . . . . . . . . . . . . . . . 35 95 4.2.12. num_v_slices . . . . . . . . . . . . . . . . . . . . 35 96 4.2.13. quant_table_set_count . . . . . . . . . . . . . . . . 35 97 4.2.14. states_coded . . . . . . . . . . . . . . . . . . . . 35 98 4.2.15. initial_state_delta . . . . . . . . . . . . . . . . . 35 99 4.2.16. ec . . . . . . . . . . . . . . . . . . . . . . . . . 36 100 4.2.17. intra . . . . . . . . . . . . . . . . . . . . . . . . 36 101 4.3. Configuration Record . . . . . . . . . . . . . . . . . . 37 102 4.3.1. reserved_for_future_use . . . . . . . . . . . . . . . 37 103 4.3.2. configuration_record_crc_parity . . . . . . . . . . . 37 104 4.3.3. Mapping FFV1 into Containers . . . . . . . . . . . . 37 105 4.4. Frame . . . . . . . . . . . . . . . . . . . . . . . . . . 38 106 4.5. Slice . . . . . . . . . . . . . . . . . . . . . . . . . . 40 107 4.6. Slice Header . . . . . . . . . . . . . . . . . . . . . . 41 108 4.6.1. slice_x . . . . . . . . . . . . . . . . . . . . . . . 42 109 4.6.2. slice_y . . . . . . . . . . . . . . . . . . . . . . . 42 110 4.6.3. slice_width . . . . . . . . . . . . . . . . . . . . . 42 111 4.6.4. slice_height . . . . . . . . . . . . . . . . . . . . 42 112 4.6.5. quant_table_set_index_count . . . . . . . . . . . . . 42 113 4.6.6. quant_table_set_index . . . . . . . . . . . . . . . . 43 114 4.6.7. picture_structure . . . . . . . . . . . . . . . . . . 43 115 4.6.8. sar_num . . . . . . . . . . . . . . . . . . . . . . . 43 116 4.6.9. sar_den . . . . . . . . . . . . . . . . . . . . . . . 44 117 4.7. Slice Content . . . . . . . . . . . . . . . . . . . . . . 44 118 4.7.1. primary_color_count . . . . . . . . . . . . . . . . . 44 119 4.7.2. plane_pixel_height . . . . . . . . . . . . . . . . . 44 120 4.7.3. slice_pixel_height . . . . . . . . . . . . . . . . . 45 121 4.7.4. slice_pixel_y . . . . . . . . . . . . . . . . . . . . 45 122 4.8. Line . . . . . . . . . . . . . . . . . . . . . . . . . . 45 123 4.8.1. plane_pixel_width . . . . . . . . . . . . . . . . . . 45 124 4.8.2. slice_pixel_width . . . . . . . . . . . . . . . . . . 46 125 4.8.3. slice_pixel_x . . . . . . . . . . . . . . . . . . . . 46 126 4.8.4. sample_difference . . . . . . . . . . . . . . . . . . 46 127 4.9. Slice Footer . . . . . . . . . . . . . . . . . . . . . . 46 128 4.9.1. slice_size . . . . . . . . . . . . . . . . . . . . . 47 129 4.9.2. error_status . . . . . . . . . . . . . . . . . . . . 47 130 4.9.3. slice_crc_parity . . . . . . . . . . . . . . . . . . 47 131 5. Restrictions . . . . . . . . . . . . . . . . . . . . . . . . 47 132 6. Security Considerations . . . . . . . . . . . . . . . . . . . 48 133 7. Media Type Definition . . . . . . . . . . . . . . . . . . . . 49 134 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 50 135 9. Changelog . . . . . . . . . . . . . . . . . . . . . . . . . . 50 136 10. Normative References . . . . . . . . . . . . . . . . . . . . 50 137 11. Informative References . . . . . . . . . . . . . . . . . . . 51 138 Appendix A. Multi-theaded decoder implementation suggestions . . 53 139 Appendix B. Future handling of some streams created by non 140 conforming encoders . . . . . . . . . . . . . . . . . . . 53 141 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 53 143 1. Introduction 145 This document describes FFV1, a lossless video encoding format. The 146 design of FFV1 considers the storage of image characteristics, data 147 fixity, and the optimized use of encoding time and storage 148 requirements. FFV1 is designed to support a wide range of lossless 149 video applications such as long-term audiovisual preservation, 150 scientific imaging, screen recording, and other video encoding 151 scenarios that seek to avoid the generational loss of lossy video 152 encodings. 154 This document defines version 0, 1 and 3 of FFV1. The distinctions 155 of the versions are provided throughout the document, but in summary: 157 * Version 0 of FFV1 was the original implementation of FFV1 and has 158 been in non-experimental use since April 14, 2006 [FFV1_V0]. 160 * Version 1 of FFV1 adds support of more video bit depths and has 161 been in use since April 24, 2009 [FFV1_V1]. 163 * Version 2 of FFV1 only existed in experimental form and is not 164 described by this document, but is available as a LyX file at 165 https://github.com/FFmpeg/FFV1/ 166 blob/8ad772b6d61c3dd8b0171979a2cd9f11924d5532/ffv1.lyx 167 (https://github.com/FFmpeg/FFV1/ 168 blob/8ad772b6d61c3dd8b0171979a2cd9f11924d5532/ffv1.lyx). 170 * Version 3 of FFV1 adds several features such as increased 171 description of the characteristics of the encoding images and 172 embedded CRC data to support fixity verification of the encoding. 173 Version 3 has been in non-experimental use since August 17, 2013 174 [FFV1_V3]. 176 This document assumes familiarity with mathematical and coding 177 concepts such as Range coding [range-coding] and YCbCr color spaces 178 [YCbCr]. 180 This specification describes the valid bitstream and how to decode 181 such valid bitstream. Bitstreams not conforming to this 182 specification or how they are handled is outside this specification. 183 A decoder could reject every invalid bitstream or attempt to perform 184 error concealment or re-download or use a redundant copy of the 185 invalid part or any other action it deems appropriate. 187 2. Notation and Conventions 189 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 190 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 191 "OPTIONAL" in this document are to be interpreted as described in BCP 192 14 [RFC2119] [RFC8174] when, and only when, they appear in all 193 capitals, as shown here. 195 2.1. Definitions 197 "Container": Format that encapsulates "Frames" (see Section 4.4) and 198 (when required) a "Configuration Record" into a bitstream. 200 "Sample": The smallest addressable representation of a color 201 component or a luma component in a "Frame". Examples of "Sample" are 202 Luma (Y), Blue-difference Chroma (Cb), Red-difference Chroma (Cr), 203 Transparency, Red, Green, and Blue. 205 "Plane": A discrete component of a static image comprised of 206 "Samples" that represent a specific quantification of "Samples" of 207 that image. 209 "Pixel": The smallest addressable representation of a color in a 210 "Frame". It is composed of one or more "Samples". 212 "ESC": An ESCape symbol to indicate that the symbol to be stored is 213 too large for normal storage and that an alternate storage method is 214 used. 216 "MSB": Most Significant Bit, the bit that can cause the largest 217 change in magnitude of the symbol. 219 "VLC": Variable Length Code, a code that maps source symbols to a 220 variable number of bits. 222 "RGB": A reference to the method of storing the value of a "Pixel" by 223 using three numeric values that represent Red, Green, and Blue. 225 "YCbCr": A reference to the method of storing the value of a "Pixel" 226 by using three numeric values that represent the luma of the "Pixel" 227 (Y) and the chroma of the "Pixel" (Cb and Cr). YCbCr word is used 228 for historical reasons and currently references any color space 229 relying on 1 luma "Sample" and 2 chroma "Samples", e.g. YCbCr, YCgCo 230 or ICtCp. The exact meaning of the three numeric values is 231 unspecified. 233 2.2. Conventions 234 2.2.1. Pseudo-code 236 The FFV1 bitstream is described in this document using pseudo-code. 237 Note that the pseudo-code is used for clarity in order to illustrate 238 the structure of FFV1 and not intended to specify any particular 239 implementation. The pseudo-code used is based upon the C programming 240 language [ISO.9899.1990] and uses its "if/else", "while" and "for" 241 keywords as well as functions defined within this document. 243 In some instances, pseudo-code is presented in a two-column format 244 such as shown in Figure 1. In this form the "type" column provides a 245 symbol as defined in Table 4 that defines the storage of the data 246 referenced in that same line of pseudo-code. 248 pseudo-code | type 249 --------------------------------------------------------------|----- 250 ExamplePseudoCode( ) { | 251 value | ur 252 } | 254 Figure 1: A depiction of type-labelled pseudo-code used within 255 this document. 257 2.2.2. Arithmetic Operators 259 Note: the operators and the order of precedence are the same as used 260 in the C programming language [ISO.9899.2018], with the exception of 261 ">>" (removal of implementation defined behavior) and "^" (power 262 instead of XOR) operators which are re-defined within this section. 264 "a + b" means a plus b. 266 "a - b" means a minus b. 268 "-a" means negation of a. 270 "a * b" means a multiplied by b. 272 "a / b" means a divided by b. 274 "a ^ b" means a raised to the b-th power. 276 "a & b" means bit-wise "and" of a and b. 278 "a | b" means bit-wise "or" of a and b. 280 "a >> b" means arithmetic right shift of two's complement integer 281 representation of a by b binary digits. This is equivalent to 282 dividing a by 2, b times, with rounding toward negative infinity. 284 "a << b" means arithmetic left shift of two's complement integer 285 representation of a by b binary digits. 287 2.2.3. Assignment Operators 289 "a = b" means a is assigned b. 291 "a++" is equivalent to a is assigned a + 1. 293 "a--" is equivalent to a is assigned a - 1. 295 "a += b" is equivalent to a is assigned a + b. 297 "a -= b" is equivalent to a is assigned a - b. 299 "a *= b" is equivalent to a is assigned a * b. 301 2.2.4. Comparison Operators 303 "a > b" means a is greater than b. 305 "a >= b" means a is greater than or equal to b. 307 "a < b" means a is less than b. 309 "a <= b" means a is less than or equal b. 311 "a == b" means a is equal to b. 313 "a != b" means a is not equal to b. 315 "a && b" means Boolean logical "and" of a and b. 317 "a || b" means Boolean logical "or" of a and b. 319 "!a" means Boolean logical "not" of a. 321 "a ? b : c" if a is true, then b, otherwise c. 323 2.2.5. Mathematical Functions 325 "floor(a)" means the largest integer less than or equal to a. 327 "ceil(a)" means the smallest integer greater than or equal to a. 329 "sign(a)" extracts the sign of a number, i.e. if a < 0 then -1, else 330 if a > 0 then 1, else 0. 332 "abs(a)" means the absolute value of a, i.e. "abs(a)" = "sign(a) * 333 a". 335 "log2(a)" means the base-two logarithm of a. 337 "min(a,b)" means the smallest of two values a and b. 339 "max(a,b)" means the largest of two values a and b. 341 "median(a,b,c)" means the numerical middle value in a data set of a, 342 b, and c, i.e. a+b+c-min(a,b,c)-max(a,b,c). 344 "A <== B" means B implies A. 346 "A <==> B" means A <== B , B <== A. 348 2.2.6. Order of Operation Precedence 350 When order of precedence is not indicated explicitly by use of 351 parentheses, operations are evaluated in the following order (from 352 top to bottom, operations of same precedence being evaluated from 353 left to right). This order of operations is based on the order of 354 operations used in Standard C. 356 a++, a-- 357 !a, -a 358 a ^ b 359 a * b, a / b, a % b 360 a + b, a - b 361 a << b, a >> b 362 a < b, a <= b, a > b, a >= b 363 a == b, a != b 364 a & b 365 a | b 366 a && b 367 a || b 368 a ? b : c 369 a = b, a += b, a -= b, a *= b 371 2.2.7. Range 373 "a...b" means any value starting from a to b, inclusive. 375 2.2.8. NumBytes 377 "NumBytes" is a non-negative integer that expresses the size in 8-bit 378 octets of a particular FFV1 "Configuration Record" or "Frame". FFV1 379 relies on its "Container" to store the "NumBytes" values; see 380 Section 4.3.3. 382 2.2.9. Bitstream Functions 384 2.2.9.1. remaining_bits_in_bitstream 386 "remaining_bits_in_bitstream( )" means the count of remaining bits 387 after the pointer in that "Configuration Record" or "Frame". It is 388 computed from the "NumBytes" value multiplied by 8 minus the count of 389 bits of that "Configuration Record" or "Frame" already read by the 390 bitstream parser. 392 2.2.9.2. remaining_symbols_in_syntax 394 "remaining_symbols_in_syntax( )" is true as long as the RangeCoder 395 has not consumed all the given input bytes. 397 2.2.9.3. byte_aligned 399 "byte_aligned( )" is true if "remaining_bits_in_bitstream( NumBytes 400 )" is a multiple of 8, otherwise false. 402 2.2.9.4. get_bits 404 "get_bits( i )" is the action to read the next "i" bits in the 405 bitstream, from most significant bit to least significant bit, and to 406 return the corresponding value. The pointer is increased by "i". 408 3. Sample Coding 410 For each "Slice" (as described in Section 4.5) of a "Frame", the 411 "Planes", "Lines", and "Samples" are coded in an order determined by 412 the "Color Space" (see Section 3.7). Each "Sample" is predicted by 413 the median predictor as described in Section 3.3 from other "Samples" 414 within the same "Plane" and the difference is stored using the method 415 described in Section 3.8. 417 3.1. Border 419 A border is assumed for each coded "Slice" for the purpose of the 420 median predictor and context according to the following rules: 422 * one column of "Samples" to the left of the coded slice is assumed 423 as identical to the "Samples" of the leftmost column of the coded 424 slice shifted down by one row. The value of the topmost "Sample" 425 of the column of "Samples" to the left of the coded slice is 426 assumed to be "0" 428 * one column of "Samples" to the right of the coded slice is assumed 429 as identical to the "Samples" of the rightmost column of the coded 430 slice 432 * an additional column of "Samples" to the left of the coded slice 433 and two rows of "Samples" above the coded slice are assumed to be 434 "0" 436 Figure 2 depicts a slice of 9 "Samples" "a,b,c,d,e,f,g,h,i" in a 3x3 437 arrangement along with its assumed border. 439 +---+---+---+---+---+---+---+---+ 440 | 0 | 0 | | 0 | 0 | 0 | | 0 | 441 +---+---+---+---+---+---+---+---+ 442 | 0 | 0 | | 0 | 0 | 0 | | 0 | 443 +---+---+---+---+---+---+---+---+ 444 | | | | | | | | | 445 +---+---+---+---+---+---+---+---+ 446 | 0 | 0 | | a | b | c | | c | 447 +---+---+---+---+---+---+---+---+ 448 | 0 | a | | d | e | f | | f | 449 +---+---+---+---+---+---+---+---+ 450 | 0 | d | | g | h | i | | i | 451 +---+---+---+---+---+---+---+---+ 453 Figure 2: A depiction of FFV1's assumed border for a set example 454 Samples. 456 3.2. Samples 458 Relative to any "Sample" "X", six other relatively positioned 459 "Samples" from the coded "Samples" and presumed border are identified 460 according to the labels used in Figure 3. The labels for these 461 relatively positioned "Samples" are used within the median predictor 462 and context. 464 +---+---+---+---+ 465 | | | T | | 466 +---+---+---+---+ 467 | |tl | t |tr | 468 +---+---+---+---+ 469 | L | l | X | | 470 +---+---+---+---+ 472 Figure 3: A depiction of how relatively positions Samples are 473 references within this document. 475 The labels for these relative "Samples" are made of the first letters 476 of the words Top, Left and Right. 478 3.3. Median Predictor 480 The prediction for any "Sample" value at position "X" may be computed 481 based upon the relative neighboring values of "l", "t", and "tl" via 482 this equation: 484 median(l, t, l + t - tl) 486 Note, this prediction template is also used in [ISO.14495-1.1999] and 487 [HuffYUV]. 489 Exception for the median predictor: if "colorspace_type == 0 && 490 bits_per_raw_sample == 16 && ( coder_type == 1 || coder_type == 2 )", 491 the following median predictor MUST be used: 493 median(left16s, top16s, left16s + top16s - diag16s) 495 where: 497 left16s = l >= 32768 ? ( l - 65536 ) : l 498 top16s = t >= 32768 ? ( t - 65536 ) : t 499 diag16s = tl >= 32768 ? ( tl - 65536 ) : tl 501 Background: a two's complement signed 16-bit signed integer was used 502 for storing "Sample" values in all known implementations of FFV1 503 bitstream. So in some circumstances, the most significant bit was 504 wrongly interpreted (used as a sign bit instead of the 16th bit of an 505 unsigned integer). Note that when the issue was discovered, the only 506 configuration of all known implementations being impacted is 16-bit 507 YCbCr with no Pixel transformation with Range Coder coder, as other 508 potentially impacted configurations (e.g. 15/16-bit JPEG2000-RCT with 509 Range Coder coder, or 16-bit content with Golomb Rice coder) were 510 implemented nowhere [ISO.15444-1.2016]. In the meanwhile, 16-bit 511 JPEG2000-RCT with Range Coder coder was implemented without this 512 issue in one implementation and validated by one conformance checker. 513 It is expected (to be confirmed) to remove this exception for the 514 median predictor in the next version of the FFV1 bitstream. 516 3.4. Context 518 Relative to any "Sample" "X", the Quantized Sample Differences "L-l", 519 "l-tl", "tl-t", "T-t", and "t-tr" are used as context: 521 context = Q_{0}[l - tl] + 522 Q_{1}[tl - t] + 523 Q_{2}[t - tr] + 524 Q_{3}[L - l] + 525 Q_{4}[T - t] 527 Figure 4 529 If "context >= 0" then "context" is used and the difference between 530 the "Sample" and its predicted value is encoded as is, else 531 "-context" is used and the difference between the "Sample" and its 532 predicted value is encoded with a flipped sign. 534 3.5. Quantization Table Sets 536 The FFV1 bitstream contains one or more Quantization Table Sets. 537 Each Quantization Table Set contains exactly 5 Quantization Tables 538 with each Quantization Table corresponding to one of the five 539 Quantized Sample Differences. For each Quantization Table, both the 540 number of quantization steps and their distribution are stored in the 541 FFV1 bitstream; each Quantization Table has exactly 256 entries, and 542 the 8 least significant bits of the Quantized Sample Difference are 543 used as index: 545 Q_{j}[k] = quant_tables[i][j][k&255] 547 Figure 5 549 In this formula, "i" is the Quantization Table Set index, "j" is the 550 Quantized Table index, "k" the Quantized Sample Difference. 552 3.6. Quantization Table Set Indexes 554 For each "Plane" of each slice, a Quantization Table Set is selected 555 from an index: 557 * For Y "Plane", "quant_table_set_index[ 0 ]" index is used 559 * For Cb and Cr "Planes", "quant_table_set_index[ 1 ]" index is used 560 * For extra "Plane", "quant_table_set_index[ (version <= 3 || 561 chroma_planes) ? 2 : 1 ]" index is used 563 Background: in first implementations of FFV1 bitstream, the index for 564 Cb and Cr "Planes" was stored even if it is not used (chroma_planes 565 set to 0), this index is kept for "version" <= 3 in order to keep 566 compatibility with FFV1 bitstreams in the wild. 568 3.7. Color spaces 570 FFV1 supports several color spaces. The count of allowed coded 571 planes and the meaning of the extra "Plane" are determined by the 572 selected color space. 574 The FFV1 bitstream interleaves data in an order determined by the 575 color space. In YCbCr for each "Plane", each "Line" is coded from 576 top to bottom and for each "Line", each "Sample" is coded from left 577 to right. In JPEG2000-RCT for each "Line" from top to bottom, each 578 "Plane" is coded and for each "Plane", each "Sample" is encoded from 579 left to right. 581 3.7.1. YCbCr 583 This color space allows 1 to 4 "Planes". 585 The Cb and Cr "Planes" are optional, but if used then MUST be used 586 together. Omitting the Cb and Cr "Planes" codes the frames in 587 grayscale without color data. 589 An optional transparency "Plane" can be used to code transparency 590 data. 592 An FFV1 "Frame" using YCbCr MUST use one of the following 593 arrangements: 595 * Y 597 * Y, Transparency 599 * Y, Cb, Cr 601 * Y, Cb, Cr, Transparency 603 The Y "Plane" MUST be coded first. If the Cb and Cr "Planes" are 604 used then they MUST be coded after the Y "Plane". If a transparency 605 "Plane" is used, then it MUST be coded last. 607 3.7.2. RGB 609 This color space allows 3 or 4 "Planes". 611 An optional transparency "Plane" can be used to code transparency 612 data. 614 JPEG2000-RCT is a Reversible Color Transform that codes RGB (red, 615 green, blue) "Planes" losslessly in a modified YCbCr color space 616 [ISO.15444-1.2016]. Reversible Pixel transformations between YCbCr 617 and RGB use the following formulae. 619 Cb = b - g 620 Cr = r - g 621 Y = g + (Cb + Cr) >> 2 622 g = Y - (Cb + Cr) >> 2 623 r = Cr + g 624 b = Cb + g 626 Figure 6 628 Exception for the JPEG2000-RCT conversion: if "bits_per_raw_sample" 629 is between 9 and 15 inclusive and "extra_plane" is 0, the following 630 formulae for reversible conversions between YCbCr and RGB MUST be 631 used instead of the ones above: 633 Cb = g - b 634 Cr = r - b 635 Y = b +(Cb + Cr) >> 2 636 b = Y -(Cb + Cr) >> 2 637 r = Cr + b 638 g = Cb + b 640 Figure 7 642 Background: At the time of this writing, in all known implementations 643 of FFV1 bitstream, when "bits_per_raw_sample" was between 9 and 15 644 inclusive and "extra_plane" is 0, GBR "Planes" were used as BGR 645 "Planes" during both encoding and decoding. In the meanwhile, 16-bit 646 JPEG2000-RCT was implemented without this issue in one implementation 647 and validated by one conformance checker. Methods to address this 648 exception for the transform are under consideration for the next 649 version of the FFV1 bitstream. 651 Cb and Cr are positively offset by "1 << bits_per_raw_sample" after 652 the conversion from RGB to the modified YCbCr and are negatively 653 offseted by the same value before the conversion from the modified 654 YCbCr to RGB, in order to have only non-negative values after the 655 conversion. 657 When FFV1 uses the JPEG2000-RCT, the horizontal "Lines" are 658 interleaved to improve caching efficiency since it is most likely 659 that the JPEG2000-RCT will immediately be converted to RGB during 660 decoding. The interleaved coding order is also Y, then Cb, then Cr, 661 and then if used transparency. 663 As an example, a "Frame" that is two "Pixels" wide and two "Pixels" 664 high, could comprise the following structure: 666 +------------------------+------------------------+ 667 | Pixel(1,1) | Pixel(2,1) | 668 | Y(1,1) Cb(1,1) Cr(1,1) | Y(2,1) Cb(2,1) Cr(2,1) | 669 +------------------------+------------------------+ 670 | Pixel(1,2) | Pixel(2,2) | 671 | Y(1,2) Cb(1,2) Cr(1,2) | Y(2,2) Cb(2,2) Cr(2,2) | 672 +------------------------+------------------------+ 674 In JPEG2000-RCT, the coding order would be left to right and then top 675 to bottom, with values interleaved by "Lines" and stored in this 676 order: 678 Y(1,1) Y(2,1) Cb(1,1) Cb(2,1) Cr(1,1) Cr(2,1) Y(1,2) Y(2,2) Cb(1,2) 679 Cb(2,2) Cr(1,2) Cr(2,2) 681 3.8. Coding of the Sample Difference 683 Instead of coding the n+1 bits of the Sample Difference with Huffman 684 or Range coding (or n+2 bits, in the case of JPEG2000-RCT), only the 685 n (or n+1, in the case of JPEG2000-RCT) least significant bits are 686 used, since this is sufficient to recover the original "Sample". In 687 the equation below, the term "bits" represents "bits_per_raw_sample + 688 1" for JPEG2000-RCT or "bits_per_raw_sample" otherwise: 690 coder_input = [(sample_difference + 2 ^ (bits - 1)) & 691 (2 ^ bits - 1)] - 2 ^ (bits - 1) 693 Figure 8: Description of the coding of the Sample Difference in 694 the bitstream. 696 3.8.1. Range Coding Mode 698 Early experimental versions of FFV1 used the CABAC Arithmetic coder 699 from H.264 as defined in [ISO.14496-10.2014] but due to the uncertain 700 patent/royalty situation, as well as its slightly worse performance, 701 CABAC was replaced by a Range coder based on an algorithm defined by 702 G. Nigel and N. Martin in 1979 [range-coding]. 704 3.8.1.1. Range Binary Values 706 To encode binary digits efficiently a Range coder is used. "C(i)" is 707 the i-th Context. "B(i)" is the i-th byte of the bytestream. "b(i)" 708 is the i-th Range coded binary value, "S(0,i)" is the i-th initial 709 state. The length of the bytestream encoding n binary symbols is 710 "j(n)" bytes. 712 r_{i} = floor( ( R_{i} * S_{i,C_{i}} ) / 2 ^ 8 ) 714 Figure 9 716 S_{i+1,C_{i}} = zero_state_{S_{i,C_{i}}} AND 717 l_i = L_i AND 718 t_i = R_i - r_i <== 719 b_i = 0 <==> 720 L_i < R_i - r_i 722 S_{i+1,C_{i}} = one_state_{S_{i,C_{i}}} AND 723 l_i = L_i - R_i + r_i AND 724 t_i = r_i <== 725 b_i = 1 <==> 726 L_i >= R_i - r_i 728 Figure 10 730 S_{i+1,k} = S_{i,k} <== C_i != k 732 Figure 11 734 R_{i+1} = 2 ^ 8 * t_{i} AND 735 L_{i+1} = 2 ^ 8 * l_{i} + B_{j_{i}} AND 736 j_{i+1} = j_{i} + 1 <== 737 t_{i} < 2 ^ 8 739 R_{i+1} = t_{i} AND 740 L_{i+1} = l_{i} AND 741 j_{i+1} = j_{i} <== 742 t_{i} >= 2 ^ 8 743 Figure 12 745 R_{0} = 65280 747 Figure 13 749 L_{0} = 2 ^ 8 * B_{0} + B_{1} 751 Figure 14 753 j_{0} = 2 755 Figure 15 757 3.8.1.1.1. Termination 759 The range coder can be used in three modes. 761 * In "Open mode" when decoding, every symbol the reader attempts to 762 read is available. In this mode arbitrary data can have been 763 appended without affecting the range coder output. This mode is 764 not used in FFV1. 766 * In "Closed mode" the length in bytes of the bytestream is provided 767 to the range decoder. Bytes beyond the length are read as 0 by 768 the range decoder. This is generally one byte shorter than the 769 open mode. 771 * In "Sentinel mode" the exact length in bytes is not known and thus 772 the range decoder MAY read into the data that follows the range 773 coded bytestream by one byte. In "Sentinel mode", the end of the 774 range coded bytestream is a binary symbol with state 129, which 775 value SHALL be discarded. After reading this symbol, the range 776 decoder will have read one byte beyond the end of the range coded 777 bytestream. This way the byte position of the end can be 778 determined. Bytestreams written in "Sentinel mode" can be read in 779 "Closed mode" if the length can be determined, in this case the 780 last (sentinel) symbol will be read non-corrupted and be of value 781 0. 783 Above describes the range decoding. Encoding is defined as any 784 process which produces a decodable bytestream. 786 There are three places where range coder termination is needed in 787 FFV1. First is in the "Configuration Record", in this case the size 788 of the range coded bytestream is known and handled as "Closed mode". 789 Second is the switch from the "Slice Header" which is range coded to 790 Golomb coded slices as "Sentinel mode". Third is the end of range 791 coded Slices which need to terminate before the CRC at their end. 792 This can be handled as "Sentinel mode" or as "Closed mode" if the CRC 793 position has been determined. 795 3.8.1.2. Range Non Binary Values 797 To encode scalar integers, it would be possible to encode each bit 798 separately and use the past bits as context. However that would mean 799 255 contexts per 8-bit symbol that is not only a waste of memory but 800 also requires more past data to reach a reasonably good estimate of 801 the probabilities. Alternatively assuming a Laplacian distribution 802 and only dealing with its variance and mean (as in Huffman coding) 803 would also be possible, however, for maximum flexibility and 804 simplicity, the chosen method uses a single symbol to encode if a 805 number is 0, and if not, encodes the number using its exponent, 806 mantissa and sign. The exact contexts used are best described by 807 Figure 16. 809 int get_symbol(RangeCoder *c, uint8_t *state, int is_signed) { 810 if (get_rac(c, state + 0) { 811 return 0; 812 } 814 int e = 0; 815 while (get_rac(c, state + 1 + min(e, 9)) { //1..10 816 e++; 817 } 819 int a = 1; 820 for (int i = e - 1; i >= 0; i--) { 821 a = a * 2 + get_rac(c, state + 22 + min(i, 9)); // 22..31 822 } 824 if (!is_signed) { 825 return a; 826 } 828 if (get_rac(c, state + 11 + min(e, 10))) { //11..21 829 return -a; 830 } else { 831 return a; 832 } 833 } 835 Figure 16: A pseudo-code description of the contexts of Range Non 836 Binary Values. 838 "get_symbol" is used for the read out of "sample_difference" 839 indicated in Figure 8. 841 "get_rac" is the process described in Section 3.8.1.1. 843 3.8.1.3. Initial Values for the Context Model 845 At keyframes all Range coder state variables are set to their initial 846 state. 848 3.8.1.4. State Transition Table 850 one_state_{i} = 851 default_state_transition_{i} + state_transition_delta_{i} 853 Figure 17 855 zero_state_{i} = 256 - one_state_{256-i} 857 Figure 18 859 3.8.1.5. default_state_transition 860 0, 0, 0, 0, 0, 0, 0, 0, 20, 21, 22, 23, 24, 25, 26, 27, 862 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 37, 38, 39, 40, 41, 42, 864 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 56, 57, 866 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 868 74, 75, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 870 89, 90, 91, 92, 93, 94, 94, 95, 96, 97, 98, 99,100,101,102,103, 872 104,105,106,107,108,109,110,111,112,113,114,114,115,116,117,118, 874 119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,133, 876 134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149, 878 150,151,152,152,153,154,155,156,157,158,159,160,161,162,163,164, 880 165,166,167,168,169,170,171,171,172,173,174,175,176,177,178,179, 882 180,181,182,183,184,185,186,187,188,189,190,190,191,192,194,194, 884 195,196,197,198,199,200,201,202,202,204,205,206,207,208,209,209, 886 210,211,212,213,215,215,216,217,218,219,220,220,222,223,224,225, 888 226,227,227,229,229,230,231,232,234,234,235,236,237,238,239,240, 890 241,242,243,244,245,246,247,248,248, 0, 0, 0, 0, 0, 0, 0, 892 3.8.1.6. Alternative State Transition Table 894 The alternative state transition table has been built using iterative 895 minimization of frame sizes and generally performs better than the 896 default. To use it, the "coder_type" (see Section 4.2.3) MUST be set 897 to 2 and the difference to the default MUST be stored in the 898 "Parameters", see Section 4.2. The reference implementation of FFV1 899 in FFmpeg uses Figure 19 by default at the time of this writing when 900 Range coding is used. 902 0, 10, 10, 10, 10, 16, 16, 16, 28, 16, 16, 29, 42, 49, 20, 49, 904 59, 25, 26, 26, 27, 31, 33, 33, 33, 34, 34, 37, 67, 38, 39, 39, 906 40, 40, 41, 79, 43, 44, 45, 45, 48, 48, 64, 50, 51, 52, 88, 52, 908 53, 74, 55, 57, 58, 58, 74, 60,101, 61, 62, 84, 66, 66, 68, 69, 910 87, 82, 71, 97, 73, 73, 82, 75,111, 77, 94, 78, 87, 81, 83, 97, 912 85, 83, 94, 86, 99, 89, 90, 99,111, 92, 93,134, 95, 98,105, 98, 914 105,110,102,108,102,118,103,106,106,113,109,112,114,112,116,125, 916 115,116,117,117,126,119,125,121,121,123,145,124,126,131,127,129, 918 165,130,132,138,133,135,145,136,137,139,146,141,143,142,144,148, 920 147,155,151,149,151,150,152,157,153,154,156,168,158,162,161,160, 922 172,163,169,164,166,184,167,170,177,174,171,173,182,176,180,178, 924 175,189,179,181,186,183,192,185,200,187,191,188,190,197,193,196, 926 197,194,195,196,198,202,199,201,210,203,207,204,205,206,208,214, 928 209,211,221,212,213,215,224,216,217,218,219,220,222,228,223,225, 930 226,224,227,229,240,230,231,232,233,234,235,236,238,239,237,242, 932 241,243,242,244,245,246,247,248,249,250,251,252,252,253,254,255, 934 Figure 19: Alternative state transition table for Range coding. 936 3.8.2. Golomb Rice Mode 938 The end of the bitstream of the "Frame" is filled with 0-bits until 939 that the bitstream contains a multiple of 8 bits. 941 3.8.2.1. Signed Golomb Rice Codes 943 This coding mode uses Golomb Rice codes. The VLC is split into two 944 parts. The prefix stores the most significant bits and the suffix 945 stores the k least significant bits or stores the whole number in the 946 ESC case. 948 int get_ur_golomb(k) { 949 for (prefix = 0; prefix < 12; prefix++) { 950 if (get_bits(1)) { 951 return get_bits(k) + (prefix << k); 952 } 953 } 954 return get_bits(bits) + 11; 955 } 957 Figure 20: A pseudo-code description of the read of an unsigned 958 integer in Golomb Rice mode. 960 int get_sr_golomb(k) { 961 v = get_ur_golomb(k); 962 if (v & 1) return - (v >> 1) - 1; 963 else return (v >> 1); 964 } 966 Figure 21: A pseudo-code description of the read of a signed 967 integer in Golomb Rice mode. 969 3.8.2.1.1. Prefix 971 +================+=======+ 972 | bits | value | 973 +================+=======+ 974 | 1 | 0 | 975 +----------------+-------+ 976 | 01 | 1 | 977 +----------------+-------+ 978 | ... | ... | 979 +----------------+-------+ 980 | 0000 0000 01 | 9 | 981 +----------------+-------+ 982 | 0000 0000 001 | 10 | 983 +----------------+-------+ 984 | 0000 0000 0001 | 11 | 985 +----------------+-------+ 986 | 0000 0000 0000 | ESC | 987 +----------------+-------+ 989 Table 1 991 3.8.2.1.2. Suffix 993 +=========+========================================+ 994 +=========+========================================+ 995 | non ESC | the k least significant bits MSB first | 996 +---------+----------------------------------------+ 997 | ESC | the value - 11, in MSB first order | 998 +---------+----------------------------------------+ 1000 Table 2 1002 "ESC" MUST NOT be used if the value can be coded as "non ESC". 1004 3.8.2.1.3. Examples 1006 +=====+=======================+=======+ 1007 | k | bits | value | 1008 +=====+=======================+=======+ 1009 | 0 | 1 | 0 | 1010 +-----+-----------------------+-------+ 1011 | 0 | 001 | 2 | 1012 +-----+-----------------------+-------+ 1013 | 2 | 1 00 | 0 | 1014 +-----+-----------------------+-------+ 1015 | 2 | 1 10 | 2 | 1016 +-----+-----------------------+-------+ 1017 | 2 | 01 01 | 5 | 1018 +-----+-----------------------+-------+ 1019 | any | 000000000000 10000000 | 139 | 1020 +-----+-----------------------+-------+ 1022 Table 3 1024 3.8.2.2. Run Mode 1026 Run mode is entered when the context is 0 and left as soon as a non-0 1027 difference is found. The level is identical to the predicted one. 1028 The run and the first different level are coded. 1030 3.8.2.2.1. Run Length Coding 1032 The run value is encoded in two parts. The prefix part stores the 1033 more significant part of the run as well as adjusting the "run_index" 1034 that determines the number of bits in the less significant part of 1035 the run. The second part of the value stores the less significant 1036 part of the run as it is. The "run_index" is reset for each "Plane" 1037 and slice to 0. 1039 log2_run[41] = { 1040 0, 0, 0, 0, 1, 1, 1, 1, 1041 2, 2, 2, 2, 3, 3, 3, 3, 1042 4, 4, 5, 5, 6, 6, 7, 7, 1043 8, 9,10,11,12,13,14,15, 1044 16,17,18,19,20,21,22,23, 1045 24, 1046 }; 1048 if (run_count == 0 && run_mode == 1) { 1049 if (get_bits(1)) { 1050 run_count = 1 << log2_run[run_index]; 1051 if (x + run_count <= w) { 1052 run_index++; 1053 } 1054 } else { 1055 if (log2_run[run_index]) { 1056 run_count = get_bits(log2_run[run_index]); 1057 } else { 1058 run_count = 0; 1059 } 1060 if (run_index) { 1061 run_index--; 1062 } 1063 run_mode = 2; 1064 } 1065 } 1067 The "log2_run" array is also used within [ISO.14495-1.1999]. 1069 3.8.2.3. Sign extension 1071 "sign_extend" is the function of increasing the number of bits of an 1072 input binary number in twos complement signed number representation 1073 while preserving the input number's sign (positive/negative) and 1074 value, in order to fit in the output bit width. It MAY be computed 1075 with: 1077 sign_extend(input_number, input_bits) { 1078 negative_bias = 1 << (input_bits - 1); 1079 bits_mask = negative_bias - 1; 1080 output_number = input_number & bits_mask; // Remove negative bit 1081 is_negative = input_number & negative_bias; // Test negative bit 1082 if (is_negative) 1083 output_number -= negative_bias; 1084 return output_number 1085 } 1087 3.8.2.4. Scalar Mode 1089 Each difference is coded with the per context mean prediction removed 1090 and a per context value for k. 1092 get_vlc_symbol(state) { 1093 i = state->count; 1094 k = 0; 1095 while (i < state->error_sum) { 1096 k++; 1097 i += i; 1098 } 1100 v = get_sr_golomb(k); 1102 if (2 * state->drift < -state->count) { 1103 v = -1 - v; 1104 } 1106 ret = sign_extend(v + state->bias, bits); 1108 state->error_sum += abs(v); 1109 state->drift += v; 1111 if (state->count == 128) { 1112 state->count >>= 1; 1113 state->drift >>= 1; 1114 state->error_sum >>= 1; 1115 } 1116 state->count++; 1117 if (state->drift <= -state->count) { 1118 state->bias = max(state->bias - 1, -128); 1120 state->drift = max(state->drift + state->count, 1121 -state->count + 1); 1122 } else if (state->drift > 0) { 1123 state->bias = min(state->bias + 1, 127); 1125 state->drift = min(state->drift - state->count, 0); 1126 } 1128 return ret; 1129 } 1131 3.8.2.4.1. Level Coding 1133 Level coding is identical to the normal difference coding with the 1134 exception that the 0 value is removed as it cannot occur: 1136 diff = get_vlc_symbol(context_state); 1137 if (diff >= 0) { 1138 diff++; 1139 } 1141 Note, this is different from JPEG-LS, which doesn't use prediction in 1142 run mode and uses a different encoding and context model for the last 1143 difference. On a small set of test "Samples" the use of prediction 1144 slightly improved the compression rate. 1146 3.8.2.5. Initial Values for the VLC context state 1148 At keyframes all coder state variables are set to their initial 1149 state. 1151 drift = 0; 1152 error_sum = 4; 1153 bias = 0; 1154 count = 1; 1156 4. Bitstream 1158 An FFV1 bitstream is composed of a series of one or more "Frames" and 1159 (when required) a "Configuration Record". 1161 Within the following sub-sections, pseudo-code is used to explain the 1162 structure of each FFV1 bitstream component, as described in 1163 Section 2.2.1. Table 4 lists symbols used to annotate that pseudo- 1164 code in order to define the storage of the data referenced in that 1165 line of pseudo-code. 1167 +========+==============================================+ 1168 | Symbol | Definition | 1169 +========+==============================================+ 1170 | u(n) | unsigned big endian integer using n bits | 1171 +--------+----------------------------------------------+ 1172 | sg | Golomb Rice coded signed scalar symbol coded | 1173 | | with the method described in Section 3.8.2 | 1174 +--------+----------------------------------------------+ 1175 | br | Range coded Boolean (1-bit) symbol with the | 1176 | | method described in Section 3.8.1.1 | 1177 +--------+----------------------------------------------+ 1178 | ur | Range coded unsigned scalar symbol coded | 1179 | | with the method described in Section 3.8.1.2 | 1180 +--------+----------------------------------------------+ 1181 | sr | Range coded signed scalar symbol coded with | 1182 | | the method described in Section 3.8.1.2 | 1183 +--------+----------------------------------------------+ 1184 | sd | Sample difference coded with the method | 1185 | | described in Section 3.8 | 1186 +--------+----------------------------------------------+ 1188 Table 4: Definition of pseudo-code symbols for this 1189 document. 1191 The following MUST be provided by external means during 1192 initialization of the decoder: 1194 "frame_pixel_width" is defined as "Frame" width in "Pixels". 1196 "frame_pixel_height" is defined as "Frame" height in "Pixels". 1198 Default values at the decoder initialization phase: 1200 "ConfigurationRecordIsPresent" is set to 0. 1202 4.1. Quantization Table Set 1204 The Quantization Table Sets are stored by storing the number of equal 1205 entries -1 of the first half of the table (represented as "len - 1" 1206 in the pseudo-code below) using the method described in 1207 Section 3.8.1.2. The second half doesn't need to be stored as it is 1208 identical to the first with flipped sign. "scale" and "len_count[ i 1209 ][ j ]" are temporary values used for the computing of 1210 "context_count[ i ]" and are not used outside Quantization Table Set 1211 pseudo-code. 1213 Example: 1215 Table: 0 0 1 1 1 1 2 2 -2 -2 -2 -1 -1 -1 -1 0 1217 Stored values: 1, 3, 1 1219 "QuantizationTableSet" has its own initial states, all set to 128. 1221 pseudo-code | type 1222 --------------------------------------------------------------|----- 1223 QuantizationTableSet( i ) { | 1224 scale = 1 | 1225 for (j = 0; j < MAX_CONTEXT_INPUTS; j++) { | 1226 QuantizationTable( i, j, scale ) | 1227 scale *= 2 * len_count[ i ][ j ] - 1 | 1228 } | 1229 context_count[ i ] = ceil( scale / 2 ) | 1230 } | 1232 "MAX_CONTEXT_INPUTS" is 5. 1234 pseudo-code | type 1235 --------------------------------------------------------------|----- 1236 QuantizationTable(i, j, scale) { | 1237 v = 0 | 1238 for (k = 0; k < 128;) { | 1239 len - 1 | ur 1240 for (n = 0; n < len; n++) { | 1241 quant_tables[ i ][ j ][ k ] = scale * v | 1242 k++ | 1243 } | 1244 v++ | 1245 } | 1246 for (k = 1; k < 128; k++) { | 1247 quant_tables[ i ][ j ][ 256 - k ] = \ | 1248 -quant_tables[ i ][ j ][ k ] | 1249 } | 1250 quant_tables[ i ][ j ][ 128 ] = \ | 1251 -quant_tables[ i ][ j ][ 127 ] | 1252 len_count[ i ][ j ] = v | 1253 } | 1255 4.1.1. quant_tables 1257 "quant_tables[ i ][ j ][ k ]" indicates the quantification table 1258 value of the Quantized Sample Difference "k" of the Quantization 1259 Table "j" of the Set Quantization Table Set "i". 1261 4.1.2. context_count 1263 "context_count[ i ]" indicates the count of contexts for Quantization 1264 Table Set "i". "context_count[ i ]" MUST be less than or equal to 1265 32768. 1267 4.2. Parameters 1269 The "Parameters" section contains significant characteristics about 1270 the decoding configuration used for all instances of "Frame" (in FFV1 1271 version 0 and 1) or the whole FFV1 bitstream (other versions), 1272 including the stream version, color configuration, and quantization 1273 tables. Figure 22 describes the contents of the bitstream. 1275 "Parameters" has its own initial states, all set to 128. 1277 pseudo-code | type 1278 --------------------------------------------------------------|----- 1279 Parameters( ) { | 1280 version | ur 1281 if (version >= 3) { | 1282 micro_version | ur 1283 } | 1284 coder_type | ur 1285 if (coder_type > 1) { | 1286 for (i = 1; i < 256; i++) { | 1287 state_transition_delta[ i ] | sr 1288 } | 1289 } | 1290 colorspace_type | ur 1291 if (version >= 1) { | 1292 bits_per_raw_sample | ur 1293 } | 1294 chroma_planes | br 1295 log2_h_chroma_subsample | ur 1296 log2_v_chroma_subsample | ur 1297 extra_plane | br 1298 if (version >= 3) { | 1299 num_h_slices - 1 | ur 1300 num_v_slices - 1 | ur 1301 quant_table_set_count | ur 1302 } | 1303 for (i = 0; i < quant_table_set_count; i++) { | 1304 QuantizationTableSet( i ) | 1305 } | 1306 if (version >= 3) { | 1307 for (i = 0; i < quant_table_set_count; i++) { | 1308 states_coded | br 1309 if (states_coded) { | 1310 for (j = 0; j < context_count[ i ]; j++) { | 1311 for (k = 0; k < CONTEXT_SIZE; k++) { | 1312 initial_state_delta[ i ][ j ][ k ] | sr 1313 } | 1314 } | 1315 } | 1316 } | 1317 ec | ur 1318 intra | ur 1319 } | 1320 } | 1322 Figure 22: A pseudo-code description of the bitstream contents. 1324 CONTEXT_SIZE is 32. 1326 4.2.1. version 1328 "version" specifies the version of the FFV1 bitstream. 1330 Each version is incompatible with other versions: decoders SHOULD 1331 reject FFV1 bitstreams due to an unknown version. 1333 Decoders SHOULD reject FFV1 bitstreams with version <= 1 && 1334 ConfigurationRecordIsPresent == 1. 1336 Decoders SHOULD reject FFV1 bitstreams with version >= 3 && 1337 ConfigurationRecordIsPresent == 0. 1339 +=======+=========================+ 1340 | value | version | 1341 +=======+=========================+ 1342 | 0 | FFV1 version 0 | 1343 +-------+-------------------------+ 1344 | 1 | FFV1 version 1 | 1345 +-------+-------------------------+ 1346 | 2 | reserved* | 1347 +-------+-------------------------+ 1348 | 3 | FFV1 version 3 | 1349 +-------+-------------------------+ 1350 | Other | reserved for future use | 1351 +-------+-------------------------+ 1353 Table 5 1355 * Version 2 was experimental and this document does not describe it. 1357 4.2.2. micro_version 1359 "micro_version" specifies the micro-version of the FFV1 bitstream. 1361 After a version is considered stable (a micro-version value is 1362 assigned to be the first stable variant of a specific version), each 1363 new micro-version after this first stable variant is compatible with 1364 the previous micro-version: decoders SHOULD NOT reject FFV1 1365 bitstreams due to an unknown micro-version equal or above the micro- 1366 version considered as stable. 1368 Meaning of "micro_version" for "version" 3: 1370 +=======+=========================+ 1371 | value | micro_version | 1372 +=======+=========================+ 1373 | 0...3 | reserved* | 1374 +-------+-------------------------+ 1375 | 4 | first stable variant | 1376 +-------+-------------------------+ 1377 | Other | reserved for future use | 1378 +-------+-------------------------+ 1380 Table 6: The definitions for 1381 "micro_version" values for FFV1 1382 version 3. 1384 * development versions may be incompatible with the stable variants. 1386 4.2.3. coder_type 1388 "coder_type" specifies the coder used. 1390 +=======+=================================================+ 1391 | value | coder used | 1392 +=======+=================================================+ 1393 | 0 | Golomb Rice | 1394 +-------+-------------------------------------------------+ 1395 | 1 | Range Coder with default state transition table | 1396 +-------+-------------------------------------------------+ 1397 | 2 | Range Coder with custom state transition table | 1398 +-------+-------------------------------------------------+ 1399 | Other | reserved for future use | 1400 +-------+-------------------------------------------------+ 1402 Table 7 1404 Restrictions: 1406 If "coder_type" is 0, then "bits_per_raw_sample" SHOULD NOT be > 8. 1408 Background: At the time of this writing, there is no known 1409 implementation of FFV1 bitstream supporting Golomb Rice algorithm 1410 with "bits_per_raw_sample" greater than 8, and Range Coder is 1411 prefered. 1413 4.2.4. state_transition_delta 1415 "state_transition_delta" specifies the Range coder custom state 1416 transition table. 1418 If "state_transition_delta" is not present in the FFV1 bitstream, all 1419 Range coder custom state transition table elements are assumed to be 1420 0. 1422 4.2.5. colorspace_type 1424 "colorspace_type" specifies the color space encoded, the pixel 1425 transformation used by the encoder, the extra plane content, as well 1426 as interleave method. 1428 +=======+=============+================+==============+=============+ 1429 | value | color space | pixel | extra plane | interleave | 1430 | | encoded | transformation | content | method | 1431 +=======+=============+================+==============+=============+ 1432 | 0 | YCbCr | None | Transparency | "Plane" | 1433 | | | | | then | 1434 | | | | | "Line" | 1435 +-------+-------------+----------------+--------------+-------------+ 1436 | 1 | RGB | JPEG2000-RCT | Transparency | "Line" | 1437 | | | | | then | 1438 | | | | | "Plane" | 1439 +-------+-------------+----------------+--------------+-------------+ 1440 | Other | reserved | reserved for | reserved for | reserved | 1441 | | for future | future use | future use | for future | 1442 | | use | | | use | 1443 +-------+-------------+----------------+--------------+-------------+ 1445 Table 8 1447 FFV1 bitstreams with "colorspace_type" == 1 && ("chroma_planes" != 1448 1 || "log2_h_chroma_subsample" != 0 || "log2_v_chroma_subsample" != 1449 0) are not part of this specification. 1451 4.2.6. chroma_planes 1453 "chroma_planes" indicates if chroma (color) "Planes" are present. 1455 +=======+=================================+ 1456 | value | presence | 1457 +=======+=================================+ 1458 | 0 | chroma "Planes" are not present | 1459 +-------+---------------------------------+ 1460 | 1 | chroma "Planes" are present | 1461 +-------+---------------------------------+ 1463 Table 9 1465 4.2.7. bits_per_raw_sample 1467 "bits_per_raw_sample" indicates the number of bits for each "Sample". 1468 Inferred to be 8 if not present. 1470 +=======+===================================+ 1471 | value | bits for each sample | 1472 +=======+===================================+ 1473 | 0 | reserved* | 1474 +-------+-----------------------------------+ 1475 | Other | the actual bits for each "Sample" | 1476 +-------+-----------------------------------+ 1478 Table 10 1480 * Encoders MUST NOT store "bits_per_raw_sample" = 0. Decoders SHOULD 1481 accept and interpret "bits_per_raw_sample" = 0 as 8. 1483 4.2.8. log2_h_chroma_subsample 1485 "log2_h_chroma_subsample" indicates the subsample factor, stored in 1486 powers to which the number 2 must be raised, between luma and chroma 1487 width ("chroma_width = 2 ^ -log2_h_chroma_subsample * luma_width"). 1489 4.2.9. log2_v_chroma_subsample 1491 "log2_v_chroma_subsample" indicates the subsample factor, stored in 1492 powers to which the number 2 must be raised, between luma and chroma 1493 height ("chroma_height = 2 ^ -log2_v_chroma_subsample * 1494 luma_height"). 1496 4.2.10. extra_plane 1498 "extra_plane" indicates if an extra "Plane" is present. 1500 +=======+==============================+ 1501 | value | presence | 1502 +=======+==============================+ 1503 | 0 | extra "Plane" is not present | 1504 +-------+------------------------------+ 1505 | 1 | extra "Plane" is present | 1506 +-------+------------------------------+ 1508 Table 11 1510 4.2.11. num_h_slices 1512 "num_h_slices" indicates the number of horizontal elements of the 1513 slice raster. 1515 Inferred to be 1 if not present. 1517 4.2.12. num_v_slices 1519 "num_v_slices" indicates the number of vertical elements of the slice 1520 raster. 1522 Inferred to be 1 if not present. 1524 4.2.13. quant_table_set_count 1526 "quant_table_set_count" indicates the number of Quantization 1527 Table Sets. "quant_table_set_count" MUST be less than or equal to 8. 1529 Inferred to be 1 if not present. 1531 MUST NOT be 0. 1533 4.2.14. states_coded 1535 "states_coded" indicates if the respective Quantization Table Set has 1536 the initial states coded. 1538 Inferred to be 0 if not present. 1540 +=======+================================+ 1541 | value | initial states | 1542 +=======+================================+ 1543 | 0 | initial states are not present | 1544 | | and are assumed to be all 128 | 1545 +-------+--------------------------------+ 1546 | 1 | initial states are present | 1547 +-------+--------------------------------+ 1549 Table 12 1551 4.2.15. initial_state_delta 1553 "initial_state_delta[ i ][ j ][ k ]" indicates the initial Range 1554 coder state, it is encoded using "k" as context index and 1556 pred = j ? initial_states[ i ][j - 1][ k ] : 128 1557 Figure 23 1559 initial_state[ i ][ j ][ k ] = 1560 ( pred + initial_state_delta[ i ][ j ][ k ] ) & 255 1562 Figure 24 1564 4.2.16. ec 1566 "ec" indicates the error detection/correction type. 1568 +=======+=================================================+ 1569 | value | error detection/correction type | 1570 +=======+=================================================+ 1571 | 0 | 32-bit CRC in "ConfigurationRecord" | 1572 +-------+-------------------------------------------------+ 1573 | 1 | 32-bit CRC in "Slice" and "ConfigurationRecord" | 1574 +-------+-------------------------------------------------+ 1575 | Other | reserved for future use | 1576 +-------+-------------------------------------------------+ 1578 Table 13 1580 4.2.17. intra 1582 "intra" indicates the constraint on "keyframe" in each instance of 1583 "Frame". 1585 Inferred to be 0 if not present. 1587 +=======+=======================================================+ 1588 | value | relationship | 1589 +=======+=======================================================+ 1590 | 0 | "keyframe" can be 0 or 1 (non keyframes or keyframes) | 1591 +-------+-------------------------------------------------------+ 1592 | 1 | "keyframe" MUST be 1 (keyframes only) | 1593 +-------+-------------------------------------------------------+ 1594 | Other | reserved for future use | 1595 +-------+-------------------------------------------------------+ 1597 Table 14 1599 4.3. Configuration Record 1601 In the case of a FFV1 bitstream with "version >= 3", a "Configuration 1602 Record" is stored in the underlying "Container" as described in 1603 Section 4.3.3. It contains the "Parameters" used for all instances 1604 of "Frame". The size of the "Configuration Record", "NumBytes", is 1605 supplied by the underlying "Container". 1607 pseudo-code | type 1608 -----------------------------------------------------------|----- 1609 ConfigurationRecord( NumBytes ) { | 1610 ConfigurationRecordIsPresent = 1 | 1611 Parameters( ) | 1612 while (remaining_symbols_in_syntax(NumBytes - 4)) { | 1613 reserved_for_future_use | br/ur/sr 1614 } | 1615 configuration_record_crc_parity | u(32) 1616 } | 1618 4.3.1. reserved_for_future_use 1620 "reserved_for_future_use" has semantics that are reserved for future 1621 use. 1623 Encoders conforming to this version of this specification SHALL NOT 1624 write this value. 1626 Decoders conforming to this version of this specification SHALL 1627 ignore its value. 1629 4.3.2. configuration_record_crc_parity 1631 "configuration_record_crc_parity" 32 bits that are chosen so that the 1632 "Configuration Record" as a whole has a CRC remainder of 0. 1634 This is equivalent to storing the CRC remainder in the 32-bit parity. 1636 The CRC generator polynomial used is described in Section 4.9.3. 1638 4.3.3. Mapping FFV1 into Containers 1640 This "Configuration Record" can be placed in any file format 1641 supporting "Configuration Records", fitting as much as possible with 1642 how the file format uses to store "Configuration Records". The 1643 "Configuration Record" storage place and "NumBytes" are currently 1644 defined and supported by this version of this specification for the 1645 following formats: 1647 4.3.3.1. AVI File Format 1649 The "Configuration Record" extends the stream format chunk ("AVI ", 1650 "hdlr", "strl", "strf") with the ConfigurationRecord bitstream. 1652 See [AVI] for more information about chunks. 1654 "NumBytes" is defined as the size, in bytes, of the strf chunk 1655 indicated in the chunk header minus the size of the stream format 1656 structure. 1658 4.3.3.2. ISO Base Media File Format 1660 The "Configuration Record" extends the sample description box 1661 ("moov", "trak", "mdia", "minf", "stbl", "stsd") with a "glbl" box 1662 that contains the ConfigurationRecord bitstream. See 1663 [ISO.14496-12.2015] for more information about boxes. 1665 "NumBytes" is defined as the size, in bytes, of the "glbl" box 1666 indicated in the box header minus the size of the box header. 1668 4.3.3.3. NUT File Format 1670 The "codec_specific_data" element (in "stream_header" packet) 1671 contains the ConfigurationRecord bitstream. See [NUT] for more 1672 information about elements. 1674 "NumBytes" is defined as the size, in bytes, of the 1675 "codec_specific_data" element as indicated in the "length" field of 1676 "codec_specific_data". 1678 4.3.3.4. Matroska File Format 1680 FFV1 SHOULD use "V_FFV1" as the Matroska "Codec ID". For FFV1 1681 versions 2 or less, the Matroska "CodecPrivate" Element SHOULD NOT be 1682 used. For FFV1 versions 3 or greater, the Matroska "CodecPrivate" 1683 Element MUST contain the FFV1 "Configuration Record" structure and no 1684 other data. See [Matroska] for more information about elements. 1686 "NumBytes" is defined as the "Element Data Size" of the 1687 "CodecPrivate" Element. 1689 4.4. Frame 1691 A "Frame" is an encoded representation of a complete static image. 1692 The whole "Frame" is provided by the underlaying container. 1694 A "Frame" consists of the "keyframe" field, "Parameters" (if 1695 "version" <= 1), and a sequence of independent slices. The pseudo- 1696 code below describes the contents of a "Frame". 1698 "keyframe" field has its own initial state, set to 128. 1700 pseudo-code | type 1701 --------------------------------------------------------------|----- 1702 Frame( NumBytes ) { | 1703 keyframe | br 1704 if (keyframe && !ConfigurationRecordIsPresent { | 1705 Parameters( ) | 1706 } | 1707 while (remaining_bits_in_bitstream( NumBytes )) { | 1708 Slice( ) | 1709 } | 1710 } | 1712 Architecture overview of slices in a "Frame": 1714 +=================================================================+ 1715 +=================================================================+ 1716 | first slice header | 1717 +-----------------------------------------------------------------+ 1718 | first slice content | 1719 +-----------------------------------------------------------------+ 1720 | first slice footer | 1721 +-----------------------------------------------------------------+ 1722 | --------------------------------------------------------------- | 1723 +-----------------------------------------------------------------+ 1724 | second slice header | 1725 +-----------------------------------------------------------------+ 1726 | second slice content | 1727 +-----------------------------------------------------------------+ 1728 | second slice footer | 1729 +-----------------------------------------------------------------+ 1730 | --------------------------------------------------------------- | 1731 +-----------------------------------------------------------------+ 1732 | ... | 1733 +-----------------------------------------------------------------+ 1734 | --------------------------------------------------------------- | 1735 +-----------------------------------------------------------------+ 1736 | last slice header | 1737 +-----------------------------------------------------------------+ 1738 | last slice content | 1739 +-----------------------------------------------------------------+ 1740 | last slice footer | 1741 +-----------------------------------------------------------------+ 1743 Table 15 1745 4.5. Slice 1747 A "Slice" is an independent spatial sub-section of a "Frame" that is 1748 encoded separately from another region of the same "Frame". The use 1749 of more than one "Slice" per "Frame" can be useful for taking 1750 advantage of the opportunities of multithreaded encoding and 1751 decoding. 1753 A "Slice" consists of a "Slice Header" (when relevant), a "Slice 1754 Content", and a "Slice Footer" (when relevant). The pseudo-code 1755 below describes the contents of a "Slice". 1757 pseudo-code | type 1758 --------------------------------------------------------------|----- 1759 Slice( ) { | 1760 if (version >= 3) { | 1761 SliceHeader( ) | 1762 } | 1763 SliceContent( ) | 1764 if (coder_type == 0) { | 1765 while (!byte_aligned()) { | 1766 padding | u(1) 1767 } | 1768 } | 1769 if (version <= 1) { | 1770 while (remaining_bits_in_bitstream( NumBytes ) != 0) {| 1771 reserved | u(1) 1772 } | 1773 } | 1774 if (version >= 3) { | 1775 SliceFooter( ) | 1776 } | 1777 } | 1779 "padding" specifies a bit without any significance and used only for 1780 byte alignment. MUST be 0. 1782 "reserved" specifies a bit without any significance in this revision 1783 of the specification and may have a significance in a later revision 1784 of this specification. 1786 Encoders SHOULD NOT fill these bits. 1788 Decoders SHOULD ignore these bits. 1790 4.6. Slice Header 1792 A "Slice Header" provides information about the decoding 1793 configuration of the "Slice", such as its spatial position, size, and 1794 aspect ratio. The pseudo-code below describes the contents of the 1795 "Slice Header". 1797 "Slice Header" has its own initial states, all set to 128. 1799 pseudo-code | type 1800 --------------------------------------------------------------|----- 1801 SliceHeader( ) { | 1802 slice_x | ur 1803 slice_y | ur 1804 slice_width - 1 | ur 1805 slice_height - 1 | ur 1806 for (i = 0; i < quant_table_set_index_count; i++) { | 1807 quant_table_set_index[ i ] | ur 1808 } | 1809 picture_structure | ur 1810 sar_num | ur 1811 sar_den | ur 1812 } | 1814 4.6.1. slice_x 1816 "slice_x" indicates the x position on the slice raster formed by 1817 num_h_slices. 1819 Inferred to be 0 if not present. 1821 4.6.2. slice_y 1823 "slice_y" indicates the y position on the slice raster formed by 1824 num_v_slices. 1826 Inferred to be 0 if not present. 1828 4.6.3. slice_width 1830 "slice_width" indicates the width on the slice raster formed by 1831 num_h_slices. 1833 Inferred to be 1 if not present. 1835 4.6.4. slice_height 1837 "slice_height" indicates the height on the slice raster formed by 1838 num_v_slices. 1840 Inferred to be 1 if not present. 1842 4.6.5. quant_table_set_index_count 1844 "quant_table_set_index_count" is defined as: 1846 1 + ( ( chroma_planes || version <= 3 ) ? 1 : 0 ) 1847 + ( extra_plane ? 1 : 0 ) 1849 4.6.6. quant_table_set_index 1851 "quant_table_set_index" indicates the Quantization Table Set index to 1852 select the Quantization Table Set and the initial states for the 1853 "Slice Content". 1855 Inferred to be 0 if not present. 1857 4.6.7. picture_structure 1859 "picture_structure" specifies the temporal and spatial relationship 1860 of each "Line" of the "Frame". 1862 Inferred to be 0 if not present. 1864 +=======+=========================+ 1865 | value | picture structure used | 1866 +=======+=========================+ 1867 | 0 | unknown | 1868 +-------+-------------------------+ 1869 | 1 | top field first | 1870 +-------+-------------------------+ 1871 | 2 | bottom field first | 1872 +-------+-------------------------+ 1873 | 3 | progressive | 1874 +-------+-------------------------+ 1875 | Other | reserved for future use | 1876 +-------+-------------------------+ 1878 Table 16 1880 4.6.8. sar_num 1882 "sar_num" specifies the "Sample" aspect ratio numerator. 1884 Inferred to be 0 if not present. 1886 A value of 0 means that aspect ratio is unknown. 1888 Encoders MUST write 0 if "Sample" aspect ratio is unknown. 1890 If "sar_den" is 0, decoders SHOULD ignore the encoded value and 1891 consider that "sar_num" is 0. 1893 4.6.9. sar_den 1895 "sar_den" specifies the "Sample" aspect ratio denominator. 1897 Inferred to be 0 if not present. 1899 A value of 0 means that aspect ratio is unknown. 1901 Encoders MUST write 0 if "Sample" aspect ratio is unknown. 1903 If "sar_num" is 0, decoders SHOULD ignore the encoded value and 1904 consider that "sar_den" is 0. 1906 4.7. Slice Content 1908 A "Slice Content" contains all "Line" elements part of the "Slice". 1910 Depending on the configuration, "Line" elements are ordered by 1911 "Plane" then by row (YCbCr) or by row then by "Plane" (RGB). 1913 pseudo-code | type 1914 --------------------------------------------------------------|----- 1915 SliceContent( ) { | 1916 if (colorspace_type == 0) { | 1917 for (p = 0; p < primary_color_count; p++) { | 1918 for (y = 0; y < plane_pixel_height[ p ]; y++) { | 1919 Line( p, y ) | 1920 } | 1921 } | 1922 } else if (colorspace_type == 1) { | 1923 for (y = 0; y < slice_pixel_height; y++) { | 1924 for (p = 0; p < primary_color_count; p++) { | 1925 Line( p, y ) | 1926 } | 1927 } | 1928 } | 1929 } | 1931 4.7.1. primary_color_count 1933 "primary_color_count" is defined as: 1935 1 + ( chroma_planes ? 2 : 0 ) + ( extra_plane ? 1 : 0 ) 1937 4.7.2. plane_pixel_height 1939 "plane_pixel_height[ p ]" is the height in "Pixels" of "Plane" p of 1940 the "Slice". It is defined as: 1942 chroma_planes == 1 && (p == 1 || p == 2) 1943 ? ceil(slice_pixel_height / (1 << log2_v_chroma_subsample)) 1944 : slice_pixel_height 1946 4.7.3. slice_pixel_height 1948 "slice_pixel_height" is the height in pixels of the slice. It is 1949 defined as: 1951 floor( 1952 ( slice_y + slice_height ) 1953 * slice_pixel_height 1954 / num_v_slices 1955 ) - slice_pixel_y. 1957 4.7.4. slice_pixel_y 1959 "slice_pixel_y" is the slice vertical position in pixels. It is 1960 defined as: 1962 floor( slice_y * frame_pixel_height / num_v_slices ) 1964 4.8. Line 1966 A "Line" is a list of the sample differences (relative to the 1967 predictor) of primary color components. The pseudo-code below 1968 describes the contents of the "Line". 1970 pseudo-code | type 1971 --------------------------------------------------------------|----- 1972 Line( p, y ) { | 1973 if (colorspace_type == 0) { | 1974 for (x = 0; x < plane_pixel_width[ p ]; x++) { | 1975 sample_difference[ p ][ y ][ x ] | sd 1976 } | 1977 } else if (colorspace_type == 1) { | 1978 for (x = 0; x < slice_pixel_width; x++) { | 1979 sample_difference[ p ][ y ][ x ] | sd 1980 } | 1981 } | 1982 } | 1984 4.8.1. plane_pixel_width 1986 "plane_pixel_width[ p ]" is the width in "Pixels" of "Plane" p of the 1987 "Slice". It is defined as: 1989 chroma\_planes == 1 && (p == 1 || p == 2) 1990 ? ceil( slice_pixel_width / (1 << log2_h_chroma_subsample) ) 1991 : slice_pixel_width. 1993 4.8.2. slice_pixel_width 1995 "slice_pixel_width" is the width in "Pixels" of the slice. It is 1996 defined as: 1998 floor( 1999 ( slice_x + slice_width ) 2000 * slice_pixel_width 2001 / num_h_slices 2002 ) - slice_pixel_x 2004 4.8.3. slice_pixel_x 2006 "slice_pixel_x" is the slice horizontal position in "Pixels". It is 2007 defined as: 2009 floor( slice_x * frame_pixel_width / num_h_slices ) 2011 4.8.4. sample_difference 2013 "sample_difference[ p ][ y ][ x ]" is the sample difference for 2014 "Sample" at "Plane" "p", y position "y", and x position "x". The 2015 "Sample" value is computed based on median predictor and context 2016 described in Section 3.2. 2018 4.9. Slice Footer 2020 A "Slice Footer" provides information about slice size and 2021 (optionally) parity. The pseudo-code below describes the contents of 2022 the "Slice Footer". 2024 Note: "Slice Footer" is always byte aligned. 2026 pseudo-code | type 2027 --------------------------------------------------------------|----- 2028 SliceFooter( ) { | 2029 slice_size | u(24) 2030 if (ec) { | 2031 error_status | u(8) 2032 slice_crc_parity | u(32) 2033 } | 2034 } | 2036 4.9.1. slice_size 2038 "slice_size" indicates the size of the slice in bytes. 2040 Note: this allows finding the start of slices before previous slices 2041 have been fully decoded, and allows parallel decoding as well as 2042 error resilience. 2044 4.9.2. error_status 2046 "error_status" specifies the error status. 2048 +=======+======================================+ 2049 | value | error status | 2050 +=======+======================================+ 2051 | 0 | no error | 2052 +-------+--------------------------------------+ 2053 | 1 | slice contains a correctable error | 2054 +-------+--------------------------------------+ 2055 | 2 | slice contains a uncorrectable error | 2056 +-------+--------------------------------------+ 2057 | Other | reserved for future use | 2058 +-------+--------------------------------------+ 2060 Table 17 2062 4.9.3. slice_crc_parity 2064 "slice_crc_parity" 32 bits that are chosen so that the slice as a 2065 whole has a crc remainder of 0. 2067 This is equivalent to storing the crc remainder in the 32-bit parity. 2069 The CRC generator polynomial used is the standard IEEE CRC polynomial 2070 (0x104C11DB7), with initial value 0, without pre-inversion and 2071 without post-inversion. 2073 5. Restrictions 2075 To ensure that fast multithreaded decoding is possible, starting with 2076 version 3 and if "frame_pixel_width * frame_pixel_height" is more 2077 than 101376, "slice_width * slice_height" MUST be less or equal to 2078 "num_h_slices * num_v_slices / 4". Note: 101376 is the frame size in 2079 "Pixels" of a 352x288 frame also known as CIF ("Common Intermediate 2080 Format") frame size format. 2082 For each "Frame", each position in the slice raster MUST be filled by 2083 one and only one slice of the "Frame" (no missing slice position, no 2084 slice overlapping). 2086 For each "Frame" with "keyframe" value of 0, each slice MUST have the 2087 same value of "slice_x", "slice_y", "slice_width", "slice_height" as 2088 a slice in the previous "Frame". 2090 6. Security Considerations 2092 Like any other codec, (such as [RFC6716]), FFV1 should not be used 2093 with insecure ciphers or cipher-modes that are vulnerable to known 2094 plaintext attacks. Some of the header bits as well as the padding 2095 are easily predictable. 2097 Implementations of the FFV1 codec need to take appropriate security 2098 considerations into account, as outlined in [RFC4732]. It is 2099 extremely important for the decoder to be robust against malicious 2100 payloads. Malicious payloads must not cause the decoder to overrun 2101 its allocated memory or to take an excessive amount of resources to 2102 decode. The same applies to the encoder, even though problems in 2103 encoders are typically rarer. Malicious video streams must not cause 2104 the encoder to misbehave because this would allow an attacker to 2105 attack transcoding gateways. A frequent security problem in image 2106 and video codecs is also to not check for integer overflows in 2107 "Pixel" count computations, that is to allocate width * height 2108 without considering that the multiplication result may have 2109 overflowed the arithmetic types range. The range coder could, if 2110 implemented naively, read one byte over the end. The implementation 2111 must ensure that no read outside allocated and initialized memory 2112 occurs. 2114 The reference implementation [REFIMPL] contains no known buffer 2115 overflow or cases where a specially crafted packet or video segment 2116 could cause a significant increase in CPU load. 2118 The reference implementation [REFIMPL] was validated in the following 2119 conditions: 2121 * Sending the decoder valid packets generated by the reference 2122 encoder and verifying that the decoder's output matches the 2123 encoder's input. 2125 * Sending the decoder packets generated by the reference encoder and 2126 then subjected to random corruption. 2128 * Sending the decoder random packets that are not FFV1. 2130 In all of the conditions above, the decoder and encoder was run 2131 inside the [VALGRIND] memory debugger as well as clangs address 2132 sanitizer [Address-Sanitizer], which track reads and writes to 2133 invalid memory regions as well as the use of uninitialized memory. 2134 There were no errors reported on any of the tested conditions. 2136 7. Media Type Definition 2138 This registration is done using the template defined in [RFC6838] and 2139 following [RFC4855]. 2141 Type name: video 2143 Subtype name: FFV1 2145 Required parameters: None. 2147 Optional parameters: These parameters are used to signal the 2148 capabilities of a receiver implementation. These parameters MUST NOT 2149 be used for any other purpose. 2151 * "version": The "version" of the FFV1 encoding as defined by 2152 Section 4.2.1. 2154 * "micro_version": The "micro_version" of the FFV1 encoding as 2155 defined by Section 4.2.2. 2157 * "coder_type": The "coder_type" of the FFV1 encoding as defined by 2158 Section 4.2.3. 2160 * "colorspace_type": The "colorspace_type" of the FFV1 encoding as 2161 defined by Section 4.2.5. 2163 * "bits_per_raw_sample": The "bits_per_raw_sample" of the FFV1 2164 encoding as defined by Section 4.2.7. 2166 * "max_slices": The value of "max_slices" is an integer indicating 2167 the maximum count of slices with a frames of the FFV1 encoding. 2169 Encoding considerations: This media type is defined for encapsulation 2170 in several audiovisual container formats and contains binary data; 2171 see Section 4.3.3. This media type is framed binary data; see 2172 Section 4.8 of [RFC6838]. 2174 Security considerations: See Section 6 of this document. 2176 Interoperability considerations: None. 2178 Published specification: RFC XXXX. 2180 [RFC Editor: Upon publication as an RFC, please replace "XXXX" with 2181 the number assigned to this document and remove this note.] 2183 Applications which use this media type: Any application that requires 2184 the transport of lossless video can use this media type. Some 2185 examples are, but not limited to screen recording, scientific 2186 imaging, and digital video preservation. 2188 Fragment identifier considerations: N/A. 2190 Additional information: None. 2192 Person & email address to contact for further information: Michael 2193 Niedermayer michael@niedermayer.cc (mailto:michael@niedermayer.cc) 2195 Intended usage: COMMON 2197 Restrictions on usage: None. 2199 Author: Dave Rice dave@dericed.com (mailto:dave@dericed.com) 2201 Change controller: IETF cellar working group delegated from the IESG. 2203 8. IANA Considerations 2205 The IANA is requested to register the following values: 2207 * Media type registration as described in Section 7. 2209 9. Changelog 2211 See https://github.com/FFmpeg/FFV1/commits/master 2212 (https://github.com/FFmpeg/FFV1/commits/master) 2214 [RFC Editor: Please remove this Changelog section prior to 2215 publication.] 2217 10. Normative References 2219 [RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type 2220 Specifications and Registration Procedures", BCP 13, 2221 RFC 6838, DOI 10.17487/RFC6838, January 2013, 2222 . 2224 [ISO.9899.2018] 2225 International Organization for Standardization, 2226 "Programming languages - C", 2018. 2228 [RFC6716] Valin, JM., Vos, K., and T. Terriberry, "Definition of the 2229 Opus Audio Codec", RFC 6716, DOI 10.17487/RFC6716, 2230 September 2012, . 2232 [ISO.9899.1990] 2233 International Organization for Standardization, 2234 "Programming languages - C", 1990. 2236 [ISO.15444-1.2016] 2237 International Organization for Standardization, 2238 "Information technology -- JPEG 2000 image coding system: 2239 Core coding system", October 2016. 2241 [Matroska] IETF, "Matroska", 2019, . 2244 [RFC4732] Handley, M., Ed., Rescorla, E., Ed., and IAB, "Internet 2245 Denial-of-Service Considerations", RFC 4732, 2246 DOI 10.17487/RFC4732, December 2006, 2247 . 2249 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2250 Requirement Levels", BCP 14, RFC 2119, 2251 DOI 10.17487/RFC2119, March 1997, 2252 . 2254 [RFC4855] Casner, S., "Media Type Registration of RTP Payload 2255 Formats", RFC 4855, DOI 10.17487/RFC4855, February 2007, 2256 . 2258 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2259 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 2260 May 2017, . 2262 11. Informative References 2264 [FFV1_V3] Niedermayer, M., "Commit to mark FFV1 version 3 as non- 2265 experimental", August 2013, . 2269 [ISO.14496-12.2015] 2270 International Organization for Standardization, 2271 "Information technology -- Coding of audio-visual objects 2272 -- Part 12: ISO base media file format", December 2015. 2274 [NUT] Niedermayer, M., "NUT Open Container Format", December 2275 2013, . 2277 [range-coding] 2278 Nigel, G. and N. Martin, "Range encoding: an algorithm for 2279 removing redundancy from a digitised message.", July 1979. 2281 [AVI] Microsoft, "AVI RIFF File Reference", undated, 2282 . 2285 [FFV1_V0] Niedermayer, M., "Commit to mark FFV1 version 0 as non- 2286 experimental", April 2006, . 2290 [YCbCr] Wikipedia, "YCbCr", undated, 2291 . 2293 [HuffYUV] Rudiak-Gould, B., "HuffYUV", December 2003, 2294 . 2297 [REFIMPL] Niedermayer, M., "The reference FFV1 implementation / the 2298 FFV1 codec in FFmpeg", undated, . 2300 [FFV1_V1] Niedermayer, M., "Commit to release FFV1 version 1", April 2301 2009, . 2304 [VALGRIND] Valgrind Developers, "Valgrind website", undated, 2305 . 2307 [ISO.14496-10.2014] 2308 International Organization for Standardization, 2309 "Information technology -- Coding of audio-visual objects 2310 -- Part 10: Advanced Video Coding", September 2014. 2312 [Address-Sanitizer] 2313 The Clang Team, "ASAN AddressSanitizer website", undated, 2314 . 2316 [ISO.14495-1.1999] 2317 International Organization for Standardization, 2318 "Information technology -- Lossless and near-lossless 2319 compression of continuous-tone still images: Baseline", 2320 December 1999. 2322 Appendix A. Multi-theaded decoder implementation suggestions 2324 This appendix is informative. 2326 The FFV1 bitstream is parsable in two ways: in sequential order as 2327 described in this document or with the pre-analysis of the footer of 2328 each slice. Each slice footer contains a "slice_size" field so the 2329 boundary of each slice is computable without having to parse the 2330 slice content. That allows multi-threading as well as independence 2331 of slice content (a bitstream error in a slice header or slice 2332 content has no impact on the decoding of the other slices). 2334 After having checked "keyframe" field, a decoder SHOULD parse 2335 "slice_size" fields, from "slice_size" of the last slice at the end 2336 of the "Frame" up to "slice_size" of the first slice at the beginning 2337 of the "Frame", before parsing slices, in order to have slices 2338 boundaries. A decoder MAY fallback on sequential order e.g. in case 2339 of a corrupted "Frame" (frame size unknown, "slice_size" of slices 2340 not coherent...) or if there is no possibility of seeking into the 2341 stream. 2343 Appendix B. Future handling of some streams created by non conforming 2344 encoders 2346 This appendix is informative. 2348 Some bitstreams were found with 40 extra bits corresponding to 2349 "error_status" and "slice_crc_parity" in the "reserved" bits of 2350 "Slice()". Any revision of this specification SHOULD care about 2351 avoiding to add 40 bits of content after "SliceContent" if "version" 2352 == 0 or "version" == 1. Else a decoder conforming to the revised 2353 specification could not distinguish between a revised bitstream and 2354 such buggy bitstream in the wild. 2356 Authors' Addresses 2358 Michael Niedermayer 2360 Email: michael@niedermayer.cc 2361 Dave Rice 2363 Email: dave@dericed.com 2365 Jerome Martinez 2367 Email: jerome@mediaarea.net