idnits 2.17.1 draft-ietf-cellar-ffv1-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 14 instances of too long lines in the document, the longest one being 2491 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (September 6, 2019) is 1693 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-20) exists of draft-ietf-cellar-ffv1-08 Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 cellar M. Niedermayer 3 Internet-Draft D. Rice 4 Intended status: Informational J. Martinez 5 Expires: March 9, 2020 September 6, 2019 7 FFV1 Video Coding Format Version 0, 1, and 3 8 draft-ietf-cellar-ffv1-09 10 Abstract 12 This document defines FFV1, a lossless intra-frame video encoding 13 format. FFV1 is designed to efficiently compress video data in a 14 variety of pixel formats. Compared to uncompressed video, FFV1 15 offers storage compression, frame fixity, and self-description, which 16 makes FFV1 useful as a preservation or intermediate video format. 18 Status of This Memo 20 This Internet-Draft is submitted in full conformance with the 21 provisions of BCP 78 and BCP 79. 23 Internet-Drafts are working documents of the Internet Engineering 24 Task Force (IETF). Note that other groups may also distribute 25 working documents as Internet-Drafts. The list of current Internet- 26 Drafts is at https://datatracker.ietf.org/drafts/current/. 28 Internet-Drafts are draft documents valid for a maximum of six months 29 and may be updated, replaced, or obsoleted by other documents at any 30 time. It is inappropriate to use Internet-Drafts as reference 31 material or to cite them other than as "work in progress." 33 This Internet-Draft will expire on March 9, 2020. 35 Copyright Notice 37 Copyright (c) 2019 IETF Trust and the persons identified as the 38 document authors. All rights reserved. 40 This document is subject to BCP 78 and the IETF Trust's Legal 41 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 42 license-info) in effect on the date of publication of this document. 43 Please review these documents carefully, as they describe your rights 44 and restrictions with respect to this document. Code Components 45 extracted from this document must include Simplified BSD License text 46 as described in Section 4.e of the Trust Legal Provisions and are 47 provided without warranty as described in the Simplified BSD License. 49 Table of Contents 51 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 52 2. Notation and Conventions . . . . . . . . . . . . . . . . . . 4 53 2.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . 5 54 2.2. Conventions . . . . . . . . . . . . . . . . . . . . . . . 5 55 2.2.1. Pseudo-code . . . . . . . . . . . . . . . . . . . . . 6 56 2.2.2. Arithmetic Operators . . . . . . . . . . . . . . . . 6 57 2.2.3. Assignment Operators . . . . . . . . . . . . . . . . 6 58 2.2.4. Comparison Operators . . . . . . . . . . . . . . . . 7 59 2.2.5. Mathematical Functions . . . . . . . . . . . . . . . 7 60 2.2.6. Order of Operation Precedence . . . . . . . . . . . . 8 61 2.2.7. Range . . . . . . . . . . . . . . . . . . . . . . . . 8 62 2.2.8. NumBytes . . . . . . . . . . . . . . . . . . . . . . 8 63 2.2.9. Bitstream Functions . . . . . . . . . . . . . . . . . 8 64 3. Sample Coding . . . . . . . . . . . . . . . . . . . . . . . . 9 65 3.1. Border . . . . . . . . . . . . . . . . . . . . . . . . . 9 66 3.2. Samples . . . . . . . . . . . . . . . . . . . . . . . . . 10 67 3.3. Median Predictor . . . . . . . . . . . . . . . . . . . . 10 68 3.4. Context . . . . . . . . . . . . . . . . . . . . . . . . . 11 69 3.5. Quantization Table Sets . . . . . . . . . . . . . . . . . 12 70 3.6. Quantization Table Set Indexes . . . . . . . . . . . . . 12 71 3.7. Color spaces . . . . . . . . . . . . . . . . . . . . . . 12 72 3.7.1. YCbCr . . . . . . . . . . . . . . . . . . . . . . . . 13 73 3.7.2. RGB . . . . . . . . . . . . . . . . . . . . . . . . . 13 74 3.8. Coding of the Sample Difference . . . . . . . . . . . . . 15 75 3.8.1. Range Coding Mode . . . . . . . . . . . . . . . . . . 15 76 3.8.2. Golomb Rice Mode . . . . . . . . . . . . . . . . . . 19 77 4. Bitstream . . . . . . . . . . . . . . . . . . . . . . . . . . 22 78 4.1. Parameters . . . . . . . . . . . . . . . . . . . . . . . 23 79 4.1.1. version . . . . . . . . . . . . . . . . . . . . . . . 24 80 4.1.2. micro_version . . . . . . . . . . . . . . . . . . . . 24 81 4.1.3. coder_type . . . . . . . . . . . . . . . . . . . . . 25 82 4.1.4. state_transition_delta . . . . . . . . . . . . . . . 25 83 4.1.5. colorspace_type . . . . . . . . . . . . . . . . . . . 25 84 4.1.6. chroma_planes . . . . . . . . . . . . . . . . . . . . 26 85 4.1.7. bits_per_raw_sample . . . . . . . . . . . . . . . . . 26 86 4.1.8. log2_h_chroma_subsample . . . . . . . . . . . . . . . 27 87 4.1.9. log2_v_chroma_subsample . . . . . . . . . . . . . . . 27 88 4.1.10. extra_plane . . . . . . . . . . . . . . . . . . . . . 27 89 4.1.11. num_h_slices . . . . . . . . . . . . . . . . . . . . 27 90 4.1.12. num_v_slices . . . . . . . . . . . . . . . . . . . . 28 91 4.1.13. quant_table_set_count . . . . . . . . . . . . . . . . 28 92 4.1.14. states_coded . . . . . . . . . . . . . . . . . . . . 28 93 4.1.15. initial_state_delta . . . . . . . . . . . . . . . . . 28 94 4.1.16. ec . . . . . . . . . . . . . . . . . . . . . . . . . 29 95 4.1.17. intra . . . . . . . . . . . . . . . . . . . . . . . . 29 96 4.2. Configuration Record . . . . . . . . . . . . . . . . . . 29 97 4.2.1. reserved_for_future_use . . . . . . . . . . . . . . . 30 98 4.2.2. configuration_record_crc_parity . . . . . . . . . . . 30 99 4.2.3. Mapping FFV1 into Containers . . . . . . . . . . . . 30 100 4.3. Frame . . . . . . . . . . . . . . . . . . . . . . . . . . 31 101 4.4. Slice . . . . . . . . . . . . . . . . . . . . . . . . . . 32 102 4.5. Slice Header . . . . . . . . . . . . . . . . . . . . . . 33 103 4.5.1. slice_x . . . . . . . . . . . . . . . . . . . . . . . 33 104 4.5.2. slice_y . . . . . . . . . . . . . . . . . . . . . . . 33 105 4.5.3. slice_width . . . . . . . . . . . . . . . . . . . . . 33 106 4.5.4. slice_height . . . . . . . . . . . . . . . . . . . . 34 107 4.5.5. quant_table_set_index_count . . . . . . . . . . . . . 34 108 4.5.6. quant_table_set_index . . . . . . . . . . . . . . . . 34 109 4.5.7. picture_structure . . . . . . . . . . . . . . . . . . 34 110 4.5.8. sar_num . . . . . . . . . . . . . . . . . . . . . . . 34 111 4.5.9. sar_den . . . . . . . . . . . . . . . . . . . . . . . 35 112 4.6. Slice Content . . . . . . . . . . . . . . . . . . . . . . 35 113 4.6.1. primary_color_count . . . . . . . . . . . . . . . . . 35 114 4.6.2. plane_pixel_height . . . . . . . . . . . . . . . . . 35 115 4.6.3. slice_pixel_height . . . . . . . . . . . . . . . . . 36 116 4.6.4. slice_pixel_y . . . . . . . . . . . . . . . . . . . . 36 117 4.7. Line . . . . . . . . . . . . . . . . . . . . . . . . . . 36 118 4.7.1. plane_pixel_width . . . . . . . . . . . . . . . . . . 36 119 4.7.2. slice_pixel_width . . . . . . . . . . . . . . . . . . 36 120 4.7.3. slice_pixel_x . . . . . . . . . . . . . . . . . . . . 36 121 4.7.4. sample_difference . . . . . . . . . . . . . . . . . . 37 122 4.8. Slice Footer . . . . . . . . . . . . . . . . . . . . . . 37 123 4.8.1. slice_size . . . . . . . . . . . . . . . . . . . . . 37 124 4.8.2. error_status . . . . . . . . . . . . . . . . . . . . 37 125 4.8.3. slice_crc_parity . . . . . . . . . . . . . . . . . . 37 126 4.9. Quantization Table Set . . . . . . . . . . . . . . . . . 38 127 4.9.1. quant_tables . . . . . . . . . . . . . . . . . . . . 38 128 4.9.2. context_count . . . . . . . . . . . . . . . . . . . . 38 129 5. Restrictions . . . . . . . . . . . . . . . . . . . . . . . . 38 130 6. Security Considerations . . . . . . . . . . . . . . . . . . . 39 131 7. Media Type Definition . . . . . . . . . . . . . . . . . . . . 40 132 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 41 133 9. Appendixes . . . . . . . . . . . . . . . . . . . . . . . . . 41 134 9.1. Decoder implementation suggestions . . . . . . . . . . . 41 135 9.1.1. Multi-threading Support and Independence of 136 Slices . . . . . . . . . . . . . . . . . . . . . . . 42 137 10. Changelog . . . . . . . . . . . . . . . . . . . . . . . . . . 42 138 11. Normative References . . . . . . . . . . . . . . . . . . . . 42 139 12. Informative References . . . . . . . . . . . . . . . . . . . 43 140 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 44 142 1. Introduction 144 This document describes FFV1, a lossless video encoding format. The 145 design of FFV1 considers the storage of image characteristics, data 146 fixity, and the optimized use of encoding time and storage 147 requirements. FFV1 is designed to support a wide range of lossless 148 video applications such as long-term audiovisual preservation, 149 scientific imaging, screen recording, and other video encoding 150 scenarios that seek to avoid the generational loss of lossy video 151 encodings. 153 This document defines version 0, 1 and 3 of FFV1. The distinctions 154 of the versions are provided throughout the document, but in summary: 156 * Version 0 of FFV1 was the original implementation of FFV1 and has 157 been in non-experimental use since April 14, 2006 [FFV1_V0]. 159 * Version 1 of FFV1 adds support of more video bit depths and has 160 been in use since April 24, 2009 [FFV1_V1]. 162 * Version 2 of FFV1 only existed in experimental form and is not 163 described by this document, but is available as a LyX file at 164 https://github.com/FFmpeg/FFV1/ 165 blob/8ad772b6d61c3dd8b0171979a2cd9f11924d5532/ffv1.lyx 166 (https://github.com/FFmpeg/FFV1/ 167 blob/8ad772b6d61c3dd8b0171979a2cd9f11924d5532/ffv1.lyx). 169 * Version 3 of FFV1 adds several features such as increased 170 description of the characteristics of the encoding images and 171 embedded CRC data to support fixity verification of the encoding. 172 Version 3 has been in non-experimental use since August 17, 2013 173 [FFV1_V3]. 175 The latest version of this document is available at 176 https://raw.github.com/FFmpeg/FFV1/master/ffv1.md 177 (https://raw.github.com/FFmpeg/FFV1/master/ffv1.md) 179 This document assumes familiarity with mathematical and coding 180 concepts such as Range coding [range-coding] and YCbCr color spaces 181 [YCbCr]. 183 2. Notation and Conventions 185 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 186 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 187 document are to be interpreted as described in [RFC2119]. 189 2.1. Definitions 191 "Container": Format that encapsulates "Frames" (see the section on 192 Frames (#frame)) and (when required) a "Configuration Record" into a 193 bitstream. 195 "Sample": The smallest addressable representation of a color 196 component or a luma component in a "Frame". Examples of "Sample" are 197 Luma, Blue Chrominance, Red Chrominance, Transparency, Red, Green, 198 and Blue. 200 "Plane": A discrete component of a static image comprised of 201 "Samples" that represent a specific quantification of "Samples" of 202 that image. 204 "Pixel": The smallest addressable representation of a color in a 205 "Frame". It is composed of 1 or more "Samples". 207 "ESC": An ESCape symbol to indicate that the symbol to be stored is 208 too large for normal storage and that an alternate storage method is 209 used. 211 "MSB": Most Significant Bit, the bit that can cause the largest 212 change in magnitude of the symbol. 214 "RCT": Reversible Color Transform, a near linear, exactly reversible 215 integer transform that converts between RGB and YCbCr representations 216 of a "Pixel". 218 "VLC": Variable Length Code, a code that maps source symbols to a 219 variable number of bits. 221 "RGB": A reference to the method of storing the value of a "Pixel" by 222 using three numeric values that represent Red, Green, and Blue. 224 "YCbCr": A reference to the method of storing the value of a "Pixel" 225 by using three numeric values that represent the luma of the "Pixel" 226 (Y) and the chrominance of the "Pixel" (Cb and Cr). YCbCr word is 227 used for historical reasons and currently references any color space 228 relying on 1 luma "Sample" and 2 chrominance "Samples", e.g. YCbCr, 229 YCgCo or ICtCp. The exact meaning of the three numeric values is 230 unspecified. 232 "TBA": To Be Announced. Used in reference to the development of 233 future iterations of the FFV1 specification. 235 2.2. Conventions 236 2.2.1. Pseudo-code 238 The FFV1 bitstream is described in this document using pseudo-code. 239 Note that the pseudo-code is used for clarity in order to illustrate 240 the structure of FFV1 and not intended to specify any particular 241 implementation. The pseudo-code used is based upon the C programming 242 language [ISO.9899.1990] and uses its "if/else", "while" and "for" 243 functions as well as functions defined within this document. 245 2.2.2. Arithmetic Operators 247 Note: the operators and the order of precedence are the same as used 248 in the C programming language [ISO.9899.1990]. 250 "a + b" means a plus b. 252 "a - b" means a minus b. 254 "-a" means negation of a. 256 "a * b" means a multiplied by b. 258 "a / b" means a divided by b. 260 "a ^ b" means a raised to the b-th power. 262 "a & b" means bit-wise "and" of a and b. 264 "a | b" means bit-wise "or" of a and b. 266 "a >> b" means arithmetic right shift of two's complement integer 267 representation of a by b binary digits. 269 "a << b" means arithmetic left shift of two's complement integer 270 representation of a by b binary digits. 272 2.2.3. Assignment Operators 274 "a = b" means a is assigned b. 276 "a++" is equivalent to a is assigned a + 1. 278 "a--" is equivalent to a is assigned a - 1. 280 "a += b" is equivalent to a is assigned a + b. 282 "a -= b" is equivalent to a is assigned a - b. 284 "a *= b" is equivalent to a is assigned a * b. 286 2.2.4. Comparison Operators 288 "a > b" means a is greater than b. 290 "a >= b" means a is greater than or equal to b. 292 "a < b" means a is less than b. 294 "a <= b" means a is less than or equal b. 296 "a == b" means a is equal to b. 298 "a != b" means a is not equal to b. 300 "a && b" means Boolean logical "and" of a and b. 302 "a || b" means Boolean logical "or" of a and b. 304 "!a" means Boolean logical "not" of a. 306 "a ? b : c" if a is true, then b, otherwise c. 308 2.2.5. Mathematical Functions 310 floor(a) the largest integer less than or equal to a 312 ceil(a) the smallest integer greater than or equal to a 314 sign(a) extracts the sign of a number, i.e. if a < 0 then -1, else if 315 a > 0 then 1, else 0 317 abs(a) the absolute value of a, i.e. abs(a) = sign(a)*a 319 log2(a) the base-two logarithm of a 321 min(a,b) the smallest of two values a and b 323 max(a,b) the largest of two values a and b 325 median(a,b,c) the numerical middle value in a data set of a, b, and 326 c, i.e. a+b+c-min(a,b,c)-max(a,b,c) 328 a_(b) the b-th value of a sequence of a 330 a~b,c. the 'b,c'-th value of a sequence of a 332 2.2.6. Order of Operation Precedence 334 When order of precedence is not indicated explicitly by use of 335 parentheses, operations are evaluated in the following order (from 336 top to bottom, operations of same precedence being evaluated from 337 left to right). This order of operations is based on the order of 338 operations used in Standard C. 340 a++, a-- 341 !a, -a 342 a ^ b 343 a * b, a / b, a % b 344 a + b, a - b 345 a << b, a >> b 346 a < b, a <= b, a > b, a >= b 347 a == b, a != b 348 a & b 349 a | b 350 a && b 351 a || b 352 a ? b : c 353 a = b, a += b, a -= b, a *= b 355 2.2.7. Range 357 "a...b" means any value starting from a to b, inclusive. 359 2.2.8. NumBytes 361 "NumBytes" is a non-negative integer that expresses the size in 8-bit 362 octets of a particular FFV1 "Configuration Record" or "Frame". FFV1 363 relies on its "Container" to store the "NumBytes" values, see the 364 section on the Mapping FFV1 into Containers (#mapping-ffv1-into- 365 containers). 367 2.2.9. Bitstream Functions 369 2.2.9.1. remaining_bits_in_bitstream 371 "remaining_bits_in_bitstream( )" means the count of remaining bits 372 after the pointer in that "Configuration Record" or "Frame". It is 373 computed from the "NumBytes" value multiplied by 8 minus the count of 374 bits of that "Configuration Record" or "Frame" already read by the 375 bitstream parser. 377 2.2.9.2. remaining_symbols_in_syntax 379 "remaining_symbols_in_syntax( )" is true as long as the RangeCoder 380 has not consumed all the given input bytes. 382 2.2.9.3. byte_aligned 384 "byte_aligned( )" is true if "remaining_bits_in_bitstream( NumBytes 385 )" is a multiple of 8, otherwise false. 387 2.2.9.4. get_bits 389 "get_bits( i )" is the action to read the next "i" bits in the 390 bitstream, from most significant bit to least significant bit, and to 391 return the corresponding value. The pointer is increased by "i". 393 3. Sample Coding 395 For each "Slice" (as described in the section on Slices (#slice)) of 396 a "Frame", the "Planes", "Lines", and "Samples" are coded in an order 397 determined by the "Color Space" (see the section on Color Space 398 (#color-spaces)). Each "Sample" is predicted by the median predictor 399 as described in the section of the Median Predictor (#median- 400 predictor) from other "Samples" within the same "Plane" and the 401 difference is stored using the method described in Coding of the 402 Sample Difference (#coding-of-the-sample-difference). 404 3.1. Border 406 A border is assumed for each coded "Slice" for the purpose of the 407 median predictor and context according to the following rules: 409 * one column of "Samples" to the left of the coded slice is assumed 410 as identical to the "Samples" of the leftmost column of the coded 411 slice shifted down by one row. The value of the topmost "Sample" 412 of the column of "Samples" to the left of the coded slice is 413 assumed to be "0" 415 * one column of "Samples" to the right of the coded slice is assumed 416 as identical to the "Samples" of the rightmost column of the coded 417 slice 419 * an additional column of "Samples" to the left of the coded slice 420 and two rows of "Samples" above the coded slice are assumed to be 421 "0" 423 The following table depicts a slice of 9 "Samples" 424 "a,b,c,d,e,f,g,h,i" in a 3x3 arrangement along with its assumed 425 border. 427 +---+---+---+---+---+---+---+---+ 428 | 0 | 0 | | 0 | 0 | 0 | | 0 | 429 +---+---+---+---+---+---+---+---+ 430 | 0 | 0 | | 0 | 0 | 0 | | 0 | 431 +---+---+---+---+---+---+---+---+ 432 | | | | | | | | | 433 +---+---+---+---+---+---+---+---+ 434 | 0 | 0 | | a | b | c | | c | 435 +---+---+---+---+---+---+---+---+ 436 | 0 | a | | d | e | f | | f | 437 +---+---+---+---+---+---+---+---+ 438 | 0 | d | | g | h | i | | i | 439 +---+---+---+---+---+---+---+---+ 441 3.2. Samples 443 Relative to any "Sample" "X", six other relatively positioned 444 "Samples" from the coded "Samples" and presumed border are identified 445 according to the labels used in the following diagram. The labels 446 for these relatively positioned "Samples" are used within the median 447 predictor and context. 449 +---+---+---+---+ 450 | | | T | | 451 +---+---+---+---+ 452 | |tl | t |tr | 453 +---+---+---+---+ 454 | L | l | X | | 455 +---+---+---+---+ 457 The labels for these relative "Samples" are made of the first letters 458 of the words Top, Left and Right. 460 3.3. Median Predictor 462 The prediction for any "Sample" value at position "X" may be computed 463 based upon the relative neighboring values of "l", "t", and "tl" via 464 this equation: 466 "median(l, t, l + t - tl)". 468 Note, this prediction template is also used in [ISO.14495-1.1999] and 469 [HuffYUV]. 471 Exception for the median predictor: if "colorspace_type == 0 && 472 bits_per_raw_sample == 16 && ( coder_type == 1 || coder_type == 2 )", 473 the following median predictor MUST be used: 475 "median(left16s, top16s, left16s + top16s - diag16s)" 477 where: 479 left16s = l >= 32768 ? ( l - 65536 ) : l 480 top16s = t >= 32768 ? ( t - 65536 ) : t 481 diag16s = tl >= 32768 ? ( tl - 65536 ) : tl 483 Background: a two's complement signed 16-bit signed integer was used 484 for storing "Sample" values in all known implementations of FFV1 485 bitstream. So in some circumstances, the most significant bit was 486 wrongly interpreted (used as a sign bit instead of the 16th bit of an 487 unsigned integer). Note that when the issue is discovered, the only 488 configuration of all known implementations being impacted is 16-bit 489 YCbCr with no Pixel transformation with Range Coder coder, as other 490 potentially impacted configurations (e.g. 15/16-bit JPEG2000-RCT with 491 Range Coder coder, or 16-bit content with Golomb Rice coder) were 492 implemented nowhere [ISO.15444-1.2016]. In the meanwhile, 16-bit 493 JPEG2000-RCT with Range Coder coder was implemented without this 494 issue in one implementation and validated by one conformance checker. 495 It is expected (to be confirmed) to remove this exception for the 496 median predictor in the next version of the FFV1 bitstream. 498 3.4. Context 500 Relative to any "Sample" "X", the Quantized Sample Differences "L-l", 501 "l-tl", "tl-t", "T-t", and "t-tr" are used as context: 503 context = Q_{0}[l - tl] + 504 Q_{1}[tl - t] + 505 Q_{2}[t - tr] + 506 Q_{3}[L - l] + 507 Q_{4}[T - t] 509 Figure 1 511 If "context >= 0" then "context" is used and the difference between 512 the "Sample" and its predicted value is encoded as is, else 513 "-context" is used and the difference between the "Sample" and its 514 predicted value is encoded with a flipped sign. 516 3.5. Quantization Table Sets 518 The FFV1 bitstream contains 1 or more Quantization Table Sets. Each 519 Quantization Table Set contains exactly 5 Quantization Tables with 520 each Quantization Table corresponding to 1 of the 5 Quantized Sample 521 Differences. For each Quantization Table, both the number of 522 quantization steps and their distribution are stored in the FFV1 523 bitstream; each Quantization Table has exactly 256 entries, and the 8 524 least significant bits of the Quantized Sample Difference are used as 525 index: 527 Q_{j}[k] = quant_tables[i][j][k&255] 529 Figure 2 531 In this formula, "i" is the Quantization Table Set index, "j" is the 532 Quantized Table index, "k" the Quantized Sample Difference. 534 3.6. Quantization Table Set Indexes 536 For each "Plane" of each slice, a Quantization Table Set is selected 537 from an index: 539 * For Y "Plane", "quant_table_set_index[ 0 ]" index is used 541 * For Cb and Cr "Planes", "quant_table_set_index[ 1 ]" index is used 543 * For extra "Plane", "quant_table_set_index[ (version <= 3 || 544 chroma_planes) ? 2 : 1 ]" index is used 546 Background: in first implementations of FFV1 bitstream, the index for 547 Cb and Cr "Planes" was stored even if it is not used (chroma_planes 548 set to 0), this index is kept for version <= 3 in order to keep 549 compatibility with FFV1 bitstreams in the wild. 551 3.7. Color spaces 553 FFV1 supports several color spaces. The count of allowed coded 554 planes and the meaning of the extra "Plane" are determined by the 555 selected color space. 557 The FFV1 bitstream interleaves data in an order determined by the 558 color space. In YCbCr for each "Plane", each "Line" is coded from 559 top to bottom and for each "Line", each "Sample" is coded from left 560 to right. In JPEG2000-RCT for each "Line" from top to bottom, each 561 "Plane" is coded and for each "Plane", each "Sample" is encoded from 562 left to right. 564 3.7.1. YCbCr 566 This color space allows 1 to 4 "Planes". 568 The Cb and Cr "Planes" are optional, but if used then MUST be used 569 together. Omitting the Cb and Cr "Planes" codes the frames in 570 grayscale without color data. 572 An optional transparency "Plane" can be used to code transparency 573 data. 575 An FFV1 "Frame" using YCbCr MUST use one of the following 576 arrangements: 578 * Y 580 * Y, Transparency 582 * Y, Cb, Cr 584 * Y, Cb, Cr, Transparency 586 The Y "Plane" MUST be coded first. If the Cb and Cr "Planes" are 587 used then they MUST be coded after the Y "Plane". If a transparency 588 "Plane" is used, then it MUST be coded last. 590 3.7.2. RGB 592 This color space allows 3 or 4 "Planes". 594 An optional transparency "Plane" can be used to code transparency 595 data. 597 JPEG2000-RCT is a Reversible Color Transform that codes RGB (red, 598 green, blue) "Planes" losslessly in a modified YCbCr color space 599 [ISO.15444-1.2016]. Reversible Pixel transformations between YCbCr 600 and RGB use the following formulae. 602 Cb=b-g 603 Cr=r-g 604 Y=g+(Cb+Cr)>>2 605 g=Y-(Cb+Cr)>>2 606 r=Cr+g 607 b=Cb+g 609 Figure 3 611 Exception for the JPEG2000-RCT conversion: if bits_per_raw_sample is 612 between 9 and 15 inclusive and extra_plane is 0, the following 613 formulae for reversible conversions between YCbCr and RGB MUST be 614 used instead of the ones above: 616 Cb=g-b 617 Cr=r-b 618 Y=b+(Cb+Cr)>>2 619 b=Y-(Cb+Cr)>>2 620 r=Cr+b 621 g=Cb+b 623 Figure 4 625 Background: At the time of this writing, in all known implementations 626 of FFV1 bitstream, when bits_per_raw_sample was between 9 and 15 627 inclusive and extra_plane is 0, GBR "Planes" were used as BGR 628 "Planes" during both encoding and decoding. In the meanwhile, 16-bit 629 JPEG2000-RCT was implemented without this issue in one implementation 630 and validated by one conformance checker. Methods to address this 631 exception for the transform are under consideration for the next 632 version of the FFV1 bitstream. 634 When FFV1 uses the JPEG2000-RCT, the horizontal "Lines" are 635 interleaved to improve caching efficiency since it is most likely 636 that the JPEG2000-RCT will immediately be converted to RGB during 637 decoding. The interleaved coding order is also Y, then Cb, then Cr, 638 and then if used transparency. 640 As an example, a "Frame" that is two "Pixels" wide and two "Pixels" 641 high, could be comprised of the following structure: 643 +------------------------+------------------------+ 644 | Pixel(1,1) | Pixel(2,1) | 645 | Y(1,1) Cb(1,1) Cr(1,1) | Y(2,1) Cb(2,1) Cr(2,1) | 646 +------------------------+------------------------+ 647 | Pixel(1,2) | Pixel(2,2) | 648 | Y(1,2) Cb(1,2) Cr(1,2) | Y(2,2) Cb(2,2) Cr(2,2) | 649 +------------------------+------------------------+ 651 In JPEG2000-RCT, the coding order would be left to right and then top 652 to bottom, with values interleaved by "Lines" and stored in this 653 order: 655 Y(1,1) Y(2,1) Cb(1,1) Cb(2,1) Cr(1,1) Cr(2,1) Y(1,2) Y(2,2) Cb(1,2) 656 Cb(2,2) Cr(1,2) Cr(2,2) 658 3.8. Coding of the Sample Difference 660 Instead of coding the n+1 bits of the Sample Difference with Huffman 661 or Range coding (or n+2 bits, in the case of JPEG2000-RCT), only the 662 n (or n+1, in the case of JPEG2000-RCT) least significant bits are 663 used, since this is sufficient to recover the original "Sample". In 664 the equation below, the term "bits" represents bits_per_raw_sample+1 665 for JPEG2000-RCT or bits_per_raw_sample otherwise: 667 coder_input = 668 [(sample_difference + 2^(bits-1)) & (2^bits - 1)] - 2^(bits-1) 670 Figure 5 672 3.8.1. Range Coding Mode 674 Early experimental versions of FFV1 used the CABAC Arithmetic coder 675 from H.264 as defined in [ISO.14496-10.2014] but due to the uncertain 676 patent/royalty situation, as well as its slightly worse performance, 677 CABAC was replaced by a Range coder based on an algorithm defined by 678 G. Nigel and N. Martin in 1979 [range-coding]. 680 3.8.1.1. Range Binary Values 682 To encode binary digits efficiently a Range coder is used. "C~i~" is 683 the i-th Context. "B~i~" is the i-th byte of the bytestream. "b~i~" 684 is the i-th Range coded binary value, "S~0,i~" is the i-th initial 685 state. The length of the bytestream encoding n binary symbols is 686 "j~n~" bytes. 688 r_{i} = floor( ( R_{i} * S_{i,C_{i}} ) / 2^8 ) 690 Figure 6 692 S_{i+1,C_{i}} = zero_state_{S_{i,C_{i}}} XOR 693 l_i = L_i XOR 694 t_i = R_i - r_i <== 695 b_i = 0 <==> 696 L_i < R_i - r_i 698 S_{i+1,C_{i}} = one_state_{S_{i,C_{i}}} XOR 699 l_i = L_i - R_i + r_i XOR 700 t_i = r_i <== 701 b_i = 1 <==> 702 L_i >= R_i - r_i 704 Figure 7 706 S_{i+1,k} = S_{i,k} <== C_i != k 708 Figure 8 710 R_{i+1} = 2^8 * t_{i} XOR 711 L_{i+1} = 2^8 * l_{i} + B_{j_{i}} XOR 712 j_{i+1} = j_{i} + 1 <== 713 t_{i} < 2^8 715 R_{i+1} = t_{i} XOR 716 L_{i+1} = l_{i} XOR 717 j_{i+1} = j_{i} <== 718 t_{i} >= 2^8 720 Figure 9 722 R_{0} = 65280 724 Figure 10 726 L_{0} = 2^8 * B_{0} + B_{1} 728 Figure 11 730 j_{0} = 2 732 Figure 12 734 3.8.1.1.1. Termination 736 The range coder can be used in 3 modes. 738 * In "Open mode" when decoding, every symbol the reader attempts to 739 read is available. In this mode arbitrary data can have been 740 appended without affecting the range coder output. This mode is 741 not used in FFV1. 743 * In "Closed mode" the length in bytes of the bytestream is provided 744 to the range decoder. Bytes beyond the length are read as 0 by 745 the range decoder. This is generally 1 byte shorter than the open 746 mode. 748 * In "Sentinel mode" the exact length in bytes is not known and thus 749 the range decoder MAY read into the data that follows the range 750 coded bytestream by one byte. In "Sentinel mode", the end of the 751 range coded bytestream is a binary symbol with state 129, which 752 value SHALL be discarded. After reading this symbol, the range 753 decoder will have read one byte beyond the end of the range coded 754 bytestream. This way the byte position of the end can be 755 determined. Bytestreams written in "Sentinel mode" can be read in 756 "Closed mode" if the length can be determined, in this case the 757 last (sentinel) symbol will be read non-corrupted and be of value 758 0. 760 Above describes the range decoding, encoding is defined as any 761 process which produces a decodable bytestream. 763 There are 3 places where range coder termination is needed in FFV1. 764 First is in the "Configuration Record", in this case the size of the 765 range coded bytestream is known and handled as "Closed mode". Second 766 is the switch from the "Slice Header" which is range coded to Golomb 767 coded slices as "Sentinel mode". Third is the end of range coded 768 Slices which need to terminate before the CRC at their end. This can 769 be handled as "Sentinel mode" or as "Closed mode" if the CRC position 770 has been determined. 772 3.8.1.2. Range Non Binary Values 774 To encode scalar integers, it would be possible to encode each bit 775 separately and use the past bits as context. However that would mean 776 255 contexts per 8-bit symbol that is not only a waste of memory but 777 also requires more past data to reach a reasonably good estimate of 778 the probabilities. Alternatively assuming a Laplacian distribution 779 and only dealing with its variance and mean (as in Huffman coding) 780 would also be possible, however, for maximum flexibility and 781 simplicity, the chosen method uses a single symbol to encode if a 782 number is 0, and if not, encodes the number using its exponent, 783 mantissa and sign. The exact contexts used are best described by the 784 following code, followed by some comments. 786 pseudo-code | type --------------------------------------------------------------|----- void put_symbol(RangeCoder *c, uint8_t *state, int v, int \ | is_signed) { | int i; | put_rac(c, state+0, !v); | if (v) { | int a= abs(v); | int e= log2(a); | | for (i = 0; i < e; i++) { | put_rac(c, state+1+min(i,9), 1); //1..10 | } | | put_rac(c, state+1+min(i,9), 0); | for (i = e-1; i >= 0; i--) { | put_rac(c, state+22+min(i,9), (a>>i)&1); //22..31 | } | | if (is_signed) { | put_rac(c, state+11 + min(e, 10), v < 0); //11..21| } | } | } | 788 3.8.1.3. Initial Values for the Context Model 790 At keyframes all Range coder state variables are set to their initial 791 state. 793 3.8.1.4. State Transition Table 795 one_state_{i} = 796 default_state_transition_{i} + state_transition_delta_{i} 798 Figure 13 800 zero_state_{i} = 256 - one_state_{256-i} 801 Figure 14 803 3.8.1.5. default_state_transition 805 0, 0, 0, 0, 0, 0, 0, 0, 20, 21, 22, 23, 24, 25, 26, 27, 807 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 37, 38, 39, 40, 41, 42, 809 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 56, 57, 811 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 813 74, 75, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 815 89, 90, 91, 92, 93, 94, 94, 95, 96, 97, 98, 99,100,101,102,103, 817 104,105,106,107,108,109,110,111,112,113,114,114,115,116,117,118, 819 119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,133, 821 134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149, 823 150,151,152,152,153,154,155,156,157,158,159,160,161,162,163,164, 825 165,166,167,168,169,170,171,171,172,173,174,175,176,177,178,179, 827 180,181,182,183,184,185,186,187,188,189,190,190,191,192,194,194, 829 195,196,197,198,199,200,201,202,202,204,205,206,207,208,209,209, 831 210,211,212,213,215,215,216,217,218,219,220,220,222,223,224,225, 833 226,227,227,229,229,230,231,232,234,234,235,236,237,238,239,240, 835 241,242,243,244,245,246,247,248,248, 0, 0, 0, 0, 0, 0, 0, 837 3.8.1.6. Alternative State Transition Table 839 The alternative state transition table has been built using iterative 840 minimization of frame sizes and generally performs better than the 841 default. To use it, the coder_type (see the section on coder_type 842 (#codertype)) MUST be set to 2 and the difference to the default MUST 843 be stored in the "Parameters", see the section on Parameters 844 (#parameters). The reference implementation of FFV1 in FFmpeg uses 845 this table by default at the time of this writing when Range coding 846 is used. 848 0, 10, 10, 10, 10, 16, 16, 16, 28, 16, 16, 29, 42, 49, 20, 49, 850 59, 25, 26, 26, 27, 31, 33, 33, 33, 34, 34, 37, 67, 38, 39, 39, 852 40, 40, 41, 79, 43, 44, 45, 45, 48, 48, 64, 50, 51, 52, 88, 52, 854 53, 74, 55, 57, 58, 58, 74, 60,101, 61, 62, 84, 66, 66, 68, 69, 856 87, 82, 71, 97, 73, 73, 82, 75,111, 77, 94, 78, 87, 81, 83, 97, 858 85, 83, 94, 86, 99, 89, 90, 99,111, 92, 93,134, 95, 98,105, 98, 860 105,110,102,108,102,118,103,106,106,113,109,112,114,112,116,125, 862 115,116,117,117,126,119,125,121,121,123,145,124,126,131,127,129, 864 165,130,132,138,133,135,145,136,137,139,146,141,143,142,144,148, 866 147,155,151,149,151,150,152,157,153,154,156,168,158,162,161,160, 868 172,163,169,164,166,184,167,170,177,174,171,173,182,176,180,178, 870 175,189,179,181,186,183,192,185,200,187,191,188,190,197,193,196, 872 197,194,195,196,198,202,199,201,210,203,207,204,205,206,208,214, 874 209,211,221,212,213,215,224,216,217,218,219,220,222,228,223,225, 876 226,224,227,229,240,230,231,232,233,234,235,236,238,239,237,242, 878 241,243,242,244,245,246,247,248,249,250,251,252,252,253,254,255, 880 3.8.2. Golomb Rice Mode 882 The end of the bitstream of the "Frame" is filled with 0-bits until 883 that the bitstream contains a multiple of 8 bits. 885 3.8.2.1. Signed Golomb Rice Codes 887 This coding mode uses Golomb Rice codes. The VLC is split into 2 888 parts, the prefix stores the most significant bits and the suffix 889 stores the k least significant bits or stores the whole number in the 890 ESC case. 892 pseudo-code | type --------------------------------------------------------------|----- int get_ur_golomb(k) { | for (prefix = 0; prefix < 12; prefix++) { | if (get_bits(1)) { | return get_bits(k) + (prefix << k) | } | } | return get_bits(bits) + 11 | } | | int get_sr_golomb(k) { | v = get_ur_golomb(k); | if (v & 1) return - (v >> 1) - 1; | else return (v >> 1); | } 894 3.8.2.1.1. Prefix 896 +----------------+-------+ 897 | bits | value | 898 +================+=======+ 899 | 1 | 0 | 900 +----------------+-------+ 901 | 01 | 1 | 902 +----------------+-------+ 903 | ... | ... | 904 +----------------+-------+ 905 | 0000 0000 0001 | 11 | 906 +----------------+-------+ 907 | 0000 0000 0000 | ESC | 908 +----------------+-------+ 910 Table 1 912 3.8.2.1.2. Suffix 914 +---------+--------------------------------------------------+ 915 +=========+==================================================+ 916 | non ESC | the k least significant bits MSB first | 917 +---------+--------------------------------------------------+ 918 | ESC | the value - 11, in MSB first order, ESC may only | 919 | | be used if the value cannot be coded as non ESC | 920 +---------+--------------------------------------------------+ 922 Table 2 924 3.8.2.1.3. Examples 926 +-----+-------------------------+-------+ 927 | k | bits | value | 928 +=====+=========================+=======+ 929 | 0 | "1" | 0 | 930 +-----+-------------------------+-------+ 931 | 0 | "001" | 2 | 932 +-----+-------------------------+-------+ 933 | 2 | "1 00" | 0 | 934 +-----+-------------------------+-------+ 935 | 2 | "1 10" | 2 | 936 +-----+-------------------------+-------+ 937 | 2 | "01 01" | 5 | 938 +-----+-------------------------+-------+ 939 | any | "000000000000 10000000" | 139 | 940 +-----+-------------------------+-------+ 942 Table 3 944 3.8.2.2. Run Mode 946 Run mode is entered when the context is 0 and left as soon as a non-0 947 difference is found. The level is identical to the predicted one. 948 The run and the first different level are coded. 950 3.8.2.2.1. Run Length Coding 952 The run value is encoded in 2 parts, the prefix part stores the more 953 significant part of the run as well as adjusting the run_index that 954 determines the number of bits in the less significant part of the 955 run. The 2nd part of the value stores the less significant part of 956 the run as it is. The run_index is reset for each "Plane" and slice 957 to 0. 959 pseudo-code | type --------------------------------------------------------------|----- log2_run[41]={ | 0, 0, 0, 0, 1, 1, 1, 1, | 2, 2, 2, 2, 3, 3, 3, 3, | 4, 4, 5, 5, 6, 6, 7, 7, | 8, 9,10,11,12,13,14,15, | 16,17,18,19,20,21,22,23, | 24, | }; | | if (run_count == 0 && run_mode == 1) { | if (get_bits(1)) { | run_count = 1 << log2_run[run_index]; | if (x + run_count <= w) { | run_index++; | } | } else { | if (log2_run[run_index]) { | run_count = get_bits(log2_run[run_index]); | } else { | run_count = 0; | } | if (run_index) { | run_index--; | } | run_mode = 2; | } | } | 961 The log2_run function is also used within [ISO.14495-1.1999]. 963 3.8.2.2.2. Level Coding 965 Level coding is identical to the normal difference coding with the 966 exception that the 0 value is removed as it cannot occur: 968 diff = get_vlc_symbol(context_state); if (diff >= 0) { diff++; } 970 Note, this is different from JPEG-LS, which doesn't use prediction in 971 run mode and uses a different encoding and context model for the last 972 difference On a small set of test "Samples" the use of prediction 973 slightly improved the compression rate. 975 3.8.2.3. Scalar Mode 977 Each difference is coded with the per context mean prediction removed 978 and a per context value for k. 980 get_vlc_symbol(state) { i = state->count; k = 0; while (i < state->error_sum) { k++; i += i; } v = get_sr_golomb(k); if (2 * state->drift < -state->count) { v = -1 - v; } ret = sign_extend(v + state->bias, bits); state->error_sum += abs(v); state->drift += v; if (state->count == 128) { state->count >>= 1; state->drift >>= 1; state->error_sum >>= 1; } state->count++; if (state->drift <= -state->count) { state->bias = max(state->bias - 1, -128); state->drift = max(state->drift + state->count, -state->count + 1); } else if (state->drift > 0) { state->bias = min(state->bias + 1, 127); state->drift = min(state->drift - state->count, 0); } return ret; } 982 3.8.2.4. Initial Values for the VLC context state 984 At keyframes all coder state variables are set to their initial 985 state. 987 drift = 0; error_sum = 4; bias = 0; count = 1; 989 4. Bitstream 991 An FFV1 bitstream is composed of a series of 1 or more "Frames" and 992 (when required) a "Configuration Record". 994 Within the following sub-sections, pseudo-code is used to explain the 995 structure of each FFV1 bitstream component, as described in the 996 section on Pseudo-Code (#pseudocode). The following table lists 997 symbols used to annotate that pseudo-code in order to define the 998 storage of the data referenced in that line of pseudo-code. 1000 +--------+-------------------------------------------+ 1001 | Symbol | Definition | 1002 +========+===========================================+ 1003 | u(n) | unsigned big endian integer using n bits | 1004 +--------+-------------------------------------------+ 1005 | sg | Golomb Rice coded signed scalar symbol | 1006 | | coded with the method described in Signed | 1007 | | Golomb Rice Codes (#golomb-rice-mode) | 1008 +--------+-------------------------------------------+ 1009 | br | Range coded Boolean (1-bit) symbol with | 1010 | | the method described in Range binary | 1011 | | values (#range-binary-values) | 1012 +--------+-------------------------------------------+ 1013 | ur | Range coded unsigned scalar symbol coded | 1014 | | with the method described in Range non | 1015 | | binary values (#range-non-binary-values) | 1016 +--------+-------------------------------------------+ 1017 | sr | Range coded signed scalar symbol coded | 1018 | | with the method described in Range non | 1019 | | binary values (#range-non-binary-values) | 1020 +--------+-------------------------------------------+ 1022 Table 4 1024 The same context that is initialized to 128 is used for all fields in 1025 the header. 1027 The following MUST be provided by external means during 1028 initialization of the decoder: 1030 "frame_pixel_width" is defined as "Frame" width in "Pixels". 1032 "frame_pixel_height" is defined as "Frame" height in "Pixels". 1034 Default values at the decoder initialization phase: 1036 "ConfigurationRecordIsPresent" is set to 0. 1038 4.1. Parameters 1040 The "Parameters" section contains significant characteristics about 1041 the decoding configuration used for all instances of "Frame" (in FFV1 1042 version 0 and 1) or the whole FFV1 bitstream (other versions), 1043 including the stream version, color configuration, and quantization 1044 tables. The pseudo-code below describes the contents of the 1045 bitstream. 1047 pseudo-code | type --------------------------------------------------------------|----- Parameters( ) { | version | ur if (version >= 3) { | micro_version | ur } | coder_type | ur if (coder_type > 1) { | for (i = 1; i < 256; i++) { | state_transition_delta[ i ] | sr } | } | colorspace_type | ur if (version >= 1) { | bits_per_raw_sample | ur } | chroma_planes | br log2_h_chroma_subsample | ur log2_v_chroma_subsample | ur extra_plane | br if (version >= 3) { | num_h_slices - 1 | ur num_v_slices - 1 | ur quant_table_set_count | ur } | for (i = 0; i < quant_table_set_count; i++) { | QuantizationTableSet( i ) | } | if (version >= 3) { | for (i = 0; i < quant_table_set_count; i++) { | states_coded | br if (states_coded) { | for (j = 0; j < context_count[ i ]; j++) { | for (k = 0; k < CONTEXT_SIZE; k++) { | initial_state_delta[ i ][ j ][ k ] | sr } | } | } | } | ec | ur intra | ur } | } | 1049 4.1.1. version 1051 "version" specifies the version of the FFV1 bitstream. 1053 Each version is incompatible with other versions: decoders SHOULD 1054 reject a file due to an unknown version. 1056 Decoders SHOULD reject a file with version <= 1 && 1057 ConfigurationRecordIsPresent == 1. 1059 Decoders SHOULD reject a file with version >= 3 && 1060 ConfigurationRecordIsPresent == 0. 1062 +-------+-------------------------+ 1063 | value | version | 1064 +=======+=========================+ 1065 | 0 | FFV1 version 0 | 1066 +-------+-------------------------+ 1067 | 1 | FFV1 version 1 | 1068 +-------+-------------------------+ 1069 | 2 | reserved* | 1070 +-------+-------------------------+ 1071 | 3 | FFV1 version 3 | 1072 +-------+-------------------------+ 1073 | Other | reserved for future use | 1074 +-------+-------------------------+ 1076 Table 5 1078 * Version 2 was never enabled in the encoder thus version 2 files 1079 SHOULD NOT exist, and this document does not describe them to keep 1080 the text simpler. 1082 4.1.2. micro_version 1084 "micro_version" specifies the micro-version of the FFV1 bitstream. 1086 After a version is considered stable (a micro-version value is 1087 assigned to be the first stable variant of a specific version), each 1088 new micro-version after this first stable variant is compatible with 1089 the previous micro-version: decoders SHOULD NOT reject a file due to 1090 an unknown micro-version equal or above the micro-version considered 1091 as stable. 1093 Meaning of micro_version for version 3: 1095 +-------+-------------------------+ 1096 | value | micro_version | 1097 +=======+=========================+ 1098 | 0...3 | reserved* | 1099 +-------+-------------------------+ 1100 | 4 | first stable variant | 1101 +-------+-------------------------+ 1102 | Other | reserved for future use | 1103 +-------+-------------------------+ 1105 Table 6 1107 * development versions may be incompatible with the stable variants. 1109 4.1.3. coder_type 1111 "coder_type" specifies the coder used. 1113 +-------+-------------------------------------------------+ 1114 | value | coder used | 1115 +=======+=================================================+ 1116 | 0 | Golomb Rice | 1117 +-------+-------------------------------------------------+ 1118 | 1 | Range Coder with default state transition table | 1119 +-------+-------------------------------------------------+ 1120 | 2 | Range Coder with custom state transition table | 1121 +-------+-------------------------------------------------+ 1122 | Other | reserved for future use | 1123 +-------+-------------------------------------------------+ 1125 Table 7 1127 4.1.4. state_transition_delta 1129 "state_transition_delta" specifies the Range coder custom state 1130 transition table. 1132 If state_transition_delta is not present in the FFV1 bitstream, all 1133 Range coder custom state transition table elements are assumed to be 1134 0. 1136 4.1.5. colorspace_type 1138 "colorspace_type" specifies the color space encoded, the pixel 1139 transformation used by the encoder, the extra plane content, as well 1140 as interleave method. 1142 +-------+-------------+----------------+--------------+-------------+ 1143 | value | color space | pixel | extra plane | interleave | 1144 | | encoded | transformation | content | method | 1145 +=======+=============+================+==============+=============+ 1146 | 0 | YCbCr | None | Transparency | "Plane" | 1147 | | | | | then | 1148 | | | | | "Line" | 1149 +-------+-------------+----------------+--------------+-------------+ 1150 | 1 | RGB | JPEG2000-RCT | Transparency | "Line" | 1151 | | | | | then | 1152 | | | | | "Plane" | 1153 +-------+-------------+----------------+--------------+-------------+ 1154 | Other | reserved | reserved for | reserved for | reserved | 1155 | | for future | future use | future use | for future | 1156 | | use | | | use | 1157 +-------+-------------+----------------+--------------+-------------+ 1159 Table 8 1161 Restrictions: 1163 If "colorspace_type" is 1, then "chroma_planes" MUST be 1, 1164 "log2_h_chroma_subsample" MUST be 0, and "log2_v_chroma_subsample" 1165 MUST be 0. 1167 4.1.6. chroma_planes 1169 "chroma_planes" indicates if chroma (color) "Planes" are present. 1171 +-------+---------------------------------+ 1172 | value | presence | 1173 +=======+=================================+ 1174 | 0 | chroma "Planes" are not present | 1175 +-------+---------------------------------+ 1176 | 1 | chroma "Planes" are present | 1177 +-------+---------------------------------+ 1179 Table 9 1181 4.1.7. bits_per_raw_sample 1183 "bits_per_raw_sample" indicates the number of bits for each "Sample". 1184 Inferred to be 8 if not present. 1186 +-------+-----------------------------------+ 1187 | value | bits for each sample | 1188 +=======+===================================+ 1189 | 0 | reserved* | 1190 +-------+-----------------------------------+ 1191 | Other | the actual bits for each "Sample" | 1192 +-------+-----------------------------------+ 1194 Table 10 1196 * Encoders MUST NOT store bits_per_raw_sample = 0 Decoders SHOULD 1197 accept and interpret bits_per_raw_sample = 0 as 8. 1199 4.1.8. log2_h_chroma_subsample 1201 "log2_h_chroma_subsample" indicates the subsample factor, stored in 1202 powers to which the number 2 must be raised, between luma and chroma 1203 width ("chroma_width = 2^-log2_h_chroma_subsample^ * luma_width"). 1205 4.1.9. log2_v_chroma_subsample 1207 "log2_v_chroma_subsample" indicates the subsample factor, stored in 1208 powers to which the number 2 must be raised, between luma and chroma 1209 height ("chroma_height=2^-log2_v_chroma_subsample^ * luma_height"). 1211 4.1.10. extra_plane 1213 "extra_plane" indicates if an extra "Plane" is present. 1215 +-------+------------------------------+ 1216 | value | presence | 1217 +=======+==============================+ 1218 | 0 | extra "Plane" is not present | 1219 +-------+------------------------------+ 1220 | 1 | extra "Plane" is present | 1221 +-------+------------------------------+ 1223 Table 11 1225 4.1.11. num_h_slices 1227 "num_h_slices" indicates the number of horizontal elements of the 1228 slice raster. 1230 Inferred to be 1 if not present. 1232 4.1.12. num_v_slices 1234 "num_v_slices" indicates the number of vertical elements of the slice 1235 raster. 1237 Inferred to be 1 if not present. 1239 4.1.13. quant_table_set_count 1241 "quant_table_set_count" indicates the number of Quantization 1242 Table Sets. 1244 Inferred to be 1 if not present. 1246 MUST NOT be 0. 1248 4.1.14. states_coded 1250 "states_coded" indicates if the respective Quantization Table Set has 1251 the initial states coded. 1253 Inferred to be 0 if not present. 1255 +-------+--------------------------------+ 1256 | value | initial states | 1257 +=======+================================+ 1258 | 0 | initial states are not present | 1259 | | and are assumed to be all 128 | 1260 +-------+--------------------------------+ 1261 | 1 | initial states are present | 1262 +-------+--------------------------------+ 1264 Table 12 1266 4.1.15. initial_state_delta 1268 "initial_state_delta[ i ][ j ][ k ]" indicates the initial Range 1269 coder state, it is encoded using "k" as context index and 1271 pred = j ? initial_states[ i ][j - 1][ k ] 1273 Figure 15 1275 initial_state[ i ][ j ][ k ] = 1276 ( pred + initial_state_delta[ i ][ j ][ k ] ) & 255 1278 Figure 16 1280 4.1.16. ec 1282 "ec" indicates the error detection/correction type. 1284 +-------+--------------------------------------------+ 1285 | value | error detection/correction type | 1286 +=======+============================================+ 1287 | 0 | 32-bit CRC on the global header | 1288 +-------+--------------------------------------------+ 1289 | 1 | 32-bit CRC per slice and the global header | 1290 +-------+--------------------------------------------+ 1291 | Other | reserved for future use | 1292 +-------+--------------------------------------------+ 1294 Table 13 1296 4.1.17. intra 1298 "intra" indicates the relationship between the instances of "Frame". 1300 Inferred to be 0 if not present. 1302 +-------+-------------------------------------+ 1303 | value | relationship | 1304 +=======+=====================================+ 1305 | 0 | Frames are independent or dependent | 1306 | | (keyframes and non keyframes) | 1307 +-------+-------------------------------------+ 1308 | 1 | Frames are independent (keyframes | 1309 | | only) | 1310 +-------+-------------------------------------+ 1311 | Other | reserved for future use | 1312 +-------+-------------------------------------+ 1314 Table 14 1316 4.2. Configuration Record 1318 In the case of a FFV1 bitstream with "version >= 3", a "Configuration 1319 Record" is stored in the underlying "Container", at the track header 1320 level. It contains the "Parameters" used for all instances of 1321 "Frame". The size of the "Configuration Record", "NumBytes", is 1322 supplied by the underlying "Container". 1324 pseudo-code | type -----------------------------------------------------------|----- ConfigurationRecord( NumBytes ) { | ConfigurationRecordIsPresent = 1 | Parameters( ) | while (remaining_symbols_in_syntax(NumBytes - 4)) { | reserved_for_future_use | br/ur/sr } | configuration_record_crc_parity | u(32) } | 1326 4.2.1. reserved_for_future_use 1328 "reserved_for_future_use" has semantics that are reserved for future 1329 use. 1331 Encoders conforming to this version of this specification SHALL NOT 1332 write this value. 1334 Decoders conforming to this version of this specification SHALL 1335 ignore its value. 1337 4.2.2. configuration_record_crc_parity 1339 "configuration_record_crc_parity" 32 bits that are chosen so that the 1340 "Configuration Record" as a whole has a crc remainder of 0. 1342 This is equivalent to storing the crc remainder in the 32-bit parity. 1344 The CRC generator polynomial used is the standard IEEE CRC polynomial 1345 (0x104C11DB7) with initial value 0. 1347 4.2.3. Mapping FFV1 into Containers 1349 This "Configuration Record" can be placed in any file format 1350 supporting "Configuration Records", fitting as much as possible with 1351 how the file format uses to store "Configuration Records". The 1352 "Configuration Record" storage place and "NumBytes" are currently 1353 defined and supported by this version of this specification for the 1354 following formats: 1356 4.2.3.1. AVI File Format 1358 The "Configuration Record" extends the stream format chunk ("AVI ", 1359 "hdlr", "strl", "strf") with the ConfigurationRecord bitstream. 1361 See [AVI] for more information about chunks. 1363 "NumBytes" is defined as the size, in bytes, of the strf chunk 1364 indicated in the chunk header minus the size of the stream format 1365 structure. 1367 4.2.3.2. ISO Base Media File Format 1369 The "Configuration Record" extends the sample description box 1370 ("moov", "trak", "mdia", "minf", "stbl", "stsd") with a "glbl" box 1371 that contains the ConfigurationRecord bitstream. See 1372 [ISO.14496-12.2015] for more information about boxes. 1374 "NumBytes" is defined as the size, in bytes, of the "glbl" box 1375 indicated in the box header minus the size of the box header. 1377 4.2.3.3. NUT File Format 1379 The codec_specific_data element (in "stream_header" packet) contains 1380 the ConfigurationRecord bitstream. See [NUT] for more information 1381 about elements. 1383 "NumBytes" is defined as the size, in bytes, of the 1384 codec_specific_data element as indicated in the "length" field of 1385 codec_specific_data 1387 4.2.3.4. Matroska File Format 1389 FFV1 SHOULD use "V_FFV1" as the Matroska "Codec ID". For FFV1 1390 versions 2 or less, the Matroska "CodecPrivate" Element SHOULD NOT be 1391 used. For FFV1 versions 3 or greater, the Matroska "CodecPrivate" 1392 Element MUST contain the FFV1 "Configuration Record" structure and no 1393 other data. See [Matroska] for more information about elements. 1395 "NumBytes" is defined as the "Element Data Size" of the 1396 "CodecPrivate" Element. 1398 4.3. Frame 1400 A "Frame" is an encoded representation of a complete static image. 1401 The whole "Frame" is provided by the underlaying container. 1403 A "Frame" consists of the keyframe field, "Parameters" (if version 1404 <=1), and a sequence of independent slices. The pseudo-code below 1405 describes the contents of a "Frame". 1407 pseudo-code | type --------------------------------------------------------------|----- Frame( NumBytes ) { | keyframe | br if (keyframe && !ConfigurationRecordIsPresent { | Parameters( ) | } | while (remaining_bits_in_bitstream( NumBytes )) { | Slice( ) | } | } | 1409 Architecture overview of slices in a "Frame": 1411 +-----------------------------------------------------------------+ 1412 +=================================================================+ 1413 | first slice header | 1414 +-----------------------------------------------------------------+ 1415 | first slice content | 1416 +-----------------------------------------------------------------+ 1417 | first slice footer | 1418 +-----------------------------------------------------------------+ 1419 | --------------------------------------------------------------- | 1420 +-----------------------------------------------------------------+ 1421 | second slice header | 1422 +-----------------------------------------------------------------+ 1423 | second slice content | 1424 +-----------------------------------------------------------------+ 1425 | second slice footer | 1426 +-----------------------------------------------------------------+ 1427 | --------------------------------------------------------------- | 1428 +-----------------------------------------------------------------+ 1429 | ... | 1430 +-----------------------------------------------------------------+ 1431 | --------------------------------------------------------------- | 1432 +-----------------------------------------------------------------+ 1433 | last slice header | 1434 +-----------------------------------------------------------------+ 1435 | last slice content | 1436 +-----------------------------------------------------------------+ 1437 | last slice footer | 1438 +-----------------------------------------------------------------+ 1440 Table 15 1442 4.4. Slice 1444 A "Slice" is an independent spatial sub-section of a "Frame" that is 1445 encoded separately from an other region of the same "Frame". The use 1446 of more than one "Slice" per "Frame" can be useful for taking 1447 advantage of the opportunities of multithreaded encoding and 1448 decoding. 1450 A "Slice" consists of a "Slice Header" (when relevant), a "Slice 1451 Content", and a "Slice Footer" (when relevant). The pseudo-code 1452 below describes the contents of a "Slice". 1454 pseudo-code | type --------------------------------------------------------------|----- Slice( ) { | if (version >= 3) { | SliceHeader( ) | } | SliceContent( ) | if (coder_type == 0) { | while (!byte_aligned()) { | padding | u(1) } | } | if (version <= 1) { | while (remaining_bits_in_bitstream( NumBytes ) != 0) {| reserved | u(1) } | } | if (version >= 3) { | SliceFooter( ) | } | } | 1456 "padding" specifies a bit without any significance and used only for 1457 byte alignment. MUST be 0. 1459 "reserved" specifies a bit without any significance in this revision 1460 of the specification and may have a significance in a later revision 1461 of this specification. 1463 Encoders SHOULD NOT fill these bits. 1465 Decoders SHOULD ignore these bits. 1467 Note in case these bits are used in a later revision of this 1468 specification: any revision of this specification SHOULD care about 1469 avoiding to add 40 bits of content after "SliceContent" for version 0 1470 and 1 of the bitstream. Background: due to some non conforming 1471 encoders, some bitstreams where found with 40 extra bits 1472 corresponding to "error_status" and "slice_crc_parity", a decoder 1473 conforming to the revised specification could not do the difference 1474 between a revised bitstream and a buggy bitstream. 1476 4.5. Slice Header 1478 A "Slice Header" provides information about the decoding 1479 configuration of the "Slice", such as its spatial position, size, and 1480 aspect ratio. The pseudo-code below describes the contents of the 1481 "Slice Header". 1483 pseudo-code | type --------------------------------------------------------------|----- SliceHeader( ) { | slice_x | ur slice_y | ur slice_width - 1 | ur slice_height - 1 | ur for (i = 0; i < quant_table_set_index_count; i++) { | quant_table_set_index[ i ] | ur } | picture_structure | ur sar_num | ur sar_den | ur } | 1485 4.5.1. slice_x 1487 "slice_x" indicates the x position on the slice raster formed by 1488 num_h_slices. 1490 Inferred to be 0 if not present. 1492 4.5.2. slice_y 1494 "slice_y" indicates the y position on the slice raster formed by 1495 num_v_slices. 1497 Inferred to be 0 if not present. 1499 4.5.3. slice_width 1501 "slice_width" indicates the width on the slice raster formed by 1502 num_h_slices. 1504 Inferred to be 1 if not present. 1506 4.5.4. slice_height 1508 "slice_height" indicates the height on the slice raster formed by 1509 num_v_slices. 1511 Inferred to be 1 if not present. 1513 4.5.5. quant_table_set_index_count 1515 "quant_table_set_index_count" is defined as "1 + ( ( chroma_planes || 1516 version <= 3 ) ? 1 : 0 ) + ( extra_plane ? 1 : 0 )". 1518 4.5.6. quant_table_set_index 1520 "quant_table_set_index" indicates the Quantization Table Set index to 1521 select the Quantization Table Set and the initial states for the 1522 slice. 1524 Inferred to be 0 if not present. 1526 4.5.7. picture_structure 1528 "picture_structure" specifies the temporal and spatial relationship 1529 of each "Line" of the "Frame". 1531 Inferred to be 0 if not present. 1533 +-------+-------------------------+ 1534 | value | picture structure used | 1535 +=======+=========================+ 1536 | 0 | unknown | 1537 +-------+-------------------------+ 1538 | 1 | top field first | 1539 +-------+-------------------------+ 1540 | 2 | bottom field first | 1541 +-------+-------------------------+ 1542 | 3 | progressive | 1543 +-------+-------------------------+ 1544 | Other | reserved for future use | 1545 +-------+-------------------------+ 1547 Table 16 1549 4.5.8. sar_num 1551 "sar_num" specifies the "Sample" aspect ratio numerator. 1553 Inferred to be 0 if not present. 1555 A value of 0 means that aspect ratio is unknown. 1557 Encoders MUST write 0 if "Sample" aspect ratio is unknown. 1559 If "sar_den" is 0, decoders SHOULD ignore the encoded value and 1560 consider that "sar_num" is 0. 1562 4.5.9. sar_den 1564 "sar_den" specifies the "Sample" aspect ratio denominator. 1566 Inferred to be 0 if not present. 1568 A value of 0 means that aspect ratio is unknown. 1570 Encoders MUST write 0 if "Sample" aspect ratio is unknown. 1572 If "sar_num" is 0, decoders SHOULD ignore the encoded value and 1573 consider that "sar_den" is 0. 1575 4.6. Slice Content 1577 A "Slice Content" contains all "Line" elements part of the "Slice". 1579 Depending on the configuration, "Line" elements are ordered by 1580 "Plane" then by row (YCbCr) or by row then by "Plane" (RGB). 1582 pseudo-code | type --------------------------------------------------------------|----- SliceContent( ) { | if (colorspace_type == 0) { | for (p = 0; p < primary_color_count; p++) { | for (y = 0; y < plane_pixel_height[ p ]; y++) { | Line( p, y ) | } | } | } else if (colorspace_type == 1) { | for (y = 0; y < slice_pixel_height; y++) { | for (p = 0; p < primary_color_count; p++) { | Line( p, y ) | } | } | } | } | 1584 4.6.1. primary_color_count 1586 "primary_color_count" is defined as "1 + ( chroma_planes ? 2 : 0 ) + 1587 ( extra_plane ? 1 : 0 )". 1589 4.6.2. plane_pixel_height 1591 "plane_pixel_height[ p ]" is the height in pixels of plane p of the 1592 slice. 1594 "plane_pixel_height[ 0 ]" and "plane_pixel_height[ 1 + ( 1595 chroma_planes ? 2 : 0 ) ]" value is "slice_pixel_height". 1597 If "chroma_planes" is set to 1, "plane_pixel_height[ 1 ]" and 1598 "plane_pixel_height[ 2 ]" value is "ceil( slice_pixel_height / 1599 log2_v_chroma_subsample )". 1601 4.6.3. slice_pixel_height 1603 "slice_pixel_height" is the height in pixels of the slice. 1605 Its value is "floor( ( slice_y + slice_height ) * slice_pixel_height 1606 / num_v_slices ) - slice_pixel_y". 1608 4.6.4. slice_pixel_y 1610 "slice_pixel_y" is the slice vertical position in pixels. 1612 Its value is "floor( slice_y * frame_pixel_height / num_v_slices )". 1614 4.7. Line 1616 A "Line" is a list of the sample differences (relative to the 1617 predictor) of primary color components. The pseudo-code below 1618 describes the contents of the "Line". 1620 pseudo-code | type --------------------------------------------------------------|----- Line( p, y ) { | if (colorspace_type == 0) { | for (x = 0; x < plane_pixel_width[ p ]; x++) { | sample_difference[ p ][ y ][ x ] | } | } else if (colorspace_type == 1) { | for (x = 0; x < slice_pixel_width; x++) { | sample_difference[ p ][ y ][ x ] | } | } | } | 1622 4.7.1. plane_pixel_width 1624 "plane_pixel_width[ p ]" is the width in "Pixels" of "Plane" p of the 1625 slice. 1627 "plane_pixel_width[ 0 ]" and "plane_pixel_width[ 1 + ( chroma_planes 1628 ? 2 : 0 ) ]" value is "slice_pixel_width". 1630 If "chroma_planes" is set to 1, "plane_pixel_width[ 1 ]" and 1631 "plane_pixel_width[ 2 ]" value is "ceil( slice_pixel_width / (1 << 1632 log2_h_chroma_subsample) )". 1634 4.7.2. slice_pixel_width 1636 "slice_pixel_width" is the width in "Pixels" of the slice. 1638 Its value is "floor( ( slice_x + slice_width ) * slice_pixel_width / 1639 num_h_slices ) - slice_pixel_x". 1641 4.7.3. slice_pixel_x 1643 "slice_pixel_x" is the slice horizontal position in "Pixels". 1645 Its value is "floor( slice_x * frame_pixel_width / num_h_slices )". 1647 4.7.4. sample_difference 1649 "sample_difference[ p ][ y ][ x ]" is the sample difference for 1650 "Sample" at "Plane" "p", y position "y", and x position "x". The 1651 "Sample" value is computed based on median predictor and context 1652 described in the section on Samples (#samples). 1654 4.8. Slice Footer 1656 A "Slice Footer" provides information about slice size and 1657 (optionally) parity. The pseudo-code below describes the contents of 1658 the "Slice Footer". 1660 Note: "Slice Footer" is always byte aligned. 1662 pseudo-code | type --------------------------------------------------------------|----- SliceFooter( ) { | slice_size | u(24) if (ec) { | error_status | u(8) slice_crc_parity | u(32) } | } | 1664 4.8.1. slice_size 1666 "slice_size" indicates the size of the slice in bytes. 1668 Note: this allows finding the start of slices before previous slices 1669 have been fully decoded, and allows parallel decoding as well as 1670 error resilience. 1672 4.8.2. error_status 1674 "error_status" specifies the error status. 1676 +-------+--------------------------------------+ 1677 | value | error status | 1678 +=======+======================================+ 1679 | 0 | no error | 1680 +-------+--------------------------------------+ 1681 | 1 | slice contains a correctable error | 1682 +-------+--------------------------------------+ 1683 | 2 | slice contains a uncorrectable error | 1684 +-------+--------------------------------------+ 1685 | Other | reserved for future use | 1686 +-------+--------------------------------------+ 1688 Table 17 1690 4.8.3. slice_crc_parity 1692 "slice_crc_parity" 32 bits that are chosen so that the slice as a 1693 whole has a crc remainder of 0. 1695 This is equivalent to storing the crc remainder in the 32-bit parity. 1697 The CRC generator polynomial used is the standard IEEE CRC polynomial 1698 (0x104C11DB7) with initial value 0. 1700 4.9. Quantization Table Set 1702 The Quantization Table Sets are stored by storing the number of equal 1703 entries -1 of the first half of the table (represented as "len - 1" 1704 in the pseudo-code below) using the method described in Range Non 1705 Binary Values (#range-non-binary-values). The second half doesn't 1706 need to be stored as it is identical to the first with flipped sign. 1707 "scale" and "len_count[ i ][ j ]" are temporary values used for the 1708 computing of "context_count[ i ]" and are not used outside 1709 Quantization Table Set pseudo-code. 1711 Example: 1713 Table: 0 0 1 1 1 1 2 2 -2 -2 -2 -1 -1 -1 -1 0 1715 Stored values: 1, 3, 1 1717 pseudo-code | type --------------------------------------------------------------|----- QuantizationTableSet( i ) { | scale = 1 | for (j = 0; j < MAX_CONTEXT_INPUTS; j++) { | QuantizationTable( i, j, scale ) | scale *= 2 * len_count[ i ][ j ] - 1 | } | context_count[ i ] = ceil( scale / 2 ) | } | 1719 MAX_CONTEXT_INPUTS is 5. 1721 pseudo-code | type --------------------------------------------------------------|----- QuantizationTable(i, j, scale) { | v = 0 | for (k = 0; k < 128;) { | len - 1 | ur for (a = 0; a < len; a++) { | quant_tables[ i ][ j ][ k ] = scale * v | k++ | } | v++ | } | for (k = 1; k < 128; k++) { | quant_tables[ i ][ j ][ 256 - k ] = \ | -quant_tables[ i ][ j ][ k ] | } | quant_tables[ i ][ j ][ 128 ] = \ | -quant_tables[ i ][ j ][ 127 ] | len_count[ i ][ j ] = v | } | 1723 4.9.1. quant_tables 1725 "quant_tables[ i ][ j ][ k ]" indicates the quantification table 1726 value of the Quantized Sample Difference "k" of the Quantization 1727 Table "j" of the Set Quantization Table Set "i". 1729 4.9.2. context_count 1731 "context_count[ i ]" indicates the count of contexts for Quantization 1732 Table Set "i". 1734 5. Restrictions 1736 To ensure that fast multithreaded decoding is possible, starting with 1737 version 3 and if "frame_pixel_width * frame_pixel_height" is more 1738 than 101376, "slice_width * slice_height" MUST be less or equal to 1739 "num_h_slices * num_v_slices / 4". Note: 101376 is the frame size in 1740 "Pixels" of a 352x288 frame also known as CIF ("Common Intermediate 1741 Format") frame size format. 1743 For each "Frame", each position in the slice raster MUST be filled by 1744 one and only one slice of the "Frame" (no missing slice position, no 1745 slice overlapping). 1747 For each "Frame" with keyframe value of 0, each slice MUST have the 1748 same value of "slice_x, slice_y, slice_width, slice_height" as a 1749 slice in the previous "Frame". 1751 6. Security Considerations 1753 Like any other codec, (such as [RFC6716]), FFV1 should not be used 1754 with insecure ciphers or cipher-modes that are vulnerable to known 1755 plaintext attacks. Some of the header bits as well as the padding 1756 are easily predictable. 1758 Implementations of the FFV1 codec need to take appropriate security 1759 considerations into account, as outlined in [RFC4732]. It is 1760 extremely important for the decoder to be robust against malicious 1761 payloads. Malicious payloads must not cause the decoder to overrun 1762 its allocated memory or to take an excessive amount of resources to 1763 decode. The same applies to the encoder, even though problems in 1764 encoders are typically rarer. Malicious video streams must not cause 1765 the encoder to misbehave because this would allow an attacker to 1766 attack transcoding gateways. A frequent security problem in image 1767 and video codecs is also to not check for integer overflows in 1768 "Pixel" count computations, that is to allocate width * height 1769 without considering that the multiplication result may have 1770 overflowed the arithmetic types range. The range coder could, if 1771 implemented naively, read one byte over the end. The implementation 1772 must ensure that no read outside allocated and initialized memory 1773 occurs. 1775 The reference implementation [REFIMPL] contains no known buffer 1776 overflow or cases where a specially crafted packet or video segment 1777 could cause a significant increase in CPU load. 1779 The reference implementation [REFIMPL] was validated in the following 1780 conditions: 1782 * Sending the decoder valid packets generated by the reference 1783 encoder and verifying that the decoder's output matches the 1784 encoder's input. 1786 * Sending the decoder packets generated by the reference encoder and 1787 then subjected to random corruption. 1789 * Sending the decoder random packets that are not FFV1. 1791 In all of the conditions above, the decoder and encoder was run 1792 inside the [VALGRIND] memory debugger as well as clangs address 1793 sanitizer [Address-Sanitizer], which track reads and writes to 1794 invalid memory regions as well as the use of uninitialized memory. 1795 There were no errors reported on any of the tested conditions. 1797 7. Media Type Definition 1799 This registration is done using the template defined in [RFC6838] and 1800 following [RFC4855]. 1802 Type name: video 1804 Subtype name: FFV1 1806 Required parameters: None. 1808 Optional parameters: 1810 This parameter is used to signal the capabilities of a receiver 1811 implementation. This parameter MUST NOT be used for any other 1812 purpose. 1814 version: The version of the FFV1 encoding as defined by the section 1815 on version (#version). 1817 micro_version: The micro_version of the FFV1 encoding as defined by 1818 the section on micro_version (#micro-version). 1820 coder_type: The coder_type of the FFV1 encoding as defined by the 1821 section on coder_type (#coder-type). 1823 colorspace_type: The colorspace_type of the FFV1 encoding as defined 1824 by the section on colorspace_type (#colorspace-type). 1826 bits_per_raw_sample: The bits_per_raw_sample of the FFV1 encoding as 1827 defined by the section on bits_per_raw_sample (#bits-per-raw-sample). 1829 max-slices: The value of max-slices is an integer indicating the 1830 maximum count of slices with a frames of the FFV1 encoding. 1832 Encoding considerations: 1834 This media type is defined for encapsulation in several audiovisual 1835 container formats and contains binary data; see the section on 1836 "Mapping FFV1 into Containers" (#mapping-ffv1-into-containers). This 1837 media type is framed binary data Section 4.8 of [RFC6838]. 1839 Security considerations: 1841 See the "Security Considerations" section (#security-considerations) 1842 of this document. 1844 Interoperability considerations: None. 1846 Published specification: 1848 [I-D.ietf-cellar-ffv1] and RFC XXXX. 1850 [RFC Editor: Upon publication as an RFC, please replace "XXXX" with 1851 the number assigned to this document and remove this note.] 1853 Applications which use this media type: 1855 Any application that requires the transport of lossless video can use 1856 this media type. Some examples are, but not limited to screen 1857 recording, scientific imaging, and digital video preservation. 1859 Fragment identifier considerations: N/A. 1861 Additional information: None. 1863 Person & email address to contact for further information: Michael 1864 Niedermayer michael@niedermayer.cc (mailto:michael@niedermayer.cc) 1866 Intended usage: COMMON 1868 Restrictions on usage: None. 1870 Author: Dave Rice dave@dericed.com (mailto:dave@dericed.com) 1872 Change controller: IETF cellar working group delegated from the IESG. 1874 8. IANA Considerations 1876 The IANA is requested to register the following values: 1878 * Media type registration as described in Media Type Definition 1879 (#media-type-definition). 1881 9. Appendixes 1883 9.1. Decoder implementation suggestions 1884 9.1.1. Multi-threading Support and Independence of Slices 1886 The FFV1 bitstream is parsable in two ways: in sequential order as 1887 described in this document or with the pre-analysis of the footer of 1888 each slice. Each slice footer contains a slice_size field so the 1889 boundary of each slice is computable without having to parse the 1890 slice content. That allows multi-threading as well as independence 1891 of slice content (a bitstream error in a slice header or slice 1892 content has no impact on the decoding of the other slices). 1894 After having checked keyframe field, a decoder SHOULD parse 1895 slice_size fields, from slice_size of the last slice at the end of 1896 the "Frame" up to slice_size of the first slice at the beginning of 1897 the "Frame", before parsing slices, in order to have slices 1898 boundaries. A decoder MAY fallback on sequential order e.g. in case 1899 of a corrupted "Frame" (frame size unknown, slice_size of slices not 1900 coherent...) or if there is no possibility of seeking into the 1901 stream. 1903 10. Changelog 1905 See https://github.com/FFmpeg/FFV1/commits/master 1906 (https://github.com/FFmpeg/FFV1/commits/master) 1908 11. Normative References 1910 [I-D.ietf-cellar-ffv1] 1911 Niedermayer, M., Rice, D., and J. Martinez, "FFV1 Video 1912 Coding Format Version 0, 1, and 3", draft-ietf-cellar- 1913 ffv1-08 (work in progress), August 13, 2019, 1914 . 1917 [ISO.15444-1.2016] 1918 International Organization for Standardization, 1919 "Information technology -- JPEG 2000 image coding system: 1920 Core coding system", October 2016. 1922 [ISO.9899.1990] 1923 International Organization for Standardization, 1924 "Programming languages - C", 1990. 1926 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1927 Requirement Levels", BCP 14, RFC 2119, 1928 DOI 10.17487/RFC2119, March 1997, 1929 . 1931 [RFC4732] Handley, M., Ed., Rescorla, E., Ed., and IAB, "Internet 1932 Denial-of-Service Considerations", RFC 4732, 1933 DOI 10.17487/RFC4732, December 2006, 1934 . 1936 [RFC4855] Casner, S., "Media Type Registration of RTP Payload 1937 Formats", RFC 4855, DOI 10.17487/RFC4855, February 2007, 1938 . 1940 [RFC6716] Valin, JM., Vos, K., and T. Terriberry, "Definition of the 1941 Opus Audio Codec", RFC 6716, DOI 10.17487/RFC6716, 1942 September 2012, . 1944 [RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type 1945 Specifications and Registration Procedures", BCP 13, 1946 RFC 6838, DOI 10.17487/RFC6838, January 2013, 1947 . 1949 12. Informative References 1951 [Address-Sanitizer] 1952 The Clang Team, "ASAN AddressSanitizer website", September 1953 2019, . 1955 [AVI] Microsoft, "AVI RIFF File Reference", September 2019, 1956 . 1959 [FFV1_V0] Niedermayer, M., "Commit to mark FFV1 version 0 as non- 1960 experimental", April 2006, 1961 . 1964 [FFV1_V1] Niedermayer, M., "Commit to release FFV1 version 1", April 1965 2009, 1966 . 1969 [FFV1_V3] Niedermayer, M., "Commit to mark FFV1 version 3 as non- 1970 experimental", August 2013, 1971 . 1974 [HuffYUV] Rudiak-Gould, B., "HuffYUV", December 2003, 1975 . 1978 [ISO.14495-1.1999] 1979 International Organization for Standardization, 1980 "Information technology -- Lossless and near-lossless 1981 compression of continuous-tone still images: Baseline", 1982 December 1999. 1984 [ISO.14496-10.2014] 1985 International Organization for Standardization, 1986 "Information technology -- Coding of audio-visual objects 1987 -- Part 10: Advanced Video Coding", September 2014. 1989 [ISO.14496-12.2015] 1990 International Organization for Standardization, 1991 "Information technology -- Coding of audio-visual objects 1992 -- Part 12: ISO base media file format", December 2015. 1994 [Matroska] IETF, "Matroska", 2016, 1995 . 1998 [NUT] Niedermayer, M., "NUT Open Container Format", December 1999 2013, . 2001 [range-coding] 2002 Nigel, G. and N. Martin, "Range encoding: an algorithm for 2003 removing redundancy from a digitised message.", July 1979. 2005 [REFIMPL] Niedermayer, M., "The reference FFV1 implementation / the 2006 FFV1 codec in FFmpeg", September 2019, 2007 . 2009 [VALGRIND] Valgrind Developers, "Valgrind website", September 2019, 2010 . 2012 [YCbCr] Wikipedia, "YCbCr", September 2019, 2013 . 2015 Authors' Addresses 2017 Michael Niedermayer 2019 Email: michael@niedermayer.cc 2021 Dave Rice 2023 Email: dave@dericed.com 2024 Jerome Martinez 2026 Email: jerome@mediaarea.net