idnits 2.17.1 draft-ietf-cellar-ffv1-v4-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (23 October 2019) is 1619 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-20) exists of draft-ietf-cellar-ffv1-09 ** Downref: Normative reference to an Informational draft: draft-ietf-cellar-ffv1 (ref. 'I-D.ietf-cellar-ffv1') ** Downref: Normative reference to an Informational RFC: RFC 4732 Summary: 2 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 cellar M. Niedermayer 3 Internet-Draft D. Rice 4 Intended status: Standards Track J. Martinez 5 Expires: 25 April 2020 23 October 2019 7 FFV1 Video Coding Format Version 4 8 draft-ietf-cellar-ffv1-v4-08 10 Abstract 12 This document defines FFV1, a lossless intra-frame video encoding 13 format. FFV1 is designed to efficiently compress video data in a 14 variety of pixel formats. Compared to uncompressed video, FFV1 15 offers storage compression, frame fixity, and self-description, which 16 makes FFV1 useful as a preservation or intermediate video format. 18 Status of This Memo 20 This Internet-Draft is submitted in full conformance with the 21 provisions of BCP 78 and BCP 79. 23 Internet-Drafts are working documents of the Internet Engineering 24 Task Force (IETF). Note that other groups may also distribute 25 working documents as Internet-Drafts. The list of current Internet- 26 Drafts is at https://datatracker.ietf.org/drafts/current/. 28 Internet-Drafts are draft documents valid for a maximum of six months 29 and may be updated, replaced, or obsoleted by other documents at any 30 time. It is inappropriate to use Internet-Drafts as reference 31 material or to cite them other than as "work in progress." 33 This Internet-Draft will expire on 25 April 2020. 35 Copyright Notice 37 Copyright (c) 2019 IETF Trust and the persons identified as the 38 document authors. All rights reserved. 40 This document is subject to BCP 78 and the IETF Trust's Legal 41 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 42 license-info) in effect on the date of publication of this document. 43 Please review these documents carefully, as they describe your rights 44 and restrictions with respect to this document. Code Components 45 extracted from this document must include Simplified BSD License text 46 as described in Section 4.e of the Trust Legal Provisions and are 47 provided without warranty as described in the Simplified BSD License. 49 Table of Contents 51 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 52 2. Notation and Conventions . . . . . . . . . . . . . . . . . . 4 53 2.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . 4 54 2.2. Conventions . . . . . . . . . . . . . . . . . . . . . . . 5 55 2.2.1. Pseudo-code . . . . . . . . . . . . . . . . . . . . . 5 56 2.2.2. Arithmetic Operators . . . . . . . . . . . . . . . . 5 57 2.2.3. Assignment Operators . . . . . . . . . . . . . . . . 6 58 2.2.4. Comparison Operators . . . . . . . . . . . . . . . . 6 59 2.2.5. Mathematical Functions . . . . . . . . . . . . . . . 7 60 2.2.6. Order of Operation Precedence . . . . . . . . . . . . 7 61 2.2.7. Range . . . . . . . . . . . . . . . . . . . . . . . . 8 62 2.2.8. NumBytes . . . . . . . . . . . . . . . . . . . . . . 8 63 2.2.9. Bitstream Functions . . . . . . . . . . . . . . . . . 8 64 3. Sample Coding . . . . . . . . . . . . . . . . . . . . . . . . 9 65 3.1. Border . . . . . . . . . . . . . . . . . . . . . . . . . 9 66 3.2. Samples . . . . . . . . . . . . . . . . . . . . . . . . . 10 67 3.3. Median Predictor . . . . . . . . . . . . . . . . . . . . 10 68 3.4. Context . . . . . . . . . . . . . . . . . . . . . . . . . 11 69 3.5. Quantization Table Sets . . . . . . . . . . . . . . . . . 11 70 3.6. Quantization Table Set Indexes . . . . . . . . . . . . . 12 71 3.7. Color spaces . . . . . . . . . . . . . . . . . . . . . . 12 72 3.7.1. YCbCr . . . . . . . . . . . . . . . . . . . . . . . . 12 73 3.7.2. RGB . . . . . . . . . . . . . . . . . . . . . . . . . 13 74 3.8. Coding of the Sample Difference . . . . . . . . . . . . . 15 75 3.8.1. Range Coding Mode . . . . . . . . . . . . . . . . . . 15 76 3.8.2. Golomb Rice Mode . . . . . . . . . . . . . . . . . . 20 77 4. Bitstream . . . . . . . . . . . . . . . . . . . . . . . . . . 25 78 4.1. Parameters . . . . . . . . . . . . . . . . . . . . . . . 26 79 4.1.1. version . . . . . . . . . . . . . . . . . . . . . . . 28 80 4.1.2. micro_version . . . . . . . . . . . . . . . . . . . . 28 81 4.1.3. coder_type . . . . . . . . . . . . . . . . . . . . . 29 82 4.1.4. state_transition_delta . . . . . . . . . . . . . . . 30 83 4.1.5. colorspace_type . . . . . . . . . . . . . . . . . . . 30 84 4.1.6. chroma_planes . . . . . . . . . . . . . . . . . . . . 31 85 4.1.7. bits_per_raw_sample . . . . . . . . . . . . . . . . . 31 86 4.1.8. log2_h_chroma_subsample . . . . . . . . . . . . . . . 31 87 4.1.9. log2_v_chroma_subsample . . . . . . . . . . . . . . . 32 88 4.1.10. extra_plane . . . . . . . . . . . . . . . . . . . . . 32 89 4.1.11. num_h_slices . . . . . . . . . . . . . . . . . . . . 32 90 4.1.12. num_v_slices . . . . . . . . . . . . . . . . . . . . 32 91 4.1.13. quant_table_set_count . . . . . . . . . . . . . . . . 32 92 4.1.14. states_coded . . . . . . . . . . . . . . . . . . . . 32 93 4.1.15. initial_state_delta . . . . . . . . . . . . . . . . . 33 94 4.1.16. ec . . . . . . . . . . . . . . . . . . . . . . . . . 33 95 4.1.17. intra . . . . . . . . . . . . . . . . . . . . . . . . 33 96 4.2. Configuration Record . . . . . . . . . . . . . . . . . . 34 97 4.2.1. reserved_for_future_use . . . . . . . . . . . . . . . 34 98 4.2.2. configuration_record_crc_parity . . . . . . . . . . . 34 99 4.2.3. Mapping FFV1 into Containers . . . . . . . . . . . . 35 100 4.3. Frame . . . . . . . . . . . . . . . . . . . . . . . . . . 36 101 4.4. Slice . . . . . . . . . . . . . . . . . . . . . . . . . . 37 102 4.5. Slice Header . . . . . . . . . . . . . . . . . . . . . . 38 103 4.5.1. slice_x . . . . . . . . . . . . . . . . . . . . . . . 39 104 4.5.2. slice_y . . . . . . . . . . . . . . . . . . . . . . . 39 105 4.5.3. slice_width . . . . . . . . . . . . . . . . . . . . . 39 106 4.5.4. slice_height . . . . . . . . . . . . . . . . . . . . 39 107 4.5.5. quant_table_set_index_count . . . . . . . . . . . . . 40 108 4.5.6. quant_table_set_index . . . . . . . . . . . . . . . . 40 109 4.5.7. picture_structure . . . . . . . . . . . . . . . . . . 40 110 4.5.8. sar_num . . . . . . . . . . . . . . . . . . . . . . . 40 111 4.5.9. sar_den . . . . . . . . . . . . . . . . . . . . . . . 41 112 4.5.10. reset_contexts . . . . . . . . . . . . . . . . . . . 41 113 4.5.11. slice_coding_mode . . . . . . . . . . . . . . . . . . 41 114 4.6. Slice Content . . . . . . . . . . . . . . . . . . . . . . 41 115 4.6.1. primary_color_count . . . . . . . . . . . . . . . . . 42 116 4.6.2. plane_pixel_height . . . . . . . . . . . . . . . . . 42 117 4.6.3. slice_pixel_height . . . . . . . . . . . . . . . . . 42 118 4.6.4. slice_pixel_y . . . . . . . . . . . . . . . . . . . . 42 119 4.7. Line . . . . . . . . . . . . . . . . . . . . . . . . . . 43 120 4.7.1. plane_pixel_width . . . . . . . . . . . . . . . . . . 43 121 4.7.2. slice_pixel_width . . . . . . . . . . . . . . . . . . 43 122 4.7.3. slice_pixel_x . . . . . . . . . . . . . . . . . . . . 43 123 4.7.4. sample_difference . . . . . . . . . . . . . . . . . . 44 124 4.8. Slice Footer . . . . . . . . . . . . . . . . . . . . . . 44 125 4.8.1. slice_size . . . . . . . . . . . . . . . . . . . . . 44 126 4.8.2. error_status . . . . . . . . . . . . . . . . . . . . 44 127 4.8.3. slice_crc_parity . . . . . . . . . . . . . . . . . . 45 128 4.9. Quantization Table Set . . . . . . . . . . . . . . . . . 45 129 4.9.1. quant_tables . . . . . . . . . . . . . . . . . . . . 46 130 4.9.2. context_count . . . . . . . . . . . . . . . . . . . . 46 131 5. Restrictions . . . . . . . . . . . . . . . . . . . . . . . . 47 132 6. Security Considerations . . . . . . . . . . . . . . . . . . . 47 133 7. Media Type Definition . . . . . . . . . . . . . . . . . . . . 48 134 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 50 135 9. Appendixes . . . . . . . . . . . . . . . . . . . . . . . . . 50 136 9.1. Decoder implementation suggestions . . . . . . . . . . . 50 137 9.1.1. Multi-threading Support and Independence of 138 Slices . . . . . . . . . . . . . . . . . . . . . . . 50 139 10. Changelog . . . . . . . . . . . . . . . . . . . . . . . . . . 50 140 11. Normative References . . . . . . . . . . . . . . . . . . . . 50 141 12. Informative References . . . . . . . . . . . . . . . . . . . 51 142 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 52 144 1. Introduction 146 This document describes FFV1, a lossless video encoding format. The 147 design of FFV1 considers the storage of image characteristics, data 148 fixity, and the optimized use of encoding time and storage 149 requirements. FFV1 is designed to support a wide range of lossless 150 video applications such as long-term audiovisual preservation, 151 scientific imaging, screen recording, and other video encoding 152 scenarios that seek to avoid the generational loss of lossy video 153 encodings. 155 This document defines a version 4 of FFV1. Prior versions of FFV1 156 are defined within [I-D.ietf-cellar-ffv1]. 158 The latest version of this document is available at 159 https://raw.github.com/FFmpeg/FFV1/master/ffv1.md 160 (https://raw.github.com/FFmpeg/FFV1/master/ffv1.md) 162 This document assumes familiarity with mathematical and coding 163 concepts such as Range coding [range-coding] and YCbCr color spaces 164 [YCbCr]. 166 2. Notation and Conventions 168 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 169 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 170 document are to be interpreted as described in [RFC2119]. 172 2.1. Definitions 174 "Container": Format that encapsulates "Frames" (see the section on 175 Frames (#frame)) and (when required) a "Configuration Record" into a 176 bitstream. 178 "Sample": The smallest addressable representation of a color 179 component or a luma component in a "Frame". Examples of "Sample" are 180 Luma, Blue Chrominance, Red Chrominance, Transparency, Red, Green, 181 and Blue. 183 "Plane": A discrete component of a static image comprised of 184 "Samples" that represent a specific quantification of "Samples" of 185 that image. 187 "Pixel": The smallest addressable representation of a color in a 188 "Frame". It is composed of 1 or more "Samples". 190 "ESC": An ESCape symbol to indicate that the symbol to be stored is 191 too large for normal storage and that an alternate storage method is 192 used. 194 "MSB": Most Significant Bit, the bit that can cause the largest 195 change in magnitude of the symbol. 197 "RCT": Reversible Color Transform, a near linear, exactly reversible 198 integer transform that converts between RGB and YCbCr representations 199 of a "Pixel". 201 "VLC": Variable Length Code, a code that maps source symbols to a 202 variable number of bits. 204 "RGB": A reference to the method of storing the value of a "Pixel" by 205 using three numeric values that represent Red, Green, and Blue. 207 "YCbCr": A reference to the method of storing the value of a "Pixel" 208 by using three numeric values that represent the luma of the "Pixel" 209 (Y) and the chrominance of the "Pixel" (Cb and Cr). YCbCr word is 210 used for historical reasons and currently references any color space 211 relying on 1 luma "Sample" and 2 chrominance "Samples", e.g. YCbCr, 212 YCgCo or ICtCp. The exact meaning of the three numeric values is 213 unspecified. 215 "TBA": To Be Announced. Used in reference to the development of 216 future iterations of the FFV1 specification. 218 2.2. Conventions 220 2.2.1. Pseudo-code 222 The FFV1 bitstream is described in this document using pseudo-code. 223 Note that the pseudo-code is used for clarity in order to illustrate 224 the structure of FFV1 and not intended to specify any particular 225 implementation. The pseudo-code used is based upon the C programming 226 language [ISO.9899.1990] and uses its "if/else", "while" and "for" 227 functions as well as functions defined within this document. 229 2.2.2. Arithmetic Operators 231 Note: the operators and the order of precedence are the same as used 232 in the C programming language [ISO.9899.1990]. 234 "a + b" means a plus b. 236 "a - b" means a minus b. 238 "-a" means negation of a. 240 "a * b" means a multiplied by b. 242 "a / b" means a divided by b. 244 "a ^ b" means a raised to the b-th power. 246 "a & b" means bit-wise "and" of a and b. 248 "a | b" means bit-wise "or" of a and b. 250 "a >> b" means arithmetic right shift of two's complement integer 251 representation of a by b binary digits. 253 "a << b" means arithmetic left shift of two's complement integer 254 representation of a by b binary digits. 256 2.2.3. Assignment Operators 258 "a = b" means a is assigned b. 260 "a++" is equivalent to a is assigned a + 1. 262 "a--" is equivalent to a is assigned a - 1. 264 "a += b" is equivalent to a is assigned a + b. 266 "a -= b" is equivalent to a is assigned a - b. 268 "a *= b" is equivalent to a is assigned a * b. 270 2.2.4. Comparison Operators 272 "a > b" means a is greater than b. 274 "a >= b" means a is greater than or equal to b. 276 "a < b" means a is less than b. 278 "a <= b" means a is less than or equal b. 280 "a == b" means a is equal to b. 282 "a != b" means a is not equal to b. 284 "a && b" means Boolean logical "and" of a and b. 286 "a || b" means Boolean logical "or" of a and b. 288 "!a" means Boolean logical "not" of a. 290 "a ? b : c" if a is true, then b, otherwise c. 292 2.2.5. Mathematical Functions 294 floor(a) the largest integer less than or equal to a 296 ceil(a) the smallest integer greater than or equal to a 298 sign(a) extracts the sign of a number, i.e. if a < 0 then -1, else if 299 a > 0 then 1, else 0 301 abs(a) the absolute value of a, i.e. abs(a) = sign(a)*a 303 log2(a) the base-two logarithm of a 305 min(a,b) the smallest of two values a and b 307 max(a,b) the largest of two values a and b 309 median(a,b,c) the numerical middle value in a data set of a, b, and 310 c, i.e. a+b+c-min(a,b,c)-max(a,b,c) 312 a_(b) the b-th value of a sequence of a 314 a~b,c. the 'b,c'-th value of a sequence of a 316 2.2.6. Order of Operation Precedence 318 When order of precedence is not indicated explicitly by use of 319 parentheses, operations are evaluated in the following order (from 320 top to bottom, operations of same precedence being evaluated from 321 left to right). This order of operations is based on the order of 322 operations used in Standard C. 324 a++, a-- 325 !a, -a 326 a ^ b 327 a * b, a / b, a % b 328 a + b, a - b 329 a << b, a >> b 330 a < b, a <= b, a > b, a >= b 331 a == b, a != b 332 a & b 333 a | b 334 a && b 335 a || b 336 a ? b : c 337 a = b, a += b, a -= b, a *= b 339 2.2.7. Range 341 "a...b" means any value starting from a to b, inclusive. 343 2.2.8. NumBytes 345 "NumBytes" is a non-negative integer that expresses the size in 8-bit 346 octets of a particular FFV1 "Configuration Record" or "Frame". FFV1 347 relies on its "Container" to store the "NumBytes" values, see the 348 section on the Mapping FFV1 into Containers (#mapping-ffv1-into- 349 containers). 351 2.2.9. Bitstream Functions 353 2.2.9.1. remaining_bits_in_bitstream 355 "remaining_bits_in_bitstream( )" means the count of remaining bits 356 after the pointer in that "Configuration Record" or "Frame". It is 357 computed from the "NumBytes" value multiplied by 8 minus the count of 358 bits of that "Configuration Record" or "Frame" already read by the 359 bitstream parser. 361 2.2.9.2. remaining_symbols_in_syntax 363 "remaining_symbols_in_syntax( )" is true as long as the RangeCoder 364 has not consumed all the given input bytes. 366 2.2.9.3. byte_aligned 368 "byte_aligned( )" is true if "remaining_bits_in_bitstream( NumBytes 369 )" is a multiple of 8, otherwise false. 371 2.2.9.4. get_bits 373 "get_bits( i )" is the action to read the next "i" bits in the 374 bitstream, from most significant bit to least significant bit, and to 375 return the corresponding value. The pointer is increased by "i". 377 3. Sample Coding 379 For each "Slice" (as described in the section on Slices (#slice)) of 380 a "Frame", the "Planes", "Lines", and "Samples" are coded in an order 381 determined by the "Color Space" (see the section on Color Space 382 (#color-spaces)). Each "Sample" is predicted by the median predictor 383 as described in the section of the Median Predictor (#median- 384 predictor) from other "Samples" within the same "Plane" and the 385 difference is stored using the method described in Coding of the 386 Sample Difference (#coding-of-the-sample-difference). 388 3.1. Border 390 A border is assumed for each coded "Slice" for the purpose of the 391 median predictor and context according to the following rules: 393 * one column of "Samples" to the left of the coded slice is assumed 394 as identical to the "Samples" of the leftmost column of the coded 395 slice shifted down by one row. The value of the topmost "Sample" 396 of the column of "Samples" to the left of the coded slice is 397 assumed to be "0" 399 * one column of "Samples" to the right of the coded slice is assumed 400 as identical to the "Samples" of the rightmost column of the coded 401 slice 403 * an additional column of "Samples" to the left of the coded slice 404 and two rows of "Samples" above the coded slice are assumed to be 405 "0" 407 The following table depicts a slice of 9 "Samples" 408 "a,b,c,d,e,f,g,h,i" in a 3x3 arrangement along with its assumed 409 border. 411 +---+---+---+---+---+---+---+---+ 412 | 0 | 0 | | 0 | 0 | 0 | | 0 | 413 +---+---+---+---+---+---+---+---+ 414 | 0 | 0 | | 0 | 0 | 0 | | 0 | 415 +---+---+---+---+---+---+---+---+ 416 | | | | | | | | | 417 +---+---+---+---+---+---+---+---+ 418 | 0 | 0 | | a | b | c | | c | 419 +---+---+---+---+---+---+---+---+ 420 | 0 | a | | d | e | f | | f | 421 +---+---+---+---+---+---+---+---+ 422 | 0 | d | | g | h | i | | i | 423 +---+---+---+---+---+---+---+---+ 425 3.2. Samples 427 Relative to any "Sample" "X", six other relatively positioned 428 "Samples" from the coded "Samples" and presumed border are identified 429 according to the labels used in the following diagram. The labels 430 for these relatively positioned "Samples" are used within the median 431 predictor and context. 433 +---+---+---+---+ 434 | | | T | | 435 +---+---+---+---+ 436 | |tl | t |tr | 437 +---+---+---+---+ 438 | L | l | X | | 439 +---+---+---+---+ 441 The labels for these relative "Samples" are made of the first letters 442 of the words Top, Left and Right. 444 3.3. Median Predictor 446 The prediction for any "Sample" value at position "X" may be computed 447 based upon the relative neighboring values of "l", "t", and "tl" via 448 this equation: 450 "median(l, t, l + t - tl)". 452 Note, this prediction template is also used in [ISO.14495-1.1999] and 453 [HuffYUV]. 455 Exception for the median predictor: if "colorspace_type == 0 && 456 bits_per_raw_sample == 16 && ( coder_type == 1 || coder_type == 2 )", 457 the following median predictor MUST be used: 459 "median(left16s, top16s, left16s + top16s - diag16s)" 461 where: 463 left16s = l >= 32768 ? ( l - 65536 ) : l 464 top16s = t >= 32768 ? ( t - 65536 ) : t 465 diag16s = tl >= 32768 ? ( tl - 65536 ) : tl 467 Background: a two's complement signed 16-bit signed integer was used 468 for storing "Sample" values in all known implementations of FFV1 469 bitstream. So in some circumstances, the most significant bit was 470 wrongly interpreted (used as a sign bit instead of the 16th bit of an 471 unsigned integer). Note that when the issue is discovered, the only 472 configuration of all known implementations being impacted is 16-bit 473 YCbCr with no Pixel transformation with Range Coder coder, as other 474 potentially impacted configurations (e.g. 15/16-bit JPEG2000-RCT with 475 Range Coder coder, or 16-bit content with Golomb Rice coder) were 476 implemented nowhere [ISO.15444-1.2016]. In the meanwhile, 16-bit 477 JPEG2000-RCT with Range Coder coder was implemented without this 478 issue in one implementation and validated by one conformance checker. 479 It is expected (to be confirmed) to remove this exception for the 480 median predictor in the next version of the FFV1 bitstream. 482 3.4. Context 484 Relative to any "Sample" "X", the Quantized Sample Differences "L-l", 485 "l-tl", "tl-t", "T-t", and "t-tr" are used as context: 487 context = Q_{0}[l - tl] + 488 Q_{1}[tl - t] + 489 Q_{2}[t - tr] + 490 Q_{3}[L - l] + 491 Q_{4}[T - t] 493 Figure 1 495 If "context >= 0" then "context" is used and the difference between 496 the "Sample" and its predicted value is encoded as is, else 497 "-context" is used and the difference between the "Sample" and its 498 predicted value is encoded with a flipped sign. 500 3.5. Quantization Table Sets 502 The FFV1 bitstream contains 1 or more Quantization Table Sets. Each 503 Quantization Table Set contains exactly 5 Quantization Tables with 504 each Quantization Table corresponding to 1 of the 5 Quantized Sample 505 Differences. For each Quantization Table, both the number of 506 quantization steps and their distribution are stored in the FFV1 507 bitstream; each Quantization Table has exactly 256 entries, and the 8 508 least significant bits of the Quantized Sample Difference are used as 509 index: 511 Q_{j}[k] = quant_tables[i][j][k&255] 513 Figure 2 515 In this formula, "i" is the Quantization Table Set index, "j" is the 516 Quantized Table index, "k" the Quantized Sample Difference. 518 3.6. Quantization Table Set Indexes 520 For each "Plane" of each slice, a Quantization Table Set is selected 521 from an index: 523 * For Y "Plane", "quant_table_set_index[ 0 ]" index is used 525 * For Cb and Cr "Planes", "quant_table_set_index[ 1 ]" index is used 527 * For extra "Plane", "quant_table_set_index[ (version <= 3 || 528 chroma_planes) ? 2 : 1 ]" index is used 530 Background: in first implementations of FFV1 bitstream, the index for 531 Cb and Cr "Planes" was stored even if it is not used (chroma_planes 532 set to 0), this index is kept for version <= 3 in order to keep 533 compatibility with FFV1 bitstreams in the wild. 535 3.7. Color spaces 537 FFV1 supports several color spaces. The count of allowed coded 538 planes and the meaning of the extra "Plane" are determined by the 539 selected color space. 541 The FFV1 bitstream interleaves data in an order determined by the 542 color space. In YCbCr for each "Plane", each "Line" is coded from 543 top to bottom and for each "Line", each "Sample" is coded from left 544 to right. In JPEG2000-RCT for each "Line" from top to bottom, each 545 "Plane" is coded and for each "Plane", each "Sample" is encoded from 546 left to right. 548 3.7.1. YCbCr 550 This color space allows 1 to 4 "Planes". 552 The Cb and Cr "Planes" are optional, but if used then MUST be used 553 together. Omitting the Cb and Cr "Planes" codes the frames in 554 grayscale without color data. 556 An optional transparency "Plane" can be used to code transparency 557 data. 559 An FFV1 "Frame" using YCbCr MUST use one of the following 560 arrangements: 562 * Y 564 * Y, Transparency 566 * Y, Cb, Cr 568 * Y, Cb, Cr, Transparency 570 The Y "Plane" MUST be coded first. If the Cb and Cr "Planes" are 571 used then they MUST be coded after the Y "Plane". If a transparency 572 "Plane" is used, then it MUST be coded last. 574 3.7.2. RGB 576 This color space allows 3 or 4 "Planes". 578 An optional transparency "Plane" can be used to code transparency 579 data. 581 JPEG2000-RCT is a Reversible Color Transform that codes RGB (red, 582 green, blue) "Planes" losslessly in a modified YCbCr color space 583 [ISO.15444-1.2016]. Reversible Pixel transformations between YCbCr 584 and RGB use the following formulae. 586 Cb=b-g 587 Cr=r-g 588 Y=g+(Cb+Cr)>>2 589 g=Y-(Cb+Cr)>>2 590 r=Cr+g 591 b=Cb+g 593 Figure 3 595 Exception for the JPEG2000-RCT conversion: if bits_per_raw_sample is 596 between 9 and 15 inclusive and extra_plane is 0, the following 597 formulae for reversible conversions between YCbCr and RGB MUST be 598 used instead of the ones above: 600 Cb=g-b 601 Cr=r-b 602 Y=b+(Cb+Cr)>>2 603 b=Y-(Cb+Cr)>>2 604 r=Cr+b 605 g=Cb+b 607 Figure 4 609 Background: At the time of this writing, in all known implementations 610 of FFV1 bitstream, when bits_per_raw_sample was between 9 and 15 611 inclusive and extra_plane is 0, GBR "Planes" were used as BGR 612 "Planes" during both encoding and decoding. In the meanwhile, 16-bit 613 JPEG2000-RCT was implemented without this issue in one implementation 614 and validated by one conformance checker. Methods to address this 615 exception for the transform are under consideration for the next 616 version of the FFV1 bitstream. 618 When FFV1 uses the JPEG2000-RCT, the horizontal "Lines" are 619 interleaved to improve caching efficiency since it is most likely 620 that the JPEG2000-RCT will immediately be converted to RGB during 621 decoding. The interleaved coding order is also Y, then Cb, then Cr, 622 and then if used transparency. 624 As an example, a "Frame" that is two "Pixels" wide and two "Pixels" 625 high, could be comprised of the following structure: 627 +------------------------+------------------------+ 628 | Pixel(1,1) | Pixel(2,1) | 629 | Y(1,1) Cb(1,1) Cr(1,1) | Y(2,1) Cb(2,1) Cr(2,1) | 630 +------------------------+------------------------+ 631 | Pixel(1,2) | Pixel(2,2) | 632 | Y(1,2) Cb(1,2) Cr(1,2) | Y(2,2) Cb(2,2) Cr(2,2) | 633 +------------------------+------------------------+ 635 In JPEG2000-RCT, the coding order would be left to right and then top 636 to bottom, with values interleaved by "Lines" and stored in this 637 order: 639 Y(1,1) Y(2,1) Cb(1,1) Cb(2,1) Cr(1,1) Cr(2,1) Y(1,2) Y(2,2) Cb(1,2) 640 Cb(2,2) Cr(1,2) Cr(2,2) 642 3.8. Coding of the Sample Difference 644 Instead of coding the n+1 bits of the Sample Difference with Huffman 645 or Range coding (or n+2 bits, in the case of JPEG2000-RCT), only the 646 n (or n+1, in the case of JPEG2000-RCT) least significant bits are 647 used, since this is sufficient to recover the original "Sample". In 648 the equation below, the term "bits" represents bits_per_raw_sample+1 649 for JPEG2000-RCT or bits_per_raw_sample otherwise: 651 coder_input = 652 [(sample_difference + 2^(bits-1)) & (2^bits - 1)] - 2^(bits-1) 654 Figure 5 656 3.8.1. Range Coding Mode 658 Early experimental versions of FFV1 used the CABAC Arithmetic coder 659 from H.264 as defined in [ISO.14496-10.2014] but due to the uncertain 660 patent/royalty situation, as well as its slightly worse performance, 661 CABAC was replaced by a Range coder based on an algorithm defined by 662 G. Nigel and N. Martin in 1979 [range-coding]. 664 3.8.1.1. Range Binary Values 666 To encode binary digits efficiently a Range coder is used. "C~i~" is 667 the i-th Context. "B~i~" is the i-th byte of the bytestream. "b~i~" 668 is the i-th Range coded binary value, "S~0,i~" is the i-th initial 669 state. The length of the bytestream encoding n binary symbols is 670 "j~n~" bytes. 672 r_{i} = floor( ( R_{i} * S_{i,C_{i}} ) / 2^8 ) 674 Figure 6 676 S_{i+1,C_{i}} = zero_state_{S_{i,C_{i}}} XOR 677 l_i = L_i XOR 678 t_i = R_i - r_i <== 679 b_i = 0 <==> 680 L_i < R_i - r_i 682 S_{i+1,C_{i}} = one_state_{S_{i,C_{i}}} XOR 683 l_i = L_i - R_i + r_i XOR 684 t_i = r_i <== 685 b_i = 1 <==> 686 L_i >= R_i - r_i 688 Figure 7 690 S_{i+1,k} = S_{i,k} <== C_i != k 692 Figure 8 694 R_{i+1} = 2^8 * t_{i} XOR 695 L_{i+1} = 2^8 * l_{i} + B_{j_{i}} XOR 696 j_{i+1} = j_{i} + 1 <== 697 t_{i} < 2^8 699 R_{i+1} = t_{i} XOR 700 L_{i+1} = l_{i} XOR 701 j_{i+1} = j_{i} <== 702 t_{i} >= 2^8 704 Figure 9 706 R_{0} = 65280 708 Figure 10 710 L_{0} = 2^8 * B_{0} + B_{1} 712 Figure 11 714 j_{0} = 2 716 Figure 12 718 3.8.1.1.1. Termination 720 The range coder can be used in 3 modes. 722 * In "Open mode" when decoding, every symbol the reader attempts to 723 read is available. In this mode arbitrary data can have been 724 appended without affecting the range coder output. This mode is 725 not used in FFV1. 727 * In "Closed mode" the length in bytes of the bytestream is provided 728 to the range decoder. Bytes beyond the length are read as 0 by 729 the range decoder. This is generally 1 byte shorter than the open 730 mode. 732 * In "Sentinel mode" the exact length in bytes is not known and thus 733 the range decoder MAY read into the data that follows the range 734 coded bytestream by one byte. In "Sentinel mode", the end of the 735 range coded bytestream is a binary symbol with state 129, which 736 value SHALL be discarded. After reading this symbol, the range 737 decoder will have read one byte beyond the end of the range coded 738 bytestream. This way the byte position of the end can be 739 determined. Bytestreams written in "Sentinel mode" can be read in 740 "Closed mode" if the length can be determined, in this case the 741 last (sentinel) symbol will be read non-corrupted and be of value 742 0. 744 Above describes the range decoding, encoding is defined as any 745 process which produces a decodable bytestream. 747 There are 3 places where range coder termination is needed in FFV1. 748 First is in the "Configuration Record", in this case the size of the 749 range coded bytestream is known and handled as "Closed mode". Second 750 is the switch from the "Slice Header" which is range coded to Golomb 751 coded slices as "Sentinel mode". Third is the end of range coded 752 Slices which need to terminate before the CRC at their end. This can 753 be handled as "Sentinel mode" or as "Closed mode" if the CRC position 754 has been determined. 756 3.8.1.2. Range Non Binary Values 758 To encode scalar integers, it would be possible to encode each bit 759 separately and use the past bits as context. However that would mean 760 255 contexts per 8-bit symbol that is not only a waste of memory but 761 also requires more past data to reach a reasonably good estimate of 762 the probabilities. Alternatively assuming a Laplacian distribution 763 and only dealing with its variance and mean (as in Huffman coding) 764 would also be possible, however, for maximum flexibility and 765 simplicity, the chosen method uses a single symbol to encode if a 766 number is 0, and if not, encodes the number using its exponent, 767 mantissa and sign. The exact contexts used are best described by the 768 following code, followed by some comments. 770 pseudo-code | type 771 --------------------------------------------------------------|----- 772 void put_symbol(RangeCoder *c, uint8_t *state, int v, int \ | 773 is_signed) { | 774 int i; | 775 put_rac(c, state+0, !v); | 776 if (v) { | 777 int a= abs(v); | 778 int e= log2(a); | 779 | 780 for (i = 0; i < e; i++) { | 781 put_rac(c, state+1+min(i,9), 1); //1..10 | 782 } | 783 | 784 put_rac(c, state+1+min(i,9), 0); | 785 for (i = e-1; i >= 0; i--) { | 786 put_rac(c, state+22+min(i,9), (a>>i)&1); //22..31 | 787 } | 788 | 789 if (is_signed) { | 790 put_rac(c, state+11 + min(e, 10), v < 0); //11..21| 791 } | 792 } | 793 } | 795 3.8.1.3. Initial Values for the Context Model 797 At keyframes all Range coder state variables are set to their initial 798 state. 800 3.8.1.4. State Transition Table 802 one_state_{i} = 803 default_state_transition_{i} + state_transition_delta_{i} 805 Figure 13 807 zero_state_{i} = 256 - one_state_{256-i} 809 Figure 14 811 3.8.1.5. default_state_transition 812 0, 0, 0, 0, 0, 0, 0, 0, 20, 21, 22, 23, 24, 25, 26, 27, 814 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 37, 38, 39, 40, 41, 42, 816 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 56, 57, 818 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 820 74, 75, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 822 89, 90, 91, 92, 93, 94, 94, 95, 96, 97, 98, 99,100,101,102,103, 824 104,105,106,107,108,109,110,111,112,113,114,114,115,116,117,118, 826 119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,133, 828 134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149, 830 150,151,152,152,153,154,155,156,157,158,159,160,161,162,163,164, 832 165,166,167,168,169,170,171,171,172,173,174,175,176,177,178,179, 834 180,181,182,183,184,185,186,187,188,189,190,190,191,192,194,194, 836 195,196,197,198,199,200,201,202,202,204,205,206,207,208,209,209, 838 210,211,212,213,215,215,216,217,218,219,220,220,222,223,224,225, 840 226,227,227,229,229,230,231,232,234,234,235,236,237,238,239,240, 842 241,242,243,244,245,246,247,248,248, 0, 0, 0, 0, 0, 0, 0, 844 3.8.1.6. Alternative State Transition Table 846 The alternative state transition table has been built using iterative 847 minimization of frame sizes and generally performs better than the 848 default. To use it, the coder_type (see the section on coder_type 849 (#codertype)) MUST be set to 2 and the difference to the default MUST 850 be stored in the "Parameters", see the section on Parameters 851 (#parameters). The reference implementation of FFV1 in FFmpeg uses 852 this table by default at the time of this writing when Range coding 853 is used. 855 0, 10, 10, 10, 10, 16, 16, 16, 28, 16, 16, 29, 42, 49, 20, 49, 857 59, 25, 26, 26, 27, 31, 33, 33, 33, 34, 34, 37, 67, 38, 39, 39, 859 40, 40, 41, 79, 43, 44, 45, 45, 48, 48, 64, 50, 51, 52, 88, 52, 861 53, 74, 55, 57, 58, 58, 74, 60,101, 61, 62, 84, 66, 66, 68, 69, 863 87, 82, 71, 97, 73, 73, 82, 75,111, 77, 94, 78, 87, 81, 83, 97, 865 85, 83, 94, 86, 99, 89, 90, 99,111, 92, 93,134, 95, 98,105, 98, 867 105,110,102,108,102,118,103,106,106,113,109,112,114,112,116,125, 869 115,116,117,117,126,119,125,121,121,123,145,124,126,131,127,129, 871 165,130,132,138,133,135,145,136,137,139,146,141,143,142,144,148, 873 147,155,151,149,151,150,152,157,153,154,156,168,158,162,161,160, 875 172,163,169,164,166,184,167,170,177,174,171,173,182,176,180,178, 877 175,189,179,181,186,183,192,185,200,187,191,188,190,197,193,196, 879 197,194,195,196,198,202,199,201,210,203,207,204,205,206,208,214, 881 209,211,221,212,213,215,224,216,217,218,219,220,222,228,223,225, 883 226,224,227,229,240,230,231,232,233,234,235,236,238,239,237,242, 885 241,243,242,244,245,246,247,248,249,250,251,252,252,253,254,255, 887 3.8.2. Golomb Rice Mode 889 The end of the bitstream of the "Frame" is filled with 0-bits until 890 that the bitstream contains a multiple of 8 bits. 892 3.8.2.1. Signed Golomb Rice Codes 894 This coding mode uses Golomb Rice codes. The VLC is split into 2 895 parts, the prefix stores the most significant bits and the suffix 896 stores the k least significant bits or stores the whole number in the 897 ESC case. 899 pseudo-code | type 900 --------------------------------------------------------------|----- 901 int get_ur_golomb(k) { | 902 for (prefix = 0; prefix < 12; prefix++) { | 903 if (get_bits(1)) { | 904 return get_bits(k) + (prefix << k) | 905 } | 906 } | 907 return get_bits(bits) + 11 | 908 } | 909 | 910 int get_sr_golomb(k) { | 911 v = get_ur_golomb(k); | 912 if (v & 1) return - (v >> 1) - 1; | 913 else return (v >> 1); | 914 } 916 3.8.2.1.1. Prefix 918 +----------------+-------+ 919 | bits | value | 920 +================+=======+ 921 | 1 | 0 | 922 +----------------+-------+ 923 | 01 | 1 | 924 +----------------+-------+ 925 | ... | ... | 926 +----------------+-------+ 927 | 0000 0000 0001 | 11 | 928 +----------------+-------+ 929 | 0000 0000 0000 | ESC | 930 +----------------+-------+ 932 Table 1 934 3.8.2.1.2. Suffix 936 +---------+--------------------------------------------------+ 937 +=========+==================================================+ 938 | non ESC | the k least significant bits MSB first | 939 +---------+--------------------------------------------------+ 940 | ESC | the value - 11, in MSB first order, ESC may only | 941 | | be used if the value cannot be coded as non ESC | 942 +---------+--------------------------------------------------+ 944 Table 2 946 3.8.2.1.3. Examples 948 +-----+-------------------------+-------+ 949 | k | bits | value | 950 +=====+=========================+=======+ 951 | 0 | "1" | 0 | 952 +-----+-------------------------+-------+ 953 | 0 | "001" | 2 | 954 +-----+-------------------------+-------+ 955 | 2 | "1 00" | 0 | 956 +-----+-------------------------+-------+ 957 | 2 | "1 10" | 2 | 958 +-----+-------------------------+-------+ 959 | 2 | "01 01" | 5 | 960 +-----+-------------------------+-------+ 961 | any | "000000000000 10000000" | 139 | 962 +-----+-------------------------+-------+ 964 Table 3 966 3.8.2.2. Run Mode 968 Run mode is entered when the context is 0 and left as soon as a non-0 969 difference is found. The level is identical to the predicted one. 970 The run and the first different level are coded. 972 3.8.2.2.1. Run Length Coding 974 The run value is encoded in 2 parts, the prefix part stores the more 975 significant part of the run as well as adjusting the run_index that 976 determines the number of bits in the less significant part of the 977 run. The 2nd part of the value stores the less significant part of 978 the run as it is. The run_index is reset for each "Plane" and slice 979 to 0. 981 pseudo-code | type 982 --------------------------------------------------------------|----- 983 log2_run[41]={ | 984 0, 0, 0, 0, 1, 1, 1, 1, | 985 2, 2, 2, 2, 3, 3, 3, 3, | 986 4, 4, 5, 5, 6, 6, 7, 7, | 987 8, 9,10,11,12,13,14,15, | 988 16,17,18,19,20,21,22,23, | 989 24, | 990 }; | 991 | 992 if (run_count == 0 && run_mode == 1) { | 993 if (get_bits(1)) { | 994 run_count = 1 << log2_run[run_index]; | 995 if (x + run_count <= w) { | 996 run_index++; | 997 } | 998 } else { | 999 if (log2_run[run_index]) { | 1000 run_count = get_bits(log2_run[run_index]); | 1001 } else { | 1002 run_count = 0; | 1003 } | 1004 if (run_index) { | 1005 run_index--; | 1006 } | 1007 run_mode = 2; | 1008 } | 1009 } | 1011 The log2_run function is also used within [ISO.14495-1.1999]. 1013 3.8.2.2.2. Level Coding 1015 Level coding is identical to the normal difference coding with the 1016 exception that the 0 value is removed as it cannot occur: 1018 diff = get_vlc_symbol(context_state); 1019 if (diff >= 0) { 1020 diff++; 1021 } 1023 Note, this is different from JPEG-LS, which doesn't use prediction in 1024 run mode and uses a different encoding and context model for the last 1025 difference On a small set of test "Samples" the use of prediction 1026 slightly improved the compression rate. 1028 3.8.2.3. Scalar Mode 1030 Each difference is coded with the per context mean prediction removed 1031 and a per context value for k. 1033 get_vlc_symbol(state) { 1034 i = state->count; 1035 k = 0; 1036 while (i < state->error_sum) { 1037 k++; 1038 i += i; 1039 } 1041 v = get_sr_golomb(k); 1043 if (2 * state->drift < -state->count) { 1044 v = -1 - v; 1045 } 1047 ret = sign_extend(v + state->bias, bits); 1049 state->error_sum += abs(v); 1050 state->drift += v; 1052 if (state->count == 128) { 1053 state->count >>= 1; 1054 state->drift >>= 1; 1055 state->error_sum >>= 1; 1056 } 1057 state->count++; 1058 if (state->drift <= -state->count) { 1059 state->bias = max(state->bias - 1, -128); 1061 state->drift = max(state->drift + state->count, 1062 -state->count + 1); 1063 } else if (state->drift > 0) { 1064 state->bias = min(state->bias + 1, 127); 1066 state->drift = min(state->drift - state->count, 0); 1067 } 1069 return ret; 1070 } 1072 3.8.2.4. Initial Values for the VLC context state 1074 At keyframes all coder state variables are set to their initial 1075 state. 1077 drift = 0; 1078 error_sum = 4; 1079 bias = 0; 1080 count = 1; 1082 4. Bitstream 1084 An FFV1 bitstream is composed of a series of 1 or more "Frames" and 1085 (when required) a "Configuration Record". 1087 Within the following sub-sections, pseudo-code is used to explain the 1088 structure of each FFV1 bitstream component, as described in the 1089 section on Pseudo-Code (#pseudocode). The following table lists 1090 symbols used to annotate that pseudo-code in order to define the 1091 storage of the data referenced in that line of pseudo-code. 1093 +--------+-------------------------------------------+ 1094 | Symbol | Definition | 1095 +========+===========================================+ 1096 | u(n) | unsigned big endian integer using n bits | 1097 +--------+-------------------------------------------+ 1098 | sg | Golomb Rice coded signed scalar symbol | 1099 | | coded with the method described in Signed | 1100 | | Golomb Rice Codes (#golomb-rice-mode) | 1101 +--------+-------------------------------------------+ 1102 | br | Range coded Boolean (1-bit) symbol with | 1103 | | the method described in Range binary | 1104 | | values (#range-binary-values) | 1105 +--------+-------------------------------------------+ 1106 | ur | Range coded unsigned scalar symbol coded | 1107 | | with the method described in Range non | 1108 | | binary values (#range-non-binary-values) | 1109 +--------+-------------------------------------------+ 1110 | sr | Range coded signed scalar symbol coded | 1111 | | with the method described in Range non | 1112 | | binary values (#range-non-binary-values) | 1113 +--------+-------------------------------------------+ 1115 Table 4 1117 The same context that is initialized to 128 is used for all fields in 1118 the header. 1120 The following MUST be provided by external means during 1121 initialization of the decoder: 1123 "frame_pixel_width" is defined as "Frame" width in "Pixels". 1125 "frame_pixel_height" is defined as "Frame" height in "Pixels". 1127 Default values at the decoder initialization phase: 1129 "ConfigurationRecordIsPresent" is set to 0. 1131 4.1. Parameters 1133 The "Parameters" section contains significant characteristics about 1134 the decoding configuration used for all instances of "Frame" (in FFV1 1135 version 0 and 1) or the whole FFV1 bitstream (other versions), 1136 including the stream version, color configuration, and quantization 1137 tables. The pseudo-code below describes the contents of the 1138 bitstream. 1140 pseudo-code | type 1141 --------------------------------------------------------------|----- 1142 Parameters( ) { | 1143 version | ur 1144 if (version >= 3) { | 1145 micro_version | ur 1146 } | 1147 coder_type | ur 1148 if (coder_type > 1) { | 1149 for (i = 1; i < 256; i++) { | 1150 state_transition_delta[ i ] | sr 1151 } | 1152 } | 1153 colorspace_type | ur 1154 if (version >= 1) { | 1155 bits_per_raw_sample | ur 1156 } | 1157 chroma_planes | br 1158 log2_h_chroma_subsample | ur 1159 log2_v_chroma_subsample | ur 1160 extra_plane | br 1161 if (version >= 3) { | 1162 num_h_slices - 1 | ur 1163 num_v_slices - 1 | ur 1164 quant_table_set_count | ur 1165 } | 1166 for (i = 0; i < quant_table_set_count; i++) { | 1167 QuantizationTableSet( i ) | 1168 } | 1169 if (version >= 3) { | 1170 for (i = 0; i < quant_table_set_count; i++) { | 1171 states_coded | br 1172 if (states_coded) { | 1173 for (j = 0; j < context_count[ i ]; j++) { | 1174 for (k = 0; k < CONTEXT_SIZE; k++) { | 1175 initial_state_delta[ i ][ j ][ k ] | sr 1176 } | 1177 } | 1178 } | 1179 } | 1180 ec | ur 1181 intra | ur 1182 } | 1183 } | 1185 CONTEXT_SIZE is 32. 1187 4.1.1. version 1189 "version" specifies the version of the FFV1 bitstream. 1191 Each version is incompatible with other versions: decoders SHOULD 1192 reject a file due to an unknown version. 1194 Decoders SHOULD reject a file with version <= 1 && 1195 ConfigurationRecordIsPresent == 1. 1197 Decoders SHOULD reject a file with version >= 3 && 1198 ConfigurationRecordIsPresent == 0. 1200 +-------+-------------------------+ 1201 | value | version | 1202 +=======+=========================+ 1203 | 0 | FFV1 version 0 | 1204 +-------+-------------------------+ 1205 | 1 | FFV1 version 1 | 1206 +-------+-------------------------+ 1207 | 2 | reserved* | 1208 +-------+-------------------------+ 1209 | 3 | FFV1 version 3 | 1210 +-------+-------------------------+ 1211 | 4 | FFV1 version 4 | 1212 +-------+-------------------------+ 1213 | Other | reserved for future use | 1214 +-------+-------------------------+ 1216 Table 5 1218 * Version 2 was never enabled in the encoder thus version 2 files 1219 SHOULD NOT exist, and this document does not describe them to keep 1220 the text simpler. 1222 4.1.2. micro_version 1224 "micro_version" specifies the micro-version of the FFV1 bitstream. 1226 After a version is considered stable (a micro-version value is 1227 assigned to be the first stable variant of a specific version), each 1228 new micro-version after this first stable variant is compatible with 1229 the previous micro-version: decoders SHOULD NOT reject a file due to 1230 an unknown micro-version equal or above the micro-version considered 1231 as stable. 1233 Meaning of micro_version for version 3: 1235 +-------+-------------------------+ 1236 | value | micro_version | 1237 +=======+=========================+ 1238 | 0...3 | reserved* | 1239 +-------+-------------------------+ 1240 | 4 | first stable variant | 1241 +-------+-------------------------+ 1242 | Other | reserved for future use | 1243 +-------+-------------------------+ 1245 Table 6 1247 * development versions may be incompatible with the stable variants. 1249 Meaning of micro_version for version 4 (note: at the time of writing 1250 of this specification, version 4 is not considered stable so the 1251 first stable version value is to be announced in the future): 1253 +---------+-------------------------+ 1254 | value | micro_version | 1255 +=========+=========================+ 1256 | 0...TBA | reserved* | 1257 +---------+-------------------------+ 1258 | TBA | first stable variant | 1259 +---------+-------------------------+ 1260 | Other | reserved for future use | 1261 +---------+-------------------------+ 1263 Table 7 1265 * development versions which may be incompatible with the stable 1266 variants. 1268 4.1.3. coder_type 1270 "coder_type" specifies the coder used. 1272 +-------+-------------------------------------------------+ 1273 | value | coder used | 1274 +=======+=================================================+ 1275 | 0 | Golomb Rice | 1276 +-------+-------------------------------------------------+ 1277 | 1 | Range Coder with default state transition table | 1278 +-------+-------------------------------------------------+ 1279 | 2 | Range Coder with custom state transition table | 1280 +-------+-------------------------------------------------+ 1281 | Other | reserved for future use | 1282 +-------+-------------------------------------------------+ 1284 Table 8 1286 4.1.4. state_transition_delta 1288 "state_transition_delta" specifies the Range coder custom state 1289 transition table. 1291 If state_transition_delta is not present in the FFV1 bitstream, all 1292 Range coder custom state transition table elements are assumed to be 1293 0. 1295 4.1.5. colorspace_type 1297 "colorspace_type" specifies the color space encoded, the pixel 1298 transformation used by the encoder, the extra plane content, as well 1299 as interleave method. 1301 +-------+-------------+----------------+--------------+-------------+ 1302 | value | color space | pixel | extra plane | interleave | 1303 | | encoded | transformation | content | method | 1304 +=======+=============+================+==============+=============+ 1305 | 0 | YCbCr | None | Transparency | "Plane" | 1306 | | | | | then | 1307 | | | | | "Line" | 1308 +-------+-------------+----------------+--------------+-------------+ 1309 | 1 | RGB | JPEG2000-RCT | Transparency | "Line" | 1310 | | | | | then | 1311 | | | | | "Plane" | 1312 +-------+-------------+----------------+--------------+-------------+ 1313 | Other | reserved | reserved for | reserved for | reserved | 1314 | | for future | future use | future use | for future | 1315 | | use | | | use | 1316 +-------+-------------+----------------+--------------+-------------+ 1318 Table 9 1320 Restrictions: 1322 If "colorspace_type" is 1, then "chroma_planes" MUST be 1, 1323 "log2_h_chroma_subsample" MUST be 0, and "log2_v_chroma_subsample" 1324 MUST be 0. 1326 4.1.6. chroma_planes 1328 "chroma_planes" indicates if chroma (color) "Planes" are present. 1330 +-------+---------------------------------+ 1331 | value | presence | 1332 +=======+=================================+ 1333 | 0 | chroma "Planes" are not present | 1334 +-------+---------------------------------+ 1335 | 1 | chroma "Planes" are present | 1336 +-------+---------------------------------+ 1338 Table 10 1340 4.1.7. bits_per_raw_sample 1342 "bits_per_raw_sample" indicates the number of bits for each "Sample". 1343 Inferred to be 8 if not present. 1345 +-------+-----------------------------------+ 1346 | value | bits for each sample | 1347 +=======+===================================+ 1348 | 0 | reserved* | 1349 +-------+-----------------------------------+ 1350 | Other | the actual bits for each "Sample" | 1351 +-------+-----------------------------------+ 1353 Table 11 1355 * Encoders MUST NOT store bits_per_raw_sample = 0 Decoders SHOULD 1356 accept and interpret bits_per_raw_sample = 0 as 8. 1358 4.1.8. log2_h_chroma_subsample 1360 "log2_h_chroma_subsample" indicates the subsample factor, stored in 1361 powers to which the number 2 must be raised, between luma and chroma 1362 width ("chroma_width = 2^-log2_h_chroma_subsample^ * luma_width"). 1364 4.1.9. log2_v_chroma_subsample 1366 "log2_v_chroma_subsample" indicates the subsample factor, stored in 1367 powers to which the number 2 must be raised, between luma and chroma 1368 height ("chroma_height=2^-log2_v_chroma_subsample^ * luma_height"). 1370 4.1.10. extra_plane 1372 "extra_plane" indicates if an extra "Plane" is present. 1374 +-------+------------------------------+ 1375 | value | presence | 1376 +=======+==============================+ 1377 | 0 | extra "Plane" is not present | 1378 +-------+------------------------------+ 1379 | 1 | extra "Plane" is present | 1380 +-------+------------------------------+ 1382 Table 12 1384 4.1.11. num_h_slices 1386 "num_h_slices" indicates the number of horizontal elements of the 1387 slice raster. 1389 Inferred to be 1 if not present. 1391 4.1.12. num_v_slices 1393 "num_v_slices" indicates the number of vertical elements of the slice 1394 raster. 1396 Inferred to be 1 if not present. 1398 4.1.13. quant_table_set_count 1400 "quant_table_set_count" indicates the number of Quantization 1401 Table Sets. "quant_table_set_count" MUST be less than or equal to 8. 1403 Inferred to be 1 if not present. 1405 MUST NOT be 0. 1407 4.1.14. states_coded 1409 "states_coded" indicates if the respective Quantization Table Set has 1410 the initial states coded. 1412 Inferred to be 0 if not present. 1414 +-------+--------------------------------+ 1415 | value | initial states | 1416 +=======+================================+ 1417 | 0 | initial states are not present | 1418 | | and are assumed to be all 128 | 1419 +-------+--------------------------------+ 1420 | 1 | initial states are present | 1421 +-------+--------------------------------+ 1423 Table 13 1425 4.1.15. initial_state_delta 1427 "initial_state_delta[ i ][ j ][ k ]" indicates the initial Range 1428 coder state, it is encoded using "k" as context index and 1430 pred = j ? initial_states[ i ][j - 1][ k ] 1432 Figure 15 1434 initial_state[ i ][ j ][ k ] = 1435 ( pred + initial_state_delta[ i ][ j ][ k ] ) & 255 1437 Figure 16 1439 4.1.16. ec 1441 "ec" indicates the error detection/correction type. 1443 +-------+--------------------------------------------+ 1444 | value | error detection/correction type | 1445 +=======+============================================+ 1446 | 0 | 32-bit CRC on the global header | 1447 +-------+--------------------------------------------+ 1448 | 1 | 32-bit CRC per slice and the global header | 1449 +-------+--------------------------------------------+ 1450 | Other | reserved for future use | 1451 +-------+--------------------------------------------+ 1453 Table 14 1455 4.1.17. intra 1457 "intra" indicates the relationship between the instances of "Frame". 1459 Inferred to be 0 if not present. 1461 +-------+-------------------------------------+ 1462 | value | relationship | 1463 +=======+=====================================+ 1464 | 0 | Frames are independent or dependent | 1465 | | (keyframes and non keyframes) | 1466 +-------+-------------------------------------+ 1467 | 1 | Frames are independent (keyframes | 1468 | | only) | 1469 +-------+-------------------------------------+ 1470 | Other | reserved for future use | 1471 +-------+-------------------------------------+ 1473 Table 15 1475 4.2. Configuration Record 1477 In the case of a FFV1 bitstream with "version >= 3", a "Configuration 1478 Record" is stored in the underlying "Container", at the track header 1479 level. It contains the "Parameters" used for all instances of 1480 "Frame". The size of the "Configuration Record", "NumBytes", is 1481 supplied by the underlying "Container". 1483 pseudo-code | type 1484 -----------------------------------------------------------|----- 1485 ConfigurationRecord( NumBytes ) { | 1486 ConfigurationRecordIsPresent = 1 | 1487 Parameters( ) | 1488 while (remaining_symbols_in_syntax(NumBytes - 4)) { | 1489 reserved_for_future_use | br/ur/sr 1490 } | 1491 configuration_record_crc_parity | u(32) 1492 } | 1494 4.2.1. reserved_for_future_use 1496 "reserved_for_future_use" has semantics that are reserved for future 1497 use. 1499 Encoders conforming to this version of this specification SHALL NOT 1500 write this value. 1502 Decoders conforming to this version of this specification SHALL 1503 ignore its value. 1505 4.2.2. configuration_record_crc_parity 1507 "configuration_record_crc_parity" 32 bits that are chosen so that the 1508 "Configuration Record" as a whole has a crc remainder of 0. 1510 This is equivalent to storing the crc remainder in the 32-bit parity. 1512 The CRC generator polynomial used is the standard IEEE CRC polynomial 1513 (0x104C11DB7) with initial value 0. 1515 4.2.3. Mapping FFV1 into Containers 1517 This "Configuration Record" can be placed in any file format 1518 supporting "Configuration Records", fitting as much as possible with 1519 how the file format uses to store "Configuration Records". The 1520 "Configuration Record" storage place and "NumBytes" are currently 1521 defined and supported by this version of this specification for the 1522 following formats: 1524 4.2.3.1. AVI File Format 1526 The "Configuration Record" extends the stream format chunk ("AVI ", 1527 "hdlr", "strl", "strf") with the ConfigurationRecord bitstream. 1529 See [AVI] for more information about chunks. 1531 "NumBytes" is defined as the size, in bytes, of the strf chunk 1532 indicated in the chunk header minus the size of the stream format 1533 structure. 1535 4.2.3.2. ISO Base Media File Format 1537 The "Configuration Record" extends the sample description box 1538 ("moov", "trak", "mdia", "minf", "stbl", "stsd") with a "glbl" box 1539 that contains the ConfigurationRecord bitstream. See 1540 [ISO.14496-12.2015] for more information about boxes. 1542 "NumBytes" is defined as the size, in bytes, of the "glbl" box 1543 indicated in the box header minus the size of the box header. 1545 4.2.3.3. NUT File Format 1547 The codec_specific_data element (in "stream_header" packet) contains 1548 the ConfigurationRecord bitstream. See [NUT] for more information 1549 about elements. 1551 "NumBytes" is defined as the size, in bytes, of the 1552 codec_specific_data element as indicated in the "length" field of 1553 codec_specific_data 1555 4.2.3.4. Matroska File Format 1557 FFV1 SHOULD use "V_FFV1" as the Matroska "Codec ID". For FFV1 1558 versions 2 or less, the Matroska "CodecPrivate" Element SHOULD NOT be 1559 used. For FFV1 versions 3 or greater, the Matroska "CodecPrivate" 1560 Element MUST contain the FFV1 "Configuration Record" structure and no 1561 other data. See [Matroska] for more information about elements. 1563 "NumBytes" is defined as the "Element Data Size" of the 1564 "CodecPrivate" Element. 1566 4.3. Frame 1568 A "Frame" is an encoded representation of a complete static image. 1569 The whole "Frame" is provided by the underlaying container. 1571 A "Frame" consists of the keyframe field, "Parameters" (if version 1572 <=1), and a sequence of independent slices. The pseudo-code below 1573 describes the contents of a "Frame". 1575 pseudo-code | type 1576 --------------------------------------------------------------|----- 1577 Frame( NumBytes ) { | 1578 keyframe | br 1579 if (keyframe && !ConfigurationRecordIsPresent { | 1580 Parameters( ) | 1581 } | 1582 while (remaining_bits_in_bitstream( NumBytes )) { | 1583 Slice( ) | 1584 } | 1585 } | 1587 Architecture overview of slices in a "Frame": 1589 +-----------------------------------------------------------------+ 1590 +=================================================================+ 1591 | first slice header | 1592 +-----------------------------------------------------------------+ 1593 | first slice content | 1594 +-----------------------------------------------------------------+ 1595 | first slice footer | 1596 +-----------------------------------------------------------------+ 1597 | --------------------------------------------------------------- | 1598 +-----------------------------------------------------------------+ 1599 | second slice header | 1600 +-----------------------------------------------------------------+ 1601 | second slice content | 1602 +-----------------------------------------------------------------+ 1603 | second slice footer | 1604 +-----------------------------------------------------------------+ 1605 | --------------------------------------------------------------- | 1606 +-----------------------------------------------------------------+ 1607 | ... | 1608 +-----------------------------------------------------------------+ 1609 | --------------------------------------------------------------- | 1610 +-----------------------------------------------------------------+ 1611 | last slice header | 1612 +-----------------------------------------------------------------+ 1613 | last slice content | 1614 +-----------------------------------------------------------------+ 1615 | last slice footer | 1616 +-----------------------------------------------------------------+ 1618 Table 16 1620 4.4. Slice 1622 A "Slice" is an independent spatial sub-section of a "Frame" that is 1623 encoded separately from an other region of the same "Frame". The use 1624 of more than one "Slice" per "Frame" can be useful for taking 1625 advantage of the opportunities of multithreaded encoding and 1626 decoding. 1628 A "Slice" consists of a "Slice Header" (when relevant), a "Slice 1629 Content", and a "Slice Footer" (when relevant). The pseudo-code 1630 below describes the contents of a "Slice". 1632 pseudo-code | type 1633 --------------------------------------------------------------|----- 1634 Slice( ) { | 1635 if (version >= 3) { | 1636 SliceHeader( ) | 1637 } | 1638 SliceContent( ) | 1639 if (coder_type == 0) { | 1640 while (!byte_aligned()) { | 1641 padding | u(1) 1642 } | 1643 } | 1644 if (version <= 1) { | 1645 while (remaining_bits_in_bitstream( NumBytes ) != 0) {| 1646 reserved | u(1) 1647 } | 1648 } | 1649 if (version >= 3) { | 1650 SliceFooter( ) | 1651 } | 1652 } | 1654 "padding" specifies a bit without any significance and used only for 1655 byte alignment. MUST be 0. 1657 "reserved" specifies a bit without any significance in this revision 1658 of the specification and may have a significance in a later revision 1659 of this specification. 1661 Encoders SHOULD NOT fill these bits. 1663 Decoders SHOULD ignore these bits. 1665 Note in case these bits are used in a later revision of this 1666 specification: any revision of this specification SHOULD care about 1667 avoiding to add 40 bits of content after "SliceContent" for version 0 1668 and 1 of the bitstream. Background: due to some non conforming 1669 encoders, some bitstreams where found with 40 extra bits 1670 corresponding to "error_status" and "slice_crc_parity", a decoder 1671 conforming to the revised specification could not do the difference 1672 between a revised bitstream and a buggy bitstream. 1674 4.5. Slice Header 1676 A "Slice Header" provides information about the decoding 1677 configuration of the "Slice", such as its spatial position, size, and 1678 aspect ratio. The pseudo-code below describes the contents of the 1679 "Slice Header". 1681 pseudo-code | type 1682 --------------------------------------------------------------|----- 1683 SliceHeader( ) { | 1684 slice_x | ur 1685 slice_y | ur 1686 slice_width - 1 | ur 1687 slice_height - 1 | ur 1688 for (i = 0; i < quant_table_set_index_count; i++) { | 1689 quant_table_set_index[ i ] | ur 1690 } | 1691 picture_structure | ur 1692 sar_num | ur 1693 sar_den | ur 1694 if (version >= 4) { | 1695 reset_contexts | br 1696 slice_coding_mode | ur 1697 } | 1698 } | 1700 4.5.1. slice_x 1702 "slice_x" indicates the x position on the slice raster formed by 1703 num_h_slices. 1705 Inferred to be 0 if not present. 1707 4.5.2. slice_y 1709 "slice_y" indicates the y position on the slice raster formed by 1710 num_v_slices. 1712 Inferred to be 0 if not present. 1714 4.5.3. slice_width 1716 "slice_width" indicates the width on the slice raster formed by 1717 num_h_slices. 1719 Inferred to be 1 if not present. 1721 4.5.4. slice_height 1723 "slice_height" indicates the height on the slice raster formed by 1724 num_v_slices. 1726 Inferred to be 1 if not present. 1728 4.5.5. quant_table_set_index_count 1730 "quant_table_set_index_count" is defined as "1 + ( ( chroma_planes || 1731 version <= 3 ) ? 1 : 0 ) + ( extra_plane ? 1 : 0 )". 1733 4.5.6. quant_table_set_index 1735 "quant_table_set_index" indicates the Quantization Table Set index to 1736 select the Quantization Table Set and the initial states for the 1737 slice. 1739 Inferred to be 0 if not present. 1741 4.5.7. picture_structure 1743 "picture_structure" specifies the temporal and spatial relationship 1744 of each "Line" of the "Frame". 1746 Inferred to be 0 if not present. 1748 +-------+-------------------------+ 1749 | value | picture structure used | 1750 +=======+=========================+ 1751 | 0 | unknown | 1752 +-------+-------------------------+ 1753 | 1 | top field first | 1754 +-------+-------------------------+ 1755 | 2 | bottom field first | 1756 +-------+-------------------------+ 1757 | 3 | progressive | 1758 +-------+-------------------------+ 1759 | Other | reserved for future use | 1760 +-------+-------------------------+ 1762 Table 17 1764 4.5.8. sar_num 1766 "sar_num" specifies the "Sample" aspect ratio numerator. 1768 Inferred to be 0 if not present. 1770 A value of 0 means that aspect ratio is unknown. 1772 Encoders MUST write 0 if "Sample" aspect ratio is unknown. 1774 If "sar_den" is 0, decoders SHOULD ignore the encoded value and 1775 consider that "sar_num" is 0. 1777 4.5.9. sar_den 1779 "sar_den" specifies the "Sample" aspect ratio denominator. 1781 Inferred to be 0 if not present. 1783 A value of 0 means that aspect ratio is unknown. 1785 Encoders MUST write 0 if "Sample" aspect ratio is unknown. 1787 If "sar_num" is 0, decoders SHOULD ignore the encoded value and 1788 consider that "sar_den" is 0. 1790 4.5.10. reset_contexts 1792 "reset_contexts" indicates if slice contexts must be reset. 1794 Inferred to be 0 if not present. 1796 4.5.11. slice_coding_mode 1798 "slice_coding_mode" indicates the slice coding mode. 1800 Inferred to be 0 if not present. 1802 +-------+-----------------------------+ 1803 | value | slice coding mode | 1804 +=======+=============================+ 1805 | 0 | Range Coding or Golomb Rice | 1806 +-------+-----------------------------+ 1807 | 1 | raw PCM | 1808 +-------+-----------------------------+ 1809 | Other | reserved for future use | 1810 +-------+-----------------------------+ 1812 Table 18 1814 4.6. Slice Content 1816 A "Slice Content" contains all "Line" elements part of the "Slice". 1818 Depending on the configuration, "Line" elements are ordered by 1819 "Plane" then by row (YCbCr) or by row then by "Plane" (RGB). 1821 pseudo-code | type 1822 --------------------------------------------------------------|----- 1823 SliceContent( ) { | 1824 if (colorspace_type == 0) { | 1825 for (p = 0; p < primary_color_count; p++) { | 1826 for (y = 0; y < plane_pixel_height[ p ]; y++) { | 1827 Line( p, y ) | 1828 } | 1829 } | 1830 } else if (colorspace_type == 1) { | 1831 for (y = 0; y < slice_pixel_height; y++) { | 1832 for (p = 0; p < primary_color_count; p++) { | 1833 Line( p, y ) | 1834 } | 1835 } | 1836 } | 1837 } | 1839 4.6.1. primary_color_count 1841 "primary_color_count" is defined as "1 + ( chroma_planes ? 2 : 0 ) + 1842 ( extra_plane ? 1 : 0 )". 1844 4.6.2. plane_pixel_height 1846 "plane_pixel_height[ p ]" is the height in pixels of plane p of the 1847 slice. 1849 "plane_pixel_height[ 0 ]" and "plane_pixel_height[ 1 + ( 1850 chroma_planes ? 2 : 0 ) ]" value is "slice_pixel_height". 1852 If "chroma_planes" is set to 1, "plane_pixel_height[ 1 ]" and 1853 "plane_pixel_height[ 2 ]" value is "ceil( slice_pixel_height / (1 << 1854 log2_v_chroma_subsample) )". 1856 4.6.3. slice_pixel_height 1858 "slice_pixel_height" is the height in pixels of the slice. 1860 Its value is "floor( ( slice_y + slice_height ) * slice_pixel_height 1861 / num_v_slices ) - slice_pixel_y". 1863 4.6.4. slice_pixel_y 1865 "slice_pixel_y" is the slice vertical position in pixels. 1867 Its value is "floor( slice_y * frame_pixel_height / num_v_slices )". 1869 4.7. Line 1871 A "Line" is a list of the sample differences (relative to the 1872 predictor) of primary color components. The pseudo-code below 1873 describes the contents of the "Line". 1875 pseudo-code | type 1876 --------------------------------------------------------------|----- 1877 Line( p, y ) { | 1878 if (colorspace_type == 0) { | 1879 for (x = 0; x < plane_pixel_width[ p ]; x++) { | 1880 sample_difference[ p ][ y ][ x ] | 1881 } | 1882 } else if (colorspace_type == 1) { | 1883 for (x = 0; x < slice_pixel_width; x++) { | 1884 sample_difference[ p ][ y ][ x ] | 1885 } | 1886 } | 1887 } | 1889 4.7.1. plane_pixel_width 1891 "plane_pixel_width[ p ]" is the width in "Pixels" of "Plane" p of the 1892 slice. 1894 "plane_pixel_width[ 0 ]" and "plane_pixel_width[ 1 + ( chroma_planes 1895 ? 2 : 0 ) ]" value is "slice_pixel_width". 1897 If "chroma_planes" is set to 1, "plane_pixel_width[ 1 ]" and 1898 "plane_pixel_width[ 2 ]" value is "ceil( slice_pixel_width / (1 << 1899 log2_h_chroma_subsample) )". 1901 4.7.2. slice_pixel_width 1903 "slice_pixel_width" is the width in "Pixels" of the slice. 1905 Its value is "floor( ( slice_x + slice_width ) * slice_pixel_width / 1906 num_h_slices ) - slice_pixel_x". 1908 4.7.3. slice_pixel_x 1910 "slice_pixel_x" is the slice horizontal position in "Pixels". 1912 Its value is "floor( slice_x * frame_pixel_width / num_h_slices )". 1914 4.7.4. sample_difference 1916 "sample_difference[ p ][ y ][ x ]" is the sample difference for 1917 "Sample" at "Plane" "p", y position "y", and x position "x". The 1918 "Sample" value is computed based on median predictor and context 1919 described in the section on Samples (#samples). 1921 4.8. Slice Footer 1923 A "Slice Footer" provides information about slice size and 1924 (optionally) parity. The pseudo-code below describes the contents of 1925 the "Slice Footer". 1927 Note: "Slice Footer" is always byte aligned. 1929 pseudo-code | type 1930 --------------------------------------------------------------|----- 1931 SliceFooter( ) { | 1932 slice_size | u(24) 1933 if (ec) { | 1934 error_status | u(8) 1935 slice_crc_parity | u(32) 1936 } | 1937 } | 1939 4.8.1. slice_size 1941 "slice_size" indicates the size of the slice in bytes. 1943 Note: this allows finding the start of slices before previous slices 1944 have been fully decoded, and allows parallel decoding as well as 1945 error resilience. 1947 4.8.2. error_status 1949 "error_status" specifies the error status. 1951 +-------+--------------------------------------+ 1952 | value | error status | 1953 +=======+======================================+ 1954 | 0 | no error | 1955 +-------+--------------------------------------+ 1956 | 1 | slice contains a correctable error | 1957 +-------+--------------------------------------+ 1958 | 2 | slice contains a uncorrectable error | 1959 +-------+--------------------------------------+ 1960 | Other | reserved for future use | 1961 +-------+--------------------------------------+ 1963 Table 19 1965 4.8.3. slice_crc_parity 1967 "slice_crc_parity" 32 bits that are chosen so that the slice as a 1968 whole has a crc remainder of 0. 1970 This is equivalent to storing the crc remainder in the 32-bit parity. 1972 The CRC generator polynomial used is the standard IEEE CRC polynomial 1973 (0x104C11DB7), with initial value 0, without pre-inversion and 1974 without post-inversion. 1976 4.9. Quantization Table Set 1978 The Quantization Table Sets are stored by storing the number of equal 1979 entries -1 of the first half of the table (represented as "len - 1" 1980 in the pseudo-code below) using the method described in Range Non 1981 Binary Values (#range-non-binary-values). The second half doesn't 1982 need to be stored as it is identical to the first with flipped sign. 1983 "scale" and "len_count[ i ][ j ]" are temporary values used for the 1984 computing of "context_count[ i ]" and are not used outside 1985 Quantization Table Set pseudo-code. 1987 Example: 1989 Table: 0 0 1 1 1 1 2 2 -2 -2 -2 -1 -1 -1 -1 0 1991 Stored values: 1, 3, 1 1992 pseudo-code | type 1993 --------------------------------------------------------------|----- 1994 QuantizationTableSet( i ) { | 1995 scale = 1 | 1996 for (j = 0; j < MAX_CONTEXT_INPUTS; j++) { | 1997 QuantizationTable( i, j, scale ) | 1998 scale *= 2 * len_count[ i ][ j ] - 1 | 1999 } | 2000 context_count[ i ] = ceil( scale / 2 ) | 2001 } | 2003 MAX_CONTEXT_INPUTS is 5. 2005 pseudo-code | type 2006 --------------------------------------------------------------|----- 2007 QuantizationTable(i, j, scale) { | 2008 v = 0 | 2009 for (k = 0; k < 128;) { | 2010 len - 1 | ur 2011 for (a = 0; a < len; a++) { | 2012 quant_tables[ i ][ j ][ k ] = scale * v | 2013 k++ | 2014 } | 2015 v++ | 2016 } | 2017 for (k = 1; k < 128; k++) { | 2018 quant_tables[ i ][ j ][ 256 - k ] = \ | 2019 -quant_tables[ i ][ j ][ k ] | 2020 } | 2021 quant_tables[ i ][ j ][ 128 ] = \ | 2022 -quant_tables[ i ][ j ][ 127 ] | 2023 len_count[ i ][ j ] = v | 2024 } | 2026 4.9.1. quant_tables 2028 "quant_tables[ i ][ j ][ k ]" indicates the quantification table 2029 value of the Quantized Sample Difference "k" of the Quantization 2030 Table "j" of the Set Quantization Table Set "i". 2032 4.9.2. context_count 2034 "context_count[ i ]" indicates the count of contexts for Quantization 2035 Table Set "i". "context_count[ i ]" MUST be less than or equal to 2036 32768. 2038 5. Restrictions 2040 To ensure that fast multithreaded decoding is possible, starting with 2041 version 3 and if "frame_pixel_width * frame_pixel_height" is more 2042 than 101376, "slice_width * slice_height" MUST be less or equal to 2043 "num_h_slices * num_v_slices / 4". Note: 101376 is the frame size in 2044 "Pixels" of a 352x288 frame also known as CIF ("Common Intermediate 2045 Format") frame size format. 2047 For each "Frame", each position in the slice raster MUST be filled by 2048 one and only one slice of the "Frame" (no missing slice position, no 2049 slice overlapping). 2051 For each "Frame" with keyframe value of 0, each slice MUST have the 2052 same value of "slice_x, slice_y, slice_width, slice_height" as a 2053 slice in the previous "Frame", except if "reset_contexts" is 1. 2055 6. Security Considerations 2057 Like any other codec, (such as [RFC6716]), FFV1 should not be used 2058 with insecure ciphers or cipher-modes that are vulnerable to known 2059 plaintext attacks. Some of the header bits as well as the padding 2060 are easily predictable. 2062 Implementations of the FFV1 codec need to take appropriate security 2063 considerations into account, as outlined in [RFC4732]. It is 2064 extremely important for the decoder to be robust against malicious 2065 payloads. Malicious payloads must not cause the decoder to overrun 2066 its allocated memory or to take an excessive amount of resources to 2067 decode. The same applies to the encoder, even though problems in 2068 encoders are typically rarer. Malicious video streams must not cause 2069 the encoder to misbehave because this would allow an attacker to 2070 attack transcoding gateways. A frequent security problem in image 2071 and video codecs is also to not check for integer overflows in 2072 "Pixel" count computations, that is to allocate width * height 2073 without considering that the multiplication result may have 2074 overflowed the arithmetic types range. The range coder could, if 2075 implemented naively, read one byte over the end. The implementation 2076 must ensure that no read outside allocated and initialized memory 2077 occurs. 2079 The reference implementation [REFIMPL] contains no known buffer 2080 overflow or cases where a specially crafted packet or video segment 2081 could cause a significant increase in CPU load. 2083 The reference implementation [REFIMPL] was validated in the following 2084 conditions: 2086 * Sending the decoder valid packets generated by the reference 2087 encoder and verifying that the decoder's output matches the 2088 encoder's input. 2090 * Sending the decoder packets generated by the reference encoder and 2091 then subjected to random corruption. 2093 * Sending the decoder random packets that are not FFV1. 2095 In all of the conditions above, the decoder and encoder was run 2096 inside the [VALGRIND] memory debugger as well as clangs address 2097 sanitizer [Address-Sanitizer], which track reads and writes to 2098 invalid memory regions as well as the use of uninitialized memory. 2099 There were no errors reported on any of the tested conditions. 2101 7. Media Type Definition 2103 This registration is done using the template defined in [RFC6838] and 2104 following [RFC4855]. 2106 Type name: video 2108 Subtype name: FFV1 2110 Required parameters: None. 2112 Optional parameters: 2114 This parameter is used to signal the capabilities of a receiver 2115 implementation. This parameter MUST NOT be used for any other 2116 purpose. 2118 version: The version of the FFV1 encoding as defined by the section 2119 on version (#version). 2121 micro_version: The micro_version of the FFV1 encoding as defined by 2122 the section on micro_version (#micro-version). 2124 coder_type: The coder_type of the FFV1 encoding as defined by the 2125 section on coder_type (#coder-type). 2127 colorspace_type: The colorspace_type of the FFV1 encoding as defined 2128 by the section on colorspace_type (#colorspace-type). 2130 bits_per_raw_sample: The bits_per_raw_sample of the FFV1 encoding as 2131 defined by the section on bits_per_raw_sample (#bits-per-raw-sample). 2133 max-slices: The value of max-slices is an integer indicating the 2134 maximum count of slices with a frames of the FFV1 encoding. 2136 Encoding considerations: 2138 This media type is defined for encapsulation in several audiovisual 2139 container formats and contains binary data; see the section on 2140 "Mapping FFV1 into Containers" (#mapping-ffv1-into-containers). This 2141 media type is framed binary data Section 4.8 of [RFC6838]. 2143 Security considerations: 2145 See the "Security Considerations" section (#security-considerations) 2146 of this document. 2148 Interoperability considerations: None. 2150 Published specification: 2152 [I-D.ietf-cellar-ffv1] and RFC XXXX. 2154 [RFC Editor: Upon publication as an RFC, please replace "XXXX" with 2155 the number assigned to this document and remove this note.] 2157 Applications which use this media type: 2159 Any application that requires the transport of lossless video can use 2160 this media type. Some examples are, but not limited to screen 2161 recording, scientific imaging, and digital video preservation. 2163 Fragment identifier considerations: N/A. 2165 Additional information: None. 2167 Person & email address to contact for further information: Michael 2168 Niedermayer michael@niedermayer.cc (mailto:michael@niedermayer.cc) 2170 Intended usage: COMMON 2172 Restrictions on usage: None. 2174 Author: Dave Rice dave@dericed.com (mailto:dave@dericed.com) 2176 Change controller: IETF cellar working group delegated from the IESG. 2178 8. IANA Considerations 2180 The IANA is requested to register the following values: 2182 * Media type registration as described in Media Type Definition 2183 (#media-type-definition). 2185 9. Appendixes 2187 9.1. Decoder implementation suggestions 2189 9.1.1. Multi-threading Support and Independence of Slices 2191 The FFV1 bitstream is parsable in two ways: in sequential order as 2192 described in this document or with the pre-analysis of the footer of 2193 each slice. Each slice footer contains a slice_size field so the 2194 boundary of each slice is computable without having to parse the 2195 slice content. That allows multi-threading as well as independence 2196 of slice content (a bitstream error in a slice header or slice 2197 content has no impact on the decoding of the other slices). 2199 After having checked keyframe field, a decoder SHOULD parse 2200 slice_size fields, from slice_size of the last slice at the end of 2201 the "Frame" up to slice_size of the first slice at the beginning of 2202 the "Frame", before parsing slices, in order to have slices 2203 boundaries. A decoder MAY fallback on sequential order e.g. in case 2204 of a corrupted "Frame" (frame size unknown, slice_size of slices not 2205 coherent...) or if there is no possibility of seeking into the 2206 stream. 2208 10. Changelog 2210 See https://github.com/FFmpeg/FFV1/commits/master 2211 (https://github.com/FFmpeg/FFV1/commits/master) 2213 11. Normative References 2215 [I-D.ietf-cellar-ffv1] 2216 Niedermayer, M., Rice, D., and J. Martinez, "FFV1 Video 2217 Coding Format Version 0, 1, and 3", Work in Progress, 2218 Internet-Draft, draft-ietf-cellar-ffv1-09, 6 September 2219 2019, 2220 . 2222 [ISO.15444-1.2016] 2223 International Organization for Standardization, 2224 "Information technology -- JPEG 2000 image coding system: 2225 Core coding system", October 2016. 2227 [ISO.9899.1990] 2228 International Organization for Standardization, 2229 "Programming languages - C", 1990. 2231 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2232 Requirement Levels", BCP 14, RFC 2119, 2233 DOI 10.17487/RFC2119, March 1997, 2234 . 2236 [RFC4732] Handley, M., Ed., Rescorla, E., Ed., and IAB, "Internet 2237 Denial-of-Service Considerations", RFC 4732, 2238 DOI 10.17487/RFC4732, December 2006, 2239 . 2241 [RFC4855] Casner, S., "Media Type Registration of RTP Payload 2242 Formats", RFC 4855, DOI 10.17487/RFC4855, February 2007, 2243 . 2245 [RFC6716] Valin, JM., Vos, K., and T. Terriberry, "Definition of the 2246 Opus Audio Codec", RFC 6716, DOI 10.17487/RFC6716, 2247 September 2012, . 2249 [RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type 2250 Specifications and Registration Procedures", BCP 13, 2251 RFC 6838, DOI 10.17487/RFC6838, January 2013, 2252 . 2254 12. Informative References 2256 [Address-Sanitizer] 2257 The Clang Team, "ASAN AddressSanitizer website", undated, 2258 . 2260 [AVI] Microsoft, "AVI RIFF File Reference", undated, 2261 . 2264 [HuffYUV] Rudiak-Gould, B., "HuffYUV", December 2003, 2265 . 2268 [ISO.14495-1.1999] 2269 International Organization for Standardization, 2270 "Information technology -- Lossless and near-lossless 2271 compression of continuous-tone still images: Baseline", 2272 December 1999. 2274 [ISO.14496-10.2014] 2275 International Organization for Standardization, 2276 "Information technology -- Coding of audio-visual objects 2277 -- Part 10: Advanced Video Coding", September 2014. 2279 [ISO.14496-12.2015] 2280 International Organization for Standardization, 2281 "Information technology -- Coding of audio-visual objects 2282 -- Part 12: ISO base media file format", December 2015. 2284 [Matroska] IETF, "Matroska", 2016, . 2287 [NUT] Niedermayer, M., "NUT Open Container Format", December 2288 2013, . 2290 [range-coding] 2291 Nigel, G. and N. Martin, "Range encoding: an algorithm for 2292 removing redundancy from a digitised message.", July 1979. 2294 [REFIMPL] Niedermayer, M., "The reference FFV1 implementation / the 2295 FFV1 codec in FFmpeg", undated, . 2297 [VALGRIND] Valgrind Developers, "Valgrind website", undated, 2298 . 2300 [YCbCr] Wikipedia, "YCbCr", undated, 2301 . 2303 Authors' Addresses 2305 Michael Niedermayer 2307 Email: michael@niedermayer.cc 2309 Dave Rice 2311 Email: dave@dericed.com 2313 Jerome Martinez 2315 Email: jerome@mediaarea.net