idnits 2.17.1 draft-niedermayer-cellar-ffv1-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 9, 2017) is 2537 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'L-l' is mentioned on line 383, but not defined == Missing Reference: 'T-t' is mentioned on line 384, but not defined -- Looks like a reference, but probably isn't: '1' on line 501 -- Looks like a reference, but probably isn't: '2' on line 501 Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 cellar M. Niedermayer 3 Internet-Draft 4 Intended status: Standards Track D. Rice 5 Expires: November 10, 2017 6 J. Martinez 7 May 9, 2017 9 FF Video Codec 1 10 draft-niedermayer-cellar-ffv1-02 12 Abstract 14 This document defines FFV1, a lossless intra-frame video encoding 15 format. FFV1 is designed to efficiently compress video data in a 16 variety of pixel formats. Compared to uncompressed video, FFV1 17 offers storage compression, frame fixity, and self-description, which 18 makes FFV1 useful as a preservation or intermediate video format. 20 Status of This Memo 22 This Internet-Draft is submitted in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at http://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on November 10, 2017. 37 Copyright Notice 39 Copyright (c) 2017 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents 44 (http://trustee.ietf.org/license-info) in effect on the date of 45 publication of this document. Please review these documents 46 carefully, as they describe your rights and restrictions with respect 47 to this document. Code Components extracted from this document must 48 include Simplified BSD License text as described in Section 4.e of 49 the Trust Legal Provisions and are provided without warranty as 50 described in the Simplified BSD License. 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 55 2. Notation and Conventions . . . . . . . . . . . . . . . . . . 4 56 2.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . 4 57 2.2. Conventions . . . . . . . . . . . . . . . . . . . . . . . 4 58 2.2.1. Arithmetic operators . . . . . . . . . . . . . . . . 5 59 2.2.2. Assignment operators . . . . . . . . . . . . . . . . 5 60 2.2.3. Comparison operators . . . . . . . . . . . . . . . . 5 61 2.2.4. Mathematical functions . . . . . . . . . . . . . . . 6 62 2.2.5. Order of operation precedence . . . . . . . . . . . . 6 63 2.2.6. Range . . . . . . . . . . . . . . . . . . . . . . . . 7 64 2.2.7. NumBytes . . . . . . . . . . . . . . . . . . . . . . 7 65 2.2.8. Bitstream functions . . . . . . . . . . . . . . . . . 7 66 3. General Description . . . . . . . . . . . . . . . . . . . . . 7 67 3.1. Border . . . . . . . . . . . . . . . . . . . . . . . . . 7 68 3.2. Median predictor . . . . . . . . . . . . . . . . . . . . 8 69 3.3. Context . . . . . . . . . . . . . . . . . . . . . . . . . 8 70 3.4. Quantization . . . . . . . . . . . . . . . . . . . . . . 9 71 3.5. Color space . . . . . . . . . . . . . . . . . . . . . . . 9 72 3.5.1. YCbCr . . . . . . . . . . . . . . . . . . . . . . . . 9 73 3.5.2. JPEG2000-RCT . . . . . . . . . . . . . . . . . . . . 10 74 3.6. Coding of the sample difference . . . . . . . . . . . . . 11 75 3.6.1. Range coding mode . . . . . . . . . . . . . . . . . . 11 76 3.6.2. Huffman coding mode . . . . . . . . . . . . . . . . . 15 77 4. Bitstream . . . . . . . . . . . . . . . . . . . . . . . . . . 17 78 4.1. Configuration Record . . . . . . . . . . . . . . . . . . 18 79 4.1.1. reserved_for_future_use . . . . . . . . . . . . . . . 19 80 4.1.2. configuration_record_crc_parity . . . . . . . . . . . 19 81 4.1.3. Mapping FFV1 into Containers . . . . . . . . . . . . 19 82 4.2. Frame . . . . . . . . . . . . . . . . . . . . . . . . . . 20 83 4.3. Slice . . . . . . . . . . . . . . . . . . . . . . . . . . 20 84 4.4. Slice Header . . . . . . . . . . . . . . . . . . . . . . 21 85 4.4.1. slice_x . . . . . . . . . . . . . . . . . . . . . . . 21 86 4.4.2. slice_y . . . . . . . . . . . . . . . . . . . . . . . 21 87 4.4.3. slice_width . . . . . . . . . . . . . . . . . . . . . 21 88 4.4.4. slice_height . . . . . . . . . . . . . . . . . . . . 21 89 4.4.5. quant_table_index_count . . . . . . . . . . . . . . . 21 90 4.4.6. quant_table_index . . . . . . . . . . . . . . . . . . 22 91 4.4.7. picture_structure . . . . . . . . . . . . . . . . . . 22 92 4.4.8. sar_num . . . . . . . . . . . . . . . . . . . . . . . 22 93 4.4.9. sar_den . . . . . . . . . . . . . . . . . . . . . . . 22 94 4.4.10. reset_contexts . . . . . . . . . . . . . . . . . . . 22 95 4.4.11. slice_coding_mode . . . . . . . . . . . . . . . . . . 22 96 4.5. Slice Content . . . . . . . . . . . . . . . . . . . . . . 23 97 4.5.1. primary_color_count . . . . . . . . . . . . . . . . . 23 98 4.5.2. plane_pixel_height . . . . . . . . . . . . . . . . . 23 99 4.5.3. slice_pixel_height . . . . . . . . . . . . . . . . . 23 100 4.5.4. slice_pixel_y . . . . . . . . . . . . . . . . . . . . 23 101 4.6. Line . . . . . . . . . . . . . . . . . . . . . . . . . . 23 102 4.6.1. plane_pixel_width . . . . . . . . . . . . . . . . . . 24 103 4.6.2. slice_pixel_width . . . . . . . . . . . . . . . . . . 24 104 4.6.3. slice_pixel_x . . . . . . . . . . . . . . . . . . . . 24 105 4.7. Slice Footer . . . . . . . . . . . . . . . . . . . . . . 24 106 4.7.1. slice_size . . . . . . . . . . . . . . . . . . . . . 25 107 4.7.2. error_status . . . . . . . . . . . . . . . . . . . . 25 108 4.7.3. slice_crc_parity . . . . . . . . . . . . . . . . . . 25 109 4.8. Parameters . . . . . . . . . . . . . . . . . . . . . . . 25 110 4.8.1. version . . . . . . . . . . . . . . . . . . . . . . . 26 111 4.8.2. micro_version . . . . . . . . . . . . . . . . . . . . 27 112 4.8.3. coder_type . . . . . . . . . . . . . . . . . . . . . 28 113 4.8.4. state_transition_delta . . . . . . . . . . . . . . . 28 114 4.8.5. colorspace_type . . . . . . . . . . . . . . . . . . . 28 115 4.8.6. chroma_planes . . . . . . . . . . . . . . . . . . . . 28 116 4.8.7. bits_per_raw_sample . . . . . . . . . . . . . . . . . 29 117 4.8.8. h_chroma_subsample . . . . . . . . . . . . . . . . . 29 118 4.8.9. v_chroma_subsample . . . . . . . . . . . . . . . . . 29 119 4.8.10. alpha_plane . . . . . . . . . . . . . . . . . . . . . 29 120 4.8.11. num_h_slices . . . . . . . . . . . . . . . . . . . . 29 121 4.8.12. num_v_slices . . . . . . . . . . . . . . . . . . . . 30 122 4.8.13. quant_table_count . . . . . . . . . . . . . . . . . . 30 123 4.8.14. states_coded . . . . . . . . . . . . . . . . . . . . 30 124 4.8.15. initial_state_delta . . . . . . . . . . . . . . . . . 30 125 4.8.16. ec . . . . . . . . . . . . . . . . . . . . . . . . . 30 126 4.8.17. intra . . . . . . . . . . . . . . . . . . . . . . . . 30 127 4.9. Quantization Tables . . . . . . . . . . . . . . . . . . . 31 128 4.9.1. quant_tables . . . . . . . . . . . . . . . . . . . . 32 129 4.9.2. context_count . . . . . . . . . . . . . . . . . . . . 32 130 4.9.3. Restrictions . . . . . . . . . . . . . . . . . . . . 32 131 5. Security Considerations . . . . . . . . . . . . . . . . . . . 33 132 6. Appendixes . . . . . . . . . . . . . . . . . . . . . . . . . 33 133 6.1. Decoder implementation suggestions . . . . . . . . . . . 34 134 6.1.1. Multi-threading support and independence of slices . 34 135 7. Changelog . . . . . . . . . . . . . . . . . . . . . . . . . . 34 136 8. ToDo . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 137 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 35 138 9.1. Normative References . . . . . . . . . . . . . . . . . . 35 139 9.2. Informative References . . . . . . . . . . . . . . . . . 35 140 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 36 142 1. Introduction 144 The FFV1 video codec is a simple and efficient lossless intra-frame 145 only codec. 147 The latest version of this document is available at 148 150 This document assumes familiarity with mathematical and coding 151 concepts such as Range coding [range-coding] and YCbCr color spaces 152 [YCbCr]. 154 2. Notation and Conventions 156 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 157 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 158 document are to be interpreted as described in [RFC2119]. 160 2.1. Definitions 162 "ESC": An ESCape symbol to indicate that the symbol to be stored is 163 too large for normal storage and that an alternate storage method. 165 "MSB": Most Significant Bit, the bit that can cause the largest 166 change in magnitude of the symbol. 168 "RCT": Reversible Color Transform, a near linear, exactly reversible 169 integer transform that converts between RGB and YCbCr representations 170 of a sample. 172 "VLC": Variable Length Code. 174 "RGB": A reference to the method of storing the value of a sample by 175 using three numeric values that represent Red, Green, and Blue. 177 "YCbCr": A reference to the method of storing the value of a sample 178 by using three numeric values that represent the luminance of the 179 sample (Y) and the chrominance of the sample (Cb and Cr). 181 "TBA": To Be Announced. Used in reference to the development of 182 future iterations of the FFV1 specification. 184 2.2. Conventions 186 Note: the operators and the order of precedence are the same as used 187 in the C programming language [ISO.9899.1990]. 189 2.2.1. Arithmetic operators 191 "a + b" means a plus b. 193 "a - b" means a minus b. 195 "-a" means negation of a. 197 "a * b" means a multiplied by b. 199 "a / b" means a divided by b. 201 "a & b" means bit-wise "and" of a and b. 203 "a | b" means bit-wise "or" of a and b. 205 "a >> b" means arithmetic right shift of two's complement integer 206 representation of a by b binary digits. 208 "a << b" means arithmetic left shift of two's complement integer 209 representation of a by b binary digits. 211 2.2.2. Assignment operators 213 "a = b" means a is assigned b. 215 "a++" is equivalent to a is assigned a + 1. 217 "a--" is equivalent to a is assigned a - 1. 219 "a += b" is equivalent to a is assigned a + b. 221 "a -= b" is equivalent to a is assigned a - b. 223 "a *= b" is equivalent to a is assigned a * b. 225 2.2.3. Comparison operators 227 "a > b" means a is greater than b. 229 "a >= b" means a is greater than or equal to b. 231 "a < b" means a is less than b. 233 "a <= b" means a is less than or equal b. 235 "a == b" means a is equal to b. 237 "a != b" means a is not equal to b. 239 "a && b" means Boolean logical "and" of a and b. 241 "a || b" means Boolean logical "or" of a and b. 243 "!a" means Boolean logical "not". 245 "a ? b : c" if a is true, then b, otherwise c. 247 2.2.4. Mathematical functions 249 floor(a) the largest integer less than or equal to a 251 ceil(a) the largest integer less than or equal to a 253 abs(a) the absolute value of a, i.e. abs(a) = sign(a)*a 255 log2(a) the base-two logarithm of a 257 min(a,b) the smallest of two values a and b 259 a_{b} the b-th value of a sequence of a 261 a_{b,c} the 'b,c'-th value of a sequence of a 263 2.2.5. Order of operation precedence 265 When order of precedence is not indicated explicitly by use of 266 parentheses, operations are evaluated in the following order (from 267 top to bottom, operations of same precedence being evaluated from 268 left to right). This order of operations is based on the order of 269 operations used in Standard C. 271 a++, a-- 272 !a, -a 273 a * b, a / b, a % b 274 a + b, a - b 275 a << b, a >> b 276 a < b, a <= b, a > b, a >= b 277 a == b, a != b 278 a & b 279 a | b 280 a && b 281 a || b 282 a ? b : c 283 a = b, a += b, a -= b, a *= b 285 2.2.6. Range 287 "a...b" means any value starting from a to b, inclusive. 289 2.2.7. NumBytes 291 NumBytes is a non-negative integer that expresses the size in 8-bit 292 octets of particular FFV1 components such as the Configuration Record 293 and Frame. FFV1 relies on its container to store the NumBytes 294 values, see Section 4.1.3. 296 2.2.8. Bitstream functions 298 2.2.8.1. remaining_bits_in_bitstream 300 "remaining_bits_in_bitstream( )" means the count of remaining bits 301 after the current position in that bitstream component. It is 302 computed from the NumBytes value multiplied by 8 minus the count of 303 bits of that component already read by the bitstream parser. 305 2.2.8.2. byte_aligned 307 "byte_aligned( )" is true if "remaining_bits_in_bitstream( NumBytes 308 )" is a multiple of 8, otherwise false. 310 3. General Description 312 Samples within a plane are coded in raster scan order (left->right, 313 top->bottom). Each sample is predicted by the median predictor from 314 samples in the same plane and the difference is stored see 315 Section 3.6. 317 3.1. Border 319 For the purpose of the predictor and context, samples above the coded 320 slice are assumed to be 0; samples to the right of the coded slice 321 are identical to the closest left sample; samples to the left of the 322 coded slice are identical to the top right sample (if there is one), 323 otherwise 0. 325 +---+---+---+---+---+---+---+---+ 326 | 0 | 0 | | 0 | 0 | 0 | | 0 | 327 | 0 | 0 | | 0 | 0 | 0 | | 0 | 328 | | | | | | | | | 329 | 0 | 0 | | a | b | c | | c | 330 | 0 | a | | d | | e | | e | 331 | 0 | d | | f | g | h | | h | 332 +---+---+---+---+---+---+---+---+ 334 3.2. Median predictor 336 median(left, top, left + top - diag) 338 left, top, diag are the left, top and left-top samples 340 Note, this is also used in [ISO.14495-1.1999] and [HuffYUV]. 342 Exception for the media predictor: if colorspace_type == 0 && 343 bits_per_raw_sample == 16 && ( coder_type == 1 || coder_type == 2 ), 344 the following media predictor MUST be used: 346 median(left16s, top16s, left16s + top16s - diag16s) 348 with: - left16s = left >= 32768 ? ( left - 65536 ) : left - top16s = 349 top >= 32768 ? ( top - 65536 ) : top - diag16s = diag >= 32768 ? ( 350 diag - 65536 ) : diag 352 Background: a two's complement signed 16-bit signed integer was used 353 for storing pixel values in all known implementations of FFV1 354 bitstream. So in some circumstances, the most significant bit was 355 wrongly interpreted (used as a sign bit instead of the 16th bit of an 356 unsigned integer). Note that when the issue is discovered, the only 357 configuration of all known implementations being impacted is 16-bit 358 YCbCr color space with Range Coder coder, as other potentially 359 impacted configurations (e.g. 15/16-bit JPEG2000-RCT color space with 360 Range Coder coder, or 16-bit any color space with Golomb Rice coder) 361 were implemented nowhere. In the meanwhile, 16-bit JPEG2000-RCT 362 color space with Range Coder coder was implemented without this issue 363 in one implementation and validated by one conformance checker. It 364 is expected (to be confirmed) to remove this exception for the media 365 predictor in the next version of the bitstream. 367 3.3. Context 369 +---+---+---+---+ 370 | | | T | | 371 +---+---+---+---+ 372 | |tl | t |tr | 373 +---+---+---+---+ 374 | L | l | X | | 375 +---+---+---+---+ 377 The quantized sample differences L-l, l-tl, tl-t, t-T, t-tr are used 378 as context: 380 context = Q_0[l-tl] + 381 abs(Q_0) * ( Q_1[tl-t] + 382 abs(Q_1) * ( Q_2[t-tr] + 383 abs(Q_2) * ( Q_3[L-l] + 384 abs(Q_3) * Q_4[T-t] ))) 386 If the context is smaller than 0 then -context is used and the 387 difference between the sample and its predicted value is encoded with 388 a flipped sign. 390 3.4. Quantization 392 There are 5 quantization tables for the 5 sample differences, both 393 the number of quantization steps and their distribution are stored in 394 the bitstream. Each quantization table has exactly 256 entries, and 395 the 8 least significant bits of the sample difference are used as 396 index: 398 Q_{i}[a - b] = Table_{i}[(a - b)&255] 400 3.5. Color space 402 FFV1 supports two color spaces: YCbCr and JPEG2000-RCT. Both color 403 spaces allow an optional Alpha plane that can be used to code 404 transparency data. 406 3.5.1. YCbCr 408 In YCbCr color space, the Cb and Cr planes are optional, but if used 409 then MUST be used together. Omitting the Cb and Cr planes codes the 410 frames in grayscale without color data. An FFV1 frame using YCbCr 411 MUST use one of the following arrangements: 413 o Y 415 o Y, Alpha 417 o Y, Cb, Cr 419 o Y, Cb, Cr, Alpha 421 When FFV1 uses the YCbCr color space, the Y plane MUST be coded 422 first. If the Cb and Cr planes are used then they MUST be coded 423 after the Y plane. If an Alpha (transparency) plane is used, then it 424 MUST be coded last. 426 3.5.2. JPEG2000-RCT 428 JPEG2000-RCT is a Reversible Color Transform that codes RGB (red, 429 green, blue) planes losslessly in a modified YCbCr color space. 430 Reversible conversions between YCbCr and RGB use the following 431 formulae. 433 Cb=b-g 435 Cr=r-g 437 Y=g+(Cb+Cr)>>2 439 g=Y-(Cb+Cr)>>2 441 r=Cr+g 443 b=Cb+g 445 Exception for the reversible conversions between YCbCr and RGB: if 446 bits_per_raw_sample is between 9 and 15 inclusive, the following 447 formulae for reversible conversions between YCbCr and RGB MUST be 448 used instead of the ones above: 450 Cb=g-b 452 Cr=r-b 454 Y=b+(Cb+Cr)>>2 456 b=Y-(Cb+Cr)>>2 458 r=Cr+b 460 g=Cb+b 462 Background: At the time of this writing, in all known implementations 463 of FFV1 bitstream, when bits_per_raw_sample was between 9 and 15 464 inclusive, GBR planes were used as BGR planes during both encoding 465 and decoding. In the meanwhile, 16-bit JPEG2000-RCT color space was 466 implemented without this issue in one implementation and validated by 467 one conformance checker. Methods to address this exception for the 468 transform are under consideration for the next version of the 469 bitstream. 471 [ISO.15444-1.2016] 472 An FFV1 frame using JPEG2000-RCT MUST use one of the following 473 arrangements: 475 o Y, Cb, Cr 477 o Y, Cb, Cr, Alpha 479 When FFV1 uses the JPEG2000-RCT color space, the horizontal lines are 480 interleaved to improve caching efficiency since it is most likely 481 that the RCT will immediately be converted to RGB during decoding. 482 The interleaved coding order is also Y, then Cb, then Cr, and then if 483 used Alpha. 485 As an example, a frame that is two pixels wide and two pixels high, 486 could be comprised of the following structure: 488 +------------------------+------------------------+ 489 | Pixel[1,1] | Pixel[2,1] | 490 | Y[1,1] Cb[1,1] Cr[1,1] | Y[2,1] Cb[2,1] Cr[2,1] | 491 +------------------------+------------------------+ 492 | Pixel[1,2] | Pixel[2,2] | 493 | Y[1,2] Cb[1,2] Cr[1,2] | Y[2,2] Cb[2,2] Cr[2,2] | 494 +------------------------+------------------------+ 496 In JPEG2000-RCT color space, the coding order would be left to right 497 and then top to bottom, with values interleaved by lines and stored 498 in this order: 500 Y[1,1] Y[2,1] Cb[1,1] Cb[2,1] Cr[1,1] Cr[2,1] Y[1,2] Y[2,2] Cb[1,2] 501 Cb[2,2] Cr[1,2] Cr[2,2] 503 3.6. Coding of the sample difference 505 Instead of coding the n+1 bits of the sample difference with Huffman 506 or Range coding (or n+2 bits, in the case of RCT), only the n (or 507 n+1) least significant bits are used, since this is sufficient to 508 recover the original sample. In the equation below, the term "bits" 509 represents bits_per_raw_sample+1 for RCT or bits_per_raw_sample 510 otherwise: 512 coder_input = 513 [(sample_difference + 2^(bits-1)) & (2^bits - 1)] - 2^(bits-1) 515 3.6.1. Range coding mode 517 Early experimental versions of FFV1 used the CABAC Arithmetic coder 518 from H.264 as defined in [ISO.14496-10.2014] but due to the uncertain 519 patent/royalty situation, as well as its slightly worse performance, 520 CABAC was replaced by a Range coder based on an algorithm defined by 521 _G. Nigel_ and _N. Martin_ in 1979 [range-coding]. 523 3.6.1.1. Range binary values 525 To encode binary digits efficiently a Range coder is used. "C_{i}" 526 is the i-th Context. "B_{i}" is the i-th byte of the bytestream. 527 "b_{i}" is the i-th Range coded binary value, "S_{0,i}" is the i-th 528 initial state, which is 128. The length of the bytestream encoding n 529 binary symbols is "j_{n}" bytes. 531 r_{i} = floor( ( R_{i} * S_{i,C_{i}} ) / 2^8 ) 533 S_{i+1,C_{i}} = zero_state_{S_{i,C_{i}}} XOR 534 l_i = L_i XOR 535 t_i = R_i - r_i <== 536 b_i = 0 <==> 537 L_i < R_i - r_i 539 S_{i+1,C_{i}} = one_state_{S_{i,C_{i}}} XOR 540 l_i = L_i - R_i + r_i XOR 541 t_i = r_i <== 542 b_i = 1 <==> 543 L_i >= R_i - r_i 545 S_{i+1,k} = S_{i,k} <== C_i != k 547 R_{i+1} = 2^8 * t_{i} XOR 548 L_{i+1} = 2^8 * l_{i} + B_{j_{i}} XOR 549 j_{i+1} = j_{i} + 1 <== 550 t_{i} < 2^8 552 R_{i+1} = t_{i} XOR 553 L_{i+1} = l_{i} XOR 554 j_{i+1} = j_{i} <== 555 t_{i} >= 2^8 557 R_{0} = 65280 559 L_{0} = 2^8 * B_{0} + B_{1} 561 j_{0} = 2 563 3.6.1.2. Range non binary values 565 To encode scalar integers, it would be possible to encode each bit 566 separately and use the past bits as context. However that would mean 567 255 contexts per 8-bit symbol which is not only a waste of memory but 568 also requires more past data to reach a reasonably good estimate of 569 the probabilities. Alternatively assuming a Laplacian distribution 570 and only dealing with its variance and mean (as in Huffman coding) 571 would also be possible, however, for maximum flexibility and 572 simplicity, the chosen method uses a single symbol to encode if a 573 number is 0 and if not encodes the number using its exponent, 574 mantissa and sign. The exact contexts used are best described by the 575 following code, followed by some comments. 577 function | type 578 --------------------------------------------------------------|----- 579 void put_symbol(RangeCoder *c, uint8_t *state, int v, int \ | 580 is_signed) { | 581 int i; | 582 put_rac(c, state+0, !v); | 583 if (v) { | 584 int a= abs(v); | 585 int e= log2(a); | 586 | 587 for (i=0; i=0; i--) | 592 put_rac(c, state+22+min(i,9), (a>>i)&1); //22..31 | 593 | 594 if (is_signed) | 595 put_rac(c, state+11 + min(e, 10), v < 0); //11..21| 596 } | 597 } | 599 3.6.1.3. Initial values for the context model 601 At keyframes all Range coder state variables are set to their initial 602 state. 604 3.6.1.4. State transition table 606 one_state_{i} = 607 default_state_transition_{i} + state_transition_delta_{i} 609 zero_state_{i} = 256 - one_state_{256-i} 611 3.6.1.5. default_state_transition 612 0, 0, 0, 0, 0, 0, 0, 0, 20, 21, 22, 23, 24, 25, 26, 27, 614 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 37, 38, 39, 40, 41, 42, 616 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 56, 57, 618 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 620 74, 75, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 622 89, 90, 91, 92, 93, 94, 94, 95, 96, 97, 98, 99,100,101,102,103, 624 104,105,106,107,108,109,110,111,112,113,114,114,115,116,117,118, 626 119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,133, 628 134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149, 630 150,151,152,152,153,154,155,156,157,158,159,160,161,162,163,164, 632 165,166,167,168,169,170,171,171,172,173,174,175,176,177,178,179, 634 180,181,182,183,184,185,186,187,188,189,190,190,191,192,194,194, 636 195,196,197,198,199,200,201,202,202,204,205,206,207,208,209,209, 638 210,211,212,213,215,215,216,217,218,219,220,220,222,223,224,225, 640 226,227,227,229,229,230,231,232,234,234,235,236,237,238,239,240, 642 241,242,243,244,245,246,247,248,248, 0, 0, 0, 0, 0, 0, 0, 644 3.6.1.6. alternative state transition table 646 The alternative state transition table has been build using iterative 647 minimization of frame sizes and generally performs better than the 648 default. To use it, the coder_type MUST be set to 2 and the 649 difference to the default MUST be stored in the parameters. The 650 reference implementation of FFV1 in FFmpeg uses this table by default 651 at the time of this writing when Range coding is used. 653 0, 10, 10, 10, 10, 16, 16, 16, 28, 16, 16, 29, 42, 49, 20, 49, 655 59, 25, 26, 26, 27, 31, 33, 33, 33, 34, 34, 37, 67, 38, 39, 39, 657 40, 40, 41, 79, 43, 44, 45, 45, 48, 48, 64, 50, 51, 52, 88, 52, 659 53, 74, 55, 57, 58, 58, 74, 60,101, 61, 62, 84, 66, 66, 68, 69, 661 87, 82, 71, 97, 73, 73, 82, 75,111, 77, 94, 78, 87, 81, 83, 97, 663 85, 83, 94, 86, 99, 89, 90, 99,111, 92, 93,134, 95, 98,105, 98, 665 105,110,102,108,102,118,103,106,106,113,109,112,114,112,116,125, 667 115,116,117,117,126,119,125,121,121,123,145,124,126,131,127,129, 669 165,130,132,138,133,135,145,136,137,139,146,141,143,142,144,148, 671 147,155,151,149,151,150,152,157,153,154,156,168,158,162,161,160, 673 172,163,169,164,166,184,167,170,177,174,171,173,182,176,180,178, 675 175,189,179,181,186,183,192,185,200,187,191,188,190,197,193,196, 677 197,194,195,196,198,202,199,201,210,203,207,204,205,206,208,214, 679 209,211,221,212,213,215,224,216,217,218,219,220,222,228,223,225, 681 226,224,227,229,240,230,231,232,233,234,235,236,238,239,237,242, 683 241,243,242,244,245,246,247,248,249,250,251,252,252,253,254,255, 685 3.6.2. Huffman coding mode 687 This coding mode uses Golomb Rice codes. The VLC code is split into 688 2 parts, the prefix stores the most significant bits, the suffix 689 stores the k least significant bits or stores the whole number in the 690 ESC case. The end of the bitstream (of the frame) is filled with 691 0-bits until that the bitstream contains a multiple of 8 bits. 693 3.6.2.1. Prefix 694 +----------------+-------+ 695 | bits | value | 696 +----------------+-------+ 697 | 1 | 0 | 698 | 01 | 1 | 699 | ... | ... | 700 | 0000 0000 0001 | 11 | 701 | 0000 0000 0000 | ESC | 702 +----------------+-------+ 704 3.6.2.2. Suffix 706 +-------+-----------------------------------------------------------+ 707 | non | the k least significant bits MSB first | 708 | ESC | | 709 | ESC | the value - 11, in MSB first order, ESC may only be used | 710 | | if the value cannot be coded as non ESC | 711 +-------+-----------------------------------------------------------+ 713 3.6.2.3. Examples 715 +-----+-------------------------+-------+ 716 | k | bits | value | 717 +-----+-------------------------+-------+ 718 | 0 | "1" | 0 | 719 | 0 | "001" | 2 | 720 | 2 | "1 00" | 0 | 721 | 2 | "1 10" | 2 | 722 | 2 | "01 01" | 5 | 723 | any | "000000000000 10000000" | 139 | 724 +-----+-------------------------+-------+ 726 3.6.2.4. Run mode 728 Run mode is entered when the context is 0 and left as soon as a non-0 729 difference is found. The level is identical to the predicted one. 730 The run and the first different level is coded. 732 3.6.2.5. Run length coding 734 The run value is encoded in 2 parts, the prefix part stores the more 735 significant part of the run as well as adjusting the run_index which 736 determines the number of bits in the less significant part of the 737 run. The 2nd part of the value stores the less significant part of 738 the run as it is. The run_index is reset for each plane and slice to 739 0. 741 function | type 742 --------------------------------------------------------------|----- 743 log2_run[41]={ | 744 0, 0, 0, 0, 1, 1, 1, 1, | 745 2, 2, 2, 2, 3, 3, 3, 3, | 746 4, 4, 5, 5, 6, 6, 7, 7, | 747 8, 9,10,11,12,13,14,15, | 748 16,17,18,19,20,21,22,23, | 749 24, | 750 }; | 751 | 752 if (run_count == 0 && run_mode == 1) { | 753 if (get_bits1()) { | 754 run_count = 1 << log2_run[run_index]; | 755 if (x + run_count <= w) | 756 run_index++; | 757 } else { | 758 if (log2_run[run_index]) | 759 run_count = get_bits(log2_run[run_index]); | 760 else | 761 run_count = 0; | 762 if (run_index) | 763 run_index--; | 764 run_mode = 2; | 765 } | 766 } | 768 The log2_run function is also used within [ISO.14495-1.1999]. 770 3.6.2.6. Level coding 772 Level coding is identical to the normal difference coding with the 773 exception that the 0 value is removed as it cannot occur: 775 if (diff>0) diff--; 776 encode(diff); 778 Note, this is different from JPEG-LS, which doesn't use prediction in 779 run mode and uses a different encoding and context model for the last 780 difference On a small set of test samples the use of prediction 781 slightly improved the compression rate. 783 4. Bitstream 784 +--------+----------------------------------------------------------+ 785 | Symbol | Definition | 786 +--------+----------------------------------------------------------+ 787 | u(n) | unsigned big endian integer using n bits | 788 | sg | Golomb Rice coded signed scalar symbol coded with the | 789 | | method described in Section 3.6.2 | 790 | br | Range coded Boolean (1-bit) symbol with the method | 791 | | described in Section 3.6.1.1 | 792 | ur | Range coded unsigned scalar symbol coded with the method | 793 | | described in Section 3.6.1.2 | 794 | sr | Range coded signed scalar symbol coded with the method | 795 | | described in Section 3.6.1.2 | 796 +--------+----------------------------------------------------------+ 798 The same context which is initialized to 128 is used for all fields 799 in the header. 801 The following MUST be provided by external means during 802 initialization of the decoder: 804 "frame_pixel_width" is defined as frame width in pixels. 806 "frame_pixel_height" is defined as frame height in pixels. 808 Default values at the decoder initialization phase: 810 "ConfigurationRecordIsPresent" is set to 0. 812 4.1. Configuration Record 814 In the case of a bitstream with "version >= 3", a Configuration 815 Record is stored in the underlying container, at the track header 816 level. It contains the parameters used for all frames. The size of 817 the Configuration Record, NumBytes, is supplied by the underlying 818 container. 820 function | type 821 --------------------------------------------------------------|----- 822 ConfigurationRecord( NumBytes ) { | 823 ConfigurationRecordIsPresent = 1 | 824 Parameters( ) | 825 while( remaining_bits_in_bitstream( NumBytes ) > 32 ) | 826 reserved_for_future_use | u(1) 827 configuration_record_crc_parity | u(32) 828 } | 830 4.1.1. reserved_for_future_use 832 "reserved_for_future_use" has semantics that are reserved for future 833 use. Encoders conforming to this version of this specification 834 SHALL NOT write this value. Decoders conforming to this version of 835 this specification SHALL ignore its value. 837 4.1.2. configuration_record_crc_parity 839 "configuration_record_crc_parity" 32 bits that are chosen so that the 840 Configuration Record as a whole has a crc remainder of 0. This is 841 equivalent to storing the crc remainder in the 32-bit parity. The 842 CRC generator polynomial used is the standard IEEE CRC polynomial 843 (0x104C11DB7) with initial value 0. 845 4.1.3. Mapping FFV1 into Containers 847 This Configuration Record can be placed in any file format supporting 848 Configuration Records, fitting as much as possible with how the file 849 format uses to store Configuration Records. The Configuration Record 850 storage place and NumBytes are currently defined and supported by 851 this version of this specification for the following container 852 formats: 854 4.1.3.1. In AVI File Format 856 The Configuration Record extends the stream format chunk ("AVI ", 857 "hdlr", "strl", "strf") with the ConfigurationRecord bitstream. See 858 [AVI] for more information about chunks. 860 "NumBytes" is defined as the size, in bytes, of the strf chunk 861 indicated in the chunk header minus the size of the stream format 862 structure. 864 4.1.3.2. In ISO/IEC 14496-12 (MP4 File Format) 866 The Configuration Record extends the sample description box ("moov", 867 "trak", "mdia", "minf", "stbl", "stsd") with a "glbl" box which 868 contains the ConfigurationRecord bitstream. See [ISO.14496-12.2015] 869 for more information about boxes. 871 "NumBytes" is defined as the size, in bytes, of the "glbl" box 872 indicated in the box header minus the size of the box header. 874 4.1.3.3. In NUT File Format 876 The codec_specific_data element (in "stream_header" packet) contains 877 the ConfigurationRecord bitstream. See [NUT] for more information 878 about elements. 880 "NumBytes" is defined as the size, in bytes, of the 881 codec_specific_data element as indicated in the "length" field of 882 codec_specific_data 884 4.1.3.4. In Matroska File Format 886 FFV1 SHOULD use "V_FFV1" as the Matroska "Codec ID". For FFV1 887 versions 2 or less, the Matroska "CodecPrivate" Element SHOULD NOT be 888 used. For FFV1 versions 3 or greater, the Matroska "CodecPrivate" 889 Element MUST contain the FFV1 Configuration Record structure and no 890 other data. See [Matroska] for more information about elements. 892 4.2. Frame 894 A frame consists of the keyframe field, parameters (if version <=1), 895 and a sequence of independent slices. 897 function | type 898 --------------------------------------------------------------|----- 899 Frame( NumBytes ) { | 900 keyframe | br 901 if (keyframe && !ConfigurationRecordIsPresent | 902 Parameters( ) | 903 while ( remaining_bits_in_bitstream( NumBytes ) ) | 904 Slice( ) | 905 } | 907 4.3. Slice 909 function | type 910 --------------------------------------------------------------|----- 911 Slice( ) { | 912 if (version >= 3) | 913 SliceHeader( ) | 914 SliceContent( ) | 915 if (coder_type == 0) | 916 while (!byte_aligned()) | 917 padding | u(1) 918 if (version >= 3) | 919 SliceFooter( ) | 920 } | 921 "padding" specifies a bit without any significance and used only for 922 byte alignment. MUST be 0. 924 4.4. Slice Header 926 function | type 927 --------------------------------------------------------------|----- 928 SliceHeader( ) { | 929 slice_x | ur 930 slice_y | ur 931 slice_width - 1 | ur 932 slice_height - 1 | ur 933 for( i = 0; i < quant_table_index_count; i++ ) | 934 quant_table_index [ i ] | ur 935 picture_structure | ur 936 sar_num | ur 937 sar_den | ur 938 if (version >= 4) { | 939 reset_contexts | br 940 slice_coding_mode | ur 941 } | 942 } | 944 4.4.1. slice_x 946 "slice_x" indicates the x position on the slice raster formed by 947 num_h_slices. Inferred to be 0 if not present. 949 4.4.2. slice_y 951 "slice_y" indicates the y position on the slice raster formed by 952 num_v_slices. Inferred to be 0 if not present. 954 4.4.3. slice_width 956 "slice_width" indicates the width on the slice raster formed by 957 num_h_slices. Inferred to be 1 if not present. 959 4.4.4. slice_height 961 "slice_height" indicates the height on the slice raster formed by 962 num_v_slices. Inferred to be 1 if not present. 964 4.4.5. quant_table_index_count 966 "quant_table_index_count" is defined as 1 + ( ( chroma_planes || 967 version <= 3 ) ? 1 : 0 ) + ( alpha_plane ? 1 : 0 ). 969 4.4.6. quant_table_index 971 "quant_table_index" indicates the index to select the quantization 972 table set and the initial states for the slice. Inferred to be 0 if 973 not present. 975 4.4.7. picture_structure 977 "picture_structure" specifies the picture structure. Inferred to be 978 0 if not present. 980 +-------+-------------------------+ 981 | value | picture structure used | 982 +-------+-------------------------+ 983 | 0 | unknown | 984 | 1 | top field first | 985 | 2 | bottom field first | 986 | 3 | progressive | 987 | Other | reserved for future use | 988 +-------+-------------------------+ 990 4.4.8. sar_num 992 "sar_num" specifies the sample aspect ratio numerator. Inferred to 993 be 0 if not present. MUST be 0 if sample aspect ratio is unknown. 995 4.4.9. sar_den 997 "sar_den" specifies the sample aspect ratio numerator. Inferred to 998 be 0 if not present. MUST be 0 if sample aspect ratio is unknown. 1000 4.4.10. reset_contexts 1002 "reset_contexts" indicates if slice contexts must be reset. Inferred 1003 to be 0 if not present. 1005 4.4.11. slice_coding_mode 1007 "slice_coding_mode" indicates the slice coding mode. Inferred to be 1008 0 if not present. 1010 +-------+----------------------------+ 1011 | value | slice coding mode | 1012 +-------+----------------------------+ 1013 | 0 | normal Range Coding or VLC | 1014 | 1 | raw PCM | 1015 | Other | reserved for future use | 1016 +-------+----------------------------+ 1018 4.5. Slice Content 1020 function | type 1021 --------------------------------------------------------------|----- 1022 SliceContent( ) { | 1023 if (colorspace_type == 0) { | 1024 for( p = 0; p < primary_color_count; p++ ) { | 1025 for( y = 0; y < plane_pixel_height[ p ]; y++ ) | 1026 Line( p, y ) | 1027 } else if (colorspace_type == 1) { | 1028 for( y = 0; y < slice_pixel_height; y++ ) | 1029 for( p = 0; p < primary_color_count; p++ ) { | 1030 Line( p, y ) | 1031 } | 1032 } | 1034 4.5.1. primary_color_count 1036 "primary_color_count" is defined as 1 + ( chroma_planes ? 2 : 0 ) + ( 1037 alpha_plane ? 1 : 0 ). 1039 4.5.2. plane_pixel_height 1041 "plane_pixel_height[ p ]" is the height in pixels of plane p of the 1042 slice. "plane_pixel_height[ 0 ]" and "plane_pixel_height[ 1 + ( 1043 chroma_planes ? 2 : 0 ) ]" value is "slice_pixel_height". If 1044 "chroma_planes" is set to 1, "plane_pixel_height[ 1 ]" and 1045 "plane_pixel_height[ 2 ]" value is "ceil(slice_pixel_height / 1046 v_chroma_subsample)". 1048 4.5.3. slice_pixel_height 1050 "slice_pixel_height" is the height in pixels of the slice. Its value 1051 is "floor(( slice_y + slice_height ) * slice_pixel_height / 1052 num_v_slices) - slice_pixel_y". 1054 4.5.4. slice_pixel_y 1056 "slice_pixel_y" is the slice vertical position in pixels. Its value 1057 is "floor(slice_y * frame_pixel_height / num_v_slices)". 1059 4.6. Line 1060 function | type 1061 --------------------------------------------------------------|----- 1062 Line( p, y ) { | 1063 if (colorspace_type == 0) { | 1064 for( x = 0; x < plane_pixel_width[ p ]; x++ ) | 1065 Pixel( p, y, x ) | 1066 } else if (colorspace_type == 1) { | 1067 for( x = 0; x < slice_pixel_width; x++ ) | 1068 Pixel( p, y, x ) | 1069 } | 1070 } | 1072 4.6.1. plane_pixel_width 1074 "plane_pixel_width[ p ]" is the width in pixels of plane p of the 1075 slice. "plane_pixel_width[ 0 ]" and "plane_pixel_width[ 1 + ( 1076 chroma_planes ? 2 : 0 ) ]" value is "slice_pixel_width". If 1077 "chroma_planes" is set to 1, "plane_pixel_width[ 1 ]" and 1078 "plane_pixel_width[ 2 ]" value is "ceil(slice_pixel_width / 1079 v_chroma_subsample)". 1081 4.6.2. slice_pixel_width 1083 "slice_pixel_width" is the width in pixels of the slice. Its value 1084 is "floor(( slice_x + slice_width ) * slice_pixel_width / 1085 num_h_slices) - slice_pixel_x". 1087 4.6.3. slice_pixel_x 1089 "slice_pixel_x" is the slice horizontal position in pixels. Its 1090 value is "floor(slice_x * frame_pixel_width / num_h_slices)". 1092 4.7. Slice Footer 1094 Note: slice footer is always byte aligned. 1096 function | type 1097 --------------------------------------------------------------|----- 1098 SliceFooter( ) { | 1099 slice_size | u(24) 1100 if (ec) { | 1101 error_status | u(8) 1102 slice_crc_parity | u(32) 1103 } | 1104 } | 1106 4.7.1. slice_size 1108 "slice_size" indicates the size of the slice in bytes. Note: this 1109 allows finding the start of slices before previous slices have been 1110 fully decoded. And allows this way parallel decoding as well as 1111 error resilience. 1113 4.7.2. error_status 1115 "error_status" specifies the error status. 1117 +-------+--------------------------------------+ 1118 | value | error status | 1119 +-------+--------------------------------------+ 1120 | 0 | no error | 1121 | 1 | slice contains a correctable error | 1122 | 2 | slice contains a uncorrectable error | 1123 | Other | reserved for future use | 1124 +-------+--------------------------------------+ 1126 4.7.3. slice_crc_parity 1128 "slice_crc_parity" 32 bits that are chosen so that the slice as a 1129 whole has a crc remainder of 0. This is equivalent to storing the 1130 crc remainder in the 32-bit parity. The CRC generator polynomial 1131 used is the standard IEEE CRC polynomial (0x104C11DB7) with initial 1132 value 0. 1134 4.8. Parameters 1135 function | type 1136 --------------------------------------------------------------|----- 1137 Parameters( ) { | 1138 version | ur 1139 if (version >= 3) | 1140 micro_version | ur 1141 coder_type | ur 1142 if (coder_type > 1) | 1143 for (i = 1; i < 256; i++) | 1144 state_transition_delta[ i ] | sr 1145 colorspace_type | ur 1146 if (version >= 1) | 1147 bits_per_raw_sample | ur 1148 chroma_planes | br 1149 log2( h_chroma_subsample ) | ur 1150 log2( v_chroma_subsample ) | ur 1151 alpha_plane | br 1152 if (version >= 3) { | 1153 num_h_slices - 1 | ur 1154 num_v_slices - 1 | ur 1155 quant_table_count | ur 1156 } | 1157 for( i = 0; i < quant_table_count; i++ ) | 1158 QuantizationTable( i ) | 1159 if (version >= 3) { | 1160 for( i = 0; i < quant_table_count; i++ ) { | 1161 states_coded | br 1162 if (states_coded) | 1163 for( j = 0; j < context_count[ i ]; j++ ) | 1164 for( k = 0; k < CONTEXT_SIZE; k++ ) | 1165 initial_state_delta[ i ][ j ][ k ] | sr 1166 } | 1167 ec | ur 1168 intra | ur 1169 } | 1170 } | 1172 4.8.1. version 1174 "version" specifies the version of the bitstream. Each version is 1175 incompatible with others versions: decoders SHOULD reject a file due 1176 to unknown version. Decoders SHOULD reject a file with version =< 1 1177 && ConfigurationRecordIsPresent == 1. Decoders SHOULD reject a file 1178 with version >= 3 && ConfigurationRecordIsPresent == 0. 1180 +-------+-------------------------+ 1181 | value | version | 1182 +-------+-------------------------+ 1183 | 0 | FFV1 version 0 | 1184 | 1 | FFV1 version 1 | 1185 | 2 | reserved* | 1186 | 3 | FFV1 version 3 | 1187 | Other | reserved for future use | 1188 +-------+-------------------------+ 1190 * Version 2 was never enabled in the encoder thus version 2 files 1191 SHOULD NOT exist, and this document does not describe them to keep 1192 the text simpler. 1194 4.8.2. micro_version 1196 "micro_version" specifies the micro-version of the bitstream. After 1197 a version is considered stable (a micro-version value is assigned to 1198 be the first stable variant of a specific version), each new micro- 1199 version after this first stable variant is compatible with the 1200 previous micro-version: decoders SHOULD NOT reject a file due to an 1201 unknown micro-version equal or above the micro-version considered as 1202 stable. 1204 Meaning of micro_version for version 3: 1206 +-------+-------------------------+ 1207 | value | micro_version | 1208 +-------+-------------------------+ 1209 | 0...3 | reserved* | 1210 | 4 | first stable variant | 1211 | Other | reserved for future use | 1212 +-------+-------------------------+ 1214 * were development versions which may be incompatible with the stable 1215 variants. 1217 Meaning of micro_version for version 4 (note: at the time of writing 1218 of this specification, version 4 is not considered stable so the 1219 first stable version value is to be announced in the future): 1221 +---------+-------------------------+ 1222 | value | micro_version | 1223 +---------+-------------------------+ 1224 | 0...TBA | reserved* | 1225 | TBA | first stable variant | 1226 | Other | reserved for future use | 1227 +---------+-------------------------+ 1229 * were development versions which may be incompatible with the stable 1230 variants. 1232 4.8.3. coder_type 1234 "coder_type" specifies the coder used 1236 +-------+-------------------------------------------------+ 1237 | value | coder used | 1238 +-------+-------------------------------------------------+ 1239 | 0 | Golomb Rice | 1240 | 1 | Range Coder with default state transition table | 1241 | 2 | Range Coder with custom state transition table | 1242 | Other | reserved for future use | 1243 +-------+-------------------------------------------------+ 1245 4.8.4. state_transition_delta 1247 "state_transition_delta" specifies the Range coder custom state 1248 transition table. If state_transition_delta is not present in the 1249 bitstream, all Range coder custom state transition table elements are 1250 assumed to be 0. 1252 4.8.5. colorspace_type 1254 "colorspace_type" specifies the color space. 1256 +-------+-------------------------+ 1257 | value | color space used | 1258 +-------+-------------------------+ 1259 | 0 | YCbCr | 1260 | 1 | JPEG2000-RCT | 1261 | Other | reserved for future use | 1262 +-------+-------------------------+ 1264 4.8.6. chroma_planes 1266 "chroma_planes" indicates if chroma (color) planes are present. 1268 +-------+-------------------------------+ 1269 | value | color space used | 1270 +-------+-------------------------------+ 1271 | 0 | chroma planes are not present | 1272 | 1 | chroma planes are present | 1273 +-------+-------------------------------+ 1275 4.8.7. bits_per_raw_sample 1277 "bits_per_raw_sample" indicates the number of bits for each luma and 1278 chroma sample. Inferred to be 8 if not present. 1280 +-------+-------------------------------------------------+ 1281 | value | bits for each luma and chroma sample | 1282 +-------+-------------------------------------------------+ 1283 | 0 | reserved* | 1284 | Other | the actual bits for each luma and chroma sample | 1285 +-------+-------------------------------------------------+ 1287 * Encoders MUST NOT store bits_per_raw_sample = 0 Decoders SHOULD 1288 accept and interpret bits_per_raw_sample = 0 as 8. 1290 4.8.8. h_chroma_subsample 1292 "h_chroma_subsample" indicates the subsample factor between luma and 1293 chroma width ("chroma_width = 2^(-log2_h_chroma_subsample) * 1294 luma_width"). 1296 4.8.9. v_chroma_subsample 1298 "v_chroma_subsample" indicates the subsample factor between luma and 1299 chroma height ("chroma_height=2^(-log2_v_chroma_subsample) * 1300 luma_height"). 1302 4.8.10. alpha_plane 1304 alpha_plane 1305 indicates if a transparency plane is present. 1307 +-------+-----------------------------------+ 1308 | value | color space used | 1309 +-------+-----------------------------------+ 1310 | 0 | transparency plane is not present | 1311 | 1 | transparency plane is present | 1312 +-------+-----------------------------------+ 1314 4.8.11. num_h_slices 1316 "num_h_slices" indicates the number of horizontal elements of the 1317 slice raster. Inferred to be 1 if not present. 1319 4.8.12. num_v_slices 1321 "num_v_slices" indicates the number of vertical elements of the slice 1322 raster. Inferred to be 1 if not present. 1324 4.8.13. quant_table_count 1326 "quant_table_count" indicates the number of quantization table sets. 1327 Inferred to be 1 if not present. 1329 4.8.14. states_coded 1331 "states_coded" indicates if the respective quantization table set has 1332 the initial states coded. Inferred to be 0 if not present. 1334 +-------+-----------------------------------------------------------+ 1335 | value | initial states | 1336 +-------+-----------------------------------------------------------+ 1337 | 0 | initial states are not present and are assumed to be all | 1338 | | 128 | 1339 | 1 | initial states are present | 1340 +-------+-----------------------------------------------------------+ 1342 4.8.15. initial_state_delta 1344 "initial_state_delta" [ i ][ j ][ k ] indicates the initial Range 1345 coder state, it is encoded using k as context index and pred = j ? 1346 initial_states[ i ][j - 1][ k ] : 128 initial_state[ i ][ j ][ k ] = 1347 ( pred + initial_state_delta[ i ][ j ][ k ] ) & 255 1349 4.8.16. ec 1351 "ec" indicates the error detection/correction type. 1353 +-------+--------------------------------------------+ 1354 | value | error detection/correction type | 1355 +-------+--------------------------------------------+ 1356 | 0 | 32-bit CRC on the global header | 1357 | 1 | 32-bit CRC per slice and the global header | 1358 | Other | reserved for future use | 1359 +-------+--------------------------------------------+ 1361 4.8.17. intra 1363 "intra" indicates the relationship between frames. Inferred to be 0 1364 if not present. 1366 +-------+-----------------------------------------------------------+ 1367 | value | relationship | 1368 +-------+-----------------------------------------------------------+ 1369 | 0 | frames are independent or dependent (keyframes and non | 1370 | | keyframes) | 1371 | 1 | frames are independent (keyframes only) | 1372 | Other | reserved for future use | 1373 +-------+-----------------------------------------------------------+ 1375 4.9. Quantization Tables 1377 The quantization tables are stored by storing the number of equal 1378 entries -1 of the first half of the table using the method described 1379 in Section 3.6.1.2. The second half doesn't need to be stored as it 1380 is identical to the first with flipped sign. 1382 example: 1384 Table: 0 0 1 1 1 1 2 2-2-2-2-1-1-1-1 0 1386 Stored values: 1, 3, 1 1388 function | type 1389 --------------------------------------------------------------|----- 1390 QuantizationTable( i ) { | 1391 scale = 1 | 1392 for( j = 0; j < MAX_CONTEXT_INPUTS; j++ ) { | 1393 QuantizationTablePerContext( i, j, scale ) | 1394 scale *= 2 * len_count[ i ][ j ] - 1 | 1395 } | 1396 context_count[ i ] = ( scale + 1 ) / 2 | 1397 } | 1399 MAX_CONTEXT_INPUTS is 5. 1401 function | type 1402 --------------------------------------------------------------|----- 1403 QuantizationTablePerContext(i, j, scale) { | 1404 v = 0 | 1405 for( k = 0; k < 128; ) { | 1406 len - 1 | sr 1407 for( a = 0; a < len; a++ ) { | 1408 quant_tables[ i ][ j ][ k ] = scale* v | 1409 k++ | 1410 } | 1411 v++ | 1412 } | 1413 for( k = 1; k < 128; k++ ) { | 1414 quant_tables[ i ][ j ][ 256 - k ] = \ | 1415 -quant_tables[ i ][ j ][ k ] | 1416 } | 1417 quant_tables[ i ][ j ][ 128 ] = \ | 1418 -quant_tables[ i ][ j ][ 127 ] | 1419 len_count[ i ][ j ] = v | 1420 } | 1422 4.9.1. quant_tables 1424 "quant_tables" indicates the quantification table values. 1426 4.9.2. context_count 1428 "context_count" indicates the count of contexts. 1430 4.9.3. Restrictions 1432 To ensure that fast multithreaded decoding is possible, starting 1433 version 3 and if frame_pixel_width * frame_pixel_height is more than 1434 101376, slice_width * slice_height MUST be less or equal to 1435 num_h_slices * num_v_slices / 4. Note: 101376 is the frame size in 1436 pixels of a 352x288 frame also known as CIF ("Common Intermediate 1437 Format") frame size format. 1439 For each frame, each position in the slice raster MUST be filled by 1440 one and only one slice of the frame (no missing slice position, no 1441 slice overlapping). 1443 For each Frame with keyframe value of 0, each slice MUST have the 1444 same value of slice_x, slice_y, slice_width, slice_height as a slice 1445 in the previous frame, except if reset_contexts is 1. 1447 5. Security Considerations 1449 Like any other codec, (such as [RFC6716]), FFV1 should not be used 1450 with insecure ciphers or cipher-modes that are vulnerable to known 1451 plaintext attacks. Some of the header bits as well as the padding 1452 are easily predictable. 1454 Implementations of the FFV1 codec need to take appropriate security 1455 considerations into account, as outlined in [RFC4732]. It is 1456 extremely important for the decoder to be robust against malicious 1457 payloads. Malicious payloads must not cause the decoder to overrun 1458 its allocated memory or to take an excessive amount of resources to 1459 decode. Although problems in encoders are typically rarer, the same 1460 applies to the encoder. Malicious video streams must not cause the 1461 encoder to misbehave because this would allow an attacker to attack 1462 transcoding gateways. A frequent security problem in image and video 1463 codecs is also to not check for integer overflows in pixel count 1464 computations, that is to allocate width * height without considering 1465 that the multiplication result may have overflowed the arithmetic 1466 types range. 1468 The reference implementation [REFIMPL] contains no known buffer 1469 overflow or cases where a specially crafted packet or video segment 1470 could cause a significant increase in CPU load. 1472 The reference implementation [REFIMPL] was validated in the following 1473 conditions: 1475 o Sending the decoder valid packets generated by the reference 1476 encoder and verifying that the decoder's output matches the 1477 encoders input. 1479 o Sending the decoder packets generated by the reference encoder and 1480 then subjected to random corruption. 1482 o Sending the decoder random packets that are not FFV1. 1484 In all of the conditions above, the decoder and encoder was run 1485 inside the [VALGRIND] memory debugger as well as clangs address 1486 sanitizer [Address-Sanitizer], which track reads and writes to 1487 invalid memory regions as well as the use of uninitialized memory. 1488 There were no errors reported on any of the tested conditions. 1490 6. Appendixes 1491 6.1. Decoder implementation suggestions 1493 6.1.1. Multi-threading support and independence of slices 1495 The bitstream is parsable in two ways: in sequential order as 1496 described in this document or with the pre-analysis of the footer of 1497 each slice. Each slice footer contains a slice_size field so the 1498 boundary of each slice is computable without having to parse the 1499 slice content. That allows multi-threading as well as independence 1500 of slice content (a bitstream error in a slice header or slice 1501 content has no impact on the decoding of the other slices). 1503 After having checked keyframe field, a decoder SHOULD parse 1504 slice_size fields, from slice_size of the last slice at the end of 1505 the frame up to slice_size of the first slice at the beginning of the 1506 frame, before parsing slices, in order to have slices boundaries. A 1507 decoder MAY fallback on sequential order e.g. in case of corrupted 1508 frame (frame size unknown, slice_size of slices not coherent...) or 1509 if there is no possibility of seek into the stream. 1511 Architecture overview of slices in a frame: 1513 +-----------------------------------------------------------------+ 1514 | first slice header | 1515 | first slice content | 1516 | first slice footer | 1517 | --------------------------------------------------------------- | 1518 | second slice header | 1519 | second slice content | 1520 | second slice footer | 1521 | --------------------------------------------------------------- | 1522 | ... | 1523 | --------------------------------------------------------------- | 1524 | last slice header | 1525 | last slice content | 1526 | last slice footer | 1527 +-----------------------------------------------------------------+ 1529 7. Changelog 1531 See 1533 8. ToDo 1535 o mean,k estimation for the Golomb Rice codes 1537 9. References 1539 9.1. Normative References 1541 [ISO.15444-1.2016] 1542 International Organization for Standardization, 1543 "Information technology -- JPEG 2000 image coding system: 1544 Core coding system", October 2016. 1546 [ISO.9899.1990] 1547 International Organization for Standardization, 1548 "Programming languages - C", ISO Standard 9899, 1990. 1550 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1551 Requirement Levels", BCP 14, RFC 2119, 1552 DOI 10.17487/RFC2119, March 1997, 1553 . 1555 9.2. Informative References 1557 [Address-Sanitizer] 1558 The Clang Team, "ASAN AddressSanitizer website", undated, 1559 . 1561 [AVI] Microsoft, "AVI RIFF File Reference", undated, 1562 . 1565 [HuffYUV] Rudiak-Gould, B., "HuffYUV", December 2003, 1566 . 1569 [ISO.14495-1.1999] 1570 International Organization for Standardization, 1571 "Information technology -- Lossless and near-lossless 1572 compression of continuous-tone still images: Baseline", 1573 December 1999. 1575 [ISO.14496-10.2014] 1576 International Organization for Standardization, 1577 "Information technology -- Coding of audio-visual objects 1578 -- Part 10: Advanced Video Coding", September 2014. 1580 [ISO.14496-12.2015] 1581 International Organization for Standardization, 1582 "Information technology -- Coding of audio-visual objects 1583 -- Part 12: ISO base media file format", December 2015. 1585 [Matroska] 1586 IETF, "Matroska", 2016, . 1589 [NUT] Niedermayer, M., "NUT Open Container Format", December 1590 2013, . 1592 [range-coding] 1593 Nigel, G. and N. Martin, "Range encoding: an algorithm for 1594 removing redundancy from a digitised message.", Proc. 1595 Institution of Electronic and Radio Engineers 1596 International Conference on Video and Data Recording , 1597 July 1979. 1599 [REFIMPL] Niedermayer, M., "The reference FFV1 implementation / the 1600 FFV1 codec in FFmpeg", undated, . 1602 [RFC4732] Handley, M., Ed., Rescorla, E., Ed., and IAB, "Internet 1603 Denial-of-Service Considerations", RFC 4732, 1604 DOI 10.17487/RFC4732, December 2006, 1605 . 1607 [RFC6716] Valin, JM., Vos, K., and T. Terriberry, "Definition of the 1608 Opus Audio Codec", RFC 6716, DOI 10.17487/RFC6716, 1609 September 2012, . 1611 [VALGRIND] 1612 Valgrind Developers, "Valgrind website", undated, 1613 . 1615 [YCbCr] Wikipedia, "YCbCr", undated, . 1618 Authors' Addresses 1620 Michael Niedermayer 1622 Email: michael@niedermayer.cc 1624 Dave Rice 1626 Email: dave@dericed.com 1628 Jerome Martinez 1630 Email: jerome@mediaarea.net