idnits 2.17.1
draft-ietf-cellar-ffv1-18.txt:
Checking boilerplate required by RFC 5378 and the IETF Trust (see
https://trustee.ietf.org/license-info):
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/checklist :
----------------------------------------------------------------------------
No issues found here.
Miscellaneous warnings:
----------------------------------------------------------------------------
== The copyright year in the IETF Trust and authors Copyright Line does not
match the current year
-- The document date (7 October 2020) is 1296 days in the past. Is this
intentional?
Checking references for intended status: Informational
----------------------------------------------------------------------------
-- Looks like a reference, but probably isn't: '41' on line 1101
Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--).
Run idnits with the --verbose option for more detailed information about
the items above.
--------------------------------------------------------------------------------
2 cellar M. Niedermayer
3 Internet-Draft
4 Intended status: Informational D. Rice
5 Expires: 10 April 2021
6 J. Martinez
7 7 October 2020
9 FFV1 Video Coding Format Version 0, 1, and 3
10 draft-ietf-cellar-ffv1-18
12 Abstract
14 This document defines FFV1, a lossless intra-frame video encoding
15 format. FFV1 is designed to efficiently compress video data in a
16 variety of pixel formats. Compared to uncompressed video, FFV1
17 offers storage compression, frame fixity, and self-description, which
18 makes FFV1 useful as a preservation or intermediate video format.
20 Status of This Memo
22 This Internet-Draft is submitted in full conformance with the
23 provisions of BCP 78 and BCP 79.
25 Internet-Drafts are working documents of the Internet Engineering
26 Task Force (IETF). Note that other groups may also distribute
27 working documents as Internet-Drafts. The list of current Internet-
28 Drafts is at https://datatracker.ietf.org/drafts/current/.
30 Internet-Drafts are draft documents valid for a maximum of six months
31 and may be updated, replaced, or obsoleted by other documents at any
32 time. It is inappropriate to use Internet-Drafts as reference
33 material or to cite them other than as "work in progress."
35 This Internet-Draft will expire on 10 April 2021.
37 Copyright Notice
39 Copyright (c) 2020 IETF Trust and the persons identified as the
40 document authors. All rights reserved.
42 This document is subject to BCP 78 and the IETF Trust's Legal
43 Provisions Relating to IETF Documents (https://trustee.ietf.org/
44 license-info) in effect on the date of publication of this document.
45 Please review these documents carefully, as they describe your rights
46 and restrictions with respect to this document. Code Components
47 extracted from this document must include Simplified BSD License text
48 as described in Section 4.e of the Trust Legal Provisions and are
49 provided without warranty as described in the Simplified BSD License.
51 Table of Contents
53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4
54 2. Notation and Conventions . . . . . . . . . . . . . . . . . . 5
55 2.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . 5
56 2.2. Conventions . . . . . . . . . . . . . . . . . . . . . . . 6
57 2.2.1. Pseudo-code . . . . . . . . . . . . . . . . . . . . . 6
58 2.2.2. Arithmetic Operators . . . . . . . . . . . . . . . . 6
59 2.2.3. Assignment Operators . . . . . . . . . . . . . . . . 7
60 2.2.4. Comparison Operators . . . . . . . . . . . . . . . . 7
61 2.2.5. Mathematical Functions . . . . . . . . . . . . . . . 8
62 2.2.6. Order of Operation Precedence . . . . . . . . . . . . 8
63 2.2.7. Range . . . . . . . . . . . . . . . . . . . . . . . . 9
64 2.2.8. NumBytes . . . . . . . . . . . . . . . . . . . . . . 9
65 2.2.9. Bitstream Functions . . . . . . . . . . . . . . . . . 9
66 3. Sample Coding . . . . . . . . . . . . . . . . . . . . . . . . 10
67 3.1. Border . . . . . . . . . . . . . . . . . . . . . . . . . 10
68 3.2. Samples . . . . . . . . . . . . . . . . . . . . . . . . . 11
69 3.3. Median Predictor . . . . . . . . . . . . . . . . . . . . 11
70 3.4. Quantization Table Sets . . . . . . . . . . . . . . . . . 12
71 3.5. Context . . . . . . . . . . . . . . . . . . . . . . . . . 13
72 3.6. Quantization Table Set Indexes . . . . . . . . . . . . . 13
73 3.7. Color spaces . . . . . . . . . . . . . . . . . . . . . . 13
74 3.7.1. YCbCr . . . . . . . . . . . . . . . . . . . . . . . . 14
75 3.7.2. RGB . . . . . . . . . . . . . . . . . . . . . . . . . 14
76 3.8. Coding of the Sample Difference . . . . . . . . . . . . . 16
77 3.8.1. Range Coding Mode . . . . . . . . . . . . . . . . . . 16
78 3.8.2. Golomb Rice Mode . . . . . . . . . . . . . . . . . . 22
79 4. Bitstream . . . . . . . . . . . . . . . . . . . . . . . . . . 28
80 4.1. Quantization Table Set . . . . . . . . . . . . . . . . . 29
81 4.1.1. quant_tables . . . . . . . . . . . . . . . . . . . . 30
82 4.1.2. context_count . . . . . . . . . . . . . . . . . . . . 31
83 4.2. Parameters . . . . . . . . . . . . . . . . . . . . . . . 31
84 4.2.1. version . . . . . . . . . . . . . . . . . . . . . . . 33
85 4.2.2. micro_version . . . . . . . . . . . . . . . . . . . . 33
86 4.2.3. coder_type . . . . . . . . . . . . . . . . . . . . . 34
87 4.2.4. state_transition_delta . . . . . . . . . . . . . . . 34
88 4.2.5. colorspace_type . . . . . . . . . . . . . . . . . . . 35
89 4.2.6. chroma_planes . . . . . . . . . . . . . . . . . . . . 35
90 4.2.7. bits_per_raw_sample . . . . . . . . . . . . . . . . . 36
91 4.2.8. log2_h_chroma_subsample . . . . . . . . . . . . . . . 36
92 4.2.9. log2_v_chroma_subsample . . . . . . . . . . . . . . . 36
93 4.2.10. extra_plane . . . . . . . . . . . . . . . . . . . . . 36
94 4.2.11. num_h_slices . . . . . . . . . . . . . . . . . . . . 37
95 4.2.12. num_v_slices . . . . . . . . . . . . . . . . . . . . 37
96 4.2.13. quant_table_set_count . . . . . . . . . . . . . . . . 37
97 4.2.14. states_coded . . . . . . . . . . . . . . . . . . . . 37
98 4.2.15. initial_state_delta . . . . . . . . . . . . . . . . . 37
99 4.2.16. ec . . . . . . . . . . . . . . . . . . . . . . . . . 38
100 4.2.17. intra . . . . . . . . . . . . . . . . . . . . . . . . 38
101 4.3. Configuration Record . . . . . . . . . . . . . . . . . . 39
102 4.3.1. reserved_for_future_use . . . . . . . . . . . . . . . 39
103 4.3.2. configuration_record_crc_parity . . . . . . . . . . . 39
104 4.3.3. Mapping FFV1 into Containers . . . . . . . . . . . . 39
105 4.4. Frame . . . . . . . . . . . . . . . . . . . . . . . . . . 40
106 4.5. Slice . . . . . . . . . . . . . . . . . . . . . . . . . . 42
107 4.6. Slice Header . . . . . . . . . . . . . . . . . . . . . . 43
108 4.6.1. slice_x . . . . . . . . . . . . . . . . . . . . . . . 44
109 4.6.2. slice_y . . . . . . . . . . . . . . . . . . . . . . . 44
110 4.6.3. slice_width . . . . . . . . . . . . . . . . . . . . . 44
111 4.6.4. slice_height . . . . . . . . . . . . . . . . . . . . 44
112 4.6.5. quant_table_set_index_count . . . . . . . . . . . . . 44
113 4.6.6. quant_table_set_index . . . . . . . . . . . . . . . . 45
114 4.6.7. picture_structure . . . . . . . . . . . . . . . . . . 45
115 4.6.8. sar_num . . . . . . . . . . . . . . . . . . . . . . . 45
116 4.6.9. sar_den . . . . . . . . . . . . . . . . . . . . . . . 46
117 4.7. Slice Content . . . . . . . . . . . . . . . . . . . . . . 46
118 4.7.1. primary_color_count . . . . . . . . . . . . . . . . . 46
119 4.7.2. plane_pixel_height . . . . . . . . . . . . . . . . . 46
120 4.7.3. slice_pixel_height . . . . . . . . . . . . . . . . . 47
121 4.7.4. slice_pixel_y . . . . . . . . . . . . . . . . . . . . 47
122 4.8. Line . . . . . . . . . . . . . . . . . . . . . . . . . . 47
123 4.8.1. plane_pixel_width . . . . . . . . . . . . . . . . . . 47
124 4.8.2. slice_pixel_width . . . . . . . . . . . . . . . . . . 48
125 4.8.3. slice_pixel_x . . . . . . . . . . . . . . . . . . . . 48
126 4.8.4. sample_difference . . . . . . . . . . . . . . . . . . 48
127 4.9. Slice Footer . . . . . . . . . . . . . . . . . . . . . . 48
128 4.9.1. slice_size . . . . . . . . . . . . . . . . . . . . . 49
129 4.9.2. error_status . . . . . . . . . . . . . . . . . . . . 49
130 4.9.3. slice_crc_parity . . . . . . . . . . . . . . . . . . 49
131 5. Restrictions . . . . . . . . . . . . . . . . . . . . . . . . 49
132 6. Security Considerations . . . . . . . . . . . . . . . . . . . 50
133 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 51
134 7.1. Media Type Definition . . . . . . . . . . . . . . . . . . 51
135 8. Changelog . . . . . . . . . . . . . . . . . . . . . . . . . . 52
136 9. Normative References . . . . . . . . . . . . . . . . . . . . 52
137 10. Informative References . . . . . . . . . . . . . . . . . . . 53
138 Appendix A. Multi-theaded decoder implementation suggestions . . 55
139 Appendix B. Future handling of some streams created by non
140 conforming encoders . . . . . . . . . . . . . . . . . . . 55
141 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 55
143 1. Introduction
145 This document describes FFV1, a lossless video encoding format. The
146 design of FFV1 considers the storage of image characteristics, data
147 fixity, and the optimized use of encoding time and storage
148 requirements. FFV1 is designed to support a wide range of lossless
149 video applications such as long-term audiovisual preservation,
150 scientific imaging, screen recording, and other video encoding
151 scenarios that seek to avoid the generational loss of lossy video
152 encodings.
154 This document defines version 0, 1 and 3 of FFV1. The distinctions
155 of the versions are provided throughout the document, but in summary:
157 * Version 0 of FFV1 was the original implementation of FFV1 and has
158 been flagged as stable on April 14, 2006 [FFV1_V0].
160 * Version 1 of FFV1 adds support of more video bit depths and has
161 been has been flagged as stable on April 24, 2009 [FFV1_V1].
163 * Version 2 of FFV1 only existed in experimental form and is not
164 described by this document, but is available as a LyX file at
165 https://github.com/FFmpeg/FFV1/
166 blob/8ad772b6d61c3dd8b0171979a2cd9f11924d5532/ffv1.lyx
167 (https://github.com/FFmpeg/FFV1/
168 blob/8ad772b6d61c3dd8b0171979a2cd9f11924d5532/ffv1.lyx).
170 * Version 3 of FFV1 adds several features such as increased
171 description of the characteristics of the encoding images and
172 embedded CRC data to support fixity verification of the encoding.
173 Version 3 has been flagged as stable on August 17, 2013 [FFV1_V3].
175 This document assumes familiarity with mathematical and coding
176 concepts such as Range coding [range-coding] and YCbCr color spaces
177 [YCbCr].
179 This specification describes the valid bitstream and how to decode
180 such valid bitstream. Bitstreams not conforming to this
181 specification or how they are handled is outside this specification.
182 A decoder could reject every invalid bitstream or attempt to perform
183 error concealment or re-download or use a redundant copy of the
184 invalid part or any other action it deems appropriate.
186 2. Notation and Conventions
188 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
189 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
190 "OPTIONAL" in this document are to be interpreted as described in BCP
191 14 [RFC2119] [RFC8174] when, and only when, they appear in all
192 capitals, as shown here.
194 2.1. Definitions
196 "FFV1": choosen name of this video encoding format, short version of
197 "FF Video 1", the letters "FF" coming from "FFmpeg", the name of the
198 reference decoder, whose the first letters originaly means "Fast
199 Forward".
201 "Container": Format that encapsulates Frames (see Section 4.4) and
202 (when required) a "Configuration Record" into a bitstream.
204 "Sample": The smallest addressable representation of a color
205 component or a luma component in a Frame. Examples of Sample are
206 Luma (Y), Blue-difference Chroma (Cb), Red-difference Chroma (Cr),
207 Transparency, Red, Green, and Blue.
209 "Symbol": A value stored in the bitstream, which is defined and
210 decoded through one of the methods described in Table 4.
212 "Line": A discrete component of a static image composed of Samples
213 that represent a specific quantification of Samples of that image.
215 "Plane": A discrete component of a static image composed of Lines
216 that represent a specific quantification of Lines of that image.
218 "Pixel": The smallest addressable representation of a color in a
219 Frame. It is composed of one or more Samples.
221 "ESC": An ESCape Symbol to indicate that the Symbol to be stored is
222 too large for normal storage and that an alternate storage method is
223 used.
225 "MSB": Most Significant Bit, the bit that can cause the largest
226 change in magnitude of the Symbol.
228 "VLC": Variable Length Code, a code that maps source symbols to a
229 variable number of bits.
231 "RGB": A reference to the method of storing the value of a Pixel by
232 using three numeric values that represent Red, Green, and Blue.
234 "YCbCr": A reference to the method of storing the value of a Pixel by
235 using three numeric values that represent the luma of the Pixel (Y)
236 and the chroma of the Pixel (Cb and Cr). YCbCr word is used for
237 historical reasons and currently references any color space relying
238 on 1 luma Sample and 2 chroma Samples, e.g. YCbCr, YCgCo or ICtCp.
239 The exact meaning of the three numeric values is unspecified.
241 2.2. Conventions
243 2.2.1. Pseudo-code
245 The FFV1 bitstream is described in this document using pseudo-code.
246 Note that the pseudo-code is used for clarity in order to illustrate
247 the structure of FFV1 and not intended to specify any particular
248 implementation. The pseudo-code used is based upon the C programming
249 language [ISO.9899.2018] and uses its "if/else", "while" and "for"
250 keywords as well as functions defined within this document.
252 In some instances, pseudo-code is presented in a two-column format
253 such as shown in Figure 1. In this form the "type" column provides a
254 Symbol as defined in Table 4 that defines the storage of the data
255 referenced in that same line of pseudo-code.
257 pseudo-code | type
258 --------------------------------------------------------------|-----
259 ExamplePseudoCode( ) { |
260 value | ur
261 } |
263 Figure 1: A depiction of type-labelled pseudo-code used within
264 this document.
266 2.2.2. Arithmetic Operators
268 Note: the operators and the order of precedence are the same as used
269 in the C programming language [ISO.9899.2018], with the exception of
270 ">>" (removal of implementation defined behavior) and "^" (power
271 instead of XOR) operators which are re-defined within this section.
273 "a + b" means a plus b.
275 "a - b" means a minus b.
277 "-a" means negation of a.
279 "a * b" means a multiplied by b.
281 "a / b" means a divided by b.
283 "a ^ b" means a raised to the b-th power.
285 "a & b" means bit-wise "and" of a and b.
287 "a | b" means bit-wise "or" of a and b.
289 "a >> b" means arithmetic right shift of two's complement integer
290 representation of a by b binary digits. This is equivalent to
291 dividing a by 2, b times, with rounding toward negative infinity.
293 "a << b" means arithmetic left shift of two's complement integer
294 representation of a by b binary digits.
296 2.2.3. Assignment Operators
298 "a = b" means a is assigned b.
300 "a++" is equivalent to a is assigned a + 1.
302 "a--" is equivalent to a is assigned a - 1.
304 "a += b" is equivalent to a is assigned a + b.
306 "a -= b" is equivalent to a is assigned a - b.
308 "a *= b" is equivalent to a is assigned a * b.
310 2.2.4. Comparison Operators
312 "a > b" is true when a is greater than b.
314 "a >= b" is true when a is greater than or equal to b.
316 "a < b" is true when a is less than b.
318 "a <= b" is true when a is less than or equal b.
320 "a == b" is true when a is equal to b.
322 "a != b" is true when a is not equal to b.
324 "a && b" is true when both a is true and b is true.
326 "a || b" is true when either a is true or b is true.
328 "!a" is true when a is not true.
330 "a ? b : c" if a is true, then b, otherwise c.
332 2.2.5. Mathematical Functions
334 "floor(a)" means the largest integer less than or equal to a.
336 "ceil(a)" means the smallest integer greater than or equal to a.
338 "sign(a)" extracts the sign of a number, i.e. if a < 0 then -1, else
339 if a > 0 then 1, else 0.
341 "abs(a)" means the absolute value of a, i.e. "abs(a)" = "sign(a) *
342 a".
344 "log2(a)" means the base-two logarithm of a.
346 "min(a,b)" means the smaller of two values a and b.
348 "max(a,b)" means the larger of two values a and b.
350 "median(a,b,c)" means the numerical middle value in a data set of a,
351 b, and c, i.e. a+b+c-min(a,b,c)-max(a,b,c).
353 "A <== B" means B implies A.
355 "A <==> B" means A <== B , B <== A.
357 a_(b) means the b-th value of a sequence of a
359 a_(b,c) means the 'b,c'-th value of a sequence of a
361 2.2.6. Order of Operation Precedence
363 When order of precedence is not indicated explicitly by use of
364 parentheses, operations are evaluated in the following order (from
365 top to bottom, operations of same precedence being evaluated from
366 left to right). This order of operations is based on the order of
367 operations used in Standard C.
369 a++, a--
370 !a, -a
371 a ^ b
372 a * b, a / b
373 a + b, a - b
374 a << b, a >> b
375 a < b, a <= b, a > b, a >= b
376 a == b, a != b
377 a & b
378 a | b
379 a && b
380 a || b
381 a ? b : c
382 a = b, a += b, a -= b, a *= b
384 2.2.7. Range
386 "a...b" means any value from a to b, inclusive.
388 2.2.8. NumBytes
390 "NumBytes" is a non-negative integer that expresses the size in 8-bit
391 octets of a particular FFV1 "Configuration Record" or "Frame". FFV1
392 relies on its Container to store the "NumBytes" values; see
393 Section 4.3.3.
395 2.2.9. Bitstream Functions
397 2.2.9.1. remaining_bits_in_bitstream
399 "remaining_bits_in_bitstream( NumBytes )" means the count of
400 remaining bits after the pointer in that "Configuration Record" or
401 "Frame". It is computed from the "NumBytes" value multiplied by 8
402 minus the count of bits of that "Configuration Record" or "Frame"
403 already read by the bitstream parser.
405 2.2.9.2. remaining_symbols_in_syntax
407 "remaining_symbols_in_syntax( )" is true as long as the RangeCoder
408 has not consumed all the given input bytes.
410 2.2.9.3. byte_aligned
412 "byte_aligned( )" is true if "remaining_bits_in_bitstream( NumBytes
413 )" is a multiple of 8, otherwise false.
415 2.2.9.4. get_bits
417 "get_bits( i )" is the action to read the next "i" bits in the
418 bitstream, from most significant bit to least significant bit, and to
419 return the corresponding value. The pointer is increased by "i".
421 3. Sample Coding
423 For each "Slice" (as described in Section 4.5) of a Frame, the
424 Planes, Lines, and Samples are coded in an order determined by the
425 color space (see Section 3.7). Each Sample is predicted by the
426 median predictor as described in Section 3.3 from other Samples
427 within the same Plane and the difference is stored using the method
428 described in Section 3.8.
430 3.1. Border
432 A border is assumed for each coded "Slice" for the purpose of the
433 median predictor and context according to the following rules:
435 * one column of Samples to the left of the coded slice is assumed as
436 identical to the Samples of the leftmost column of the coded slice
437 shifted down by one row. The value of the topmost Sample of the
438 column of Samples to the left of the coded slice is assumed to be
439 "0"
441 * one column of Samples to the right of the coded slice is assumed
442 as identical to the Samples of the rightmost column of the coded
443 slice
445 * an additional column of Samples to the left of the coded slice and
446 two rows of Samples above the coded slice are assumed to be "0"
448 Figure 2 depicts a slice of 9 Samples "a,b,c,d,e,f,g,h,i" in a 3x3
449 arrangement along with its assumed border.
451 +---+---+---+---+---+---+---+---+
452 | 0 | 0 | | 0 | 0 | 0 | | 0 |
453 +---+---+---+---+---+---+---+---+
454 | 0 | 0 | | 0 | 0 | 0 | | 0 |
455 +---+---+---+---+---+---+---+---+
456 | | | | | | | | |
457 +---+---+---+---+---+---+---+---+
458 | 0 | 0 | | a | b | c | | c |
459 +---+---+---+---+---+---+---+---+
460 | 0 | a | | d | e | f | | f |
461 +---+---+---+---+---+---+---+---+
462 | 0 | d | | g | h | i | | i |
463 +---+---+---+---+---+---+---+---+
465 Figure 2: A depiction of FFV1's assumed border for a set example
466 Samples.
468 3.2. Samples
470 Relative to any Sample "X", six other relatively positioned Samples
471 from the coded Samples and presumed border are identified according
472 to the labels used in Figure 3. The labels for these relatively
473 positioned Samples are used within the median predictor and context.
475 +---+---+---+---+
476 | | | T | |
477 +---+---+---+---+
478 | |tl | t |tr |
479 +---+---+---+---+
480 | L | l | X | |
481 +---+---+---+---+
483 Figure 3: A depiction of how relatively positioned Samples are
484 referenced within this document.
486 The labels for these relative Samples are made of the first letters
487 of the words Top, Left and Right.
489 3.3. Median Predictor
491 The prediction for any Sample value at position "X" may be computed
492 based upon the relative neighboring values of "l", "t", and "tl" via
493 this equation:
495 median(l, t, l + t - tl)
497 Note, this prediction template is also used in [ISO.14495-1.1999] and
498 [HuffYUV].
500 Exception for the median predictor: if "colorspace_type == 0 &&
501 bits_per_raw_sample == 16 && ( coder_type == 1 || coder_type == 2 )"
502 (see Section 4.2.5, Section 4.2.7 and Section 4.2.5), the following
503 median predictor MUST be used:
505 median(left16s, top16s, left16s + top16s - diag16s)
507 where:
509 left16s = l >= 32768 ? ( l - 65536 ) : l
510 top16s = t >= 32768 ? ( t - 65536 ) : t
511 diag16s = tl >= 32768 ? ( tl - 65536 ) : tl
513 Background: a two's complement 16-bit signed integer was used for
514 storing Sample values in all known implementations of FFV1 bitstream.
515 So in some circumstances, the most significant bit was wrongly
516 interpreted (used as a sign bit instead of the 16th bit of an
517 unsigned integer). Note that when the issue was discovered, the only
518 configuration of all known implementations being impacted is 16-bit
519 YCbCr with no Pixel transformation with Range Coder coder, as other
520 potentially impacted configurations (e.g. 15/16-bit JPEG2000-RCT with
521 Range Coder coder, or 16-bit content with Golomb Rice coder) were
522 implemented nowhere [ISO.15444-1.2016]. In the meanwhile, 16-bit
523 JPEG2000-RCT with Range Coder coder was implemented without this
524 issue in one implementation and validated by one conformance checker.
525 It is expected (to be confirmed) to remove this exception for the
526 median predictor in the next version of the FFV1 bitstream.
528 3.4. Quantization Table Sets
530 The FFV1 bitstream contains one or more Quantization Table Sets.
531 Each Quantization Table Set contains exactly 5 Quantization Tables
532 with each Quantization Table corresponding to one of the five
533 Quantized Sample Differences. For each Quantization Table, both the
534 number of quantization steps and their distribution are stored in the
535 FFV1 bitstream; each Quantization Table has exactly 256 entries, and
536 the 8 least significant bits of the Quantized Sample Difference are
537 used as index:
539 Q_(j)[k] = quant_tables[i][j][k&255]
541 Figure 4
543 In this formula, "i" is the Quantization Table Set index, "j" is the
544 Quantized Table index, "k" the Quantized Sample Difference.
546 3.5. Context
548 Relative to any Sample "X", the Quantized Sample Differences "L-l",
549 "l-tl", "tl-t", "T-t", and "t-tr" are used as context:
551 context = Q_(0)[l - tl] +
552 Q_(1)[tl - t] +
553 Q_(2)[t - tr] +
554 Q_(3)[L - l] +
555 Q_(4)[T - t]
557 Figure 5
559 If "context >= 0" then "context" is used and the difference between
560 the Sample and its predicted value is encoded as is, else "-context"
561 is used and the difference between the Sample and its predicted value
562 is encoded with a flipped sign.
564 3.6. Quantization Table Set Indexes
566 For each Plane of each slice, a Quantization Table Set is selected
567 from an index:
569 * For Y Plane, "quant_table_set_index[ 0 ]" index is used
571 * For Cb and Cr Planes, "quant_table_set_index[ 1 ]" index is used
573 * For extra Plane, "quant_table_set_index[ (version <= 3 ||
574 chroma_planes) ? 2 : 1 ]" index is used
576 Background: in first implementations of FFV1 bitstream, the index for
577 Cb and Cr Planes was stored even if it is not used (chroma_planes set
578 to 0), this index is kept for "version" <= 3 in order to keep
579 compatibility with FFV1 bitstreams in the wild.
581 3.7. Color spaces
583 FFV1 supports several color spaces. The count of allowed coded
584 planes and the meaning of the extra Plane are determined by the
585 selected color space.
587 The FFV1 bitstream interleaves data in an order determined by the
588 color space. In YCbCr for each Plane, each Line is coded from top to
589 bottom and for each Line, each Sample is coded from left to right.
590 In JPEG2000-RCT for each Line from top to bottom, each Plane is coded
591 and for each Plane, each Sample is encoded from left to right.
593 3.7.1. YCbCr
595 This color space allows 1 to 4 Planes.
597 The Cb and Cr Planes are optional, but if used then MUST be used
598 together. Omitting the Cb and Cr Planes codes the frames in
599 grayscale without color data.
601 An optional transparency Plane can be used to code transparency data.
603 An FFV1 Frame using YCbCr MUST use one of the following arrangements:
605 * Y
607 * Y, Transparency
609 * Y, Cb, Cr
611 * Y, Cb, Cr, Transparency
613 The Y Plane MUST be coded first. If the Cb and Cr Planes are used
614 then they MUST be coded after the Y Plane. If a transparency Plane
615 is used, then it MUST be coded last.
617 3.7.2. RGB
619 This color space allows 3 or 4 Planes.
621 An optional transparency Plane can be used to code transparency data.
623 JPEG2000-RCT is a Reversible Color Transform that codes RGB (red,
624 green, blue) Planes losslessly in a modified YCbCr color space
625 [ISO.15444-1.2016]. Reversible Pixel transformations between YCbCr
626 and RGB use the following formulae.
628 Cb = b - g
629 Cr = r - g
630 Y = g + (Cb + Cr) >> 2
631 g = Y - (Cb + Cr) >> 2
632 r = Cr + g
633 b = Cb + g
635 Figure 6
637 Exception for the JPEG2000-RCT conversion: if "bits_per_raw_sample"
638 is between 9 and 15 inclusive and "extra_plane" is 0, the following
639 formulae for reversible conversions between YCbCr and RGB MUST be
640 used instead of the ones above:
642 Cb = g - b
643 Cr = r - b
644 Y = b +(Cb + Cr) >> 2
645 b = Y -(Cb + Cr) >> 2
646 r = Cr + b
647 g = Cb + b
649 Figure 7
651 Background: At the time of this writing, in all known implementations
652 of FFV1 bitstream, when "bits_per_raw_sample" was between 9 and 15
653 inclusive and "extra_plane" is 0, GBR Planes were used as BGR Planes
654 during both encoding and decoding. In the meanwhile, 16-bit
655 JPEG2000-RCT was implemented without this issue in one implementation
656 and validated by one conformance checker. Methods to address this
657 exception for the transform are under consideration for the next
658 version of the FFV1 bitstream.
660 Cb and Cr are positively offset by "1 << bits_per_raw_sample" after
661 the conversion from RGB to the modified YCbCr and are negatively
662 offseted by the same value before the conversion from the modified
663 YCbCr to RGB, in order to have only non-negative values after the
664 conversion.
666 When FFV1 uses the JPEG2000-RCT, the horizontal Lines are interleaved
667 to improve caching efficiency since it is most likely that the
668 JPEG2000-RCT will immediately be converted to RGB during decoding.
669 The interleaved coding order is also Y, then Cb, then Cr, and then,
670 if used, transparency.
672 As an example, a Frame that is two Pixels wide and two Pixels high,
673 could comprise the following structure:
675 +------------------------+------------------------+
676 | Pixel(1,1) | Pixel(2,1) |
677 | Y(1,1) Cb(1,1) Cr(1,1) | Y(2,1) Cb(2,1) Cr(2,1) |
678 +------------------------+------------------------+
679 | Pixel(1,2) | Pixel(2,2) |
680 | Y(1,2) Cb(1,2) Cr(1,2) | Y(2,2) Cb(2,2) Cr(2,2) |
681 +------------------------+------------------------+
683 In JPEG2000-RCT, the coding order would be left to right and then top
684 to bottom, with values interleaved by Lines and stored in this order:
686 Y(1,1) Y(2,1) Cb(1,1) Cb(2,1) Cr(1,1) Cr(2,1) Y(1,2) Y(2,2) Cb(1,2)
687 Cb(2,2) Cr(1,2) Cr(2,2)
689 3.8. Coding of the Sample Difference
691 Instead of coding the n+1 bits of the Sample Difference with Huffman
692 or Range coding (or n+2 bits, in the case of JPEG2000-RCT), only the
693 n (or n+1, in the case of JPEG2000-RCT) least significant bits are
694 used, since this is sufficient to recover the original Sample. In
695 the equation below, the term "bits" represents "bits_per_raw_sample +
696 1" for JPEG2000-RCT or "bits_per_raw_sample" otherwise:
698 coder_input = [(sample_difference + 2 ^ (bits - 1)) &
699 (2 ^ bits - 1)] - 2 ^ (bits - 1)
701 Figure 8: Description of the coding of the Sample Difference in
702 the bitstream.
704 3.8.1. Range Coding Mode
706 Early experimental versions of FFV1 used the CABAC Arithmetic coder
707 from H.264 as defined in [ISO.14496-10.2014] but due to the uncertain
708 patent/royalty situation, as well as its slightly worse performance,
709 CABAC was replaced by a Range coder based on an algorithm defined by
710 G. Nigel N. Martin in 1979 [range-coding].
712 3.8.1.1. Range Binary Values
714 To encode binary digits efficiently a Range coder is used. C_(i) is
715 the i-th Context. B_(i) is the i-th byte of the bytestream. b_(i) is
716 the i-th Range coded binary value, S_(0, i) is the i-th initial
717 state. The length of the bytestream encoding n binary symbols is
718 j_(n) bytes.
720 r_(i) = floor( ( R_(i) * S_(i, C_(i)) ) / 2 ^ 8 )
722 Figure 9: A formula of the read of a binary value in Range Binary
723 mode.
725 S_(i + 1, C_(i)) = zero_state_(S_(i, C_(i))) AND
726 l_(i) = L_(i) AND
727 t_(i) = R_(i) - r_(i) <==
728 b_(i) = 0 <==>
729 L_(i) < R_(i) - r_(i)
731 S_(i + 1, C_(i)) = one_state_(S_(i, C_(i))) AND
732 l_(i) = L_(i) - R_(i) + r_(i) AND
733 t_(i) = r_(i) <==
734 b_(i) = 1 <==>
735 L_(i) >= R_(i) - r_(i)
736 Figure 10
738 S_(i + 1, k) = S_(i, k) <== C_(i) != k
740 Figure 11
742 R_(i + 1) = 2 ^ 8 * t_(i) AND
743 L_(i + 1) = 2 ^ 8 * l_(i) + B_(j_(i)) AND
744 j_(i + 1) = j_(i) + 1 <==
745 t_(i) < 2 ^ 8
747 R_(i + 1) = t_(i) AND
748 L_(i + 1) = l_(i) AND
749 j_(i + 1) = j_(i) <==
750 t_(i) >= 2 ^ 8
752 Figure 12
754 R_(0) = 65280
756 Figure 13
758 L_(0) = 2 ^ 8 * B_(0) + B_(1)
760 Figure 14
762 j_(0) = 2
764 Figure 15
766 range = 0xFF00;
767 end = 0;
768 low = get_bits(16);
769 if (low >= range) {
770 low = range;
771 end = 1;
772 }
774 Figure 16: A pseudo-code description of the initial states in
775 Range Binary mode.
777 refill() {
778 if (range < 256) {
779 range = range * 256;
780 low = low * 256;
781 if (!end) {
782 c.low += get_bits(8);
783 if (remaining_bits_in_bitstream( NumBytes ) == 0) {
784 end = 1;
785 }
786 }
787 }
788 }
790 Figure 17: A pseudo-code description of refilling the Range
791 Binary Value coder buffer.
793 get_rac(state) {
794 rangeoff = (range * state) / 256;
795 range -= rangeoff;
796 if (low < range) {
797 state = zero_state[state];
798 refill();
799 return 0;
800 } else {
801 low -= range;
802 state = one_state[state];
803 range = rangeoff;
804 refill();
805 return 1;
806 }
807 }
809 Figure 18: A pseudo-code description of the read of a binary
810 value in Range Binary mode.
812 3.8.1.1.1. Termination
814 The range coder can be used in three modes.
816 * In "Open mode" when decoding, every Symbol the reader attempts to
817 read is available. In this mode arbitrary data can have been
818 appended without affecting the range coder output. This mode is
819 not used in FFV1.
821 * In "Closed mode" the length in bytes of the bytestream is provided
822 to the range decoder. Bytes beyond the length are read as 0 by
823 the range decoder. This is generally one byte shorter than the
824 open mode.
826 * In "Sentinel mode" the exact length in bytes is not known and thus
827 the range decoder MAY read into the data that follows the range
828 coded bytestream by one byte. In "Sentinel mode", the end of the
829 range coded bytestream is a binary Symbol with state 129, which
830 value SHALL be discarded. After reading this Symbol, the range
831 decoder will have read one byte beyond the end of the range coded
832 bytestream. This way the byte position of the end can be
833 determined. Bytestreams written in "Sentinel mode" can be read in
834 "Closed mode" if the length can be determined, in this case the
835 last (sentinel) Symbol will be read non-corrupted and be of value
836 0.
838 Above describes the range decoding. Encoding is defined as any
839 process which produces a decodable bytestream.
841 There are three places where range coder termination is needed in
842 FFV1. First is in the "Configuration Record", in this case the size
843 of the range coded bytestream is known and handled as "Closed mode".
844 Second is the switch from the "Slice Header" which is range coded to
845 Golomb coded slices as "Sentinel mode". Third is the end of range
846 coded Slices which need to terminate before the CRC at their end.
847 This can be handled as "Sentinel mode" or as "Closed mode" if the CRC
848 position has been determined.
850 3.8.1.2. Range Non Binary Values
852 To encode scalar integers, it would be possible to encode each bit
853 separately and use the past bits as context. However that would mean
854 255 contexts per 8-bit Symbol that is not only a waste of memory but
855 also requires more past data to reach a reasonably good estimate of
856 the probabilities. Alternatively assuming a Laplacian distribution
857 and only dealing with its variance and mean (as in Huffman coding)
858 would also be possible, however, for maximum flexibility and
859 simplicity, the chosen method uses a single Symbol to encode if a
860 number is 0, and if not, encodes the number using its exponent,
861 mantissa and sign. The exact contexts used are best described by
862 Figure 19.
864 int get_symbol(RangeCoder *c, uint8_t *state, int is_signed) {
865 if (get_rac(c, state + 0) {
866 return 0;
867 }
869 int e = 0;
870 while (get_rac(c, state + 1 + min(e, 9)) { //1..10
871 e++;
872 }
874 int a = 1;
875 for (int i = e - 1; i >= 0; i--) {
876 a = a * 2 + get_rac(c, state + 22 + min(i, 9)); // 22..31
877 }
879 if (!is_signed) {
880 return a;
881 }
883 if (get_rac(c, state + 11 + min(e, 10))) { //11..21
884 return -a;
885 } else {
886 return a;
887 }
888 }
890 Figure 19: A pseudo-code description of the contexts of Range Non
891 Binary Values.
893 "get_symbol" is used for the read out of "sample_difference"
894 indicated in Figure 8.
896 "get_rac" returns a boolean, computed from the bytestream as
897 described in Figure 9 as a formula and in Figure 18 as pseudo-code.
899 3.8.1.3. Initial Values for the Context Model
901 When "keyframe" (see Section 4.4) value is 1, all Range coder state
902 variables are set to their initial state.
904 3.8.1.4. State Transition Table
906 one_state_(i) =
907 default_state_transition_(i) + state_transition_delta_(i)
909 Figure 20
911 zero_state_(i) = 256 - one_state_(256-i)
912 Figure 21
914 3.8.1.5. default_state_transition
916 0, 0, 0, 0, 0, 0, 0, 0, 20, 21, 22, 23, 24, 25, 26, 27,
918 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 37, 38, 39, 40, 41, 42,
920 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 56, 57,
922 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73,
924 74, 75, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88,
926 89, 90, 91, 92, 93, 94, 94, 95, 96, 97, 98, 99,100,101,102,103,
928 104,105,106,107,108,109,110,111,112,113,114,114,115,116,117,118,
930 119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,133,
932 134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,
934 150,151,152,152,153,154,155,156,157,158,159,160,161,162,163,164,
936 165,166,167,168,169,170,171,171,172,173,174,175,176,177,178,179,
938 180,181,182,183,184,185,186,187,188,189,190,190,191,192,194,194,
940 195,196,197,198,199,200,201,202,202,204,205,206,207,208,209,209,
942 210,211,212,213,215,215,216,217,218,219,220,220,222,223,224,225,
944 226,227,227,229,229,230,231,232,234,234,235,236,237,238,239,240,
946 241,242,243,244,245,246,247,248,248, 0, 0, 0, 0, 0, 0, 0,
948 3.8.1.6. Alternative State Transition Table
950 The alternative state transition table has been built using iterative
951 minimization of frame sizes and generally performs better than the
952 default. To use it, the "coder_type" (see Section 4.2.3) MUST be set
953 to 2 and the difference to the default MUST be stored in the
954 "Parameters", see Section 4.2. The reference implementation of FFV1
955 in FFmpeg uses Figure 22 by default at the time of this writing when
956 Range coding is used.
958 0, 10, 10, 10, 10, 16, 16, 16, 28, 16, 16, 29, 42, 49, 20, 49,
960 59, 25, 26, 26, 27, 31, 33, 33, 33, 34, 34, 37, 67, 38, 39, 39,
962 40, 40, 41, 79, 43, 44, 45, 45, 48, 48, 64, 50, 51, 52, 88, 52,
964 53, 74, 55, 57, 58, 58, 74, 60,101, 61, 62, 84, 66, 66, 68, 69,
966 87, 82, 71, 97, 73, 73, 82, 75,111, 77, 94, 78, 87, 81, 83, 97,
968 85, 83, 94, 86, 99, 89, 90, 99,111, 92, 93,134, 95, 98,105, 98,
970 105,110,102,108,102,118,103,106,106,113,109,112,114,112,116,125,
972 115,116,117,117,126,119,125,121,121,123,145,124,126,131,127,129,
974 165,130,132,138,133,135,145,136,137,139,146,141,143,142,144,148,
976 147,155,151,149,151,150,152,157,153,154,156,168,158,162,161,160,
978 172,163,169,164,166,184,167,170,177,174,171,173,182,176,180,178,
980 175,189,179,181,186,183,192,185,200,187,191,188,190,197,193,196,
982 197,194,195,196,198,202,199,201,210,203,207,204,205,206,208,214,
984 209,211,221,212,213,215,224,216,217,218,219,220,222,228,223,225,
986 226,224,227,229,240,230,231,232,233,234,235,236,238,239,237,242,
988 241,243,242,244,245,246,247,248,249,250,251,252,252,253,254,255,
990 Figure 22: Alternative state transition table for Range coding.
992 3.8.2. Golomb Rice Mode
994 The end of the bitstream of the Frame is padded with 0-bits until the
995 bitstream contains a multiple of 8 bits.
997 3.8.2.1. Signed Golomb Rice Codes
999 This coding mode uses Golomb Rice codes. The VLC is split into two
1000 parts. The prefix stores the most significant bits and the suffix
1001 stores the k least significant bits or stores the whole number in the
1002 ESC case.
1004 int get_ur_golomb(k) {
1005 for (prefix = 0; prefix < 12; prefix++) {
1006 if (get_bits(1)) {
1007 return get_bits(k) + (prefix << k);
1008 }
1009 }
1010 return get_bits(bits) + 11;
1011 }
1013 Figure 23: A pseudo-code description of the read of an unsigned
1014 integer in Golomb Rice mode.
1016 int get_sr_golomb(k) {
1017 v = get_ur_golomb(k);
1018 if (v & 1) return - (v >> 1) - 1;
1019 else return (v >> 1);
1020 }
1022 Figure 24: A pseudo-code description of the read of a signed
1023 integer in Golomb Rice mode.
1025 3.8.2.1.1. Prefix
1027 +================+=======+
1028 | bits | value |
1029 +================+=======+
1030 | 1 | 0 |
1031 +----------------+-------+
1032 | 01 | 1 |
1033 +----------------+-------+
1034 | ... | ... |
1035 +----------------+-------+
1036 | 0000 0000 01 | 9 |
1037 +----------------+-------+
1038 | 0000 0000 001 | 10 |
1039 +----------------+-------+
1040 | 0000 0000 0001 | 11 |
1041 +----------------+-------+
1042 | 0000 0000 0000 | ESC |
1043 +----------------+-------+
1045 Table 1
1047 3.8.2.1.2. Suffix
1049 +=========+========================================+
1050 +=========+========================================+
1051 | non ESC | the k least significant bits MSB first |
1052 +---------+----------------------------------------+
1053 | ESC | the value - 11, in MSB first order |
1054 +---------+----------------------------------------+
1056 Table 2
1058 ESC MUST NOT be used if the value can be coded as non ESC.
1060 3.8.2.1.3. Examples
1062 Table 3 shows practical examples of how Signed Golomb Rice Codes are
1063 decoded based on the series of bits extracted from the bitstream as
1064 described by the method above:
1066 +=====+=======================+=======+
1067 | k | bits | value |
1068 +=====+=======================+=======+
1069 | 0 | 1 | 0 |
1070 +-----+-----------------------+-------+
1071 | 0 | 001 | 2 |
1072 +-----+-----------------------+-------+
1073 | 2 | 1 00 | 0 |
1074 +-----+-----------------------+-------+
1075 | 2 | 1 10 | 2 |
1076 +-----+-----------------------+-------+
1077 | 2 | 01 01 | 5 |
1078 +-----+-----------------------+-------+
1079 | any | 000000000000 10000000 | 139 |
1080 +-----+-----------------------+-------+
1082 Table 3: Examples of decoded Signed
1083 Golomb Rice Codes.
1085 3.8.2.2. Run Mode
1087 Run mode is entered when the context is 0 and left as soon as a non-0
1088 difference is found. The sample difference is identical to the
1089 predicted one. The run and the first different sample difference are
1090 coded as defined in Section 3.8.2.4.1.
1092 3.8.2.2.1. Run Length Coding
1094 The run value is encoded in two parts. The prefix part stores the
1095 more significant part of the run as well as adjusting the "run_index"
1096 that determines the number of bits in the less significant part of
1097 the run. The second part of the value stores the less significant
1098 part of the run as it is. The "run_index" is reset for each Plane
1099 and slice to 0.
1101 log2_run[41] = {
1102 0, 0, 0, 0, 1, 1, 1, 1,
1103 2, 2, 2, 2, 3, 3, 3, 3,
1104 4, 4, 5, 5, 6, 6, 7, 7,
1105 8, 9,10,11,12,13,14,15,
1106 16,17,18,19,20,21,22,23,
1107 24,
1108 };
1110 if (run_count == 0 && run_mode == 1) {
1111 if (get_bits(1)) {
1112 run_count = 1 << log2_run[run_index];
1113 if (x + run_count <= w) {
1114 run_index++;
1115 }
1116 } else {
1117 if (log2_run[run_index]) {
1118 run_count = get_bits(log2_run[run_index]);
1119 } else {
1120 run_count = 0;
1121 }
1122 if (run_index) {
1123 run_index--;
1124 }
1125 run_mode = 2;
1126 }
1127 }
1129 The "log2_run" array is also used within [ISO.14495-1.1999].
1131 3.8.2.3. Sign extension
1133 "sign_extend" is the function of increasing the number of bits of an
1134 input binary number in twos complement signed number representation
1135 while preserving the input number's sign (positive/negative) and
1136 value, in order to fit in the output bit width. It MAY be computed
1137 with:
1139 sign_extend(input_number, input_bits) {
1140 negative_bias = 1 << (input_bits - 1);
1141 bits_mask = negative_bias - 1;
1142 output_number = input_number & bits_mask; // Remove negative bit
1143 is_negative = input_number & negative_bias; // Test negative bit
1144 if (is_negative)
1145 output_number -= negative_bias;
1146 return output_number
1147 }
1149 3.8.2.4. Scalar Mode
1151 Each difference is coded with the per context mean prediction removed
1152 and a per context value for k.
1154 get_vlc_symbol(state) {
1155 i = state->count;
1156 k = 0;
1157 while (i < state->error_sum) {
1158 k++;
1159 i += i;
1160 }
1162 v = get_sr_golomb(k);
1164 if (2 * state->drift < -state->count) {
1165 v = -1 - v;
1166 }
1168 ret = sign_extend(v + state->bias, bits);
1170 state->error_sum += abs(v);
1171 state->drift += v;
1173 if (state->count == 128) {
1174 state->count >>= 1;
1175 state->drift >>= 1;
1176 state->error_sum >>= 1;
1177 }
1178 state->count++;
1179 if (state->drift <= -state->count) {
1180 state->bias = max(state->bias - 1, -128);
1182 state->drift = max(state->drift + state->count,
1183 -state->count + 1);
1184 } else if (state->drift > 0) {
1185 state->bias = min(state->bias + 1, 127);
1187 state->drift = min(state->drift - state->count, 0);
1188 }
1190 return ret;
1191 }
1193 3.8.2.4.1. Golomb Rice Sample Difference Coding
1195 Level coding is identical to the normal difference coding with the
1196 exception that the 0 value is removed as it cannot occur:
1198 diff = get_vlc_symbol(context_state);
1199 if (diff >= 0) {
1200 diff++;
1201 }
1203 Note, this is different from JPEG-LS, which doesn't use prediction in
1204 run mode and uses a different encoding and context model for the last
1205 difference. On a small set of test Samples the use of prediction
1206 slightly improved the compression rate.
1208 3.8.2.5. Initial Values for the VLC context state
1210 When "keyframe" (see Section 4.4) value is 1, all coder state
1211 variables are set to their initial state.
1213 drift = 0;
1214 error_sum = 4;
1215 bias = 0;
1216 count = 1;
1218 4. Bitstream
1220 An FFV1 bitstream is composed of a series of one or more Frames and
1221 (when required) a "Configuration Record".
1223 Within the following sub-sections, pseudo-code is used, as described
1224 in Section 2.2.1, to explain the structure of each FFV1 bitstream
1225 component. Table 4 lists symbols used to annotate that pseudo-code
1226 in order to define the storage of the data referenced in that line of
1227 pseudo-code.
1229 +========+=================================================+
1230 | Symbol | Definition |
1231 +========+=================================================+
1232 | u(n) | unsigned big endian integer Symbol using n bits |
1233 +--------+-------------------------------------------------+
1234 | sg | Golomb Rice coded signed scalar Symbol coded |
1235 | | with the method described in Section 3.8.2 |
1236 +--------+-------------------------------------------------+
1237 | br | Range coded Boolean (1-bit) Symbol with the |
1238 | | method described in Section 3.8.1.1 |
1239 +--------+-------------------------------------------------+
1240 | ur | Range coded unsigned scalar Symbol coded with |
1241 | | the method described in Section 3.8.1.2 |
1242 +--------+-------------------------------------------------+
1243 | sr | Range coded signed scalar Symbol coded with the |
1244 | | method described in Section 3.8.1.2 |
1245 +--------+-------------------------------------------------+
1246 | sd | Sample difference Symbol coded with the method |
1247 | | described in Section 3.8 |
1248 +--------+-------------------------------------------------+
1250 Table 4: Definition of pseudo-code symbols for this
1251 document.
1253 The following MUST be provided by external means during
1254 initialization of the decoder:
1256 "frame_pixel_width" is defined as Frame width in Pixels.
1258 "frame_pixel_height" is defined as Frame height in Pixels.
1260 Default values at the decoder initialization phase:
1262 "ConfigurationRecordIsPresent" is set to 0.
1264 4.1. Quantization Table Set
1266 The Quantization Table Sets are stored by storing the number of equal
1267 entries -1 of the first half of the table (represented as "len - 1"
1268 in the pseudo-code below) using the method described in
1269 Section 3.8.1.2. The second half doesn't need to be stored as it is
1270 identical to the first with flipped sign. "scale" and "len_count[ i
1271 ][ j ]" are temporary values used for the computing of
1272 "context_count[ i ]" and are not used outside Quantization Table Set
1273 pseudo-code.
1275 Example:
1277 Table: 0 0 1 1 1 1 2 2 -2 -2 -2 -1 -1 -1 -1 0
1279 Stored values: 1, 3, 1
1281 "QuantizationTableSet" has its own initial states, all set to 128.
1283 pseudo-code | type
1284 --------------------------------------------------------------|-----
1285 QuantizationTableSet( i ) { |
1286 scale = 1 |
1287 for (j = 0; j < MAX_CONTEXT_INPUTS; j++) { |
1288 QuantizationTable( i, j, scale ) |
1289 scale *= 2 * len_count[ i ][ j ] - 1 |
1290 } |
1291 context_count[ i ] = ceil( scale / 2 ) |
1292 } |
1294 "MAX_CONTEXT_INPUTS" is 5.
1296 pseudo-code | type
1297 --------------------------------------------------------------|-----
1298 QuantizationTable(i, j, scale) { |
1299 v = 0 |
1300 for (k = 0; k < 128;) { |
1301 len - 1 | ur
1302 for (n = 0; n < len; n++) { |
1303 quant_tables[ i ][ j ][ k ] = scale * v |
1304 k++ |
1305 } |
1306 v++ |
1307 } |
1308 for (k = 1; k < 128; k++) { |
1309 quant_tables[ i ][ j ][ 256 - k ] = \ |
1310 -quant_tables[ i ][ j ][ k ] |
1311 } |
1312 quant_tables[ i ][ j ][ 128 ] = \ |
1313 -quant_tables[ i ][ j ][ 127 ] |
1314 len_count[ i ][ j ] = v |
1315 } |
1317 4.1.1. quant_tables
1319 "quant_tables[ i ][ j ][ k ]" indicates the quantification table
1320 value of the Quantized Sample Difference "k" of the Quantization
1321 Table "j" of the Set Quantization Table Set "i".
1323 4.1.2. context_count
1325 "context_count[ i ]" indicates the count of contexts for Quantization
1326 Table Set "i". "context_count[ i ]" MUST be less than or equal to
1327 32768.
1329 4.2. Parameters
1331 The "Parameters" section contains significant characteristics about
1332 the decoding configuration used for all instances of Frame (in FFV1
1333 version 0 and 1) or the whole FFV1 bitstream (other versions),
1334 including the stream version, color configuration, and quantization
1335 tables. Figure 25 describes the contents of the bitstream.
1337 "Parameters" has its own initial states, all set to 128.
1339 pseudo-code | type
1340 --------------------------------------------------------------|-----
1341 Parameters( ) { |
1342 version | ur
1343 if (version >= 3) { |
1344 micro_version | ur
1345 } |
1346 coder_type | ur
1347 if (coder_type > 1) { |
1348 for (i = 1; i < 256; i++) { |
1349 state_transition_delta[ i ] | sr
1350 } |
1351 } |
1352 colorspace_type | ur
1353 if (version >= 1) { |
1354 bits_per_raw_sample | ur
1355 } |
1356 chroma_planes | br
1357 log2_h_chroma_subsample | ur
1358 log2_v_chroma_subsample | ur
1359 extra_plane | br
1360 if (version >= 3) { |
1361 num_h_slices - 1 | ur
1362 num_v_slices - 1 | ur
1363 quant_table_set_count | ur
1364 } |
1365 for (i = 0; i < quant_table_set_count; i++) { |
1366 QuantizationTableSet( i ) |
1367 } |
1368 if (version >= 3) { |
1369 for (i = 0; i < quant_table_set_count; i++) { |
1370 states_coded | br
1371 if (states_coded) { |
1372 for (j = 0; j < context_count[ i ]; j++) { |
1373 for (k = 0; k < CONTEXT_SIZE; k++) { |
1374 initial_state_delta[ i ][ j ][ k ] | sr
1375 } |
1376 } |
1377 } |
1378 } |
1379 ec | ur
1380 intra | ur
1381 } |
1382 } |
1384 Figure 25: A pseudo-code description of the bitstream contents.
1386 CONTEXT_SIZE is 32.
1388 4.2.1. version
1390 "version" specifies the version of the FFV1 bitstream.
1392 Each version is incompatible with other versions: decoders SHOULD
1393 reject FFV1 bitstreams due to an unknown version.
1395 Decoders SHOULD reject FFV1 bitstreams with version <= 1 &&
1396 ConfigurationRecordIsPresent == 1.
1398 Decoders SHOULD reject FFV1 bitstreams with version >= 3 &&
1399 ConfigurationRecordIsPresent == 0.
1401 +=======+=========================+
1402 | value | version |
1403 +=======+=========================+
1404 | 0 | FFV1 version 0 |
1405 +-------+-------------------------+
1406 | 1 | FFV1 version 1 |
1407 +-------+-------------------------+
1408 | 2 | reserved* |
1409 +-------+-------------------------+
1410 | 3 | FFV1 version 3 |
1411 +-------+-------------------------+
1412 | Other | reserved for future use |
1413 +-------+-------------------------+
1415 Table 5
1417 * Version 2 was experimental and this document does not describe it.
1419 4.2.2. micro_version
1421 "micro_version" specifies the micro-version of the FFV1 bitstream.
1423 After a version is considered stable (a micro-version value is
1424 assigned to be the first stable variant of a specific version), each
1425 new micro-version after this first stable variant is compatible with
1426 the previous micro-version: decoders SHOULD NOT reject FFV1
1427 bitstreams due to an unknown micro-version equal or above the micro-
1428 version considered as stable.
1430 Meaning of "micro_version" for "version" 3:
1432 +=======+=========================+
1433 | value | micro_version |
1434 +=======+=========================+
1435 | 0...3 | reserved* |
1436 +-------+-------------------------+
1437 | 4 | first stable variant |
1438 +-------+-------------------------+
1439 | Other | reserved for future use |
1440 +-------+-------------------------+
1442 Table 6: The definitions for
1443 "micro_version" values for FFV1
1444 version 3.
1446 * development versions may be incompatible with the stable variants.
1448 4.2.3. coder_type
1450 "coder_type" specifies the coder used.
1452 +=======+=================================================+
1453 | value | coder used |
1454 +=======+=================================================+
1455 | 0 | Golomb Rice |
1456 +-------+-------------------------------------------------+
1457 | 1 | Range Coder with default state transition table |
1458 +-------+-------------------------------------------------+
1459 | 2 | Range Coder with custom state transition table |
1460 +-------+-------------------------------------------------+
1461 | Other | reserved for future use |
1462 +-------+-------------------------------------------------+
1464 Table 7
1466 Restrictions:
1468 If "coder_type" is 0, then "bits_per_raw_sample" SHOULD NOT be > 8.
1470 Background: At the time of this writing, there is no known
1471 implementation of FFV1 bitstream supporting Golomb Rice algorithm
1472 with "bits_per_raw_sample" greater than 8, and Range Coder is
1473 prefered.
1475 4.2.4. state_transition_delta
1477 "state_transition_delta" specifies the Range coder custom state
1478 transition table.
1480 If "state_transition_delta" is not present in the FFV1 bitstream, all
1481 Range coder custom state transition table elements are assumed to be
1482 0.
1484 4.2.5. colorspace_type
1486 "colorspace_type" specifies the color space encoded, the pixel
1487 transformation used by the encoder, the extra plane content, as well
1488 as interleave method.
1490 +=======+==============+================+==============+============+
1491 | value | color space | pixel | extra plane | interleave |
1492 | | encoded | transformation | content | method |
1493 +=======+==============+================+==============+============+
1494 | 0 | YCbCr | None | Transparency | Plane then |
1495 | | | | | Line |
1496 +-------+--------------+----------------+--------------+------------+
1497 | 1 | RGB | JPEG2000-RCT | Transparency | Line then |
1498 | | | | | Plane |
1499 +-------+--------------+----------------+--------------+------------+
1500 | Other | reserved | reserved for | reserved for | reserved |
1501 | | for future | future use | future use | for future |
1502 | | use | | | use |
1503 +-------+--------------+----------------+--------------+------------+
1505 Table 8
1507 FFV1 bitstreams with "colorspace_type" == 1 && ("chroma_planes" !=
1508 1 || "log2_h_chroma_subsample" != 0 || "log2_v_chroma_subsample" !=
1509 0) are not part of this specification.
1511 4.2.6. chroma_planes
1513 "chroma_planes" indicates if chroma (color) Planes are present.
1515 +=======+===============================+
1516 | value | presence |
1517 +=======+===============================+
1518 | 0 | chroma Planes are not present |
1519 +-------+-------------------------------+
1520 | 1 | chroma Planes are present |
1521 +-------+-------------------------------+
1523 Table 9
1525 4.2.7. bits_per_raw_sample
1527 "bits_per_raw_sample" indicates the number of bits for each Sample.
1528 Inferred to be 8 if not present.
1530 +=======+=================================+
1531 | value | bits for each sample |
1532 +=======+=================================+
1533 | 0 | reserved* |
1534 +-------+---------------------------------+
1535 | Other | the actual bits for each Sample |
1536 +-------+---------------------------------+
1538 Table 10
1540 * Encoders MUST NOT store "bits_per_raw_sample" = 0. Decoders SHOULD
1541 accept and interpret "bits_per_raw_sample" = 0 as 8.
1543 4.2.8. log2_h_chroma_subsample
1545 "log2_h_chroma_subsample" indicates the subsample factor, stored in
1546 powers to which the number 2 is raised, between luma and chroma width
1547 ("chroma_width = 2 ^ -log2_h_chroma_subsample * luma_width").
1549 4.2.9. log2_v_chroma_subsample
1551 "log2_v_chroma_subsample" indicates the subsample factor, stored in
1552 powers to which the number 2 is raised, between luma and chroma
1553 height ("chroma_height = 2 ^ -log2_v_chroma_subsample *
1554 luma_height").
1556 4.2.10. extra_plane
1558 "extra_plane" indicates if an extra Plane is present.
1560 +=======+============================+
1561 | value | presence |
1562 +=======+============================+
1563 | 0 | extra Plane is not present |
1564 +-------+----------------------------+
1565 | 1 | extra Plane is present |
1566 +-------+----------------------------+
1568 Table 11
1570 4.2.11. num_h_slices
1572 "num_h_slices" indicates the number of horizontal elements of the
1573 slice raster.
1575 Inferred to be 1 if not present.
1577 4.2.12. num_v_slices
1579 "num_v_slices" indicates the number of vertical elements of the slice
1580 raster.
1582 Inferred to be 1 if not present.
1584 4.2.13. quant_table_set_count
1586 "quant_table_set_count" indicates the number of Quantization
1587 Table Sets. "quant_table_set_count" MUST be less than or equal to 8.
1589 Inferred to be 1 if not present.
1591 MUST NOT be 0.
1593 4.2.14. states_coded
1595 "states_coded" indicates if the respective Quantization Table Set has
1596 the initial states coded.
1598 Inferred to be 0 if not present.
1600 +=======+================================+
1601 | value | initial states |
1602 +=======+================================+
1603 | 0 | initial states are not present |
1604 | | and are assumed to be all 128 |
1605 +-------+--------------------------------+
1606 | 1 | initial states are present |
1607 +-------+--------------------------------+
1609 Table 12
1611 4.2.15. initial_state_delta
1613 "initial_state_delta[ i ][ j ][ k ]" indicates the initial Range
1614 coder state, it is encoded using "k" as context index and
1616 pred = j ? initial_states[ i ][j - 1][ k ] : 128
1617 Figure 26
1619 initial_state[ i ][ j ][ k ] =
1620 ( pred + initial_state_delta[ i ][ j ][ k ] ) & 255
1622 Figure 27
1624 4.2.16. ec
1626 "ec" indicates the error detection/correction type.
1628 +=======+=================================================+
1629 | value | error detection/correction type |
1630 +=======+=================================================+
1631 | 0 | 32-bit CRC in "ConfigurationRecord" |
1632 +-------+-------------------------------------------------+
1633 | 1 | 32-bit CRC in "Slice" and "ConfigurationRecord" |
1634 +-------+-------------------------------------------------+
1635 | Other | reserved for future use |
1636 +-------+-------------------------------------------------+
1638 Table 13
1640 4.2.17. intra
1642 "intra" indicates the constraint on "keyframe" in each instance of
1643 Frame.
1645 Inferred to be 0 if not present.
1647 +=======+=======================================================+
1648 | value | relationship |
1649 +=======+=======================================================+
1650 | 0 | "keyframe" can be 0 or 1 (non keyframes or keyframes) |
1651 +-------+-------------------------------------------------------+
1652 | 1 | "keyframe" MUST be 1 (keyframes only) |
1653 +-------+-------------------------------------------------------+
1654 | Other | reserved for future use |
1655 +-------+-------------------------------------------------------+
1657 Table 14
1659 4.3. Configuration Record
1661 In the case of a FFV1 bitstream with "version >= 3", a "Configuration
1662 Record" is stored in the underlying Container as described in
1663 Section 4.3.3. It contains the "Parameters" used for all instances
1664 of Frame. The size of the "Configuration Record", "NumBytes", is
1665 supplied by the underlying Container.
1667 pseudo-code | type
1668 -----------------------------------------------------------|-----
1669 ConfigurationRecord( NumBytes ) { |
1670 ConfigurationRecordIsPresent = 1 |
1671 Parameters( ) |
1672 while (remaining_symbols_in_syntax(NumBytes - 4)) { |
1673 reserved_for_future_use | br/ur/sr
1674 } |
1675 configuration_record_crc_parity | u(32)
1676 } |
1678 4.3.1. reserved_for_future_use
1680 "reserved_for_future_use" is a placeholder for future updates of this
1681 specification.
1683 Encoders conforming to this version of this specification SHALL NOT
1684 write "reserved_for_future_use".
1686 Decoders conforming to this version of this specification SHALL
1687 ignore "reserved_for_future_use".
1689 4.3.2. configuration_record_crc_parity
1691 "configuration_record_crc_parity" 32 bits that are chosen so that the
1692 "Configuration Record" as a whole has a CRC remainder of 0.
1694 This is equivalent to storing the CRC remainder in the 32-bit parity.
1696 The CRC generator polynomial used is described in Section 4.9.3.
1698 4.3.3. Mapping FFV1 into Containers
1700 This "Configuration Record" can be placed in any file format
1701 supporting "Configuration Records", fitting as much as possible with
1702 how the file format uses to store "Configuration Records". The
1703 "Configuration Record" storage place and "NumBytes" are currently
1704 defined and supported by this version of this specification for the
1705 following formats:
1707 4.3.3.1. AVI File Format
1709 The "Configuration Record" extends the stream format chunk ("AVI ",
1710 "hdlr", "strl", "strf") with the ConfigurationRecord bitstream.
1712 See [AVI] for more information about chunks.
1714 "NumBytes" is defined as the size, in bytes, of the strf chunk
1715 indicated in the chunk header minus the size of the stream format
1716 structure.
1718 4.3.3.2. ISO Base Media File Format
1720 The "Configuration Record" extends the sample description box
1721 ("moov", "trak", "mdia", "minf", "stbl", "stsd") with a "glbl" box
1722 that contains the ConfigurationRecord bitstream. See
1723 [ISO.14496-12.2015] for more information about boxes.
1725 "NumBytes" is defined as the size, in bytes, of the "glbl" box
1726 indicated in the box header minus the size of the box header.
1728 4.3.3.3. NUT File Format
1730 The "codec_specific_data" element (in "stream_header" packet)
1731 contains the ConfigurationRecord bitstream. See [NUT] for more
1732 information about elements.
1734 "NumBytes" is defined as the size, in bytes, of the
1735 "codec_specific_data" element as indicated in the "length" field of
1736 "codec_specific_data".
1738 4.3.3.4. Matroska File Format
1740 FFV1 SHOULD use "V_FFV1" as the Matroska "Codec ID". For FFV1
1741 versions 2 or less, the Matroska "CodecPrivate" Element SHOULD NOT be
1742 used. For FFV1 versions 3 or greater, the Matroska "CodecPrivate"
1743 Element MUST contain the FFV1 "Configuration Record" structure and no
1744 other data. See [Matroska] for more information about elements.
1746 "NumBytes" is defined as the "Element Data Size" of the
1747 "CodecPrivate" Element.
1749 4.4. Frame
1751 A Frame is an encoded representation of a complete static image. The
1752 whole Frame is provided by the underlaying container.
1754 A Frame consists of the "keyframe" field, "Parameters" (if "version"
1755 <= 1), and a sequence of independent slices. The pseudo-code below
1756 describes the contents of a Frame.
1758 "keyframe" field has its own initial state, set to 128.
1760 pseudo-code | type
1761 --------------------------------------------------------------|-----
1762 Frame( NumBytes ) { |
1763 keyframe | br
1764 if (keyframe && !ConfigurationRecordIsPresent { |
1765 Parameters( ) |
1766 } |
1767 while (remaining_bits_in_bitstream( NumBytes )) { |
1768 Slice( ) |
1769 } |
1770 } |
1772 Architecture overview of slices in a Frame:
1774 +=================================================================+
1775 +=================================================================+
1776 | first slice header |
1777 +-----------------------------------------------------------------+
1778 | first slice content |
1779 +-----------------------------------------------------------------+
1780 | first slice footer |
1781 +-----------------------------------------------------------------+
1782 | --------------------------------------------------------------- |
1783 +-----------------------------------------------------------------+
1784 | second slice header |
1785 +-----------------------------------------------------------------+
1786 | second slice content |
1787 +-----------------------------------------------------------------+
1788 | second slice footer |
1789 +-----------------------------------------------------------------+
1790 | --------------------------------------------------------------- |
1791 +-----------------------------------------------------------------+
1792 | ... |
1793 +-----------------------------------------------------------------+
1794 | --------------------------------------------------------------- |
1795 +-----------------------------------------------------------------+
1796 | last slice header |
1797 +-----------------------------------------------------------------+
1798 | last slice content |
1799 +-----------------------------------------------------------------+
1800 | last slice footer |
1801 +-----------------------------------------------------------------+
1803 Table 15
1805 4.5. Slice
1807 A "Slice" is an independent spatial sub-section of a Frame that is
1808 encoded separately from another region of the same Frame. The use of
1809 more than one "Slice" per Frame can be useful for taking advantage of
1810 the opportunities of multithreaded encoding and decoding.
1812 A "Slice" consists of a "Slice Header" (when relevant), a "Slice
1813 Content", and a "Slice Footer" (when relevant). The pseudo-code
1814 below describes the contents of a "Slice".
1816 pseudo-code | type
1817 --------------------------------------------------------------|-----
1818 Slice( ) { |
1819 if (version >= 3) { |
1820 SliceHeader( ) |
1821 } |
1822 SliceContent( ) |
1823 if (coder_type == 0) { |
1824 while (!byte_aligned()) { |
1825 padding | u(1)
1826 } |
1827 } |
1828 if (version <= 1) { |
1829 while (remaining_bits_in_bitstream( NumBytes ) != 0) {|
1830 reserved | u(1)
1831 } |
1832 } |
1833 if (version >= 3) { |
1834 SliceFooter( ) |
1835 } |
1836 } |
1838 "padding" specifies a bit without any significance and used only for
1839 byte alignment. MUST be 0.
1841 "reserved" specifies a bit without any significance in this revision
1842 of the specification and may have a significance in a later revision
1843 of this specification.
1845 Encoders SHOULD NOT fill "reserved".
1847 Decoders SHOULD ignore "reserved".
1849 4.6. Slice Header
1851 A "Slice Header" provides information about the decoding
1852 configuration of the "Slice", such as its spatial position, size, and
1853 aspect ratio. The pseudo-code below describes the contents of the
1854 "Slice Header".
1856 "Slice Header" has its own initial states, all set to 128.
1858 pseudo-code | type
1859 --------------------------------------------------------------|-----
1860 SliceHeader( ) { |
1861 slice_x | ur
1862 slice_y | ur
1863 slice_width - 1 | ur
1864 slice_height - 1 | ur
1865 for (i = 0; i < quant_table_set_index_count; i++) { |
1866 quant_table_set_index[ i ] | ur
1867 } |
1868 picture_structure | ur
1869 sar_num | ur
1870 sar_den | ur
1871 } |
1873 4.6.1. slice_x
1875 "slice_x" indicates the x position on the slice raster formed by
1876 num_h_slices.
1878 Inferred to be 0 if not present.
1880 4.6.2. slice_y
1882 "slice_y" indicates the y position on the slice raster formed by
1883 num_v_slices.
1885 Inferred to be 0 if not present.
1887 4.6.3. slice_width
1889 "slice_width" indicates the width on the slice raster formed by
1890 num_h_slices.
1892 Inferred to be 1 if not present.
1894 4.6.4. slice_height
1896 "slice_height" indicates the height on the slice raster formed by
1897 num_v_slices.
1899 Inferred to be 1 if not present.
1901 4.6.5. quant_table_set_index_count
1903 "quant_table_set_index_count" is defined as:
1905 1 + ( ( chroma_planes || version <= 3 ) ? 1 : 0 )
1906 + ( extra_plane ? 1 : 0 )
1908 4.6.6. quant_table_set_index
1910 "quant_table_set_index" indicates the Quantization Table Set index to
1911 select the Quantization Table Set and the initial states for the
1912 "Slice Content".
1914 Inferred to be 0 if not present.
1916 4.6.7. picture_structure
1918 "picture_structure" specifies the temporal and spatial relationship
1919 of each Line of the Frame.
1921 Inferred to be 0 if not present.
1923 +=======+=========================+
1924 | value | picture structure used |
1925 +=======+=========================+
1926 | 0 | unknown |
1927 +-------+-------------------------+
1928 | 1 | top field first |
1929 +-------+-------------------------+
1930 | 2 | bottom field first |
1931 +-------+-------------------------+
1932 | 3 | progressive |
1933 +-------+-------------------------+
1934 | Other | reserved for future use |
1935 +-------+-------------------------+
1937 Table 16
1939 4.6.8. sar_num
1941 "sar_num" specifies the Sample aspect ratio numerator.
1943 Inferred to be 0 if not present.
1945 A value of 0 means that aspect ratio is unknown.
1947 Encoders MUST write 0 if Sample aspect ratio is unknown.
1949 If "sar_den" is 0, decoders SHOULD ignore the encoded value and
1950 consider that "sar_num" is 0.
1952 4.6.9. sar_den
1954 "sar_den" specifies the Sample aspect ratio denominator.
1956 Inferred to be 0 if not present.
1958 A value of 0 means that aspect ratio is unknown.
1960 Encoders MUST write 0 if Sample aspect ratio is unknown.
1962 If "sar_num" is 0, decoders SHOULD ignore the encoded value and
1963 consider that "sar_den" is 0.
1965 4.7. Slice Content
1967 A "Slice Content" contains all Line elements part of the "Slice".
1969 Depending on the configuration, Line elements are ordered by Plane
1970 then by row (YCbCr) or by row then by Plane (RGB).
1972 pseudo-code | type
1973 --------------------------------------------------------------|-----
1974 SliceContent( ) { |
1975 if (colorspace_type == 0) { |
1976 for (p = 0; p < primary_color_count; p++) { |
1977 for (y = 0; y < plane_pixel_height[ p ]; y++) { |
1978 Line( p, y ) |
1979 } |
1980 } |
1981 } else if (colorspace_type == 1) { |
1982 for (y = 0; y < slice_pixel_height; y++) { |
1983 for (p = 0; p < primary_color_count; p++) { |
1984 Line( p, y ) |
1985 } |
1986 } |
1987 } |
1988 } |
1990 4.7.1. primary_color_count
1992 "primary_color_count" is defined as:
1994 1 + ( chroma_planes ? 2 : 0 ) + ( extra_plane ? 1 : 0 )
1996 4.7.2. plane_pixel_height
1998 "plane_pixel_height[ p ]" is the height in Pixels of Plane p of the
1999 "Slice". It is defined as:
2001 chroma_planes == 1 && (p == 1 || p == 2)
2002 ? ceil(slice_pixel_height / (1 << log2_v_chroma_subsample))
2003 : slice_pixel_height
2005 4.7.3. slice_pixel_height
2007 "slice_pixel_height" is the height in pixels of the slice. It is
2008 defined as:
2010 floor(
2011 ( slice_y + slice_height )
2012 * slice_pixel_height
2013 / num_v_slices
2014 ) - slice_pixel_y.
2016 4.7.4. slice_pixel_y
2018 "slice_pixel_y" is the slice vertical position in pixels. It is
2019 defined as:
2021 floor( slice_y * frame_pixel_height / num_v_slices )
2023 4.8. Line
2025 A Line is a list of the sample differences (relative to the
2026 predictor) of primary color components. The pseudo-code below
2027 describes the contents of the Line.
2029 pseudo-code | type
2030 --------------------------------------------------------------|-----
2031 Line( p, y ) { |
2032 if (colorspace_type == 0) { |
2033 for (x = 0; x < plane_pixel_width[ p ]; x++) { |
2034 sample_difference[ p ][ y ][ x ] | sd
2035 } |
2036 } else if (colorspace_type == 1) { |
2037 for (x = 0; x < slice_pixel_width; x++) { |
2038 sample_difference[ p ][ y ][ x ] | sd
2039 } |
2040 } |
2041 } |
2043 4.8.1. plane_pixel_width
2045 "plane_pixel_width[ p ]" is the width in Pixels of Plane p of the
2046 "Slice". It is defined as:
2048 chroma\_planes == 1 && (p == 1 || p == 2)
2049 ? ceil( slice_pixel_width / (1 << log2_h_chroma_subsample) )
2050 : slice_pixel_width.
2052 4.8.2. slice_pixel_width
2054 "slice_pixel_width" is the width in Pixels of the slice. It is
2055 defined as:
2057 floor(
2058 ( slice_x + slice_width )
2059 * slice_pixel_width
2060 / num_h_slices
2061 ) - slice_pixel_x
2063 4.8.3. slice_pixel_x
2065 "slice_pixel_x" is the slice horizontal position in Pixels. It is
2066 defined as:
2068 floor( slice_x * frame_pixel_width / num_h_slices )
2070 4.8.4. sample_difference
2072 "sample_difference[ p ][ y ][ x ]" is the sample difference for
2073 Sample at Plane "p", y position "y", and x position "x". The Sample
2074 value is computed based on median predictor and context described in
2075 Section 3.2.
2077 4.9. Slice Footer
2079 A "Slice Footer" provides information about slice size and
2080 (optionally) parity. The pseudo-code below describes the contents of
2081 the "Slice Footer".
2083 Note: "Slice Footer" is always byte aligned.
2085 pseudo-code | type
2086 --------------------------------------------------------------|-----
2087 SliceFooter( ) { |
2088 slice_size | u(24)
2089 if (ec) { |
2090 error_status | u(8)
2091 slice_crc_parity | u(32)
2092 } |
2093 } |
2095 4.9.1. slice_size
2097 "slice_size" indicates the size of the slice in bytes.
2099 Note: this allows finding the start of slices before previous slices
2100 have been fully decoded, and allows parallel decoding as well as
2101 error resilience.
2103 4.9.2. error_status
2105 "error_status" specifies the error status.
2107 +=======+======================================+
2108 | value | error status |
2109 +=======+======================================+
2110 | 0 | no error |
2111 +-------+--------------------------------------+
2112 | 1 | slice contains a correctable error |
2113 +-------+--------------------------------------+
2114 | 2 | slice contains a uncorrectable error |
2115 +-------+--------------------------------------+
2116 | Other | reserved for future use |
2117 +-------+--------------------------------------+
2119 Table 17
2121 4.9.3. slice_crc_parity
2123 "slice_crc_parity" 32 bits that are chosen so that the slice as a
2124 whole has a crc remainder of 0.
2126 This is equivalent to storing the crc remainder in the 32-bit parity.
2128 The CRC generator polynomial used is the standard IEEE CRC polynomial
2129 (0x104C11DB7), with initial value 0, without pre-inversion and
2130 without post-inversion.
2132 5. Restrictions
2134 To ensure that fast multithreaded decoding is possible, starting with
2135 version 3 and if "frame_pixel_width * frame_pixel_height" is more
2136 than 101376, "slice_width * slice_height" MUST be less or equal to
2137 "num_h_slices * num_v_slices / 4". Note: 101376 is the frame size in
2138 Pixels of a 352x288 frame also known as CIF ("Common Intermediate
2139 Format") frame size format.
2141 For each Frame, each position in the slice raster MUST be filled by
2142 one and only one slice of the Frame (no missing slice position, no
2143 slice overlapping).
2145 For each Frame with "keyframe" value of 0, each slice MUST have the
2146 same value of "slice_x", "slice_y", "slice_width", "slice_height" as
2147 a slice in the previous Frame.
2149 6. Security Considerations
2151 Like any other codec, (such as [RFC6716]), FFV1 should not be used
2152 with insecure ciphers or cipher-modes that are vulnerable to known
2153 plaintext attacks. Some of the header bits as well as the padding
2154 are easily predictable.
2156 Implementations of the FFV1 codec need to take appropriate security
2157 considerations into account, as outlined in [RFC4732]. It is
2158 extremely important for the decoder to be robust against malicious
2159 payloads. Malicious payloads MUST NOT cause the decoder to overrun
2160 its allocated memory or to take an excessive amount of resources to
2161 decode. The same applies to the encoder, even though problems in
2162 encoders are typically rarer. Malicious video streams MUST NOT cause
2163 the encoder to misbehave because this would allow an attacker to
2164 attack transcoding gateways. A frequent security problem in image
2165 and video codecs is failure to check for integer overflows. An
2166 example is allocating "frame_pixel_width * frame_pixel_height" in
2167 Pixel count computations without considering that the multiplication
2168 result may have overflowed the arithmetic types range. The range
2169 coder could, if implemented naively, read one byte over the end. The
2170 implementation MUST ensure that no read outside allocated and
2171 initialized memory occurs.
2173 None of the content carried in FFV1 is intended to be executable.
2175 The reference implementation [REFIMPL] contains no known buffer
2176 overflow or cases where a specially crafted packet or video segment
2177 could cause a significant increase in CPU load.
2179 The reference implementation [REFIMPL] was validated in the following
2180 conditions:
2182 * Sending the decoder valid packets generated by the reference
2183 encoder and verifying that the decoder's output matches the
2184 encoder's input.
2186 * Sending the decoder packets generated by the reference encoder and
2187 then subjected to random corruption.
2189 * Sending the decoder random packets that are not FFV1.
2191 In all of the conditions above, the decoder and encoder was run
2192 inside the [VALGRIND] memory debugger as well as clangs address
2193 sanitizer [Address-Sanitizer], which track reads and writes to
2194 invalid memory regions as well as the use of uninitialized memory.
2195 There were no errors reported on any of the tested conditions.
2197 7. IANA Considerations
2199 The IANA is requested to register the following values:
2201 7.1. Media Type Definition
2203 This registration is done using the template defined in [RFC6838] and
2204 following [RFC4855].
2206 Type name: video
2208 Subtype name: FFV1
2210 Required parameters: None.
2212 Optional parameters: These parameters are used to signal the
2213 capabilities of a receiver implementation. These parameters MUST NOT
2214 be used for any other purpose.
2216 * "version": The "version" of the FFV1 encoding as defined by
2217 Section 4.2.1.
2219 * "micro_version": The "micro_version" of the FFV1 encoding as
2220 defined by Section 4.2.2.
2222 * "coder_type": The "coder_type" of the FFV1 encoding as defined by
2223 Section 4.2.3.
2225 * "colorspace_type": The "colorspace_type" of the FFV1 encoding as
2226 defined by Section 4.2.5.
2228 * "bits_per_raw_sample": The "bits_per_raw_sample" of the FFV1
2229 encoding as defined by Section 4.2.7.
2231 * "max_slices": The value of "max_slices" is an integer indicating
2232 the maximum count of slices with a frames of the FFV1 encoding.
2234 Encoding considerations: This media type is defined for encapsulation
2235 in several audiovisual container formats and contains binary data;
2236 see Section 4.3.3. This media type is framed binary data; see
2237 Section 4.8 of [RFC6838].
2239 Security considerations: See Section 6 of this document.
2241 Interoperability considerations: None.
2243 Published specification: RFC XXXX.
2245 [RFC Editor: Upon publication as an RFC, please replace "XXXX" with
2246 the number assigned to this document and remove this note.]
2248 Applications which use this media type: Any application that requires
2249 the transport of lossless video can use this media type. Some
2250 examples are, but not limited to screen recording, scientific
2251 imaging, and digital video preservation.
2253 Fragment identifier considerations: N/A.
2255 Additional information: None.
2257 Person & email address to contact for further information: Michael
2258 Niedermayer michael@niedermayer.cc (mailto:michael@niedermayer.cc)
2260 Intended usage: COMMON
2262 Restrictions on usage: None.
2264 Author: Dave Rice dave@dericed.com (mailto:dave@dericed.com)
2266 Change controller: IETF cellar working group delegated from the IESG.
2268 8. Changelog
2270 See https://github.com/FFmpeg/FFV1/commits/master
2271 (https://github.com/FFmpeg/FFV1/commits/master)
2273 [RFC Editor: Please remove this Changelog section prior to
2274 publication.]
2276 9. Normative References
2278 [ISO.15444-1.2016]
2279 International Organization for Standardization,
2280 "Information technology -- JPEG 2000 image coding system:
2281 Core coding system", October 2016.
2283 [ISO.9899.2018]
2284 International Organization for Standardization,
2285 "Programming languages - C", ISO Standard 9899, 2018.
2287 [Matroska] IETF, "Matroska", 2019, .
2290 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
2291 Requirement Levels", BCP 14, RFC 2119,
2292 DOI 10.17487/RFC2119, March 1997,
2293 .
2295 [RFC4732] Handley, M., Ed., Rescorla, E., Ed., and IAB, "Internet
2296 Denial-of-Service Considerations", RFC 4732,
2297 DOI 10.17487/RFC4732, December 2006,
2298 .
2300 [RFC4855] Casner, S., "Media Type Registration of RTP Payload
2301 Formats", RFC 4855, DOI 10.17487/RFC4855, February 2007,
2302 .
2304 [RFC6716] Valin, JM., Vos, K., and T. Terriberry, "Definition of the
2305 Opus Audio Codec", RFC 6716, DOI 10.17487/RFC6716,
2306 September 2012, .
2308 [RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type
2309 Specifications and Registration Procedures", BCP 13,
2310 RFC 6838, DOI 10.17487/RFC6838, January 2013,
2311 .
2313 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2314 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
2315 May 2017, .
2317 10. Informative References
2319 [Address-Sanitizer]
2320 The Clang Team, "ASAN AddressSanitizer website", undated,
2321 .
2323 [AVI] Microsoft, "AVI RIFF File Reference", undated,
2324 .
2327 [FFV1_V0] Niedermayer, M., "Commit to mark FFV1 version 0 as non-
2328 experimental", April 2006, .
2332 [FFV1_V1] Niedermayer, M., "Commit to release FFV1 version 1", April
2333 2009, .
2336 [FFV1_V3] Niedermayer, M., "Commit to mark FFV1 version 3 as non-
2337 experimental", August 2013, .
2341 [HuffYUV] Rudiak-Gould, B., "HuffYUV", December 2003,
2342 .
2345 [ISO.14495-1.1999]
2346 International Organization for Standardization,
2347 "Information technology -- Lossless and near-lossless
2348 compression of continuous-tone still images: Baseline",
2349 December 1999.
2351 [ISO.14496-10.2014]
2352 International Organization for Standardization,
2353 "Information technology -- Coding of audio-visual objects
2354 -- Part 10: Advanced Video Coding", September 2014.
2356 [ISO.14496-12.2015]
2357 International Organization for Standardization,
2358 "Information technology -- Coding of audio-visual objects
2359 -- Part 12: ISO base media file format", December 2015.
2361 [NUT] Niedermayer, M., "NUT Open Container Format", December
2362 2013, .
2364 [range-coding]
2365 Martin, G. N. N., "Range encoding: an algorithm for
2366 removing redundancy from a digitised message", Proceedings
2367 of the Conference on Video and Data Recording. Institution
2368 of Electronic and Radio Engineers, Hampshire, England,
2369 July 1979.
2371 [REFIMPL] Niedermayer, M., "The reference FFV1 implementation / the
2372 FFV1 codec in FFmpeg", undated, .
2374 [VALGRIND] Valgrind Developers, "Valgrind website", undated,
2375 .
2377 [YCbCr] Wikipedia, "YCbCr", undated,
2378 .
2380 Appendix A. Multi-theaded decoder implementation suggestions
2382 This appendix is informative.
2384 The FFV1 bitstream is parsable in two ways: in sequential order as
2385 described in this document or with the pre-analysis of the footer of
2386 each slice. Each slice footer contains a "slice_size" field so the
2387 boundary of each slice is computable without having to parse the
2388 slice content. That allows multi-threading as well as independence
2389 of slice content (a bitstream error in a slice header or slice
2390 content has no impact on the decoding of the other slices).
2392 After having checked "keyframe" field, a decoder SHOULD parse
2393 "slice_size" fields, from "slice_size" of the last slice at the end
2394 of the "Frame" up to "slice_size" of the first slice at the beginning
2395 of the "Frame", before parsing slices, in order to have slices
2396 boundaries. A decoder MAY fallback on sequential order e.g. in case
2397 of a corrupted "Frame" (frame size unknown, "slice_size" of slices
2398 not coherent...) or if there is no possibility of seeking into the
2399 stream.
2401 Appendix B. Future handling of some streams created by non conforming
2402 encoders
2404 This appendix is informative.
2406 Some bitstreams were found with 40 extra bits corresponding to
2407 "error_status" and "slice_crc_parity" in the "reserved" bits of
2408 "Slice()". Any revision of this specification SHOULD care about
2409 avoiding to add 40 bits of content after "SliceContent" if "version"
2410 == 0 or "version" == 1. Else a decoder conforming to the revised
2411 specification could not distinguish between a revised bitstream and
2412 such buggy bitstream in the wild.
2414 Authors' Addresses
2416 Michael Niedermayer
2418 Email: michael@niedermayer.cc
2420 Dave Rice
2422 Email: dave@dericed.com
2424 Jerome Martinez
2425 Email: jerome@mediaarea.net