idnits 2.17.1 

draft-ietf-cellar-flac-02.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** There are 56 instances of too long lines in the document, the longest
     one being 4 characters in excess of 72.

  == There are 2 instances of lines with non-RFC2606-compliant FQDNs in the
     document.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (29 October 2021) is 909 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

     No issues found here.

     Summary: 2 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	cellar                                                     M. Richardson
3	Internet-Draft
4	Intended status: Informational                                 A. Weaver
5	Expires: 2 May 2022                                      29 October 2021

7	                       Free Lossless Audio Codec
8	                       draft-ietf-cellar-flac-02

10	Abstract

12	   This document defines FLAC, which stands for Free Lossless Audio
13	   Codec, a free, open source codec for lossless audio compression and
14	   decompression.

16	Status of This Memo

18	   This Internet-Draft is submitted in full conformance with the
19	   provisions of BCP 78 and BCP 79.

21	   Internet-Drafts are working documents of the Internet Engineering
22	   Task Force (IETF).  Note that other groups may also distribute
23	   working documents as Internet-Drafts.  The list of current Internet-
24	   Drafts is at https://datatracker.ietf.org/drafts/current/.

26	   Internet-Drafts are draft documents valid for a maximum of six months
27	   and may be updated, replaced, or obsoleted by other documents at any
28	   time.  It is inappropriate to use Internet-Drafts as reference
29	   material or to cite them other than as "work in progress."

31	   This Internet-Draft will expire on 2 May 2022.

33	Copyright Notice

35	   Copyright (c) 2021 IETF Trust and the persons identified as the
36	   document authors.  All rights reserved.

38	   This document is subject to BCP 78 and the IETF Trust's Legal
39	   Provisions Relating to IETF Documents (https://trustee.ietf.org/
40	   license-info) in effect on the date of publication of this document.
41	   Please review these documents carefully, as they describe your rights
42	   and restrictions with respect to this document.  Code Components
43	   extracted from this document must include Revised BSD License text as
44	   described in Section 4.e of the Trust Legal Provisions and are
45	   provided without warranty as described in the Revised BSD License.

47	Table of Contents

49	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
50	   2.  Notation and Conventions  . . . . . . . . . . . . . . . . . .   3
51	   3.  Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .   3
52	   4.  Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . .   4
53	   5.  Architecture  . . . . . . . . . . . . . . . . . . . . . . . .   4
54	   6.  Definitions . . . . . . . . . . . . . . . . . . . . . . . . .   5
55	   7.  Blocking  . . . . . . . . . . . . . . . . . . . . . . . . . .   7
56	   8.  Interchannel Decorrelation  . . . . . . . . . . . . . . . . .   7
57	   9.  Prediction  . . . . . . . . . . . . . . . . . . . . . . . . .   8
58	   10. Residual Coding . . . . . . . . . . . . . . . . . . . . . . .   9
59	   11. Format  . . . . . . . . . . . . . . . . . . . . . . . . . . .  10
60	     11.1.  Principles . . . . . . . . . . . . . . . . . . . . . . .  10
61	     11.2.  Overview . . . . . . . . . . . . . . . . . . . . . . . .  11
62	     11.3.  Subset . . . . . . . . . . . . . . . . . . . . . . . . .  14
63	     11.4.  Conventions  . . . . . . . . . . . . . . . . . . . . . .  15
64	     11.5.  STREAM . . . . . . . . . . . . . . . . . . . . . . . . .  15
65	     11.6.  METADATA_BLOCK . . . . . . . . . . . . . . . . . . . . .  15
66	     11.7.  METADATA_BLOCK_HEADER  . . . . . . . . . . . . . . . . .  16
67	     11.8.  BLOCK_TYPE . . . . . . . . . . . . . . . . . . . . . . .  16
68	     11.9.  METADATA_BLOCK_DATA  . . . . . . . . . . . . . . . . . .  17
69	     11.10. METADATA_BLOCK_STREAMINFO  . . . . . . . . . . . . . . .  17
70	     11.11. METADATA_BLOCK_PADDING . . . . . . . . . . . . . . . . .  18
71	     11.12. METADATA_BLOCK_APPLICATION . . . . . . . . . . . . . . .  18
72	     11.13. METADATA_BLOCK_SEEKTABLE . . . . . . . . . . . . . . . .  19
73	     11.14. SEEKPOINT  . . . . . . . . . . . . . . . . . . . . . . .  19
74	     11.15. METADATA_BLOCK_VORBIS_COMMENT  . . . . . . . . . . . . .  20
75	     11.16. METADATA_BLOCK_CUESHEET  . . . . . . . . . . . . . . . .  20
76	     11.17. CUESHEET_TRACK . . . . . . . . . . . . . . . . . . . . .  21
77	     11.18. CUESHEET_TRACK_INDEX . . . . . . . . . . . . . . . . . .  22
78	     11.19. METADATA_BLOCK_PICTURE . . . . . . . . . . . . . . . . .  23
79	     11.20. PICTURE_TYPE . . . . . . . . . . . . . . . . . . . . . .  24
80	     11.21. FRAME  . . . . . . . . . . . . . . . . . . . . . . . . .  25
81	     11.22. FRAME_HEADER . . . . . . . . . . . . . . . . . . . . . .  25
82	       11.22.1.  FRAME HEADER RESERVED . . . . . . . . . . . . . . .  26
83	       11.22.2.  BLOCKING STRATEGY . . . . . . . . . . . . . . . . .  26
84	       11.22.3.  INTERCHANNEL SAMPLE BLOCK SIZE  . . . . . . . . . .  27
85	       11.22.4.  SAMPLE RATE . . . . . . . . . . . . . . . . . . . .  27
86	       11.22.5.  CHANNEL ASSIGNMENT  . . . . . . . . . . . . . . . .  28
87	       11.22.6.  SAMPLE SIZE . . . . . . . . . . . . . . . . . . . .  30
88	       11.22.7.  FRAME HEADER RESERVED2  . . . . . . . . . . . . . .  30
89	       11.22.8.  CODED NUMBER  . . . . . . . . . . . . . . . . . . .  30
90	       11.22.9.  BLOCK SIZE INT  . . . . . . . . . . . . . . . . . .  31
91	       11.22.10. SAMPLE RATE INT . . . . . . . . . . . . . . . . . .  31
92	       11.22.11. FRAME CRC . . . . . . . . . . . . . . . . . . . . .  31
93	     11.23. FRAME_FOOTER . . . . . . . . . . . . . . . . . . . . . .  31
94	     11.24. SUBFRAME . . . . . . . . . . . . . . . . . . . . . . . .  32
95	     11.25. SUBFRAME_HEADER  . . . . . . . . . . . . . . . . . . . .  32
96	       11.25.1.  SUBFRAME TYPE . . . . . . . . . . . . . . . . . . .  32
97	       11.25.2.  WASTED BITS PER SAMPLE FLAG . . . . . . . . . . . .  33
98	     11.26. SUBFRAME_CONSTANT  . . . . . . . . . . . . . . . . . . .  33
99	     11.27. SUBFRAME_FIXED . . . . . . . . . . . . . . . . . . . . .  34
100	     11.28. SUBFRAME_LPC . . . . . . . . . . . . . . . . . . . . . .  34
101	     11.29. SUBFRAME_VERBATIM  . . . . . . . . . . . . . . . . . . .  34
102	     11.30. RESIDUAL . . . . . . . . . . . . . . . . . . . . . . . .  35
103	       11.30.1.  RESIDUAL_CODING_METHOD  . . . . . . . . . . . . . .  35
104	       11.30.2.  RESIDUAL_CODING_METHOD_PARTITIONED_EXP_GOLOMB . . .  35
105	       11.30.3.  RESIDUAL_CODING_METHOD_PARTITIONED_EXP_GOLOMB2  . .  36
106	       11.30.4.  ENCODED RESIDUAL  . . . . . . . . . . . . . . . . .  37
107	   12. Security Considerations . . . . . . . . . . . . . . . . . . .  38
108	   13. Normative References  . . . . . . . . . . . . . . . . . . . .  38
109	   14. Informative References  . . . . . . . . . . . . . . . . . . .  38
110	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  39

112	1.  Introduction

114	   This is a detailed description of the FLAC format.  There is also a
115	   companion document that describes FLAC-to-Ogg mapping
116	   (https://xiph.org/flac/ogg_mapping.html).

118	   For a user-oriented overview, see About the FLAC Format
119	   (https://xiph.org/flac/documentation_format_overview.html).

121	2.  Notation and Conventions

123	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
124	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
125	   "OPTIONAL" in this document are to be interpreted as described in BCP
126	   14 [RFC2119] [RFC8174] when, and only when, they appear in all
127	   capitals, as shown here.

129	3.  Acknowledgments

131	   FLAC owes much to the many people who have advanced the audio
132	   compression field so freely.  For instance: - A.  J.  Robinson
133	   (http://svr-www.eng.cam.ac.uk/~ajr/) for his work on Shorten
134	   (http://svr-www.eng.cam.ac.uk/reports/abstracts/robinson_tr156.html);
135	   his paper is a good starting point on some of the basic methods used
136	   by FLAC.  FLAC trivially extends and improves the fixed predictors,
137	   LPC coefficient quantization, and Exponential-Golomb coding used in
138	   Shorten. - S.  W.  Golomb
139	   (https://web.archive.org/web/20040215005354/http://csi.usc.edu/
140	   faculty/golomb.html) and Robert F.  Rice; their universal codes are
141	   used by FLAC's entropy coder. - N.  Levinson and J.  Durbin; the
142	   reference encoder uses an algorithm developed and refined by them for
143	   determining the LPC coefficients from the autocorrelation
144	   coefficients. - And of course, Claude Shannon
145	   (http://en.wikipedia.org/wiki/Claude_Shannon)

147	4.  Scope

149	   FLAC stands for Free Lossless Audio Codec: it is designed to reduce
150	   the amount of computer storage space needed to store digital audio
151	   signals without needing to remove information in doing so (i.e.
152	   lossless).  FLAC is free in the sense that its specification is open,
153	   its reference implementation is open-source and it is not encumbered
154	   by any known patent.

156	   FLAC is able to achieve lossless compression because samples in audio
157	   signals tend to be highly correlated with their close neighbors.  In
158	   contrast with general purpose compressors, which often use
159	   dictionaries, do run-length coding or exploit long-term repetition,
160	   FLAC removes redundancy solely in the very short term, looking back
161	   at most 32 samples.

163	   The FLAC format is suited for pulse-code modulated (PCM) audio with 1
164	   to 8 channels, sample rates from 1 to 1048576 Hertz and bit depths
165	   between 4 and 32 bits.  Most tools for reading and writing the FLAC
166	   format have been optimized for CD-audio, which is PCM audio with 2
167	   channels, a sample rate of 44.1 kHz and a bit depth of 16 bits.

169	   Compared to other lossless (audio) coding formats, FLAC is a format
170	   with low complexity and can be coded to and from with little
171	   computing resources.  Decoding of FLAC has seen many independent
172	   implementations on many different platforms, and both encoding and
173	   decoding can be implemented without needing floating-point
174	   arithmetic.

176	   The coding methods provided by the FLAC format works best on PCM
177	   audio signals of which the samples have a signed representation and
178	   are centered around zero.  Audio signals in which samples have an
179	   unsigned representation must be transformed to a signed
180	   representation as described in this document in order to achieve
181	   reasonable compression.  The FLAC format is not suited to compress
182	   audio that is not PCM.  Pulse-density modulated audio, e.g.  DSD,
183	   cannot be compressed by FLAC.

185	5.  Architecture

187	   Similar to many audio coders, a FLAC encoder has the following
188	   stages:

190	   *  Blocking (see section on Blocking (#blocking)).  The input is
191	      broken up into many contiguous blocks.  With FLAC, the blocks MAY
192	      vary in size.  The optimal size of the block is usually affected
193	      by many factors, including the sample rate, spectral
194	      characteristics over time, etc.  Though FLAC allows the block size
195	      to vary within a stream, the reference encoder uses a fixed block
196	      size.

198	   *  Interchannel Decorrelation (see section on Interchannel
199	      Decorrelation (#interchannel-decorrelation)).  In the case of
200	      stereo streams, the encoder will create mid and side signals based
201	      on the average and difference (respectively) of the left and right
202	      channels.  The encoder will then pass the best form of the signal
203	      to the next stage.

205	   *  Prediction (see section on Prediction (#prediction)).  The block
206	      is passed through a prediction stage where the encoder tries to
207	      find a mathematical description (usually an approximate one) of
208	      the signal.  This description is typically much smaller than the
209	      raw signal itself.  Since the methods of prediction are known to
210	      both the encoder and decoder, only the parameters of the predictor
211	      need be included in the compressed stream.  FLAC currently uses
212	      four different classes of predictors, but the format has reserved
213	      space for additional methods.  FLAC allows the class of predictor
214	      to change from block to block, or even within the channels of a
215	      block.

217	   *  Residual Coding (See section on Residual Coding (#residual-
218	      coding)).  If the predictor does not describe the signal exactly,
219	      the difference between the original signal and the predicted
220	      signal (called the error or residual signal) MUST be coded
221	      losslessly.  If the predictor is effective, the residual signal
222	      will require fewer bits per sample than the original signal.  FLAC
223	      currently uses only one method for encoding the residual, but the
224	      format has reserved space for additional methods.  FLAC allows the
225	      residual coding method to change from block to block, or even
226	      within the channels of a block.

228	   In addition, FLAC specifies a metadata system, which allows arbitrary
229	   information about the stream to be included at the beginning of the
230	   stream.

232	6.  Definitions

234	   *  *Block*: A (short) section of linear pulse-code modulated audio,
235	      with one or more channels.

237	   *  *Subblock*: All samples within a corresponding block for 1
238	      channel.  One or more subblocks form a block, and all subblocks in
239	      a certain block contain the same number of samples.

241	   *  *Frame*: A frame header plus one or more subframes.  It encodes
242	      the contents of a corresponding block.

244	   *  *Subframe*: An encoded subblock.  All subframes within a frame
245	      code for the same number of samples.  A subframe MAY correspond to
246	      a subblock, else it corresponds to either the addition or
247	      subtraction of two subblocks, see section on interchannel
248	      decorrelation (#interchannel-decorrelation).

250	   *  *Blocksize*: The total number of samples contained in a block or
251	      coded in a frame, divided by the number of channels.  In other
252	      words, the number of samples in any subblock of a block, or any
253	      subframe of a frame.  This is also called *interchannel samples*.

255	   *  *Bit depth* or *bits per sample*: the number of bits used to
256	      contain each sample.  This MUST be the same for all subblocks in a
257	      block but MAY be different for different subframes in a frame
258	      because of interchannel decorrelation (#interchannel-
259	      decorrelation).

261	   *  *Predictor*: a model used to predict samples in an audio signal
262	      based on past samples.  FLAC uses such predictors to remove
263	      redundancy in a signal in order to be able to compress it.

265	   *  *Linear predictor*: a predictor using linear prediction
266	      (https://en.wikipedia.org/wiki/Linear_prediction).  This is also
267	      called *linear predictive coding (LPC)*. With a linear predictor
268	      each prediction is a linear combination of past samples, hence the
269	      name.  A linear predictor has a causal discrete-time finite
270	      impulse response (https://en.wikipedia.org/wiki/
271	      Finite_impulse_response).

273	   *  *Fixed predictor*: a linear predictor in which the model
274	      parameters are the same across all FLAC files, and thus not need
275	      to be stored.

277	   *  *Predictor order*: the number of past samples that a predictor
278	      uses.  For example, a 4th order predictor uses the 4 samples
279	      directly preceding a certain sample to predict it.  In FLAC,
280	      samples used in a predictor are always consecutive, and are always
281	      the samples directly before the sample that is being predicted

283	   *  *Residual*: The audio signal that remains after a predictor has
284	      been subtracted from a subblock.  If the predictor has been able
285	      to remove redundancy from the signal, the samples of the remaining
286	      signal (the *residual samples*) will have, on average, a smaller
287	      numerical value than the original signal.

289	   *  *Rice code*: A variable-length code
290	      (https://en.wikipedia.org/wiki/Variable-length_code) which
291	      compresses data by making use of the observation that, after using
292	      an effective predictor, most residual samples are closer to zero
293	      than the original samples, while still allowing for a small part
294	      of the samples to be much larger.

296	7.  Blocking

298	   The size used for blocking the audio data has a direct effect on the
299	   compression ratio.  If the block size is too small, the resulting
300	   large number of frames mean that excess bits will be wasted on frame
301	   headers.  If the block size is too large, the characteristics of the
302	   signal MAY vary so much that the encoder will be unable to find a
303	   good predictor.  In order to simplify encoder/decoder design, FLAC
304	   imposes a minimum block size of 16 samples, and a maximum block size
305	   of 65535 samples.  This range covers the optimal size for all of the
306	   audio data FLAC supports.

308	   Currently the reference encoder uses a fixed block size, optimized on
309	   the sample rate of the input.  Future versions MAY vary the block
310	   size depending on the characteristics of the signal.

312	   Blocked data is passed to the predictor stage one subblock (channel)
313	   at a time.  Each subblock is independently coded into a subframe, and
314	   the subframes are concatenated into a frame.  Because each channel is
315	   coded separately, one channel of a stereo frame MAY be encoded as a
316	   constant subframe, and the other an LPC subframe.

318	8.  Interchannel Decorrelation

320	   In many audio files, channels are correlated.  The FLAC format can
321	   exploit this correlation in stereo files by not directly coding
322	   subblocks into subframes, but instead coding an average of all
323	   samples in both subblocks (a mid channel) or the difference between
324	   all samples in both subblocks (a side channel).  The following
325	   combinations are possible:

327	   *  *Independent*. All channels are coded independently.  All non-
328	      stereo files MUST be encoded this way.

330	   *  *Mid-side*. A left and right subblock are converted to mid and
331	      side subframes.  To calculate a sample for a mid subframe, the
332	      corresponding left and right samples are summed and the result is
333	      shifted right by 1 bit.  To calculate a sample for a side
334	      subframe, the corresponding right sample is subtracted from the
335	      corresponding left sample.  On decoding, the mid channel has to be
336	      shifted left by 1 bit.  Also, if the side channel is uneven, 1 has
337	      to be added to the mid channel after the left shift.  To
338	      reconstruct the left channel, the corresponding samples in the mid
339	      and side subframes are added and the result shifted right by 1
340	      bit, while for the right channel the side channel has to be
341	      subtracted from the mid channel and the result shifted right by 1
342	      bit.

344	   *  *Left-side*. The left subblock is coded and the left and right
345	      subblock are used to code a side subframe.  The side subframe is
346	      constructed in the same way as for mid-side.  To decode, the right
347	      subblock is restored by subtracting the samples in the side
348	      subframe from the corresponding samples the left subframe.

350	   *  *Right-side*. The right subblock is coded and the left and right
351	      subblock are used to code a side subframe.  Note that the actual
352	      coded subframe order is side-right.  The side subframe is
353	      constructed in the same way as for mid-side.  To decode, the left
354	      subblock is restored by adding the samples in the side subframe to
355	      the corresponding samples in the left subframe.

357	   The side channel needs one extra bit of bit depth as the subtraction
358	   can produce sample values twice as large as the maximum possible in
359	   any given bit depth.  The mid channel in mid-side stereo does not
360	   need one extra bit, as it is shifted left one bit.  The left shift of
361	   the mid channel does not lead to non-lossless behavior, because an
362	   uneven sample in the mid subframe must always be accompanied by a
363	   corresponding uneven sample in the side subframe, which means the
364	   lost least significant bit can be restored by taking it from the
365	   sample in the side subframe.

367	9.  Prediction

369	   FLAC uses four methods for modeling the input signal:

371	   1.  *Verbatim*. This is essentially a zero-order predictor of the
372	       signal.  The predicted signal is zero, meaning the residual is
373	       the signal itself, and the compression is zero.  This is the
374	       baseline against which the other predictors are measured.  If you
375	       feed random data to the encoder, the verbatim predictor will
376	       probably be used for every subblock.  Since the raw signal is not
377	       actually passed through the residual coding stage (it is added to
378	       the stream 'verbatim'), the encoding results will not be the same
379	       as a zero-order linear predictor.

381	   2.  *Constant*. This predictor is used whenever the subblock is pure
382	       DC ("digital silence"), i.e. a constant value throughout.  The
383	       signal is run-length encoded and added to the stream.

385	   3.  *Fixed linear predictor*. FLAC uses a class of computationally-
386	       efficient fixed linear predictors (for a good description, see
387	       audiopak (http://www.hpl.hp.com/techreports/1999/HPL-
388	       1999-144.pdf) and shorten (http://svr-
389	       www.eng.cam.ac.uk/reports/abstracts/robinson_tr156.html)).  FLAC
390	       adds a fourth-order predictor to the zero-to-third-order
391	       predictors used by Shorten.  Since the predictors are fixed, the
392	       predictor order is the only parameter that needs to be stored in
393	       the compressed stream.  The error signal is then passed to the
394	       residual coder.

396	   4.  *FIR Linear prediction*. For more accurate modeling (at a cost of
397	       slower encoding), FLAC supports up to 32nd order FIR linear
398	       prediction (again, for information on linear prediction, see
399	       audiopak (http://www.hpl.hp.com/techreports/1999/HPL-
400	       1999-144.pdf) and shorten (http://svr-
401	       www.eng.cam.ac.uk/reports/abstracts/robinson_tr156.html)).  The
402	       reference encoder uses the Levinson-Durbin method for calculating
403	       the LPC coefficients from the autocorrelation coefficients, and
404	       the coefficients are quantized before computing the residual.
405	       Whereas encoders such as Shorten used a fixed quantization for
406	       the entire input, FLAC allows the quantized coefficient precision
407	       to vary from subframe to subframe.  The FLAC reference encoder
408	       estimates the optimal precision to use based on the block size
409	       and dynamic range of the original signal.

411	10.  Residual Coding

413	   FLAC uses Exponential-Golomb (a variant of Rice) coding as its
414	   residual encoder.  You can learn more about exp-golomb coding
415	   (https://en.wikipedia.org/wiki/Exponential-Golomb_coding) on
416	   Wikipedia.

418	   FLAC currently defines two similar methods for the coding of the
419	   error signal from the prediction stage.  The error signal is coded
420	   using Exponential-Golomb codes in one of two ways:

422	   1.  the encoder estimates a single exp-golomb parameter based on the
423	       variance of the residual and exp-golomb codes the entire residual
424	       using this parameter;

426	   2.  the residual is partitioned into several equal-length regions of
427	       contiguous samples, and each region is coded with its own exp-
428	       golomb parameter based on the region's mean.

430	   (Note that the first method is a special case of the second method
431	   with one partition, except the exp-golomb parameter is based on the
432	   residual variance instead of the mean.)

434	   The FLAC format has reserved space for other coding methods.  Some
435	   possibilities for volunteers would be to explore better context-
436	   modeling of the exp-golomb parameter, or Huffman coding.  See LOCO-I
437	   (http://www.hpl.hp.com/techreports/98/HPL-98-193.html) and pucrunch (
438	   http://web.archive.org/web/20140827133312/http://www.cs.tut.fi/~alber
439	   t/Dev/pucrunch/packing.html) for descriptions of several universal
440	   codes.

442	11.  Format

444	   This section specifies the FLAC bitstream format.

446	11.1.  Principles

448	   FLAC has no format version information, but it does contain reserved
449	   space in several places.  Future versions of the format MAY use this
450	   reserved space safely without breaking the format of older streams.
451	   Older decoders MAY choose to abort decoding or skip data encoded with
452	   newer methods.  Apart from reserved patterns, in places the format
453	   specifies invalid patterns, meaning that the patterns MAY never
454	   appear in any valid bitstream, in any prior, present, or future
455	   versions of the format.  These invalid patterns are usually used to
456	   make the synchronization mechanism more robust.

458	   All numbers used in a FLAC bitstream MUST be integers; there are no
459	   floating-point representations.  All numbers MUST be big-endian
460	   coded, except the length field used in Vorbis comments, which MUST be
461	   little-endian coded.  All numbers MUST be unsigned except linear
462	   predictor coefficients, the linear prediction shift and numbers which
463	   directly represent samples, which MUST be signed.  None of these
464	   restrictions apply to application metadata blocks.

466	   All samples encoded to and decoded from the FLAC format MUST be in a
467	   signed representation.

469	   There are several ways to convert unsigned sample representations to
470	   signed sample representations, but the coding methods provided by the
471	   FLAC format work best on audio signals of which the numerical values
472	   of the samples are centered around zero, i.e. have no DC offset.  In
473	   most unsigned audio formats, signals are centered around halfway the
474	   range of the unsigned integer type used.  If that is the case, all
475	   sample representations SHOULD be converted by first copying the
476	   number to a signed integer with sufficient range and then subtracting
477	   half of the range of the unsigned integer type, which should result
478	   in a signal with samples centered around 0.

480	11.2.  Overview

482	   Before the formal description of the stream, an overview might be
483	   helpful.

485	   *  A FLAC bitstream consists of the "fLaC" (i.e. 0x664C6143) marker
486	      at the beginning of the stream, followed by a mandatory metadata
487	      block (called the STREAMINFO block), any number of other metadata
488	      blocks, then the audio frames.

490	   *  FLAC supports up to 128 kinds of metadata blocks; currently the
491	      following are defined:

493	      -  STREAMINFO: This block has information about the whole stream,
494	         like sample rate, number of channels, total number of samples,
495	         etc.  It MUST be present as the first metadata block in the
496	         stream.  Other metadata blocks MAY follow, and ones that the
497	         decoder doesn't understand, it will skip.

499	      -  PADDING: This block allows for an arbitrary amount of padding.
500	         The contents of a PADDING block have no meaning.  This block is
501	         useful when it is known that metadata will be edited after
502	         encoding; the user can instruct the encoder to reserve a
503	         PADDING block of sufficient size so that when metadata is
504	         added, it will simply overwrite the padding (which is
505	         relatively quick) instead of having to insert it into the right
506	         place in the existing file (which would normally require
507	         rewriting the entire file).

509	      -  APPLICATION: This block is for use by third-party applications.
510	         The only mandatory field is a 32-bit identifier.  This ID is
511	         granted upon request to an application by the FLAC maintainers.
512	         The remainder is of the block is defined by the registered
513	         application.  Visit the registration page
514	         (https://xiph.org/flac/id.html) if you would like to register
515	         an ID for your application with FLAC.

517	      -  SEEKTABLE: This is an OPTIONAL block for storing seek points.
518	         It is possible to seek to any given sample in a FLAC stream
519	         without a seek table, but the delay can be unpredictable since
520	         the bitrate MAY vary widely within a stream.  By adding seek
521	         points to a stream, this delay can be significantly reduced.
522	         Each seek point takes 18 bytes, so 1% resolution within a
523	         stream adds less than 2K.  There can be only one SEEKTABLE in a
524	         stream, but the table can have any number of seek points.
525	         There is also a special 'placeholder' seekpoint which will be
526	         ignored by decoders but which can be used to reserve space for
527	         future seek point insertion.

529	      -  VORBIS_COMMENT: This block is for storing a list of human-
530	         readable name/value pairs.  Values are encoded using UTF-8.  It
531	         is an implementation of the Vorbis comment specification
532	         (http://xiph.org/vorbis/doc/v-comment.html) (without the
533	         framing bit).  This is the only officially supported tagging
534	         mechanism in FLAC.  There MUST be only zero or one
535	         VORBIS_COMMENT blocks in a stream.  In some external
536	         documentation, Vorbis comments are called FLAC tags to lessen
537	         confusion.

539	      -  CUESHEET: This block is for storing various information that
540	         can be used in a cue sheet.  It supports track and index
541	         points, compatible with Red Book CD digital audio discs, as
542	         well as other CD-DA metadata such as media catalog number and
543	         track ISRCs.  The CUESHEET block is especially useful for
544	         backing up CD-DA discs, but it can be used as a general purpose
545	         cueing mechanism for playback.

547	      -  PICTURE: This block is for storing pictures associated with the
548	         file, most commonly cover art from CDs.  There MAY be more than
549	         one PICTURE block in a file.  The picture format is similar to
550	         the APIC frame in ID3v2 (http://www.id3.org/id3v2.4.0-frames).
551	         The PICTURE block has a type, MIME type, and UTF-8 description
552	         like ID3v2, and supports external linking via URL (though this
553	         is discouraged).  The differences are that there is no
554	         uniqueness constraint on the description field, and the MIME
555	         type is mandatory.  The FLAC PICTURE block also includes the
556	         resolution, color depth, and palette size so that the client
557	         can search for a suitable picture without having to scan them
558	         all.

560	   *  The audio data is composed of one or more audio frames.  Each
561	      frame consists of a frame header, which contains a sync code,
562	      information about the frame like the block size, sample rate,
563	      number of channels, et cetera, and an 8-bit CRC.  The frame header
564	      also contains either the sample number of the first sample in the
565	      frame (for variable-blocksize streams), or the frame number (for
566	      fixed-blocksize streams).  This allows for fast, sample-accurate
567	      seeking to be performed.  Following the frame header are encoded
568	      subframes, one for each channel, and finally, the frame is zero-
569	      padded to a byte boundary.  Each subframe has its own header that
570	      specifies how the subframe is encoded.

572	   *  Since a decoder MAY start decoding in the middle of a stream,
573	      there MUST be a method to determine the start of a frame.  A
574	      14-bit sync code begins each frame.  The sync code will not appear
575	      anywhere else in the frame header.  However, since it MAY appear
576	      in the subframes, the decoder has two other ways of ensuring a
577	      correct sync.  The first is to check that the rest of the frame
578	      header contains no invalid data.  Even this is not foolproof since
579	      valid header patterns can still occur within the subframes.  The
580	      decoder's final check is to generate an 8-bit CRC of the frame
581	      header and compare this to the CRC stored at the end of the frame
582	      header.

584	   *  Again, since a decoder MAY start decoding at an arbitrary frame in
585	      the stream, each frame header MUST contain some basic information
586	      about the stream because the decoder MAY not have access to the
587	      STREAMINFO metadata block at the start of the stream.  This
588	      information includes sample rate, bits per sample, number of
589	      channels, etc.  Since the frame header is pure overhead, it has a
590	      direct effect on the compression ratio.  To keep the frame header
591	      as small as possible, FLAC uses lookup tables for the most
592	      commonly used values for frame parameters.  For instance, the
593	      sample rate part of the frame header is specified using 4 bits.
594	      Eight of the bit patterns correspond to the commonly used sample
595	      rates of 8, 16, 22.05, 24, 32, 44.1, 48 or 96 kHz.  However, odd
596	      sample rates can be specified by using one of the 'hint' bit
597	      patterns, directing the decoder to find the exact sample rate at
598	      the end of the frame header.  The same method is used for
599	      specifying the block size and bits per sample.  In this way, the
600	      frame header size stays small for all of the most common forms of
601	      audio data.

603	   *  Individual subframes (one for each channel) are coded separately
604	      within a frame, and appear serially in the stream.  In other
605	      words, the encoded audio data is NOT channel-interleaved.  This
606	      reduces decoder complexity at the cost of requiring larger decode
607	      buffers.  Each subframe has its own header specifying the
608	      attributes of the subframe, like prediction method and order,
609	      residual coding parameters, etc.  The header is followed by the
610	      encoded audio data for that channel.

612	11.3.  Subset

614	   FLAC specifies a subset of itself as the Subset format.  The purpose
615	   of this is to ensure that any streams encoded according to the Subset
616	   are truly "streamable", meaning that a decoder that cannot seek
617	   within the stream can still pick up in the middle of the stream and
618	   start decoding.  It also makes hardware decoder implementations more
619	   practical by limiting the encoding parameters such that decoder
620	   buffer sizes and other resource requirements can be easily
621	   determined. *flac* generates Subset streams by default unless the "--
622	   lax" command-line option is used.  The Subset makes the following
623	   limitations on what MAY be used in the stream:

625	   *  The blocksize bits in the FRAME_HEADER (see FRAME_HEADER section
626	      (#frameheader)) MUST be 0b0001-0b1110.  The blocksize MUST be <=
627	      16384; if the sample rate is <= 48000 Hz, the blocksize MUST be <=
628	      4608 = 2^9 * 3^2.

630	   *  The sample rate bits in the FRAME_HEADER MUST be 0b0001-0b1110.

632	   *  The bits-per-sample bits in the FRAME_HEADER MUST be 0b001-0b111.

634	   *  If the sample rate is <= 48000 Hz, the filter order in LPC
635	      subframes (see SUBFRAME_LPC section (#subframelpc)) MUST be less
636	      than or equal to 12, i.e. the subframe type bits in the
637	      SUBFRAME_HEADER (see SUBFRAME_HEADER section (#subframeheader))
638	      SHOULD NOT be 0b101100-0b111111.

640	   *  The Rice partition order (see Coded residual section (#coded-
641	      residual)) MUST be less than or equal to 8.

643	11.4.  Conventions

645	   The following tables constitute a formal description of the FLAC
646	   format.  Values expressed as u(n) represent unsigned big-endian
647	   integer using n bits. n may be expressed as an equation using *
648	   (multiplication), / (division), + (addition), or - (subtraction).  An
649	   inclusive range of the number of bits expressed may be represented
650	   with an ellipsis, such as u(m...n).  The name of a value followed by
651	   an asterisk * indicates zero or more occurrences of the value.  The
652	   name of a value followed by a plus sign + indicates one or more
653	   occurrences of the value.

655	11.5.  STREAM

657	    +===========================+=====================================+
658	    | Data                      | Description                         |
659	    +===========================+=====================================+
660	    | u(32)                     | "fLaC", the FLAC stream marker in   |
661	    |                           | ASCII, meaning byte 0 of the stream |
662	    |                           | is 0x66, followed by 0x4C 0x61 0x43 |
663	    +---------------------------+-------------------------------------+
664	    | METADATA_BLOCK_STREAMINFO | This is the mandatory STREAMINFO    |
665	    |                           | metadata block that has the basic   |
666	    |                           | properties of the stream.           |
667	    +---------------------------+-------------------------------------+
668	    | METADATA_BLOCK*           | Zero or more metadata blocks        |
669	    +---------------------------+-------------------------------------+
670	    | FRAME+                    | One or more audio frames            |
671	    +---------------------------+-------------------------------------+

673	                                  Table 1

675	11.6.  METADATA_BLOCK

677	    +=======================+========================================+
678	    | Data                  | Description                            |
679	    +=======================+========================================+
680	    | METADATA_BLOCK_HEADER | A block header that specifies the type |
681	    |                       | and size of the metadata block data.   |
682	    +-----------------------+----------------------------------------+
683	    | METADATA_BLOCK_DATA   |                                        |
684	    +-----------------------+----------------------------------------+

686	                                 Table 2

688	11.7.  METADATA_BLOCK_HEADER

690	    +=======+=========================================================+
691	    | Data  | Description                                             |
692	    +=======+=========================================================+
693	    | u(1)  | Last-metadata-block flag: '1' if this block is the last |
694	    |       | metadata block before the audio blocks, '0' otherwise.  |
695	    +-------+---------------------------------------------------------+
696	    | u(7)  | BLOCK_TYPE                                              |
697	    +-------+---------------------------------------------------------+
698	    | u(24) | Length (in bytes) of metadata to follow (does not       |
699	    |       | include the size of the METADATA_BLOCK_HEADER)          |
700	    +-------+---------------------------------------------------------+

702	                                  Table 3

704	11.8.  BLOCK_TYPE

706	     +=========+====================================================+
707	     | Value   | Description                                        |
708	     +=========+====================================================+
709	     | 0       | STREAMINFO                                         |
710	     +---------+----------------------------------------------------+
711	     | 1       | PADDING                                            |
712	     +---------+----------------------------------------------------+
713	     | 2       | APPLICATION                                        |
714	     +---------+----------------------------------------------------+
715	     | 3       | SEEKTABLE                                          |
716	     +---------+----------------------------------------------------+
717	     | 4       | VORBIS_COMMENT                                     |
718	     +---------+----------------------------------------------------+
719	     | 5       | CUESHEET                                           |
720	     +---------+----------------------------------------------------+
721	     | 6       | PICTURE                                            |
722	     +---------+----------------------------------------------------+
723	     | 7 - 126 | reserved                                           |
724	     +---------+----------------------------------------------------+
725	     | 127     | invalid, to avoid confusion with a frame sync code |
726	     +---------+----------------------------------------------------+

728	                                 Table 4

730	11.9.  METADATA_BLOCK_DATA

732	   +===================================================+==============+
733	   | Data                                              | Description  |
734	   +===================================================+==============+
735	   | METADATA_BLOCK_STREAMINFO ||                      | The block    |
736	   | METADATA_BLOCK_PADDING ||                         | data MUST    |
737	   | METADATA_BLOCK_APPLICATION ||                     | match the    |
738	   | METADATA_BLOCK_SEEKTABLE ||                       | block type   |
739	   | METADATA_BLOCK_VORBIS_COMMENT ||                  | in the block |
740	   | METADATA_BLOCK_CUESHEET || METADATA_BLOCK_PICTURE | header.      |
741	   +---------------------------------------------------+--------------+

743	                                 Table 5

745	11.10.  METADATA_BLOCK_STREAMINFO

747	       +========+=================================================+
748	       | Data   | Description                                     |
749	       +========+=================================================+
750	       | u(16)  | The minimum block size (in samples) used in the |
751	       |        | stream.                                         |
752	       +--------+-------------------------------------------------+
753	       | u(16)  | The maximum block size (in samples) used in the |
754	       |        | stream.  (Minimum blocksize == maximum          |
755	       |        | blocksize) implies a fixed-blocksize stream.    |
756	       +--------+-------------------------------------------------+
757	       | u(24)  | The minimum frame size (in bytes) used in the   |
758	       |        | stream.  A value of 0 signifies that the value  |
759	       |        | is not known.                                   |
760	       +--------+-------------------------------------------------+
761	       | u(24)  | The maximum frame size (in bytes) used in the   |
762	       |        | stream.  A value of 0 signifies that the value  |
763	       |        | is not known.                                   |
764	       +--------+-------------------------------------------------+
765	       | u(20)  | Sample rate in Hz.  Though 20 bits are          |
766	       |        | available, the maximum sample rate is limited   |
767	       |        | by the structure of frame headers to 655350 Hz. |
768	       |        | Also, a value of 0 is invalid.                  |
769	       +--------+-------------------------------------------------+
770	       | u(3)   | (number of channels)-1.  FLAC supports from 1   |
771	       |        | to 8 channels                                   |
772	       +--------+-------------------------------------------------+
773	       | u(5)   | (bits per sample)-1.  FLAC supports from 4 to   |
774	       |        | 32 bits per sample.  Currently the reference    |
775	       |        | encoder and decoders only support up to 24 bits |
776	       |        | per sample.                                     |
777	       +--------+-------------------------------------------------+
778	       | u(36)  | Total samples in stream.  'Samples' means       |
779	       |        | inter-channel sample, i.e. one second of 44.1   |
780	       |        | kHz audio will have 44100 samples regardless of |
781	       |        | the number of channels.  A value of zero here   |
782	       |        | means the number of total samples is unknown.   |
783	       +--------+-------------------------------------------------+
784	       | u(128) | MD5 signature of the unencoded audio data.      |
785	       |        | This allows the decoder to determine if an      |
786	       |        | error exists in the audio data even when the    |
787	       |        | error does not result in an invalid bitstream.  |
788	       +--------+-------------------------------------------------+

790	                                 Table 6

792	   FLAC specifies a minimum block size of 16 and a maximum block size of
793	   65535, meaning the bit patterns corresponding to the numbers 0-15 in
794	   the minimum blocksize and maximum blocksize fields are invalid.

796	   The MD5 signature is made by performing an MD5 transformation on the
797	   samples of all channels interleaved, represented in signed, little-
798	   endian form.  This interleaving is on a per-sample basis, so for a
799	   stereo file this means first the first sample of the first channel,
800	   then the first sample of the second channel, then the second sample
801	   of the first channel etc.  Before performing the MD5 transformation,
802	   all samples must be byte-aligned.  So, in case the bit depth is not a
803	   whole number of bytes, additional zero bits are inserted at the most-
804	   significant position until each sample representation is a whole
805	   number of bytes.

807	11.11.  METADATA_BLOCK_PADDING

809	             +======+========================================+
810	             | Data | Description                            |
811	             +======+========================================+
812	             | u(n) | n '0' bits (n MUST be a multiple of 8) |
813	             +------+----------------------------------------+

815	                                  Table 7

817	11.12.  METADATA_BLOCK_APPLICATION

819	           +=======+===========================================+
820	           | Data  | Description                               |
821	           +=======+===========================================+
822	           | u(32) | Registered application ID.  (Visit the    |
823	           |       | registration page (https://xiph.org/flac/ |
824	           |       | id.html) to register an ID with FLAC.)    |
825	           +-------+-------------------------------------------+
826	           | u(n)  | Application data (n MUST be a multiple of |
827	           |       | 8)                                        |
828	           +-------+-------------------------------------------+

830	                                  Table 8

832	11.13.  METADATA_BLOCK_SEEKTABLE

834	                 +============+==========================+
835	                 | Data       | Description              |
836	                 +============+==========================+
837	                 | SEEKPOINT+ | One or more seek points. |
838	                 +------------+--------------------------+

840	                                  Table 9

842	   NOTE - The number of seek points is implied by the metadata header
843	   'length' field, i.e. equal to length / 18.

845	11.14.  SEEKPOINT

847	   +=======+==========================================================+
848	   | Data  | Description                                              |
849	   +=======+==========================================================+
850	   | u(64) | Sample number of first sample in the target frame, or    |
851	   |       | 0xFFFFFFFFFFFFFFFF for a placeholder point.              |
852	   +-------+----------------------------------------------------------+
853	   | u(64) | Offset (in bytes) from the first byte of the first frame |
854	   |       | header to the first byte of the target frame's header.   |
855	   +-------+----------------------------------------------------------+
856	   | u(16) | Number of samples in the target frame.                   |
857	   +-------+----------------------------------------------------------+

859	                                 Table 10

861	   NOTES

863	   *  For placeholder points, the second and third field values are
864	      undefined.

866	   *  Seek points within a table MUST be sorted in ascending order by
867	      sample number.

869	   *  Seek points within a table MUST be unique by sample number, with
870	      the exception of placeholder points.

872	   *  The previous two notes imply that there MAY be any number of
873	      placeholder points, but they MUST all occur at the end of the
874	      table.

876	11.15.  METADATA_BLOCK_VORBIS_COMMENT

878	   +======+===========================================================+
879	   | Data | Description                                               |
880	   +======+===========================================================+
881	   | u(n) | Also known as FLAC tags, the contents of a vorbis comment |
882	   |      | packet as specified here (http://www.xiph.org/vorbis/doc/ |
883	   |      | v-comment.html) (without the framing bit).  Note that the |
884	   |      | vorbis comment spec allows for on the order of 2^64 bytes |
885	   |      | of data where as the FLAC metadata block is limited to    |
886	   |      | 2^24 bytes.  Given the stated purpose of vorbis comments, |
887	   |      | i.e. human-readable textual information, this limit is    |
888	   |      | unlikely to be restrictive.  Also note that the 32-bit    |
889	   |      | field lengths are little-endian coded according to the    |
890	   |      | vorbis spec, as opposed to the usual big-endian coding of |
891	   |      | fixed-length integers in the rest of FLAC.                |
892	   +------+-----------------------------------------------------------+

894	                                 Table 11

896	11.16.  METADATA_BLOCK_CUESHEET

898	   +=================+================================================+
899	   | Data            | Description                                    |
900	   +=================+================================================+
901	   | u(128*8)        | Media catalog number, in ASCII printable       |
902	   |                 | characters 0x20-0x7E.  In general, the media   |
903	   |                 | catalog number SHOULD be 0 to 128 bytes long;  |
904	   |                 | any unused characters SHOULD be right-padded   |
905	   |                 | with NUL characters.  For CD-DA, this is a     |
906	   |                 | thirteen digit number, followed by 115 NUL     |
907	   |                 | bytes.                                         |
908	   +-----------------+------------------------------------------------+
909	   | u(64)           | The number of lead-in samples.  This field has |
910	   |                 | meaning only for CD-DA cuesheets; for other    |
911	   |                 | uses it SHOULD be 0.  For CD-DA, the lead-in   |
912	   |                 | is the TRACK 00 area where the table of        |
913	   |                 | contents is stored; more precisely, it is the  |
914	   |                 | number of samples from the first sample of the |
915	   |                 | media to the first sample of the first index   |
916	   |                 | point of the first track.  According to the    |
917	   |                 | Red Book, the lead-in MUST be silence and CD   |
918	   |                 | grabbing software does not usually store it;   |
919	   |                 | additionally, the lead-in MUST be at least two |
920	   |                 | seconds but MAY be longer.  For these reasons  |
921	   |                 | the lead-in length is stored here so that the  |
922	   |                 | absolute position of the first track can be    |
923	   |                 | computed.  Note that the lead-in stored here   |
924	   |                 | is the number of samples up to the first index |
925	   |                 | point of the first track, not necessarily to   |
926	   |                 | INDEX 01 of the first track; even the first    |
927	   |                 | track MAY have INDEX 00 data.                  |
928	   +-----------------+------------------------------------------------+
929	   | u(1)            | 1 if the CUESHEET corresponds to a Compact     |
930	   |                 | Disc, else 0.                                  |
931	   +-----------------+------------------------------------------------+
932	   | u(7+258*8)      | Reserved.  All bits MUST be set to zero.       |
933	   +-----------------+------------------------------------------------+
934	   | u(8)            | The number of tracks.  Must be at least 1      |
935	   |                 | (because of the requisite lead-out track).     |
936	   |                 | For CD-DA, this number MUST be no more than    |
937	   |                 | 100 (99 regular tracks and one lead-out        |
938	   |                 | track).                                        |
939	   +-----------------+------------------------------------------------+
940	   | CUESHEET_TRACK+ | One or more tracks.  A CUESHEET block is       |
941	   |                 | REQUIRED to have a lead-out track; it is       |
942	   |                 | always the last track in the CUESHEET.  For    |
943	   |                 | CD-DA, the lead-out track number MUST be 170   |
944	   |                 | as specified by the Red Book, otherwise it     |
945	   |                 | MUST be 255.                                   |
946	   +-----------------+------------------------------------------------+

948	                                 Table 12

950	11.17.  CUESHEET_TRACK

952	   +=====================+=================================================+
953	   |Data                 |Description                                      |
954	   +=====================+=================================================+
955	   |u(64)                |Track offset in samples, relative to the         |
956	   |                     |beginning of the FLAC audio stream.  It is the   |
957	   |                     |offset to the first index point of the track.    |
958	   |                     |(Note how this differs from CD-DA, where the     |
959	   |                     |track's offset in the TOC is that of the track's |
960	   |                     |INDEX 01 even if there is an INDEX 00.)  For CD- |
961	   |                     |DA, the offset MUST be evenly divisible by 588   |
962	   |                     |samples (588 samples = 44100 samples/s * 1/75 s).|
963	   +---------------------+-------------------------------------------------+
964	   |u(8)                 |Track number.  A track number of 0 is not allowed|
965	   |                     |to avoid conflicting with the CD-DA spec, which  |
966	   |                     |reserves this for the lead-in.  For CD-DA the    |
967	   |                     |number MUST be 1-99, or 170 for the lead-out; for|
968	   |                     |non-CD-DA, the track number MUST for 255 for the |
969	   |                     |lead-out.  It is not REQUIRED but encouraged to  |
970	   |                     |start with track 1 and increase sequentially.    |
971	   |                     |Track numbers MUST be unique within a CUESHEET.  |
972	   +---------------------+-------------------------------------------------+
973	   |u(12*8)              |Track ISRC.  This is a 12-digit alphanumeric     |
974	   |                     |code; see here (http://isrc.ifpi.org/) and here  |
975	   |                     |(http://www.disctronics.co.uk/technology/cdaudio/|
976	   |                     |cdaud_isrc.htm).  A value of 12 ASCII NUL        |
977	   |                     |characters MAY be used to denote absence of an   |
978	   |                     |ISRC.                                            |
979	   +---------------------+-------------------------------------------------+
980	   |u(1)                 |The track type: 0 for audio, 1 for non-audio.    |
981	   |                     |This corresponds to the CD-DA Q-channel control  |
982	   |                     |bit 3.                                           |
983	   +---------------------+-------------------------------------------------+
984	   |u(1)                 |The pre-emphasis flag: 0 for no pre-emphasis, 1  |
985	   |                     |for pre-emphasis.  This corresponds to the CD-DA |
986	   |                     |Q-channel control bit 5; see here                |
987	   |                     |(http://www.chipchapin.com/CDMedia/cdda9.php3).  |
988	   +---------------------+-------------------------------------------------+
989	   |u(6+13*8)            |Reserved.  All bits MUST be set to zero.         |
990	   +---------------------+-------------------------------------------------+
991	   |u(8)                 |The number of track index points.  There MUST be |
992	   |                     |at least one index in every track in a CUESHEET  |
993	   |                     |except for the lead-out track, which MUST have   |
994	   |                     |zero.  For CD-DA, this number SHOULD NOT be more |
995	   |                     |than 100.                                        |
996	   +---------------------+-------------------------------------------------+
997	   |CUESHEET_TRACK_INDEX+|For all tracks except the lead-out track, one or |
998	   |                     |more track index points.                         |
999	   +---------------------+-------------------------------------------------+

1001	                                  Table 13

1003	11.18.  CUESHEET_TRACK_INDEX

1005	   +========+=========================================================+
1006	   | Data   | Description                                             |
1007	   +========+=========================================================+
1008	   | u(64)  | Offset in samples, relative to the track offset, of the |
1009	   |        | index point.  For CD-DA, the offset MUST be evenly      |
1010	   |        | divisible by 588 samples (588 samples = 44100 samples/s |
1011	   |        | * 1/75 s).  Note that the offset is from the beginning  |
1012	   |        | of the track, not the beginning of the audio data.      |
1013	   +--------+---------------------------------------------------------+
1014	   | u(8)   | The index point number.  For CD-DA, an index number of  |
1015	   |        | 0 corresponds to the track pre-gap.  The first index in |
1016	   |        | a track MUST have a number of 0 or 1, and subsequently, |
1017	   |        | index numbers MUST increase by 1.  Index numbers MUST   |
1018	   |        | be unique within a track.                               |
1019	   +--------+---------------------------------------------------------+
1020	   | u(3*8) | Reserved.  All bits MUST be set to zero.                |
1021	   +--------+---------------------------------------------------------+

1023	                                 Table 14

1025	11.19.  METADATA_BLOCK_PICTURE

1027	       +========+==================================================+
1028	       | Data   | Description                                      |
1029	       +========+==================================================+
1030	       | u(32)  | The PICTURE_TYPE according to the ID3v2 APIC     |
1031	       |        | frame.                                           |
1032	       +--------+--------------------------------------------------+
1033	       | u(32)  | The length of the MIME type string in bytes.     |
1034	       +--------+--------------------------------------------------+
1035	       | u(n*8) | The MIME type string, in printable ASCII         |
1036	       |        | characters 0x20-0x7E.  The MIME type MAY also be |
1037	       |        | --> to signify that the data part is a URL of    |
1038	       |        | the picture instead of the picture data itself.  |
1039	       +--------+--------------------------------------------------+
1040	       | u(32)  | The length of the description string in bytes.   |
1041	       +--------+--------------------------------------------------+
1042	       | u(n*8) | The description of the picture, in UTF-8.        |
1043	       +--------+--------------------------------------------------+
1044	       | u(32)  | The width of the picture in pixels.              |
1045	       +--------+--------------------------------------------------+
1046	       | u(32)  | The height of the picture in pixels.             |
1047	       +--------+--------------------------------------------------+
1048	       | u(32)  | The color depth of the picture in bits-per-      |
1049	       |        | pixel.                                           |
1050	       +--------+--------------------------------------------------+
1051	       | u(32)  | For indexed-color pictures (e.g.  GIF), the      |
1052	       |        | number of colors used, or 0 for non-indexed      |
1053	       |        | pictures.                                        |
1054	       +--------+--------------------------------------------------+
1055	       | u(32)  | The length of the picture data in bytes.         |
1056	       +--------+--------------------------------------------------+
1057	       | u(n*8) | The binary picture data.                         |
1058	       +--------+--------------------------------------------------+

1060	                                  Table 15

1062	11.20.  PICTURE_TYPE

1064	              +=======+=====================================+
1065	              | Value | Description                         |
1066	              +=======+=====================================+
1067	              |     0 | Other                               |
1068	              +-------+-------------------------------------+
1069	              |     1 | 32x32 pixels 'file icon' (PNG only) |
1070	              +-------+-------------------------------------+
1071	              |     2 | Other file icon                     |
1072	              +-------+-------------------------------------+
1073	              |     3 | Cover (front)                       |
1074	              +-------+-------------------------------------+
1075	              |     4 | Cover (back)                        |
1076	              +-------+-------------------------------------+
1077	              |     5 | Leaflet page                        |
1078	              +-------+-------------------------------------+
1079	              |     6 | Media (e.g. label side of CD)       |
1080	              +-------+-------------------------------------+
1081	              |     7 | Lead artist/lead performer/soloist  |
1082	              +-------+-------------------------------------+
1083	              |     8 | Artist/performer                    |
1084	              +-------+-------------------------------------+
1085	              |     9 | Conductor                           |
1086	              +-------+-------------------------------------+
1087	              |    10 | Band/Orchestra                      |
1088	              +-------+-------------------------------------+
1089	              |    11 | Composer                            |
1090	              +-------+-------------------------------------+
1091	              |    12 | Lyricist/text writer                |
1092	              +-------+-------------------------------------+
1093	              |    13 | Recording Location                  |
1094	              +-------+-------------------------------------+
1095	              |    14 | During recording                    |
1096	              +-------+-------------------------------------+
1097	              |    15 | During performance                  |
1098	              +-------+-------------------------------------+
1099	              |    16 | Movie/video screen capture          |
1100	              +-------+-------------------------------------+
1101	              |    17 | A bright colored fish               |
1102	              +-------+-------------------------------------+
1103	              |    18 | Illustration                        |
1104	              +-------+-------------------------------------+
1105	              |    19 | Band/artist logotype                |
1106	              +-------+-------------------------------------+
1107	              |    20 | Publisher/Studio logotype           |
1108	              +-------+-------------------------------------+
1109	                                  Table 16

1111	   Other values are reserved and SHOULD NOT be used.  There MAY only be
1112	   one each of picture type 1 and 2 in a file.

1114	11.21.  FRAME

1116	            +==============+=================================+
1117	            | Data         | Description                     |
1118	            +==============+=================================+
1119	            | FRAME_HEADER |                                 |
1120	            +--------------+---------------------------------+
1121	            | SUBFRAME+    | One SUBFRAME per channel.       |
1122	            +--------------+---------------------------------+
1123	            | u(?)         | Zero-padding to byte alignment. |
1124	            +--------------+---------------------------------+
1125	            | FRAME_FOOTER |                                 |
1126	            +--------------+---------------------------------+

1128	                                 Table 17

1130	11.22.  FRAME_HEADER

1132	                +=======+================================+
1133	                | Data  | Description                    |
1134	                +=======+================================+
1135	                | u(14) | Sync code '0b11111111111110'   |
1136	                +-------+--------------------------------+
1137	                | u(1)  | FRAME HEADER RESERVED          |
1138	                +-------+--------------------------------+
1139	                | u(1)  | BLOCKING STRATEGY              |
1140	                +-------+--------------------------------+
1141	                | u(4)  | INTERCHANNEL SAMPLE BLOCK SIZE |
1142	                +-------+--------------------------------+
1143	                | u(4)  | SAMPLE RATE                    |
1144	                +-------+--------------------------------+
1145	                | u(4)  | CHANNEL ASSIGNMENT             |
1146	                +-------+--------------------------------+
1147	                | u(3)  | SAMPLE SIZE                    |
1148	                +-------+--------------------------------+
1149	                | u(1)  | FRAME HEADER RESERVED2         |
1150	                +-------+--------------------------------+
1151	                | u(?)  | CODED NUMBER                   |
1152	                +-------+--------------------------------+
1153	                | u(?)  | BLOCK SIZE INT                 |
1154	                +-------+--------------------------------+
1155	                | u(?)  | SAMPLE RATE INT                |
1156	                +-------+--------------------------------+
1157	                | u(8)  | FRAME CRC                      |
1158	                +-------+--------------------------------+

1160	                                 Table 18

1162	11.22.1.  FRAME HEADER RESERVED

1164	                    +=======+=========================+
1165	                    | Value | Description             |
1166	                    +=======+=========================+
1167	                    |     0 | mandatory value         |
1168	                    +-------+-------------------------+
1169	                    |     1 | reserved for future use |
1170	                    +-------+-------------------------+

1172	                                  Table 19

1174	   FRAME HEADER RESERVED MUST remain reserved for 0 in order for a FLAC
1175	   frame's initial 15 bits to be distinguishable from the start of an
1176	   MPEG audio frame (see also (http://lists.xiph.org/pipermail/flac-
1177	   dev/2008-December/002607.html)).

1179	11.22.2.  BLOCKING STRATEGY

1181	               +=======+==================================+
1182	               | Value | Description                      |
1183	               +=======+==================================+
1184	               |     0 | fixed-blocksize stream; frame    |
1185	               |       | header encodes the frame number  |
1186	               +-------+----------------------------------+
1187	               |     1 | variable-blocksize stream; frame |
1188	               |       | header encodes the sample number |
1189	               +-------+----------------------------------+

1191	                                 Table 20

1193	   The BLOCKING STRATEGY bit MUST be the same throughout the entire
1194	   stream.

1196	   The BLOCKING STRATEGY bit determines how to calculate the sample
1197	   number of the first sample in the frame.  If the bit is 0 (fixed-
1198	   blocksize), the frame header encodes the frame number as above, and
1199	   the frame's starting sample number will be the frame number times the
1200	   blocksize.  If it is 1 (variable-blocksize), the frame header encodes
1201	   the frame's starting sample number itself.  (In the case of a fixed-
1202	   blocksize stream, only the last block MAY be shorter than the stream
1203	   blocksize; its starting sample number will be calculated as the frame
1204	   number times the previous frame's blocksize, or zero if it is the
1205	   first frame).

1207	11.22.3.  INTERCHANNEL SAMPLE BLOCK SIZE

1209	       +=================+=========================================+
1210	       |           Value | Description                             |
1211	       +=================+=========================================+
1212	       |          0b0000 | reserved                                |
1213	       +-----------------+-----------------------------------------+
1214	       |          0b0001 | 192 samples                             |
1215	       +-----------------+-----------------------------------------+
1216	       | 0b0010 - 0b0101 | 576 * (2^(n-2)) samples, i.e. 576,      |
1217	       |                 | 1152, 2304 or 4608                      |
1218	       +-----------------+-----------------------------------------+
1219	       |          0b0110 | get 8 bit (blocksize-1) from end of     |
1220	       |                 | header                                  |
1221	       +-----------------+-----------------------------------------+
1222	       |          0b0111 | get 16 bit (blocksize-1) from end of    |
1223	       |                 | header                                  |
1224	       +-----------------+-----------------------------------------+
1225	       | 0b1000 - 0b1111 | 256 * (2^(n-8)) samples, i.e. 256, 512, |
1226	       |                 | 1024, 2048, 4096, 8192, 16384 or 32768  |
1227	       +-----------------+-----------------------------------------+

1229	                                  Table 21

1231	11.22.4.  SAMPLE RATE

1233	     +========+=====================================================+
1234	     |  Value | Description                                         |
1235	     +========+=====================================================+
1236	     | 0b0000 | get from STREAMINFO metadata block                  |
1237	     +--------+-----------------------------------------------------+
1238	     | 0b0001 | 88.2 kHz                                            |
1239	     +--------+-----------------------------------------------------+
1240	     | 0b0010 | 176.4 kHz                                           |
1241	     +--------+-----------------------------------------------------+
1242	     | 0b0011 | 192 kHz                                             |
1243	     +--------+-----------------------------------------------------+
1244	     | 0b0100 | 8 kHz                                               |
1245	     +--------+-----------------------------------------------------+
1246	     | 0b0101 | 16 kHz                                              |
1247	     +--------+-----------------------------------------------------+
1248	     | 0b0110 | 22.05 kHz                                           |
1249	     +--------+-----------------------------------------------------+
1250	     | 0b0111 | 24 kHz                                              |
1251	     +--------+-----------------------------------------------------+
1252	     | 0b1000 | 32 kHz                                              |
1253	     +--------+-----------------------------------------------------+
1254	     | 0b1001 | 44.1 kHz                                            |
1255	     +--------+-----------------------------------------------------+
1256	     | 0b1010 | 48 kHz                                              |
1257	     +--------+-----------------------------------------------------+
1258	     | 0b1011 | 96 kHz                                              |
1259	     +--------+-----------------------------------------------------+
1260	     | 0b1100 | get 8 bit sample rate (in kHz) from end of header   |
1261	     +--------+-----------------------------------------------------+
1262	     | 0b1101 | get 16 bit sample rate (in Hz) from end of header   |
1263	     +--------+-----------------------------------------------------+
1264	     | 0b1110 | get 16 bit sample rate (in daHz) from end of header |
1265	     +--------+-----------------------------------------------------+
1266	     | 0b1111 | invalid, to prevent sync-fooling string of 1s       |
1267	     +--------+-----------------------------------------------------+

1269	                                 Table 22

1271	11.22.5.  CHANNEL ASSIGNMENT

1273	   Values 0b0000-0b0111 represent the (number of independent channels)-1
1274	   coded independently, channel order follows SMPTE/ITU-R
1275	   recommendations.  Values 0b1000-0b1010 represent 2 channel (stereo)
1276	   audio where the signal has been mapped to a different representation,
1277	   see section on Interchannel Decorrelation (#interchannel-
1278	   decorrelation).

1280	    +==========+======================================================+
1281	    |    Value | Description                                          |
1282	    +==========+======================================================+
1283	    |   0b0000 | 1 channel: mono                                      |
1284	    +----------+------------------------------------------------------+
1285	    |   0b0001 | 2 channels: left, right                              |
1286	    +----------+------------------------------------------------------+
1287	    |   0b0010 | 3 channels: left, right, center                      |
1288	    +----------+------------------------------------------------------+
1289	    |   0b0011 | 4 channels: front left, front right, back left, back |
1290	    |          | right                                                |
1291	    +----------+------------------------------------------------------+
1292	    |   0b0100 | 5 channels: front left, front right, front center,   |
1293	    |          | back/surround left, back/surround right              |
1294	    +----------+------------------------------------------------------+
1295	    |   0b0101 | 6 channels: front left, front right, front center,   |
1296	    |          | LFE, back/surround left, back/surround right         |
1297	    +----------+------------------------------------------------------+
1298	    |   0b0110 | 7 channels: front left, front right, front center,   |
1299	    |          | LFE, back center, side left, side right              |
1300	    +----------+------------------------------------------------------+
1301	    |   0b0111 | 8 channels: front left, front right, front center,   |
1302	    |          | LFE, back left, back right, side left, side right    |
1303	    +----------+------------------------------------------------------+
1304	    |   0b1000 | left/side stereo: channel 0 is the left channel,     |
1305	    |          | channel 1 is the side(difference) channel            |
1306	    +----------+------------------------------------------------------+
1307	    |   0b1001 | right/side stereo: channel 0 is the side(difference) |
1308	    |          | channel, channel 1 is the right channel              |
1309	    +----------+------------------------------------------------------+
1310	    |   0b1010 | mid/side stereo: channel 0 is the mid(average)       |
1311	    |          | channel, channel 1 is the side(difference) channel   |
1312	    +----------+------------------------------------------------------+
1313	    | 0b1011 - | reserved                                             |
1314	    |   0b1111 |                                                      |
1315	    +----------+------------------------------------------------------+

1317	                                  Table 23

1319	   Please note that the actual coded subframe order for right/side
1320	   stereo is side-right.

1322	11.22.6.  SAMPLE SIZE

1324	              +=======+====================================+
1325	              | Value | Description                        |
1326	              +=======+====================================+
1327	              | 0b000 | get from STREAMINFO metadata block |
1328	              +-------+------------------------------------+
1329	              | 0b001 | 8 bits per sample                  |
1330	              +-------+------------------------------------+
1331	              | 0b010 | 12 bits per sample                 |
1332	              +-------+------------------------------------+
1333	              | 0b011 | reserved                           |
1334	              +-------+------------------------------------+
1335	              | 0b100 | 16 bits per sample                 |
1336	              +-------+------------------------------------+
1337	              | 0b101 | 20 bits per sample                 |
1338	              +-------+------------------------------------+
1339	              | 0b110 | 24 bits per sample                 |
1340	              +-------+------------------------------------+
1341	              | 0b111 | reserved                           |
1342	              +-------+------------------------------------+

1344	                                 Table 24

1346	   For subframes that encode a difference channel, the sample size is
1347	   one bit larger than the sample size of the frame, in order to be able
1348	   to encode the difference between extreme values.

1350	11.22.7.  FRAME HEADER RESERVED2

1352	                    +=======+=========================+
1353	                    | Value | Description             |
1354	                    +=======+=========================+
1355	                    |     0 | mandatory value         |
1356	                    +-------+-------------------------+
1357	                    |     1 | reserved for future use |
1358	                    +-------+-------------------------+

1360	                                  Table 25

1362	11.22.8.  CODED NUMBER

1364	   Frame/Sample numbers are encoded using the UTF-8 format, from BEFORE
1365	   it was limited to 4 bytes by RFC3629, this variant supports the
1366	   original 7 byte maximum.

1368	   Note to implementors: All Unicode compliant UTF-8 decoders and
1369	   encoders are limited to 4 bytes, it's best to just write your own one
1370	   off solution.

1372	  if(variable blocksize)
1373	    `u(8...56)`: "UTF-8" coded sample number (decoded number is 36 bits)
1374	  else
1375	    `u(8...48)`: "UTF-8" coded frame number (decoded number is 31 bits)

1377	11.22.9.  BLOCK SIZE INT

1379	   if(`INTERCHANNEL SAMPLE BLOCK SIZE` == 0b0110)
1380	     8 bit (blocksize-1)
1381	   else if(`INTERCHANNEL SAMPLE BLOCK SIZE` == 0b0111)
1382	     16 bit (blocksize-1)

1384	11.22.10.  SAMPLE RATE INT

1386	   if(`SAMPLE RATE` == 0b1100)
1387	     8 bit sample rate (in kHz)
1388	   else if(`SAMPLE RATE` == 0b1101)
1389	     16 bit sample rate (in Hz)
1390	   else if(`SAMPLE RATE` == 0b1110)
1391	     16 bit sample rate (in daHz)

1393	11.22.11.  FRAME CRC

1395	   CRC-8 (polynomial = x^8 + x^2 + x^1 + x^0, initialized with 0) of
1396	   everything before the CRC, including the sync code

1398	11.23.  FRAME_FOOTER

1400	       +=======+===================================================+
1401	       | Data  | Description                                       |
1402	       +=======+===================================================+
1403	       | u(16) | CRC-16 (polynomial = x^16 + x^15 + x^2 + x^0,     |
1404	       |       | initialized with 0) of everything before the CRC, |
1405	       |       | back to and including the frame header sync code  |
1406	       +-------+---------------------------------------------------+

1408	                                  Table 26

1410	11.24.  SUBFRAME

1412	     +========================================+======================+
1413	     | Data                                   | Description          |
1414	     +========================================+======================+
1415	     | SUBFRAME_HEADER                        |                      |
1416	     +----------------------------------------+----------------------+
1417	     | SUBFRAME_CONSTANT || SUBFRAME_FIXED || | The SUBFRAME_HEADER  |
1418	     | SUBFRAME_LPC || SUBFRAME_VERBATIM      | specifies which one. |
1419	     +----------------------------------------+----------------------+

1421	                                  Table 27

1423	11.25.  SUBFRAME_HEADER

1425	    +========+========================================================+
1426	    | Data   | Description                                            |
1427	    +========+========================================================+
1428	    | u(1)   | Zero bit padding, to prevent sync-fooling string of 1s |
1429	    +--------+--------------------------------------------------------+
1430	    | u(6)   | SUBFRAME TYPE (see section on SUBFRAME TYPE            |
1431	    |        | (#subframe-type))                                      |
1432	    +--------+--------------------------------------------------------+
1433	    | u(1+k) | WASTED BITS PER SAMPLE FLAG (see section on WASTED     |
1434	    |        | BITS PER SAMPLE FLAG (#wasted-bits-per-sample-flag))   |
1435	    +--------+--------------------------------------------------------+

1437	                                  Table 28

1439	11.25.1.  SUBFRAME TYPE

1441	   +==========+=======================================================+
1442	   |    Value | Description                                           |
1443	   +==========+=======================================================+
1444	   | 0b000000 | SUBFRAME_CONSTANT                                     |
1445	   +----------+-------------------------------------------------------+
1446	   | 0b000001 | SUBFRAME_VERBATIM                                     |
1447	   +----------+-------------------------------------------------------+
1448	   | 0b00001x | reserved                                              |
1449	   +----------+-------------------------------------------------------+
1450	   | 0b0001xx | reserved                                              |
1451	   +----------+-------------------------------------------------------+
1452	   | 0b001xxx | if(xxx <= 4) SUBFRAME_FIXED, xxx=order; else reserved |
1453	   +----------+-------------------------------------------------------+
1454	   | 0b01xxxx | reserved                                              |
1455	   +----------+-------------------------------------------------------+
1456	   | 0b1xxxxx | SUBFRAME_LPC, xxxxx=order-1                           |
1457	   +----------+-------------------------------------------------------+
1458	                                 Table 29

1460	11.25.2.  WASTED BITS PER SAMPLE FLAG

1462	   Certain file formats, like AIFF, can store audio samples with a bit
1463	   depth that is not an integer number of bytes by padding them with
1464	   least significant zero bits to a bit depth that is an integer number
1465	   of bytes.  For example, shifting a 14-bit sample right by 2 pads it
1466	   to a 16-bit sample, which then has two zero least-significant bits.
1467	   In this specification, these least-significant zero bits are referred
1468	   to as wasted bits-per-sample or simply wasted bits.  They are wasted
1469	   in a sense that they contain no information, but are stored anyway.

1471	   The wasted bits-per-sample flag in a subframe header is set to 1 if a
1472	   certain number of least-significant bits of all samples in the
1473	   current subframe are zero.  If this is the case, the number of wasted
1474	   bits-per-sample (k) minus 1 follows the flag in an unary encoding.
1475	   For example, if k is 3, 0b001 follows.  If k = 0, the wasted bits-
1476	   per-sample flag is 0 and no unary coded k follows.

1478	   In case k is not equal to 0, samples are coded ignoring k least-
1479	   significant bits.  For example, if the preceding frame header
1480	   specified a sample size of 16 bits per sample and k is 3, samples in
1481	   the subframe are coded as 13 bits per sample.  A decoder MUST add k
1482	   least-significant zero bits by shifting left (padding) after decoding
1483	   a subframe sample.  In case the frame has left/side, right/side or
1484	   mid/side stereo, padding MUST happen to a sample before it is used to
1485	   reconstruct a left or right sample.

1487	   Besides audio files that have a certain number of wasted bits for the
1488	   whole file, there exist audio files in which the number of wasted
1489	   bits varies.  There are DVD-Audio discs in which blocks of samples
1490	   have had their least-significant bits selectively zeroed, as to
1491	   slightly improve the compression of their otherwise lossless Meridian
1492	   Lossless Packing codec.  There are also audio processors like
1493	   lossyWAV that enable users to improve compression of their files by a
1494	   lossless audio codec in a non-lossless way.  Because of this the
1495	   number of wasted bits k MAY change between frames and MAY differ
1496	   between subframes.

1498	11.26.  SUBFRAME_CONSTANT

1500	             +======+========================================+
1501	             | Data | Description                            |
1502	             +======+========================================+
1503	             | u(n) | Unencoded constant value of the        |
1504	             |      | subblock, n = frame's bits-per-sample. |
1505	             +------+----------------------------------------+
1506	                                  Table 30

1508	11.27.  SUBFRAME_FIXED

1510	           +==========+========================================+
1511	           | Data     | Description                            |
1512	           +==========+========================================+
1513	           | u(n)     | Unencoded warm-up samples (n = frame's |
1514	           |          | bits-per-sample * predictor order).    |
1515	           +----------+----------------------------------------+
1516	           | RESIDUAL | Encoded residual                       |
1517	           +----------+----------------------------------------+

1519	                                  Table 31

1521	11.28.  SUBFRAME_LPC

1523	   +==========+========================================================+
1524	   | Data     | Description                                            |
1525	   +==========+========================================================+
1526	   | u(n)     | Unencoded warm-up samples (n = frame's bits-           |
1527	   |          | per-sample * lpc order).                               |
1528	   +----------+--------------------------------------------------------+
1529	   | u(4)     | (quantized linear predictor coefficients'              |
1530	   |          | precision in bits)-1 (NOTE: 0b1111 is invalid).        |
1531	   +----------+--------------------------------------------------------+
1532	   | u(5)     | Quantized linear predictor coefficient shift           |
1533	   |          | needed in bits (NOTE: this number is signed            |
1534	   |          | two's-complement).                                     |
1535	   +----------+--------------------------------------------------------+
1536	   | u(n)     | Unencoded predictor coefficients (n = qlp coeff        |
1537	   |          | precision * lpc order) (NOTE: the coefficients         |
1538	   |          | are signed two's-complement).                          |
1539	   +----------+--------------------------------------------------------+
1540	   | RESIDUAL | Encoded residual                                       |
1541	   +----------+--------------------------------------------------------+

1543	                                  Table 32

1545	11.29.  SUBFRAME_VERBATIM

1547	         +=========+=============================================+
1548	         | Data    | Description                                 |
1549	         +=========+=============================================+
1550	         | u(n\*i) | Unencoded subblock, where n is frame's      |
1551	         |         | bits-per-sample and i is frame's blocksize. |
1552	         +---------+---------------------------------------------+
1553	                                  Table 33

1555	11.30.  RESIDUAL

1557	   +================================================+======================+
1558	   |Data                                            |Description           |
1559	   +================================================+======================+
1560	   |u(2)                                            |RESIDUAL_CODING_METHOD|
1561	   +------------------------------------------------+----------------------+
1562	   |RESIDUAL_CODING_METHOD_PARTITIONED_EXP_GOLOMB |||                      |
1563	   |RESIDUAL_CODING_METHOD_PARTITIONED_EXP_GOLOMB2  |                      |
1564	   +------------------------------------------------+----------------------+

1566	                                  Table 34

1568	11.30.1.  RESIDUAL_CODING_METHOD

1570	    +=======+========================================================+
1571	    | Value | Description                                            |
1572	    +=======+========================================================+
1573	    |  0b00 | partitioned Exp-Golomb coding with 4-bit Exp-Golomb    |
1574	    |       | parameter;                                             |
1575	    |       | RESIDUAL_CODING_METHOD_PARTITIONED_EXP_GOLOMB follows  |
1576	    +-------+--------------------------------------------------------+
1577	    |  0b01 | partitioned Exp-Golomb coding with 5-bit Exp-Golomb    |
1578	    |       | parameter;                                             |
1579	    |       | RESIDUAL_CODING_METHOD_PARTITIONED_EXP_GOLOMB2 follows |
1580	    +-------+--------------------------------------------------------+
1581	    |  0b10 | reserved                                               |
1582	    |     - |                                                        |
1583	    |  0b11 |                                                        |
1584	    +-------+--------------------------------------------------------+

1586	                                 Table 35

1588	11.30.2.  RESIDUAL_CODING_METHOD_PARTITIONED_EXP_GOLOMB

1590	       +=======================+===================================+
1591	       | Data                  | Description                       |
1592	       +=======================+===================================+
1593	       | u(4)                  | Partition order.                  |
1594	       +-----------------------+-----------------------------------+
1595	       | EXP_GOLOMB_PARTITION+ | There will be 2^order partitions. |
1596	       +-----------------------+-----------------------------------+

1598	                                  Table 36

1600	11.30.2.1.  EXP_GOLOMB_PARTITION

1602	     +==========+====================================================+
1603	     | Data     | Description                                        |
1604	     +==========+====================================================+
1605	     | u(4(+5)) | EXP-GOLOMB PARTITION ENCODING PARAMETER (see       |
1606	     |          | section on EXP-GOLOMB PARTITION ENCODING PARAMETER |
1607	     |          | (#exp-golomb-partition-encoding-parameter))        |
1608	     +----------+----------------------------------------------------+
1609	     | u(?)     | ENCODED RESIDUAL (see section on ENCODED RESIDUAL  |
1610	     |          | (#encoded-residual))                               |
1611	     +----------+----------------------------------------------------+

1613	                                  Table 37

1615	11.30.2.2.  EXP GOLOMB PARTITION ENCODING PARAMETER

1617	          +==========+==========================================+
1618	          |    Value | Description                              |
1619	          +==========+==========================================+
1620	          | 0b0000 - | Exp-golomb parameter.                    |
1621	          |   0b1110 |                                          |
1622	          +----------+------------------------------------------+
1623	          |   0b1111 | Escape code, meaning the partition is in |
1624	          |          | unencoded binary form using n bits per   |
1625	          |          | sample; n follows as a 5-bit number.     |
1626	          +----------+------------------------------------------+

1628	                                  Table 38

1630	11.30.3.  RESIDUAL_CODING_METHOD_PARTITIONED_EXP_GOLOMB2

1632	      +========================+===================================+
1633	      | Data                   | Description                       |
1634	      +========================+===================================+
1635	      | u(4)                   | Partition order.                  |
1636	      +------------------------+-----------------------------------+
1637	      | EXP-GOLOMB2_PARTITION+ | There will be 2^order partitions. |
1638	      +------------------------+-----------------------------------+

1640	                                 Table 39

1642	11.30.3.1.  EXP_GOLOMB2_PARTITION

1644	    +==========+=====================================================+
1645	    | Data     | Description                                         |
1646	    +==========+=====================================================+
1647	    | u(5(+5)) | EXP-GOLOMB2 PARTITION ENCODING PARAMETER (see       |
1648	    |          | section on EXP-GOLOMB2 PARTITION ENCODING PARAMETER |
1649	    |          | (#expgolomb2-partition-encoding-parameter))         |
1650	    +----------+-----------------------------------------------------+
1651	    | u(?)     | ENCODED RESIDUAL (see section on ENCODED RESIDUAL   |
1652	    |          | (#encoded-residual))                                |
1653	    +----------+-----------------------------------------------------+

1655	                                 Table 40

1657	11.30.3.2.  EXP-GOLOMB2 PARTITION ENCODING PARAMETER

1659	         +===========+==========================================+
1660	         |     Value | Description                              |
1661	         +===========+==========================================+
1662	         | 0b00000 - | Exp-golomb parameter.                    |
1663	         |   0b11110 |                                          |
1664	         +-----------+------------------------------------------+
1665	         |   0b11111 | Escape code, meaning the partition is in |
1666	         |           | unencoded binary form using n bits per   |
1667	         |           | sample; n follows as a 5-bit number.     |
1668	         +-----------+------------------------------------------+

1670	                                 Table 41

1672	11.30.4.  ENCODED RESIDUAL

1674	   The number of samples (n) in the partition is determined as follows:

1676	   *  if the partition order is zero, n = frame's blocksize - predictor
1677	      order

1679	   *  else if this is not the first partition of the subframe, n =
1680	      (frame's blocksize / (2^partition order))

1682	   *  else n = (frame's blocksize / (2^partition order)) - predictor
1683	      order

1685	12.  Security Considerations

1687	   Like any other codec (such as [RFC6716]), FLAC should not be used
1688	   with insecure ciphers or cipher modes that are vulnerable to known
1689	   plaintext attacks.  Some of the header bits as well as the padding
1690	   are easily predictable.

1692	   Implementations of the FLAC codec need to take appropriate security
1693	   considerations into account.  Those related to denial of service are
1694	   outlined in Section 2.1 of [RFC4732].  It is extremely important for
1695	   the decoder to be robust against malicious payloads.  Malicious
1696	   payloads MUST NOT cause the decoder to overrun its allocated memory
1697	   or to take an excessive amount of resources to decode.  An overrun in
1698	   allocated memory could lead to arbitrary code execution by an
1699	   attacker.  The same applies to the encoder, even though problems in
1700	   encoders are typically rarer.  Malicious audio streams MUST NOT cause
1701	   the encoder to misbehave because this would allow an attacker to
1702	   attack transcoding gateways.  An example is allocating more memory
1703	   than available especially with blocksizes of more than 10000 or with
1704	   big metadata blocks, or not allocating enough memory before copying
1705	   data, which lead to execution of malicious code, crashes, freezes or
1706	   reboots on some known implementations.  See the FLAC decoder
1707	   testbench (https://wiki.hydrogenaud.io/
1708	   index.php?title=FLAC_decoder_testbench) for a non-exhaustive list of
1709	   FLAC files with extreme configurations which lead to crashes or
1710	   reboots on some known implementations.

1712	   None of the content carried in FLAC is intended to be executable.

1714	13.  Normative References

1716	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1717	              Requirement Levels", BCP 14, RFC 2119,
1718	              DOI 10.17487/RFC2119, March 1997,
1719	              <https://www.rfc-editor.org/info/rfc2119>.

1721	   [RFC4732]  Handley, M., Ed., Rescorla, E., Ed., and IAB, "Internet
1722	              Denial-of-Service Considerations", RFC 4732,
1723	              DOI 10.17487/RFC4732, December 2006,
1724	              <https://www.rfc-editor.org/info/rfc4732>.

1726	   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
1727	              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
1728	              May 2017, <https://www.rfc-editor.org/info/rfc8174>.

1730	14.  Informative References

1732	   [RFC6716]  Valin, JM., Vos, K., and T. Terriberry, "Definition of the
1733	              Opus Audio Codec", RFC 6716, DOI 10.17487/RFC6716,
1734	              September 2012, <https://www.rfc-editor.org/info/rfc6716>.

1736	Authors' Addresses

1738	   Michael Richardson

1740	   Email: mcr@sandelman.ca

1742	   Andrew Weaver

1744	   Email: theandrewjw@gmail.com