idnits 2.17.1 

draft-xiph-cellar-flac-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack a Security Considerations section.

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (June 5, 2017) is 2516 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Missing reference section? '1' on line 1268 looks like a reference

  -- Missing reference section? '2' on line 1270 looks like a reference

  -- Missing reference section? '3' on line 1272 looks like a reference

  -- Missing reference section? '4' on line 1274 looks like a reference

  -- Missing reference section? '5' on line 1277 looks like a reference

  -- Missing reference section? '6' on line 1280 looks like a reference

  -- Missing reference section? '7' on line 1282 looks like a reference

  -- Missing reference section? '8' on line 1284 looks like a reference

  -- Missing reference section? '9' on line 1286 looks like a reference

  -- Missing reference section? '10' on line 1289 looks like a reference

  -- Missing reference section? '11' on line 1291 looks like a reference

  -- Missing reference section? '12' on line 1294 looks like a reference

  -- Missing reference section? '13' on line 1296 looks like a reference

  -- Missing reference section? '14' on line 1298 looks like a reference

  -- Missing reference section? '15' on line 1301 looks like a reference

  -- Missing reference section? '16' on line 1303 looks like a reference

  -- Missing reference section? '17' on line 1305 looks like a reference

  -- Missing reference section? '18' on line 1307 looks like a reference

  -- Missing reference section? '19' on line 1309 looks like a reference

  -- Missing reference section? '20' on line 1311 looks like a reference

  -- Missing reference section? '21' on line 1313 looks like a reference

  -- Missing reference section? '22' on line 1315 looks like a reference

  -- Missing reference section? '23' on line 1317 looks like a reference


     Summary: 3 errors (**), 0 flaws (~~), 1 warning (==), 24 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	cellar                                                        J. Coalson
3	Internet-Draft
4	Intended status: Standards Track
5	Expires: December 7, 2017                            Xiph.Org Foundation
6	                                                            June 5, 2017

8	                       Free Lossless Audio Codec
9	                       draft-xiph-cellar-flac-00

11	Abstract

13	   This document defines FLAC, which stands for Free Lossless Audio
14	   Codec, a free, open source codec for lossless audio compression and
15	   decompression.

17	Status of This Memo

19	   This Internet-Draft is submitted in full conformance with the
20	   provisions of BCP 78 and BCP 79.

22	   Internet-Drafts are working documents of the Internet Engineering
23	   Task Force (IETF).  Note that other groups may also distribute
24	   working documents as Internet-Drafts.  The list of current Internet-
25	   Drafts is at http://datatracker.ietf.org/drafts/current/.

27	   Internet-Drafts are draft documents valid for a maximum of six months
28	   and may be updated, replaced, or obsoleted by other documents at any
29	   time.  It is inappropriate to use Internet-Drafts as reference
30	   material or to cite them other than as "work in progress."

32	   This Internet-Draft will expire on December 7, 2017.

34	Copyright Notice

36	   Copyright (c) 2017 IETF Trust and the persons identified as the
37	   document authors.  All rights reserved.

39	   This document is subject to BCP 78 and the IETF Trust's Legal
40	   Provisions Relating to IETF Documents
41	   (http://trustee.ietf.org/license-info) in effect on the date of
42	   publication of this document.  Please review these documents
43	   carefully, as they describe your rights and restrictions with respect
44	   to this document.  Code Components extracted from this document must
45	   include Simplified BSD License text as described in Section 4.e of
46	   the Trust Legal Provisions and are provided without warranty as
47	   described in the Simplified BSD License.

49	Table of Contents

51	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
52	   2.  Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .   3
53	   3.  Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . .   3
54	   4.  Architecture  . . . . . . . . . . . . . . . . . . . . . . . .   4
55	   5.  Definitions . . . . . . . . . . . . . . . . . . . . . . . . .   5
56	   6.  Blocking  . . . . . . . . . . . . . . . . . . . . . . . . . .   5
57	   7.  Interchannel Decorrelation  . . . . . . . . . . . . . . . . .   6
58	   8.  Prediction  . . . . . . . . . . . . . . . . . . . . . . . . .   6
59	   9.  Residual Coding . . . . . . . . . . . . . . . . . . . . . . .   7
60	   10. Format  . . . . . . . . . . . . . . . . . . . . . . . . . . .   8
61	     10.1.  Conventions  . . . . . . . . . . . . . . . . . . . . . .  11
62	     10.2.  STREAM . . . . . . . . . . . . . . . . . . . . . . . . .  11
63	     10.3.  METADATA_BLOCK . . . . . . . . . . . . . . . . . . . . .  12
64	     10.4.  METADATA_BLOCK_HEADER  . . . . . . . . . . . . . . . . .  12
65	     10.5.  BLOCK_TYPE . . . . . . . . . . . . . . . . . . . . . . .  12
66	     10.6.  METADATA_BLOCK_DATA  . . . . . . . . . . . . . . . . . .  13
67	     10.7.  METADATA_BLOCK_STREAMINFO  . . . . . . . . . . . . . . .  13
68	     10.8.  METADATA_BLOCK_PADDING . . . . . . . . . . . . . . . . .  14
69	     10.9.  METADATA_BLOCK_APPLICATION . . . . . . . . . . . . . . .  15
70	     10.10. METADATA_BLOCK_SEEKTABLE . . . . . . . . . . . . . . . .  15
71	     10.11. SEEKPOINT  . . . . . . . . . . . . . . . . . . . . . . .  15
72	     10.12. METADATA_BLOCK_VORBIS_COMMENT  . . . . . . . . . . . . .  16
73	     10.13. METADATA_BLOCK_CUESHEET  . . . . . . . . . . . . . . . .  16
74	     10.14. CUESHEET_TRACK . . . . . . . . . . . . . . . . . . . . .  18
75	     10.15. CUESHEET_TRACK_INDEX . . . . . . . . . . . . . . . . . .  20
76	     10.16. METADATA_BLOCK_PICTURE . . . . . . . . . . . . . . . . .  20
77	     10.17. PICTURE_TYPE . . . . . . . . . . . . . . . . . . . . . .  20
78	     10.18. FRAME  . . . . . . . . . . . . . . . . . . . . . . . . .  21
79	     10.19. FRAME_HEADER . . . . . . . . . . . . . . . . . . . . . .  21
80	       10.19.1.  FRAME HEADER RESERVED . . . . . . . . . . . . . . .  22
81	       10.19.2.  BLOCKING STRATEGY . . . . . . . . . . . . . . . . .  22
82	       10.19.3.  INTERCHANNEL SAMPLE BLOCK SIZE  . . . . . . . . . .  23
83	       10.19.4.  SAMPLE RATE . . . . . . . . . . . . . . . . . . . .  23
84	       10.19.5.  CHANNEL ASSIGNMENT  . . . . . . . . . . . . . . . .  24
85	       10.19.6.  SAMPLE SIZE . . . . . . . . . . . . . . . . . . . .  24
86	       10.19.7.  FRAME HEADER RESERVED2  . . . . . . . . . . . . . .  25
87	       10.19.8.  CODED NUMBER  . . . . . . . . . . . . . . . . . . .  25
88	       10.19.9.  BLOCK SIZE INT  . . . . . . . . . . . . . . . . . .  25
89	       10.19.10. SAMPLE RATE INT . . . . . . . . . . . . . . . . . .  25
90	       10.19.11. FRAME CRC . . . . . . . . . . . . . . . . . . . . .  25
91	     10.20. FRAME_FOOTER . . . . . . . . . . . . . . . . . . . . . .  25
92	     10.21. SUBFRAME . . . . . . . . . . . . . . . . . . . . . . . .  26
93	     10.22. SUBFRAME_HEADER  . . . . . . . . . . . . . . . . . . . .  26
94	       10.22.1.  SUBFRAME TYPE . . . . . . . . . . . . . . . . . . .  26
95	       10.22.2.  WASTED BITS PER SAMPLE FLAG . . . . . . . . . . . .  26
96	     10.23. SUBFRAME_CONSTANT  . . . . . . . . . . . . . . . . . . .  27
97	     10.24. SUBFRAME_FIXED . . . . . . . . . . . . . . . . . . . . .  27
98	     10.25. SUBFRAME_LPC . . . . . . . . . . . . . . . . . . . . . .  27
99	     10.26. SUBFRAME_VERBATIM  . . . . . . . . . . . . . . . . . . .  28
100	     10.27. RESIDUAL . . . . . . . . . . . . . . . . . . . . . . . .  28
101	       10.27.1.  RESIDUAL_CODING_METHOD  . . . . . . . . . . . . . .  28
102	       10.27.2.  RESIDUAL_CODING_METHOD_PARTITIONED_EXP_GOLOMB . . .  28
103	       10.27.3.  RESIDUAL_CODING_METHOD_PARTITIONED_EXP_GOLOMB2  . .  29
104	       10.27.4.  ENCODED RESIDUAL  . . . . . . . . . . . . . . . . .  30
105	     11.1.  URIs . . . . . . . . . . . . . . . . . . . . . . . . . .  30
106	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  31

108	1.  Introduction

110	   This is a detailed description of the FLAC format.  There is also a
111	   companion document that describes FLAC-to-Ogg mapping [1].

113	   For a user-oriented overview, see About the FLAC Format [2].

115	2.  Acknowledgments

117	   FLAC owes much to the many people who have advanced the audio
118	   compression field so freely.  For instance: - A.  J.  Robinson [3]
119	   for his work on Shorten [4]; his paper is a good starting point on
120	   some of the basic methods used by FLAC.  FLAC trivially extends and
121	   improves the fixed predictors, LPC coefficient quantization, and
122	   Exponential-Golomb coding used in Shorten.  - S.  W.  Golomb [5] and
123	   Robert F.  Rice; their universal codes are used by FLAC's entropy
124	   coder.  - N.  Levinson and J.  Durbin; the reference encoder uses an
125	   algorithm developed and refined by them for determining the LPC
126	   coefficients from the autocorrelation coefficients.  - And of course,
127	   Claude Shannon [6]

129	3.  Scope

131	   It is a known fact that no algorithm can losslessly compress all
132	   possible input, so most compressors restrict themselves to a useful
133	   domain and try to work as well as possible within that domain.
134	   FLAC's domain is audio data.  Though it can losslessly *code* any
135	   input, only certain kinds of input will get smaller.  FLAC exploits
136	   the fact that audio data typically has a high degree of sample-to-
137	   sample correlation.

139	   Within the audio domain, there are many possible subdomains.  For
140	   example: low bitrate speech, high-bitrate multi-channel music, etc.
141	   FLAC itself does not target a specific subdomain but many of the
142	   default parameters of the reference encoder are tuned to CD-quality
143	   music data (i.e. 44.1 kHz, 2 channel, 16 bits per sample).  The
144	   effect of the encoding parameters on different kinds of audio data
145	   will be examined later.

147	4.  Architecture

149	   Similar to many audio coders, a FLAC encoder has the following
150	   stages:

152	   o  "Blocking" (see Section 6).  The input is broken up into many
153	      contiguous blocks.  With FLAC, the blocks may vary in size.  The
154	      optimal size of the block is usually affected by many factors,
155	      including the sample rate, spectral characteristics over time,
156	      etc.  Though FLAC allows the block size to vary within a stream,
157	      the reference encoder uses a fixed block size.

159	   o  "Interchannel Decorrelation" (see Section 7).  In the case of
160	      stereo streams, the encoder will create mid and side signals based
161	      on the average and difference (respectively) of the left and right
162	      channels.  The encoder will then pass the best form of the signal
163	      to the next stage.

165	   o  "Prediction" (see Section 8).  The block is passed through a
166	      prediction stage where the encoder tries to find a mathematical
167	      description (usually an approximate one) of the signal.  This
168	      description is typically much smaller than the raw signal itself.
169	      Since the methods of prediction are known to both the encoder and
170	      decoder, only the parameters of the predictor need be included in
171	      the compressed stream.  FLAC currently uses four different classes
172	      of predictors, but the format has reserved space for additional
173	      methods.  FLAC allows the class of predictor to change from block
174	      to block, or even within the channels of a block.

176	   o  "Residual Coding" (See Section 9).  If the predictor does not
177	      describe the signal exactly, the difference between the original
178	      signal and the predicted signal (called the error or residual
179	      signal) must be coded losslessly.  If the predictor is effective,
180	      the residual signal will require fewer bits per sample than the
181	      original signal.  FLAC currently uses only one method for encoding
182	      the residual, but the format has reserved space for additional
183	      methods.  FLAC allows the residual coding method to change from
184	      block to block, or even within the channels of a block.

186	   In addition, FLAC specifies a metadata system, which allows arbitrary
187	   information about the stream to be included at the beginning of the
188	   stream.

190	5.  Definitions

192	   Many terms like "block" and "frame" are used to mean different things
193	   in different encoding schemes.  For example, a frame in MP3
194	   corresponds to many samples across several channels, whereas an S/
195	   PDIF frame represents just one sample for each channel.  The
196	   definitions we use for FLAC follow.  Note that when we talk about
197	   blocks and subblocks we are referring to the raw unencoded audio data
198	   that is the input to the encoder, and when we talk about frames and
199	   subframes, we are referring to the FLAC-encoded data.

201	   o  *Block*: One or more audio samples that span several channels.

203	   o  *Subblock*: One or more audio samples within a channel.  So a
204	      block contains one subblock for each channel, and all subblocks
205	      contain the same number of samples.

207	   o  *Blocksize*: The number of samples in any of a block's subblocks.
208	      For example, a one second block sampled at 44.1 kHz has a
209	      blocksize of 44100, regardless of the number of channels.

211	   o  *Frame*: A frame header plus one or more subframes.

213	   o  *Subframe*: A subframe header plus one or more encoded samples
214	      from a given channel.  All subframes within a frame will contain
215	      the same number of samples.

217	   o  *Exponential-Golomb coding*: One of Robert Rice's universal coding
218	      schemes, FLAC's residual coder, compresses data by writing the
219	      number of bits to be read minus 1, before writing the actual
220	      value.

222	   o  *LPC*: Linear predictive coding [7].

224	6.  Blocking

226	   The size used for blocking the audio data has a direct effect on the
227	   compression ratio.  If the block size is too small, the resulting
228	   large number of frames mean that excess bits will be wasted on frame
229	   headers.  If the block size is too large, the characteristics of the
230	   signal may vary so much that the encoder will be unable to find a
231	   good predictor.  In order to simplify encoder/decoder design, FLAC
232	   imposes a minimum block size of 16 samples, and a maximum block size
233	   of 65535 samples.  This range covers the optimal size for all of the
234	   audio data FLAC supports.

236	   Currently the reference encoder uses a fixed block size, optimized on
237	   the sample rate of the input.  Future versions may vary the block
238	   size depending on the characteristics of the signal.

240	   Blocked data is passed to the predictor stage one subblock (channel)
241	   at a time.  Each subblock is independently coded into a subframe, and
242	   the subframes are concatenated into a frame.  Because each channel is
243	   coded separately, it means that one channel of a stereo frame may be
244	   encoded as a constant subframe, and the other an LPC subframe.

246	7.  Interchannel Decorrelation

248	   In stereo streams, many times there is an exploitable amount of
249	   correlation between the left and right channels.  FLAC allows the
250	   frames of stereo streams to have different channel assignments, and
251	   an encoder may choose to use the best representation on a frame-by-
252	   frame basis.

254	   o  *Independent*. The left and right channels are coded
255	      independently.

257	   o  *Mid-side*. The left and right channels are transformed into mid
258	      and side channels.  The mid channel is the midpoint (average) of
259	      the left and right signals, and the side is the difference signal
260	      (left minus right).

262	   o  *Left-side*. The left channel and side channel are coded.

264	   o  *Right-side*. The right channel and side channel are coded

266	   Surprisingly, the left-side and right-side forms can be the most
267	   efficient in many frames, even though the raw number of bits per
268	   sample needed for the original signal is slightly more than that
269	   needed for independent or mid-side coding.

271	8.  Prediction

273	   FLAC uses four methods for modeling the input signal:

275	   o  *Verbatim*. This is essentially a zero-order predictor of the
276	      signal.  The predicted signal is zero, meaning the residual is the
277	      signal itself, and the compression is zero.  This is the baseline
278	      against which the other predictors are measured.  If you feed
279	      random data to the encoder, the verbatim predictor will probably
280	      be used for every subblock.  Since the raw signal is not actually
281	      passed through the residual coding stage (it is added to the
282	      stream 'verbatim'), the encoding results will not be the same as a
283	      zero-order linear predictor.

285	   o  *Constant*. This predictor is used whenever the subblock is pure
286	      DC ("digital silence"), i.e. a constant value throughout.  The
287	      signal is run-length encoded and added to the stream.

289	   o  *Fixed linear predictor*. FLAC uses a class of computationally-
290	      efficient fixed linear predictors (for a good description, see
291	      audiopak [8] and shorten [9]).  FLAC adds a fourth-order predictor
292	      to the zero-to-third-order predictors used by Shorten.  Since the
293	      predictors are fixed, the predictor order is the only parameter
294	      that needs to be stored in the compressed stream.  The error
295	      signal is then passed to the residual coder.

297	   o  *FIR Linear prediction*. For more accurate modeling (at a cost of
298	      slower encoding), FLAC supports up to 32nd order FIR linear
299	      prediction (again, for information on linear prediction, see
300	      audiopak [10] and shorten [11]).  The reference encoder uses the
301	      Levinson-Durbin method for calculating the LPC coefficients from
302	      the autocorrelation coefficients, and the coefficients are
303	      quantized before computing the residual.  Whereas encoders such as
304	      Shorten used a fixed quantization for the entire input, FLAC
305	      allows the quantized coefficient precision to vary from subframe
306	      to subframe.  The FLAC reference encoder estimates the optimal
307	      precision to use based on the block size and dynamic range of the
308	      original signal.

310	9.  Residual Coding

312	   FLAC uses Exponential-Golomb (a variant of Rice) coding as it's
313	   residual encoder.  You can learn more about exp-golomb coding on
314	   Wikipedia [12]

316	   FLAC currently defines two similar methods for the coding of the
317	   error signal from the prediction stage.  The error signal is coded
318	   using Exponential-Golomb codes in one of two ways:

320	   1.  the encoder estimates a single exp-golomb parameter based on the
321	       variance of the residual and exp-golomb codes the entire residual
322	       using this parameter;

324	   2.  the residual is partitioned into several equal-length regions of
325	       contiguous samples, and each region is coded with its own exp-
326	       golomb parameter based on the region's mean.  (Note that the
327	       first method is a special case of the second method with one
328	       partition, except the exp-golomb parameter is based on the
329	       residual variance instead of the mean.)

331	   The FLAC format has reserved space for other coding methods.  Some
332	   possibilities for volunteers would be to explore better context-
333	   modeling of the exp-golomb parameter, or Huffman coding.  See LOCO-I
334	   [13] and pucrunch [14] for descriptions of several universal codes.

336	10.  Format

338	   This section specifies the FLAC bitstream format.  FLAC has no format
339	   version information, but it does contain reserved space in several
340	   places.  Future versions of the format may use this reserved space
341	   safely without breaking the format of older streams.  Older decoders
342	   may choose to abort decoding or skip data encoded with newer methods.
343	   Apart from reserved patterns, in places the format specifies invalid
344	   patterns, meaning that the patterns may never appear in any valid
345	   bitstream, in any prior, present, or future versions of the format.
346	   These invalid patterns are usually used to make the synchronization
347	   mechanism more robust.

349	   All numbers used in a FLAC bitstream are integers; there are no
350	   floating-point representations.  All numbers are big-endian coded.
351	   All numbers are unsigned unless otherwise specified.

353	   Before the formal description of the stream, an overview might be
354	   helpful.

356	   o  A FLAC bitstream consists of the "fLaC" marker at the beginning of
357	      the stream, followed by a mandatory metadata block (called the
358	      STREAMINFO block), any number of other metadata blocks, then the
359	      audio frames.

361	   o  FLAC supports up to 128 kinds of metadata blocks; currently the
362	      following are defined:

364	      *  "STREAMINFO": This block has information about the whole
365	         stream, like sample rate, number of channels, total number of
366	         samples, etc.  It must be present as the first metadata block
367	         in the stream.  Other metadata blocks may follow, and ones that
368	         the decoder doesn't understand, it will skip.

370	      *  "APPLICATION": This block is for use by third-party
371	         applications.  The only mandatory field is a 32-bit identifier.
372	         This ID is granted upon request to an application by the FLAC
373	         maintainers.  The remainder is of the block is defined by the
374	         registered application.  Visit the registration page [15] if
375	         you would like to register an ID for your application with
376	         FLAC.

378	      *  "PADDING": This block allows for an arbitrary amount of
379	         padding.  The contents of a PADDING block have no meaning.
380	         This block is useful when it is known that metadata will be
381	         edited after encoding; the user can instruct the encoder to
382	         reserve a PADDING block of sufficient size so that when
383	         metadata is added, it will simply overwrite the padding (which
384	         is relatively quick) instead of having to insert it into the
385	         right place in the existing file (which would normally require
386	         rewriting the entire file).

388	      *  "SEEKTABLE": This is an optional block for storing seek points.
389	         It is possible to seek to any given sample in a FLAC stream
390	         without a seek table, but the delay can be unpredictable since
391	         the bitrate may vary widely within a stream.  By adding seek
392	         points to a stream, this delay can be significantly reduced.
393	         Each seek point takes 18 bytes, so 1% resolution within a
394	         stream adds less than 2K.  There can be only one SEEKTABLE in a
395	         stream, but the table can have any number of seek points.
396	         There is also a special 'placeholder' seekpoint which will be
397	         ignored by decoders but which can be used to reserve space for
398	         future seek point insertion.

400	      *  "VORBIS_COMMENT": This block is for storing a list of human-
401	         readable name/value pairs.  Values are encoded using UTF-8.  It
402	         is an implementation of the Vorbis comment specification [16]
403	         (without the framing bit).  This is the only officially
404	         supported tagging mechanism in FLAC.  There may be only one
405	         VORBIS_COMMENT block in a stream.  In some external
406	         documentation, Vorbis comments are called FLAC tags to lessen
407	         confusion.

409	      *  "CUESHEET": This block is for storing various information that
410	         can be used in a cue sheet.  It supports track and index
411	         points, compatible with Red Book CD digital audio discs, as
412	         well as other CD-DA metadata such as media catalog number and
413	         track ISRCs.  The CUESHEET block is especially useful for
414	         backing up CD-DA discs, but it can be used as a general purpose
415	         cueing mechanism for playback.

417	      *  "PICTURE": This block is for storing pictures associated with
418	         the file, most commonly cover art from CDs.  There may be more
419	         than one PICTURE block in a file.  The picture format is
420	         similar to the APIC frame in ID3v2 [17].  The PICTURE block has
421	         a type, MIME type, and UTF-8 description like ID3v2, and
422	         supports external linking via URL (though this is discouraged).
423	         The differences are that there is no uniqueness constraint on
424	         the description field, and the MIME type is mandatory.  The
425	         FLAC PICTURE block also includes the resolution, color depth,
426	         and palette size so that the client can search for a suitable
427	         picture without having to scan them all.

429	   o  The audio data is composed of one or more audio frames.  Each
430	      frame consists of a frame header, which contains a sync code,
431	      information about the frame like the block size, sample rate,
432	      number of channels, et cetera, and an 8-bit CRC.  The frame header
433	      also contains either the sample number of the first sample in the
434	      frame (for variable-blocksize streams), or the frame number (for
435	      fixed-blocksize streams).  This allows for fast, sample-accurate
436	      seeking to be performed.  Following the frame header are encoded
437	      subframes, one for each channel, and finally, the frame is zero-
438	      padded to a byte boundary.  Each subframe has its own header that
439	      specifies how the subframe is encoded.

441	   o  Since a decoder may start decoding in the middle of a stream,
442	      there must be a method to determine the start of a frame.  A
443	      14-bit sync code begins each frame.  The sync code will not appear
444	      anywhere else in the frame header.  However, since it may appear
445	      in the subframes, the decoder has two other ways of ensuring a
446	      correct sync.  The first is to check that the rest of the frame
447	      header contains no invalid data.  Even this is not foolproof since
448	      valid header patterns can still occur within the subframes.  The
449	      decoder's final check is to generate an 8-bit CRC of the frame
450	      header and compare this to the CRC stored at the end of the frame
451	      header.

453	   o  Again, since a decoder may start decoding at an arbitrary frame in
454	      the stream, each frame header must contain some basic information
455	      about the stream because the decoder may not have access to the
456	      STREAMINFO metadata block at the start of the stream.  This
457	      information includes sample rate, bits per sample, number of
458	      channels, etc.  Since the frame header is pure overhead, it has a
459	      direct effect on the compression ratio.  To keep the frame header
460	      as small as possible, FLAC uses lookup tables for the most
461	      commonly used values for frame parameters.  For instance, the
462	      sample rate part of the frame header is specified using 4 bits.
463	      Eight of the bit patterns correspond to the commonly used sample
464	      rates of 8/16/22.05/24/32/44.1/48/96 kHz.  However, odd sample
465	      rates can be specified by using one of the 'hint' bit patterns,
466	      directing the decoder to find the exact sample rate at the end of
467	      the frame header.  The same method is used for specifying the
468	      block size and bits per sample.  In this way, the frame header
469	      size stays small for all of the most common forms of audio data.

471	   o  Individual subframes (one for each channel) are coded separately
472	      within a frame, and appear serially in the stream.  In other
473	      words, the encoded audio data is NOT channel-interleaved.  This
474	      reduces decoder complexity at the cost of requiring larger decode
475	      buffers.  Each subframe has its own header specifying the
476	      attributes of the subframe, like prediction method and order,
477	      residual coding parameters, etc.  The header is followed by the
478	      encoded audio data for that channel.

480	   o  "FLAC" specifies a subset of itself as the Subset format.  The
481	      purpose of this is to ensure that any streams encoded according to
482	      the Subset are truly "streamable", meaning that a decoder that
483	      cannot seek within the stream can still pick up in the middle of
484	      the stream and start decoding.  It also makes hardware decoder
485	      implementations more practical by limiting the encoding parameters
486	      such that decoder buffer sizes and other resource requirements can
487	      be easily determined. *flac* generates Subset streams by default
488	      unless the "--lax" command-line option is used.  The Subset makes
489	      the following limitations on what may be used in the stream:

491	      *  The blocksize bits in the "FRAME_HEADER" (see Section 10.19)
492	         must be 0001-1110.  The blocksize must be <= 16384; if the
493	         sample rate is <= 48000 Hz, the blocksize must be <= 4608.

495	      *  The sample rate bits in the "FRAME_HEADER" must be 0001-1110.

497	      *  The bits-per-sample bits in the "FRAME_HEADER" must be 001-111.

499	      *  If the sample rate is <= 48000 Hz, the filter order in "LPC
500	         subframes" (see Section 10.25) must be less than or equal to
501	         12, i.e. the subframe type bits in the "SUBFRAME_HEADER" (see
502	         Section 10.22) may not be 101100-111111.

504	      *  The Rice partition order in an "exp-golomb coded residual
505	         section" (see Section 10.27.2) must be less than or equal to 8.

507	10.1.  Conventions

509	   The following tables constitute a formal description of the FLAC
510	   format.  Values expressed as "u(n)" represent unsigned big-endian
511	   integer using "n" bits. "n" may be expressed as an equation using "*"
512	   (multiplication), "/" (division), "+" (addition), or "-"
513	   (subtraction).  An inclusive range of the number of bits expressed
514	   may be represented with an ellipsis, such as "u(m...n)".  The name of
515	   a value followed by an asterisk "*" indicates zero or more
516	   occurrences of the value.  The name of a value followed by a plus
517	   sign "+" indicates one or more occurrences of the value.

519	10.2.  STREAM
520	   +-----------------------------+-------------------------------------+
521	   | Data                        | Description                         |
522	   +-----------------------------+-------------------------------------+
523	   | "u(32)"                     | "fLaC", the FLAC stream marker in   |
524	   |                             | ASCII, meaning byte 0 of the stream |
525	   |                             | is 0x66, followed by 0x4C 0x61 0x43 |
526	   | "METADATA_BLOCK_STREAMINFO" | This is the mandatory STREAMINFO    |
527	   |                             | metadata block that has the basic   |
528	   |                             | properties of the stream.           |
529	   | "METADATA_BLOCK"*           | Zero or more metadata blocks        |
530	   | "FRAME"+                    | One or more audio frames            |
531	   +-----------------------------+-------------------------------------+

533	10.3.  METADATA_BLOCK

535	   +-------------------------+-----------------------------------------+
536	   | Data                    | Description                             |
537	   +-------------------------+-----------------------------------------+
538	   | "METADATA_BLOCK_HEADER" | A block header that specifies the type  |
539	   |                         | and size of the metadata block data.    |
540	   | "METADATA_BLOCK_DATA"   |                                         |
541	   +-------------------------+-----------------------------------------+

543	10.4.  METADATA_BLOCK_HEADER

545	   +---------+---------------------------------------------------------+
546	   | Data    | Description                                             |
547	   +---------+---------------------------------------------------------+
548	   | "u(1)"  | Last-metadata-block flag: '1' if this block is the last |
549	   |         | metadata block before the audio blocks, '0' otherwise.  |
550	   | "u(7)"  | "BLOCK_TYPE"                                            |
551	   | "u(24)" | Length (in bytes) of metadata to follow (does not       |
552	   |         | include the size of the "METADATA_BLOCK_HEADER")        |
553	   +---------+---------------------------------------------------------+

555	10.5.  BLOCK_TYPE
556	     +---------+----------------------------------------------------+
557	     | Value   | Description                                        |
558	     +---------+----------------------------------------------------+
559	     | 0       | STREAMINFO                                         |
560	     | 1       | PADDING                                            |
561	     | 2       | APPLICATION                                        |
562	     | 3       | SEEKTABLE                                          |
563	     | 4       | VORBIS_COMMENT                                     |
564	     | 5       | CUESHEET                                           |
565	     | 6       | PICTURE                                            |
566	     | 7 - 126 | reserved                                           |
567	     | 127     | invalid, to avoid confusion with a frame sync code |
568	     +---------+----------------------------------------------------+

570	10.6.  METADATA_BLOCK_DATA

572	   +-------------------------------------------------+-----------------+
573	   | Data                                            | Description     |
574	   +-------------------------------------------------+-----------------+
575	   | "METADATA_BLOCK_STREAMINFO" ||                  | The block data  |
576	   | "METADATA_BLOCK_PADDING" ||                     | must match the  |
577	   | "METADATA_BLOCK_APPLICATION" ||                 | block type in   |
578	   | "METADATA_BLOCK_SEEKTABLE" ||                   | the block       |
579	   | "METADATA_BLOCK_VORBIS_COMMENT" ||              | header.         |
580	   | "METADATA_BLOCK_CUESHEET" ||                    |                 |
581	   | "METADATA_BLOCK_PICTURE"                        |                 |
582	   +-------------------------------------------------+-----------------+

584	10.7.  METADATA_BLOCK_STREAMINFO
585	   +----------+--------------------------------------------------------+
586	   | Data     | Description                                            |
587	   +----------+--------------------------------------------------------+
588	   | "u(16)"  | The minimum block size (in samples) used in the        |
589	   |          | stream.                                                |
590	   | "u(16)"  | The maximum block size (in samples) used in the        |
591	   |          | stream. (Minimum blocksize == maximum blocksize)       |
592	   |          | implies a fixed-blocksize stream.                      |
593	   | "u(24)"  | The minimum frame size (in bytes) used in the stream.  |
594	   |          | May be 0 to imply the value is not known.              |
595	   | "u(24)"  | The maximum frame size (in bytes) used in the stream.  |
596	   |          | May be 0 to imply the value is not known.              |
597	   | "u(20)"  | Sample rate in Hz. Though 20 bits are available, the   |
598	   |          | maximum sample rate is limited by the structure of     |
599	   |          | frame headers to 655350 Hz. Also, a value of 0 is      |
600	   |          | invalid.                                               |
601	   | "u(3)"   | (number of channels)-1. FLAC supports from 1 to 8      |
602	   |          | channels                                               |
603	   | "u(5)"   | (bits per sample)-1. FLAC supports from 4 to 32 bits   |
604	   |          | per sample. Currently the reference encoder and        |
605	   |          | decoders only support up to 24 bits per sample.        |
606	   | "u(36)"  | Total samples in stream. 'Samples' means inter-channel |
607	   |          | sample, i.e. one second of 44.1 kHz audio will have    |
608	   |          | 44100 samples regardless of the number of channels. A  |
609	   |          | value of zero here means the number of total samples   |
610	   |          | is unknown.                                            |
611	   | "u(128)" | MD5 signature of the unencoded audio data. This allows |
612	   |          | the decoder to determine if an error exists in the     |
613	   |          | audio data even when the error does not result in an   |
614	   |          | invalid bitstream.                                     |
615	   +----------+--------------------------------------------------------+

617	   NOTE

619	   o  FLAC specifies a minimum block size of 16 and a maximum block size
620	      of 65535, meaning the bit patterns corresponding to the numbers
621	      0-15 in the minimum blocksize and maximum blocksize fields are
622	      invalid.

624	10.8.  METADATA_BLOCK_PADDING

626	            +--------+----------------------------------------+
627	            | Data   | Description                            |
628	            +--------+----------------------------------------+
629	            | "u(n)" | n '0' bits (n must be a multiple of 8) |
630	            +--------+----------------------------------------+

632	10.9.  METADATA_BLOCK_APPLICATION

634	   +---------+---------------------------------------------------------+
635	   | Data    | Description                                             |
636	   +---------+---------------------------------------------------------+
637	   | "u(32)" | Registered application ID. (Visit the registration page |
638	   |         | [18] to register an ID with FLAC.)                      |
639	   | "u(n)"  | Application data (n must be a multiple of 8)            |
640	   +---------+---------------------------------------------------------+

642	10.10.  METADATA_BLOCK_SEEKTABLE

644	                +--------------+--------------------------+
645	                | Data         | Description              |
646	                +--------------+--------------------------+
647	                | "SEEKPOINT"+ | One or more seek points. |
648	                +--------------+--------------------------+

650	   NOTE - The number of seek points is implied by the metadata header
651	   'length' field, i.e. equal to length / 18.

653	10.11.  SEEKPOINT

655	   +---------+---------------------------------------------------------+
656	   | Data    | Description                                             |
657	   +---------+---------------------------------------------------------+
658	   | "u(64)" | Sample number of first sample in the target frame, or   |
659	   |         | "0xFFFFFFFFFFFFFFFF" for a placeholder point.           |
660	   | "u(64)" | Offset (in bytes) from the first byte of the first      |
661	   |         | frame header to the first byte of the target frame's    |
662	   |         | header.                                                 |
663	   | "u(16)" | Number of samples in the target frame.                  |
664	   +---------+---------------------------------------------------------+

666	   NOTES

668	   o  For placeholder points, the second and third field values are
669	      undefined.

671	   o  Seek points within a table must be sorted in ascending order by
672	      sample number.

674	   o  Seek points within a table must be unique by sample number, with
675	      the exception of placeholder points.

677	   o  The previous two notes imply that there may be any number of
678	      placeholder points, but they must all occur at the end of the
679	      table.

681	10.12.  METADATA_BLOCK_VORBIS_COMMENT

683	   +--------+----------------------------------------------------------+
684	   | Data   | Description                                              |
685	   +--------+----------------------------------------------------------+
686	   | "u(n)" | Also known as FLAC tags, the contents of a vorbis        |
687	   |        | comment packet as specified here [19] (without the       |
688	   |        | framing bit). Note that the vorbis comment spec allows   |
689	   |        | for on the order of 2 ^ 64 bytes of data where as the    |
690	   |        | FLAC metadata block is limited to 2 ^ 24 bytes. Given    |
691	   |        | the stated purpose of vorbis comments, i.e. human-       |
692	   |        | readable textual information, this limit is unlikely to  |
693	   |        | be restrictive. Also note that the 32-bit field lengths  |
694	   |        | are little-endian coded according to the vorbis spec, as |
695	   |        | opposed to the usual big-endian coding of fixed-length   |
696	   |        | integers in the rest of FLAC.                            |
697	   +--------+----------------------------------------------------------+

699	10.13.  METADATA_BLOCK_CUESHEET
700	   +-------------------+-----------------------------------------------+
701	   | Data              | Description                                   |
702	   +-------------------+-----------------------------------------------+
703	   | "u(128*8)"        | Media catalog number, in ASCII printable      |
704	   |                   | characters 0x20-0x7e. In general, the media   |
705	   |                   | catalog number may be 0 to 128 bytes long;    |
706	   |                   | any unused characters should be right-padded  |
707	   |                   | with NUL characters. For CD-DA, this is a     |
708	   |                   | thirteen digit number, followed by 115 NUL    |
709	   |                   | bytes.                                        |
710	   | "u(64)"           | The number of lead-in samples. This field has |
711	   |                   | meaning only for CD-DA cuesheets; for other   |
712	   |                   | uses it should be 0. For CD-DA, the lead-in   |
713	   |                   | is the TRACK 00 area where the table of       |
714	   |                   | contents is stored; more precisely, it is the |
715	   |                   | number of samples from the first sample of    |
716	   |                   | the media to the first sample of the first    |
717	   |                   | index point of the first track. According to  |
718	   |                   | the Red Book, the lead-in must be silence and |
719	   |                   | CD grabbing software does not usually store   |
720	   |                   | it; additionally, the lead-in must be at      |
721	   |                   | least two seconds but may be longer. For      |
722	   |                   | these reasons the lead-in length is stored    |
723	   |                   | here so that the absolute position of the     |
724	   |                   | first track can be computed. Note that the    |
725	   |                   | lead-in stored here is the number of samples  |
726	   |                   | up to the first index point of the first      |
727	   |                   | track, not necessarily to INDEX 01 of the     |
728	   |                   | first track; even the first track may have    |
729	   |                   | INDEX 00 data.                                |
730	   | "u(1)"            | "1" if the CUESHEET corresponds to a Compact  |
731	   |                   | Disc, else "0".                               |
732	   | "u(7+258*8)"      | Reserved. All bits must be set to zero.       |
733	   | "u(8)"            | The number of tracks. Must be at least 1      |
734	   |                   | (because of the requisite lead-out track).    |
735	   |                   | For CD-DA, this number must be no more than   |
736	   |                   | 100 (99 regular tracks and one lead-out       |
737	   |                   | track).                                       |
738	   | "CUESHEET_TRACK"+ | One or more tracks. A CUESHEET block is       |
739	   |                   | required to have a lead-out track; it is      |
740	   |                   | always the last track in the CUESHEET. For    |
741	   |                   | CD-DA, the lead-out track number must be 170  |
742	   |                   | as specified by the Red Book, otherwise is    |
743	   |                   | must be 255.                                  |
744	   +-------------------+-----------------------------------------------+

746	10.14.  CUESHEET_TRACK
747	   +-------------------------+-----------------------------------------+
748	   | Data                    | Description                             |
749	   +-------------------------+-----------------------------------------+
750	   | "u(64)"                 | Track offset in samples, relative to    |
751	   |                         | the beginning of the FLAC audio stream. |
752	   |                         | It is the offset to the first index     |
753	   |                         | point of the track. (Note how this      |
754	   |                         | differs from CD-DA, where the track's   |
755	   |                         | offset in the TOC is that of the        |
756	   |                         | track's INDEX 01 even if there is an    |
757	   |                         | INDEX 00.) For CD-DA, the offset must   |
758	   |                         | be evenly divisible by 588 samples (588 |
759	   |                         | samples = 44100 samples/sec * 1/75th of |
760	   |                         | a sec).                                 |
761	   | "u(8)"                  | Track number. A track number of 0 is    |
762	   |                         | not allowed to avoid conflicting with   |
763	   |                         | the CD-DA spec, which reserves this for |
764	   |                         | the lead-in. For CD-DA the number must  |
765	   |                         | be 1-99, or 170 for the lead-out; for   |
766	   |                         | non-CD-DA, the track number must for    |
767	   |                         | 255 for the lead-out. It is not         |
768	   |                         | required but encouraged to start with   |
769	   |                         | track 1 and increase sequentially.      |
770	   |                         | Track numbers must be unique within a   |
771	   |                         | CUESHEET.                               |
772	   | "u(12\*8)"              | Track ISRC. This is a 12-digit          |
773	   |                         | alphanumeric code; see here [20] and    |
774	   |                         | here [21]. A value of 12 ASCII NUL      |
775	   |                         | characters may be used to denote        |
776	   |                         | absence of an ISRC.                     |
777	   | "u(1)"                  | The track type: 0 for audio, 1 for non- |
778	   |                         | audio. This corresponds to the CD-DA    |
779	   |                         | Q-channel control bit 3.                |
780	   | "u(1)"                  | The pre-emphasis flag: 0 for no pre-    |
781	   |                         | emphasis, 1 for pre-emphasis. This      |
782	   |                         | corresponds to the CD-DA Q-channel      |
783	   |                         | control bit 5; see here [22].           |
784	   | "u(6+13*8)"             | Reserved. All bits must be set to zero. |
785	   | "u(8)"                  | The number of track index points. There |
786	   |                         | must be at least one index in every     |
787	   |                         | track in a CUESHEET except for the      |
788	   |                         | lead-out track, which must have zero.   |
789	   |                         | For CD-DA, this number may be no more   |
790	   |                         | than 100.                               |
791	   | "CUESHEET_TRACK_INDEX"+ | For all tracks except the lead-out      |
792	   |                         | track, one or more track index points.  |
793	   +-------------------------+-----------------------------------------+

795	10.15.  CUESHEET_TRACK_INDEX

797	   +----------+--------------------------------------------------------+
798	   | Data     | Description                                            |
799	   +----------+--------------------------------------------------------+
800	   | "u(64)"  | Offset in samples, relative to the track offset, of    |
801	   |          | the index point. For CD-DA, the offset must be evenly  |
802	   |          | divisible by 588 samples (588 samples = 44100          |
803	   |          | samples/sec * 1/75 sec). Note that the offset is from  |
804	   |          | the beginning of the track, not the beginning of the   |
805	   |          | audio data.                                            |
806	   | "u(8)"   | The index point number. For CD-DA, an index number of  |
807	   |          | 0 corresponds to the track pre-gap. The first index in |
808	   |          | a track must have a number of 0 or 1, and              |
809	   |          | subsequently, index numbers must increase by 1. Index  |
810	   |          | numbers must be unique within a track.                 |
811	   | "u(3*8)" | Reserved. All bits must be set to zero.                |
812	   +----------+--------------------------------------------------------+

814	10.16.  METADATA_BLOCK_PICTURE

816	   +----------+--------------------------------------------------------+
817	   | Data     | Description                                            |
818	   +----------+--------------------------------------------------------+
819	   | "u(32)"  | The PICTURE_TYPE according to the ID3v2 APIC frame:    |
820	   | "u(32)"  | The length of the MIME type string in bytes.           |
821	   | "u(n*8)" | The MIME type string, in printable ASCII characters    |
822	   |          | 0x20-0x7e. The MIME type may also be "-->" to signify  |
823	   |          | that the data part is a URL of the picture instead of  |
824	   |          | the picture data itself.                               |
825	   | "u(32)"  | The length of the description string in bytes.         |
826	   | "u(n*8)" | The description of the picture, in UTF-8.              |
827	   | "u(32)"  | The width of the picture in pixels.                    |
828	   | "u(32)"  | The height of the picture in pixels.                   |
829	   | "u(32)"  | The color depth of the picture in bits-per-pixel.      |
830	   | "u(32)"  | For indexed-color pictures (e.g. GIF), the number of   |
831	   |          | colors used, or "0" for non-indexed pictures.          |
832	   | "u(32)"  | The length of the picture data in bytes.               |
833	   | "u(n*8)" | The binary picture data.                               |
834	   +----------+--------------------------------------------------------+

836	10.17.  PICTURE_TYPE
837	              +-------+-------------------------------------+
838	              | Value | Description                         |
839	              +-------+-------------------------------------+
840	              |     0 | Other                               |
841	              |     1 | 32x32 pixels 'file icon' (PNG only) |
842	              |     2 | Other file icon                     |
843	              |     3 | Cover (front)                       |
844	              |     4 | Cover (back)                        |
845	              |     5 | Leaflet page                        |
846	              |     6 | Media (e.g. label side of CD)       |
847	              |     7 | Lead artist/lead performer/soloist  |
848	              |     8 | Artist/performer                    |
849	              |     9 | Conductor                           |
850	              |    10 | Band/Orchestra                      |
851	              |    11 | Composer                            |
852	              |    12 | Lyricist/text writer                |
853	              |    13 | Recording Location                  |
854	              |    14 | During recording                    |
855	              |    15 | During performance                  |
856	              |    16 | Movie/video screen capture          |
857	              |    17 | A bright colored fish               |
858	              |    18 | Illustration                        |
859	              |    19 | Band/artist logotype                |
860	              |    20 | Publisher/Studio logotype           |
861	              +-------+-------------------------------------+

863	   Other values are reserved and should not be used.  There may only be
864	   one each of picture type 1 and 2 in a file.

866	10.18.  FRAME

868	           +----------------+---------------------------------+
869	           | Data           | Description                     |
870	           +----------------+---------------------------------+
871	           | "FRAME_HEADER" |                                 |
872	           | "SUBFRAME"+    | One SUBFRAME per channel.       |
873	           | "u(?)"         | Zero-padding to byte alignment. |
874	           | "FRAME_FOOTER" |                                 |
875	           +----------------+---------------------------------+

877	10.19.  FRAME_HEADER
878	              +---------+----------------------------------+
879	              | Data    | Description                      |
880	              +---------+----------------------------------+
881	              | "u(14)" | Sync code '0b11111111111110'     |
882	              | "u(1)"  | "FRAME HEADER RESERVED"          |
883	              | "u(1)"  | "BLOCKING STRATEGY"              |
884	              | "u(4)"  | "INTERCHANNEL SAMPLE BLOCK SIZE" |
885	              | "u(4)"  | "SAMPLE RATE"                    |
886	              | "u(4)"  | "CHANNEL ASSIGNMENT"             |
887	              | "u(3)"  | "SAMPLE SIZE"                    |
888	              | "u(1)"  | "FRAME HEADER RESERVED2"         |
889	              | "u(?)"  | "CODED NUMBER"                   |
890	              | "u(?)"  | "BLOCK SIZE INT"                 |
891	              | "u(?)"  | "SAMPLE RATE INT"                |
892	              | "u(8)"  | "FRAME CRC"                      |
893	              +---------+----------------------------------+

895	10.19.1.  FRAME HEADER RESERVED

897	                    +-------+-------------------------+
898	                    | Value | Description             |
899	                    +-------+-------------------------+
900	                    |     0 | mandatory value         |
901	                    |     1 | reserved for future use |
902	                    +-------+-------------------------+

904	   FRAME HEADER RESERVED must remain reserved for "0" in order for a
905	   FLAC frame's initial 15 bits to be distinguishable from the start of
906	   an MPEG audio frame (see also [23]).

908	10.19.2.  BLOCKING STRATEGY

910	   +-------+-----------------------------------------------------------+
911	   | Value | Description                                               |
912	   +-------+-----------------------------------------------------------+
913	   |     0 | fixed-blocksize stream; frame header encodes the frame    |
914	   |       | number                                                    |
915	   |     1 | variable-blocksize stream; frame header encodes the       |
916	   |       | sample number                                             |
917	   +-------+-----------------------------------------------------------+

919	   The "BLOCKING STRATEGY" bit must be the same throughout the entire
920	   stream.

922	   The "BLOCKING STRATEGY" bit determines how to calculate the sample
923	   number of the first sample in the frame.  If the bit is "0" (fixed-
924	   blocksize), the frame header encodes the frame number as above, and
925	   the frame's starting sample number will be the frame number times the
926	   blocksize.  If it is "1" (variable-blocksize), the frame header
927	   encodes the frame's starting sample number itself.  (In the case of a
928	   fixed-blocksize stream, only the last block may be shorter than the
929	   stream blocksize; its starting sample number will be calculated as
930	   the frame number times the previous frame's blocksize, or zero if it
931	   is the first frame).

933	10.19.3.  INTERCHANNEL SAMPLE BLOCK SIZE

935	   +--------------+----------------------------------------------------+
936	   |        Value | Description                                        |
937	   +--------------+----------------------------------------------------+
938	   |       0b0000 | reserved                                           |
939	   |       0b0001 | 192 samples                                        |
940	   |     0b0010 - | 576 * (2^(n-2)) samples, i.e. 576/1152/2304/4608   |
941	   |       0b0101 |                                                    |
942	   |       0b0110 | get 8 bit (blocksize-1) from end of header         |
943	   |       0b0111 | get 16 bit (blocksize-1) from end of header        |
944	   |     0b1000 - | 256 * (2^(n-8)) samples, i.e.                      |
945	   |       0b1111 | 256/512/1024/2048/4096/8192/16384/32768            |
946	   +--------------+----------------------------------------------------+

948	10.19.4.  SAMPLE RATE

950	   +--------+----------------------------------------------------------+
951	   |  Value | Description                                              |
952	   +--------+----------------------------------------------------------+
953	   | 0b0000 | get from STREAMINFO metadata block                       |
954	   | 0b0001 | 88.2 kHz                                                 |
955	   | 0b0010 | 176.4 kHz                                                |
956	   | 0b0011 | 192 kHz                                                  |
957	   | 0b0100 | 8 kHz                                                    |
958	   | 0b0101 | 16 kHz                                                   |
959	   | 0b0110 | 22.05 kHz                                                |
960	   | 0b0111 | 24 kHz                                                   |
961	   | 0b1000 | 32 kHz                                                   |
962	   | 0b1001 | 44.1 kHz                                                 |
963	   | 0b1010 | 48 kHz                                                   |
964	   | 0b1011 | 96 kHz                                                   |
965	   | 0b1100 | get 8 bit sample rate (in kHz) from end of header        |
966	   | 0b1101 | get 16 bit sample rate (in Hz) from end of header        |
967	   | 0b1110 | get 16 bit sample rate (in tens of Hz) from end of       |
968	   |        | header                                                   |
969	   | 0b1111 | invalid, to prevent sync-fooling string of 1s            |
970	   +--------+----------------------------------------------------------+

972	10.19.5.  CHANNEL ASSIGNMENT

974	   For values 0000-0111, the value represents the (number of independent
975	   channels)-1.  Where defined, the channel order follows SMPTE/ITU-R
976	   recommendations.

978	   +-----------+-------------------------------------------------------+
979	   |     Value | Description                                           |
980	   +-----------+-------------------------------------------------------+
981	   |    0b0000 | 1 channel: mono                                       |
982	   |    0b0001 | 2 channels: left, right                               |
983	   |    0b0010 | 3 channels: left, right, center                       |
984	   |    0b0011 | 4 channels: front left, front right, back left, back  |
985	   |           | right                                                 |
986	   |    0b0100 | 5 channels: front left, front right, front center,    |
987	   |           | back/surround left, back/surround right               |
988	   |    0b0101 | 6 channels: front left, front right, front center,    |
989	   |           | LFE, back/surround left, back/surround right          |
990	   |    0b0110 | 7 channels: front left, front right, front center,    |
991	   |           | LFE, back center, side left, side right               |
992	   |    0b0111 | 8 channels: front left, front right, front center,    |
993	   |           | LFE, back left, back right, side left, side right     |
994	   |    0b1000 | left/side stereo: channel 0 is the left channel,      |
995	   |           | channel 1 is the side(difference) channel             |
996	   |    0b1001 | right/side stereo: channel 0 is the side(difference)  |
997	   |           | channel, channel 1 is the right channel               |
998	   |    0b1010 | mid/side stereo: channel 0 is the mid(average)        |
999	   |           | channel, channel 1 is the side(difference) channel    |
1000	   |  0b1011 - | reserved                                              |
1001	   |    0b1111 |                                                       |
1002	   +-----------+-------------------------------------------------------+

1004	10.19.6.  SAMPLE SIZE

1006	              +-------+------------------------------------+
1007	              | Value | Description                        |
1008	              +-------+------------------------------------+
1009	              | 0b000 | get from STREAMINFO metadata block |
1010	              | 0b001 | 8 bits per sample                  |
1011	              | 0b010 | 12 bits per sample                 |
1012	              | 0b011 | reserved                           |
1013	              | 0b100 | 16 bits per sample                 |
1014	              | 0b101 | 20 bits per sample                 |
1015	              | 0b110 | 24 bits per sample                 |
1016	              | 0b111 | reserved                           |
1017	              +-------+------------------------------------+

1019	10.19.7.  FRAME HEADER RESERVED2

1021	                    +-------+-------------------------+
1022	                    | Value | Description             |
1023	                    +-------+-------------------------+
1024	                    |     0 | mandatory value         |
1025	                    |     1 | reserved for future use |
1026	                    +-------+-------------------------+

1028	10.19.8.  CODED NUMBER

1030	   The "UTF-8" coding used for the sample/frame number is the same
1031	   variable length code used to store compressed UCS-2, extended to
1032	   handle larger input.

1034	  if(variable blocksize)
1035	    `u(8...56)`: "UTF-8" coded sample number (decoded number is 36 bits)
1036	  else
1037	    `u(8...48)`:"UTF-8" coded frame number (decoded number is 31 bits)

1039	10.19.9.  BLOCK SIZE INT

1041	            if(`INTERCHANNEL SAMPLE BLOCK SIZE` == 0b0110)
1042	              8 bit (blocksize-1)
1043	            else if(`INTERCHANNEL SAMPLE BLOCK SIZE` == 0b0111)
1044	              16 bit (blocksize-1)

1046	10.19.10.  SAMPLE RATE INT

1048	                    if(`SAMPLE RATE` == 0b1100)
1049	                      8 bit sample rate (in kHz)
1050	                    else if(`SAMPLE RATE` == 0b1101)
1051	                      16 bit sample rate (in Hz)
1052	                    else if(`SAMPLE RATE` == 0b1110)
1053	                      16 bit sample rate in tens of Hz)

1055	10.19.11.  FRAME CRC

1057	   CRC-8 (polynomial = x^8 + x^2 + x^1 + x^0, initialized with 0) of
1058	   everything before the CRC, including the sync code

1060	10.20.  FRAME_FOOTER
1061	   +---------+---------------------------------------------------------+
1062	   | Data    | Description                                             |
1063	   +---------+---------------------------------------------------------+
1064	   | "u(16)" | CRC-16 (polynomial = x^16 + x^15 + x^2 + x^0,           |
1065	   |         | initialized with 0) of everything before the CRC, back  |
1066	   |         | to and including the frame header sync code             |
1067	   +---------+---------------------------------------------------------+

1069	10.21.  SUBFRAME

1071	   +-------------------------------------------+-----------------------+
1072	   | Data                                      | Description           |
1073	   +-------------------------------------------+-----------------------+
1074	   | "SUBFRAME_HEADER"                         |                       |
1075	   | "SUBFRAME_CONSTANT" || "SUBFRAME_FIXED"   | The SUBFRAME_HEADER   |
1076	   | || "SUBFRAME_LPC" || "SUBFRAME_VERBATIM"  | specifies which one.  |
1077	   +-------------------------------------------+-----------------------+

1079	10.22.  SUBFRAME_HEADER

1081	   +----------+--------------------------------------------------------+
1082	   | Data     | Description                                            |
1083	   +----------+--------------------------------------------------------+
1084	   | "u(1)"   | Zero bit padding, to prevent sync-fooling string of 1s |
1085	   | "u(6)"   | "SUBFRAME TYPE" (see Section 10.22.1)                  |
1086	   | "u(1+k)" | "WASTED BITS PER SAMPLE FLAG" (see Section 10.22.2)    |
1087	   +----------+--------------------------------------------------------+

1089	10.22.1.  SUBFRAME TYPE

1091	   +----------+--------------------------------------------------------+
1092	   |    Value | Description                                            |
1093	   +----------+--------------------------------------------------------+
1094	   | 0b000000 | "SUBFRAME_CONSTANT"                                    |
1095	   | 0b000001 | "SUBFRAME_VERBATIM"                                    |
1096	   | 0b00001x | reserved                                               |
1097	   | 0b0001xx | reserved                                               |
1098	   | 0b001xxx | if(xxx <= 4) "SUBFRAME_FIXED", xxx=order ; else        |
1099	   |          | reserved                                               |
1100	   | 0b01xxxx | reserved                                               |
1101	   | 0b1xxxxx | "SUBFRAME_LPC", xxxxx=order-1                          |
1102	   +----------+--------------------------------------------------------+

1104	10.22.2.  WASTED BITS PER SAMPLE FLAG
1105	   +-------+-----------------------------------------------------------+
1106	   | Value | Description                                               |
1107	   +-------+-----------------------------------------------------------+
1108	   |     0 | no wasted bits-per-sample in source subblock, k=0         |
1109	   |     1 | k wasted bits-per-sample in source subblock, k-1 follows, |
1110	   |       | unary coded; e.g. k=3 => 001 follows, k=7 => 0000001      |
1111	   |       | follows.                                                  |
1112	   +-------+-----------------------------------------------------------+

1114	10.23.  SUBFRAME_CONSTANT

1116	   +--------+----------------------------------------------------------+
1117	   | Data   | Description                                              |
1118	   +--------+----------------------------------------------------------+
1119	   | "u(n)" | Unencoded constant value of the subblock, n = frame's    |
1120	   |        | bits-per-sample.                                         |
1121	   +--------+----------------------------------------------------------+

1123	10.24.  SUBFRAME_FIXED

1125	   +------------+------------------------------------------------------+
1126	   | Data       | Description                                          |
1127	   +------------+------------------------------------------------------+
1128	   | "u(n)"     | Unencoded warm-up samples (n = frame's bits-per-     |
1129	   |            | sample * predictor order).                           |
1130	   | "RESIDUAL" | Encoded residual                                     |
1131	   +------------+------------------------------------------------------+

1133	10.25.  SUBFRAME_LPC

1135	   +------------+------------------------------------------------------+
1136	   | Data       | Description                                          |
1137	   +------------+------------------------------------------------------+
1138	   | "u(n)"     | Unencoded warm-up samples (n = frame's bits-per-     |
1139	   |            | sample * lpc order).                                 |
1140	   | "u(4)"     | (Quantized linear predictor coefficients' precision  |
1141	   |            | in bits)-1 (0b1111 = invalid).                       |
1142	   | "u(5)"     | Quantized linear predictor coefficient shift needed  |
1143	   |            | in bits (NOTE: this number is signed                 |
1144	   |            | two's-complement).                                   |
1145	   | "u(n)"     | Unencoded predictor coefficients (n = qlp coeff      |
1146	   |            | precision * lpc order) (NOTE: the coefficients are   |
1147	   |            | signed two's-complement).                            |
1148	   | "RESIDUAL" | Encoded residual                                     |
1149	   +------------+------------------------------------------------------+

1151	10.26.  SUBFRAME_VERBATIM

1153	   +-----------+-------------------------------------------------------+
1154	   | Data      | Description                                           |
1155	   +-----------+-------------------------------------------------------+
1156	   | "u(n\*i)" | Unencoded subblock; n = frame's bits-per-sample, i =  |
1157	   |           | frame's blocksize.                                    |
1158	   +-----------+-------------------------------------------------------+

1160	10.27.  RESIDUAL

1162	   +-------------------------------------------+-----------------------+
1163	   | Data                                      | Description           |
1164	   +-------------------------------------------+-----------------------+
1165	   | "u(2)"                                    | "RESIDUAL_CODING_METH |
1166	   |                                           | OD"                   |
1167	   | "RESIDUAL_CODING_METHOD_PARTITIONED_EXP_G |                       |
1168	   | OLOMB" || "RESIDUAL_CODING_METHOD_PARTITI |                       |
1169	   | ONED_EXP_GOLOMB2"                         |                       |
1170	   +-------------------------------------------+-----------------------+

1172	10.27.1.  RESIDUAL_CODING_METHOD

1174	   +--------+----------------------------------------------------------+
1175	   |  Value | Description                                              |
1176	   +--------+----------------------------------------------------------+
1177	   |   0b00 | partitioned Exp-Golomb coding with 4-bit Exp-Golomb      |
1178	   |        | parameter; RESIDUAL_CODING_METHOD_PARTITIONED_EXP_GOLOMB |
1179	   |        | follows                                                  |
1180	   |   0b01 | partitioned Exp-Golomb coding with 5-bit Exp-Golomb      |
1181	   |        | parameter;                                               |
1182	   |        | RESIDUAL_CODING_METHOD_PARTITIONED_EXP_GOLOMB2 follows   |
1183	   | 0b10 - | reserved                                                 |
1184	   |   0b11 |                                                          |
1185	   +--------+----------------------------------------------------------+

1187	10.27.2.  RESIDUAL_CODING_METHOD_PARTITIONED_EXP_GOLOMB

1189	      +-------------------------+-----------------------------------+
1190	      | Data                    | Description                       |
1191	      +-------------------------+-----------------------------------+
1192	      | "u(4)"                  | Partition order.                  |
1193	      | "EXP_GOLOMB_PARTITION"+ | There will be 2^order partitions. |
1194	      +-------------------------+-----------------------------------+

1196	10.27.2.1.  EXP_GOLOMB_PARTITION

1198	   +------------+------------------------------------------------------+
1199	   | Data       | Description                                          |
1200	   +------------+------------------------------------------------------+
1201	   | "u(4(+5))" | "EXP-GOLOMB PARTITION ENCODING PARAMETER" (see       |
1202	   |            | Section 10.27.2.2)                                   |
1203	   | "u(?)"     | "ENCODED RESIDUAL" (see Section 10.27.4)             |
1204	   +------------+------------------------------------------------------+

1206	10.27.2.2.  EXP-GOLOMB PARTITION ENCODING PARAMETER

1208	   +----------+--------------------------------------------------------+
1209	   |    Value | Description                                            |
1210	   +----------+--------------------------------------------------------+
1211	   | 0b0000 - | Exp-golomb parameter.                                  |
1212	   |   0b1110 |                                                        |
1213	   |   0b1111 | Escape code, meaning the partition is in unencoded     |
1214	   |          | binary form using n bits per sample; n follows as a    |
1215	   |          | 5-bit number.                                          |
1216	   +----------+--------------------------------------------------------+

1218	10.27.3.  RESIDUAL_CODING_METHOD_PARTITIONED_EXP_GOLOMB2

1220	     +--------------------------+-----------------------------------+
1221	     | Data                     | Description                       |
1222	     +--------------------------+-----------------------------------+
1223	     | "u(4)"                   | Partition order.                  |
1224	     | "EXP-GOLOMB2_PARTITION"+ | There will be 2^order partitions. |
1225	     +--------------------------+-----------------------------------+

1227	10.27.3.1.  EXP_GOLOMB2_PARTITION

1229	   +------------+------------------------------------------------------+
1230	   | Data       | Description                                          |
1231	   +------------+------------------------------------------------------+
1232	   | "u(5(+5))" | "EXP-GOLOMB2 PARTITION ENCODING PARAMETER" (see      |
1233	   |            | Section 10.27.3.2)                                   |
1234	   | "u(?)"     | "ENCODED RESIDUAL" (see Section 10.27.4)             |
1235	   +------------+------------------------------------------------------+

1237	10.27.3.2.  EXP-GOLOMB2 PARTITION ENCODING PARAMETER
1238	   +----------+--------------------------------------------------------+
1239	   |    Value | Description                                            |
1240	   +----------+--------------------------------------------------------+
1241	   |  0b00000 | Exp-golomb parameter.                                  |
1242	   |        - |                                                        |
1243	   |  0b11110 |                                                        |
1244	   |  0b11111 | Escape code, meaning the partition is in unencoded     |
1245	   |          | binary form using n bits per sample; n follows as a    |
1246	   |          | 5-bit number.                                          |
1247	   +----------+--------------------------------------------------------+

1249	10.27.4.  ENCODED RESIDUAL

1251	   The number of samples (n) in the partition is determined as follows:

1253	   o  if the partition order is zero, n = frame's blocksize - predictor
1254	      order

1256	   o  else if this is not the first partition of the subframe, n =
1257	      (frame's blocksize / (2^partition order))

1259	   o  else n = (frame's blocksize / (2^partition order)) - predictor
1260	      order

1262	   Copyright (c) 2000-2009 Josh Coalson, 2011-2014 Xiph.Org Foundation

1264	11.  References

1266	11.1.  URIs

1268	   [1] ogg_mapping.html

1270	   [2] documentation_format_overview.html

1272	   [3] http://svr-www.eng.cam.ac.uk/~ajr/

1274	   [4] http://svr-www.eng.cam.ac.uk/reports/abstracts/
1275	       robinson_tr156.html

1277	   [5] https://web.archive.org/web/20040215005354/http://csi.usc.edu/
1278	       faculty/golomb.html

1280	   [6] http://en.wikipedia.org/wiki/Claude_Shannon

1282	   [7] https://en.wikipedia.org/wiki/Linear_predictive_coding

1284	   [8] http://www.hpl.hp.com/techreports/1999/HPL-1999-144.pdf

1286	   [9] http://svr-www.eng.cam.ac.uk/reports/abstracts/
1287	       robinson_tr156.html

1289	   [10] http://www.hpl.hp.com/techreports/1999/HPL-1999-144.pdf

1291	   [11] http://svr-www.eng.cam.ac.uk/reports/abstracts/
1292	        robinson_tr156.html

1294	   [12] https://en.wikipedia.org/wiki/Exponential-Golomb_coding

1296	   [13] http://www.hpl.hp.com/techreports/98/HPL-98-193.html

1298	   [14] http://web.archive.org/web/20140827133312/http://www.cs.tut.fi/~
1299	        albert/Dev/pucrunch/packing.html

1301	   [15] https://xiph.org/flac/id.html

1303	   [16] http://xiph.org/vorbis/doc/v-comment.html

1305	   [17] http://www.id3.org/id3v2.4.0-frames

1307	   [18] id.html

1309	   [19] http://www.xiph.org/vorbis/doc/v-comment.html

1311	   [20] http://isrc.ifpi.org/

1313	   [21] http://www.disctronics.co.uk/technology/cdaudio/cdaud_isrc.htm

1315	   [22] http://www.chipchapin.com/CDMedia/cdda9.php3

1317	   [23] http://lists.xiph.org/pipermail/flac-
1318	        dev/2008-December/002607.html

1320	Authors' Addresses

1322	   Josh Coalson

1324	   Xiph.Org Foundation