idnits 2.17.1 draft-ietf-avt-rfc3119bis-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 16. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 545. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 516. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 523. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 529. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 903 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 44 instances of too long lines in the document, the longest one being 1 character in excess of 72. -- The draft header indicates that this document obsoletes RFC3119, but the abstract doesn't seem to directly say this. It does mention RFC3119 though, so this could be OK. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (August 28, 2007) is 6079 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. '3' -- Possible downref: Non-RFC (?) normative reference: ref. '4' ** Obsolete normative reference: RFC 4566 (ref. '7') (Obsoleted by RFC 8866) Summary: 3 errors (**), 0 flaws (~~), 2 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group Ross Finlayson 2 INTERNET-DRAFT Live Networks, Inc. 3 Obsoletes: 3119 August 28, 2007 4 Category: Standards Track 5 Expires: February 28, 2008 7 A More Loss-Tolerant RTP Payload Format for MP3 Audio 8 10 Status of this Memo 12 By submitting this Internet-Draft, each author represents that 13 any applicable patent or other IPR claims of which he or she is 14 aware have been or will be disclosed, and any of which he or she 15 becomes aware will be disclosed, in accordance with Section 6 of 16 BCP 79. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that other 20 groups may also distribute working documents as Internet-Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet-Drafts as reference 25 material or to cite them other than as "work in progress. 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html 33 Abstract 35 This document describes a RTP (Real-Time Protocol) payload format for 36 transporting MPEG (Moving Picture Experts Group) 1 or 2, layer III 37 audio (commonly known as "MP3"). This format is an alternative to 38 that described in RFC 2250, and performs better if there is packet 39 loss. (This document updates (and obsoletes) RFC 3119, correcting 40 typographical errors in the "SDP usage" section and pseudo-code 41 appendices.) 43 Table of Contents 45 1. Terminology 46 2. Introduction 47 3. The Structure of MP3 Frames 48 4. A New Payload Format 49 4.1 ADU frames 50 4.2 ADU descriptors 51 4.3 Packing rules 52 4.4 RTP header fields 53 4.5 Handling received data 54 5. Handling Multiple MPEG Audio Layers 55 6. Frame Packetizing and Depacketizing 56 7. ADU Frame Interleaving 57 8. IANA Considerations 58 9. SDP usage 59 10. Security Considerations 60 11. Acknowledgements 61 12. Normative References 62 13. Author's Address 63 14. IPR Notice 64 15. Copyright Notice 65 Appendix A. Translating Between "MP3 Frames" and "ADU Frames" 66 A.1 Converting a sequence of "MP3 Frames" to a sequence of 67 "ADU Frames": 68 A.2 Converting a sequence of "ADU Frames" to a sequence of 69 "MP3 Frames": 70 Appendix B: Interleaving and Deinterleaving 71 B.1 Interleaving a sequence of "ADU Frames": 72 B.2 Deinterleaving a sequence of (interleaved) "ADU Frames": 74 1. Terminology 75 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 76 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 77 document are to be interpreted as described in RFC 2119 [1]. 79 2. Introduction 81 While the RTP payload format defined in RFC 2250 [2] is generally 82 applicable to all forms of MPEG audio or video, it is sub-optimal for 83 MPEG-1 or 2, layer III audio (commonly known as "MP3"). The reason 84 for this is that an MP3 frame is not a true "Application Data Unit" - 85 it contains a back-pointer to data in earlier frames, and so cannot 86 be decoded independently of these earlier frames. Because RFC 2250 87 defines that packet boundaries coincide with frame boundaries, it 88 handles packet loss inefficiently when carrying MP3 data. The loss 89 of an MP3 frame will render some data in previous (or future) frames 90 useless, even if they are received without loss. 92 In this document we define an alternative RTP payload format for MP3 93 audio. This format uses a data-preserving rearrangement of the 94 original MPEG frames, so that packet boundaries now coincide with 95 true MP3 "Application Data Units", which can also (optionally) be 96 rearranged in an interleaving pattern. This new format is therefore 97 more data-efficient than RFC 2250 in the face of packet loss. 99 3. The Structure of MP3 Frames 101 In this section we give a brief overview of the structure of a MP3 102 frame. (For more detailed description, see the MPEG-1 audio [3] and 103 MPEG-2 audio [4] specifications.) 105 Each MPEG audio frame begins with a 4-byte header. Information 106 defined by this header includes: 108 - Whether the audio is MPEG-1 or MPEG-2. 109 - Whether the audio is layer I, II, or III. 110 (The remainder of this document assumes layer III, i.e., "MP3" 111 frames) 112 - Whether the audio is mono or stereo. 113 - Whether or not there is a 2-byte CRC field following the header. 114 - (indirectly) The size of the frame. 116 The following structures appear after the header: 118 - (optionally) A 2-byte CRC field 119 - A "side info" structure. This has the following length: 120 - 32 bytes for MPEG-1 stereo 121 - 17 bytes for MPEG-1 mono, or for MPEG-2 stereo 122 - 9 bytes for MPEG-2 mono 123 - Encoded audio data, plus optional ancillary data (filling out the 124 rest of the frame) 126 For the purpose of this document, the "side info" structure is the 127 most important, because it defines the location and size of the 128 "Application Data Unit" (ADU) that an MP3 decoder will process. In 129 particular, the "side info" structure defines: 131 - "main_data_begin": This is a back-pointer (in bytes) to the start 132 of the ADU. The back-pointer is counted from the beginning of the 133 frame, and counts only encoded audio data and any ancillary data 134 (i.e., ignoring any header, CRC, or "side info" fields). 136 An MP3 decoder processes each ADU independently. The ADUs will 137 generally vary in length, but their average length will, of course, 138 be that of the of the MP3 frames (minus the length of the header, 139 CRC, and "side info" fields). (In MPEG literature, this ADU is 140 sometimes referred to as a "bit reservoir".) 142 4. A New Payload Format 144 As noted in [5], a payload format should be designed so that packet 145 boundaries coincide with "codec frame boundaries" - i.e., with ADUs. 146 In the RFC 2250 payload format for MPEG audio [2], each RTP packet 147 payload contains MP3 frames. In this new payload format for MP3 148 audio, however, each RTP packet payload contains "ADU frames", each 149 preceded by an "ADU descriptor". 151 4.1 ADU frames 153 An "ADU frame" is defined as: 155 - The 4-byte MPEG header 156 (the same as the original MP3 frame, except that the first 11 157 bits are (optionally) replaced by an "Interleaving Sequence 158 Number", as described in section 7 below) 159 - The optional 2-byte CRC field 160 (the same as the original MP3 frame) 161 - The "side info" structure 162 (the same as the original MP3 frame) 163 - The complete sequence of encoded audio data (and any ancillary 164 data) for the ADU (i.e., running from the start of this MP3 165 frame's "main_data_begin" back-pointer, up to the start of the 166 next MP3 frame's back-pointer) 168 4.2 ADU descriptors 170 Within each RTP packet payload, each "ADU frame" is preceded by a 1 171 or 2-byte "ADU descriptor", which gives the size of the ADU, and 172 indicates whether or not this packet's data is a continuation of the 173 previous packet's data. (This occurs only when a single "ADU 174 descriptor"+"ADU frame" is too large to fit within a RTP packet.) 176 An ADU descriptor consists of the following fields 178 - "C": Continuation flag (1 bit): 1 if the data following the ADU 179 descriptor is a continuation of an ADU frame that was too 180 large to fit within a single RTP packet; 0 otherwise. 181 - "T": Descriptor Type flag (1 bit): 182 0 if this is a 1-byte ADU descriptor; 183 1 if this is a 2-byte ADU descriptor. 184 - "ADU size" (6 or 14 bits): 185 The size (in bytes) of the ADU frame that will follow this 186 ADU descriptor (i.e., NOT including the size of the 187 descriptor itself). A 2-byte ADU descriptor (with a 14-bit 188 "ADU size" field) is used for ADU frame sizes of 64 bytes or 189 more. For smaller ADU frame sizes, senders MAY alternatively 190 use a 1-byte ADU descriptor (with a 6-bit "ADU size" field). 191 Receivers MUST be able to accept an ADU descriptor of either 192 size. 194 Thus, a 1-byte ADU descriptor is formatted as follows: 196 0 1 2 3 4 5 6 7 197 +-+-+-+-+-+-+-+-+ 198 |C|0| ADU size | 199 +-+-+-+-+-+-+-+-+ 201 and a 2-byte ADU descriptor is formatted as follows: 203 0 1 204 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 205 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 206 |C|1| ADU size (14 bits) | 207 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 209 4.3 Packing rules 211 Each RTP packet payload begins with a "ADU descriptor", followed by 212 "ADU frame" data. Normally, this "ADU descriptor"+"ADU frame" will 213 fit completely within the RTP packet. In this case, more than one 214 successive "ADU descriptor"+"ADU frame" MAY be packed into a single 215 RTP packet, provided that they all fit completely. 217 If, however, a single "ADU descriptor"+"ADU frame" is too large to 218 fit within an RTP packet, then the "ADU frame" is split across two or 219 more successive RTP packets. Each such packet begins with an ADU 220 descriptor. The first packet's descriptor has a "C" (continuation) 221 flag of 0; the following packets' descriptors each have a "C" flag of 222 1. Each descriptor, in this case, has the same "ADU size" value: the 223 size of the entire "ADU frame" (not just the portion that will fit 224 within a single RTP packet). Each such packet (even the last one) 225 contains only one "ADU descriptor". 227 4.4 RTP header fields 229 Payload Type: The (static) payload type 14 that was defined for 230 MPEG audio [6] MUST NOT be used. Instead, a different, dynamic 231 payload type MUST be used - i.e., one in the range [96..127]. 233 M bit: This payload format defines no use for this bit. Senders 234 SHOULD set this bit to zero in each outgoing packet. 236 Timestamp: This is a 32-bit 90 kHz timestamp, representing the 237 presentation time of the first ADU packed within the packet. 239 4.5 Handling received data 241 Note that no information is lost by converting a sequence of MP3 242 frames to a corresponding sequence of "ADU frames", so a receiving 243 RTP implementation can either feed the ADU frames directly to an 244 appropriately modified MP3 decoder, or convert them back into a 245 sequence of MP3 frames, as described in appendix A.2 below. 247 5. Handling Multiple MPEG Audio Layers 249 The RTP payload format described here is intended only for MPEG-1 or 250 2, layer III audio ("MP3"). In contrast, layer I and layer II frames 251 are self-contained, without a back-pointer to earlier frames. 252 However, it is possible (although unusual) for a sequence of audio 253 frames to consist of a mixture of layer III frames and layer I or II 254 frames. When such a sequence is transmitted, only layer III frames 255 are converted to ADUs; layer I or II frames are sent 'as is' (except 256 for the prepending of an "ADU descriptor"). Similarly, the receiver 257 of a sequence of frames - using this payload format - leaves layer I 258 and II frames untouched (after removing the prepended "ADU 259 descriptor), but converts layer III frames from "ADU frames" to 260 regular MP3 frames. (Recall that each frame's layer is identified 261 from its 4-byte MPEG header.) 263 If you are transmitting a stream consisting *only* of layer I or 264 layer II frames (i.e., without any MP3 data), then there is no 265 benefit to using this payload format, *unless* you are using the 266 interleaving mechanism described in section 7 below. 268 6. Frame Packetizing and Depacketizing 270 The transmission of a sequence of MP3 frames takes the following 271 steps: 273 MP3 frames 274 -1-> ADU frames 275 -2-> interleaved ADU frames 276 -3-> RTP packets 278 Step 1, the conversion of a sequence of MP3 frames to a corresponding 279 sequence of ADU frames, takes place as described in sections 3 and 280 4.1 above. (Note also the pseudo-code in appendix A.1.) 282 Step 2 is the reordering of the sequence of ADU frames in an 283 (optional) interleaving pattern, prior to packetization, as described 284 in section 7 below. (Note also the pseudo-code in appendix B.1.) 285 Interleaving helps reduce the effect of packet loss, by distributing 286 consecutive ADU frames over non-consecutive packets. (Note that 287 because of the back-pointer in MP3 frames, interleaving can be 288 applied - in general - only to ADU frames. Thus, interleaving was 289 not possible for RFC 2250.) 291 Step 3 is the packetizing of a sequence of (interleaved) ADU frames 292 into RTP packets - as described in section 4.3 above. Each packet's 293 RTP timestamp is the presentation time of the first ADU that is 294 packed within it. Note that, if interleaving was done in step 2, the 295 RTP timestamps on outgoing packets will not necessarily be 296 monotonically nondecreasing. 298 Similarly, a sequence of received RTP packets is handled as follows: 300 RTP packets 301 -4-> RTP packets ordered by RTP sequence number 302 -5-> interleaved ADU frames 303 -6-> ADU frames 304 -7-> MP3 frames 306 Step 4 is the usual sorting of incoming RTP packets using the RTP 307 sequence number. 309 Step 5 is the depacketizing of ADU frames from RTP packets - i.e., 310 the reverse of step 3. As part of this process, a receiver uses the 311 "C" (continuation) flag in the ADU descriptor to notice when an ADU 312 frame is split over more than one packet (and to discard the ADU 313 frame entirely if one of these packets is lost). 315 Step 6 is the rearranging of the sequence of ADU frames back to its 316 original order (except for ADU frames missing due to packet loss), as 317 described in section 7 below. (Note also the pseudo-code in appendix 318 B.2.) 320 Step 7 is the conversion of the sequence of ADU frames into a 321 corresponding sequence of MP3 frames - i.e., the reverse of step 1. 322 (Note also the pseudo-code in appendix A.2.) With an appropriately 323 modified MP3 decoder, an implementation may omit this step; instead, 324 it could feed ADU frames directly to the (modified) MP3 decoder. 326 7. ADU Frame Interleaving 328 In MPEG audio frames (MPEG-1 or 2; all layers) the high-order 11 bits 329 of the 4-byte MPEG header ('syncword') are always all-one (i.e., 330 0xFFE). When reordering a sequence of ADU frames for transmission, 331 we reuse these 11 bits as an "Interleaving Sequence Number" (ISN). 332 (Upon reception, they are replaced with 0xFFE once again.) 334 The structure of the ISN is (a,b), where: 336 - a == bits 0-7: 8-bit Interleave Index (within Cycle) 337 - b == bits 8-10: 3-bit Interleave Cycle Count 339 I.e., the 4-byte MPEG header is reused as follows: 341 0 1 2 3 342 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 343 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 344 |Interleave Idx |CycCt| The rest of the original MPEG header | 345 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 347 Example: Consider the following interleave cycle (of size 8): 348 1,3,5,7,0,2,4,6 349 (This particular pattern has the property that any loss of up to four 350 consecutive ADUs in the interleaved stream will lead to a 351 deinterleaved stream with no gaps greater than one.) 352 This produces the following sequence of ISNs: 354 (1,0) (3,0) (5,0) (7,0) (0,0) (2,0) (4,0) (6,0) (1,1) (3,1) 355 (5,1) etc. 357 So, in this example, a sequence of ADU frames 359 f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 (etc.) 361 would get reordered, in step 2, into: 363 (1,0)f1 (3,0)f3 (5,0)f5 (7,0)f7 (0,0)f0 (2,0)f2 (4,0)f4 (6,0)f6 364 (1,1)f9 (3,1)f11 (5,1)f13 (etc.) 366 and the reverse reordering (along with replacement of the 0xFFE) 367 would occur upon reception. 369 The reason for breaking the ISN into "Interleave Cycle Count" and 370 "Interleave Index" (rather than just treating it as a single 11-bit 371 counter) is to give receivers a way of knowing when an ADU frame 372 should be 'released' to the ADU->MP3 conversion process (step 7 373 above), rather than waiting for more interleaved ADU frames to 374 arrive. E.g., in the example above, when the receiver sees a frame 375 with ISN (,1), it knows that it can release all 376 previously-seen frames with ISN (,0), even if some other 377 (,0) frames remain missing due to packet loss. A 8-bit 378 Interleave Index allows interleave cycles of size up to 256. 380 The choice of an interleaving order can be made independently of RTP 381 packetization. Thus, a simple implementation could choose an 382 interleaving order first, reorder the ADU frames accordingly (step 383 2), then simply pack them sequentially into RTP packets (step 3). 384 However, the size of ADU frames - and thus the number of ADU frames 385 that will fit in each RTP packet - will typically vary in size, so a 386 more optimal implementation would combine steps 2 and 3, by choosing 387 an interleaving order that better reflected the number of ADU frames 388 packed within each RTP packet. 390 Each receiving implementation of this payload format MUST recognize 391 the ISN and be able to perform deinterleaving of incoming ADU frames 392 (step 6). However, a sending implementation of this payload format 393 MAY choose not to perform interleaving - i.e., by omitting step 2. 394 In this case, the high-order 11 bits in each 4-byte MPEG header would 395 remain at 0xFFE. Receiving implementations would thus see a sequence 396 of identical ISNs (all 0xFFE). They would handle this in the same 397 way as if the Interleave Cycle Count changed with each ADU frame, by 398 simply releasing the sequence of incoming ADU frames sequentially to 399 the ADU->MP3 conversion process (step 7), without reordering. (Note 400 also the pseudo-code in appendix B.2.) 402 8. IANA Considerations 404 [Note to RFC Editor: Please replace "XXXX" with this document's 405 RFC number, when it is assigned.] 407 Media type name: audio 409 Media subtype: mpa-robust 411 Required parameters: none 413 Optional parameters: none 415 Encoding considerations: 416 This type is defined only for transfer via RTP, as specified in 417 "RFC XXXX". 419 Security considerations: 420 See the "Security Considerations" section of 421 "RFC XXXX". 423 Interoperability considerations: 424 This encoding is incompatible with both the "audio/mpa" 425 and "audio/mpeg" media types. 427 Published specification: 428 The ISO/IEC MPEG-1 [3] and MPEG-2 [4] audio specifications, 429 and "RFC XXXX". 431 Applications which use this media type: 432 Audio streaming tools (transmitting and receiving) 434 Additional information: none 436 Person & email address to contact for further information: 437 Ross Finlayson 438 finlayson (at) live555.com 440 Intended usage: COMMON 442 Author/Change controller: 443 Author: Ross Finlayson 444 Change controller: IETF AVT Working Group 446 9. SDP usage 448 When conveying information by SDP [7], the encoding name SHALL be 449 "mpa-robust" (the same as the media subtype). An example of the 450 media representation in SDP is: 452 m=audio 49000 RTP/AVP 121 453 a=rtpmap:121 mpa-robust/90000 455 10. Security Considerations 457 If a session using this payload format is being encrypted, and 458 interleaving is being used, then the sender SHOULD ensure that any 459 change of encryption key coincides with a start of a new interleave 460 cycle. Apart from this, the security considerations for this payload 461 format are identical to those noted for RFC 2250 [2]. 463 11. Acknowledgements 465 The suggestion of adding an interleaving option (using the first bits 466 of the MPEG 'syncword' - which would otherwise be all-ones - as an 467 interleaving index) is due to Dave Singer and Stefan Gewinner. In 468 addition, Dave Singer provided valuable feedback that helped clarify 469 and improve the description of this payload format. Feedback from 470 Chris Sloan led to the addition of an "ADU descriptor" preceding each 471 ADU frame in the RTP packet. 473 12. Normative References 475 [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement 476 Levels", BCP 14, RFC 2119, March 1997. 478 [2] Hoffman, D., Fernando, G., Goyal, V. and M. Civanlar, "RTP 479 Payload Format for MPEG1/MPEG2 Video", RFC 2250, January 1998. 481 [3] ISO/IEC International Standard 11172-3; "Coding of moving 482 pictures and associated audio for digital storage media up to 483 about 1,5 Mbits/s - Part 3: Audio", 1993. 485 [4] ISO/IEC International Standard 13818-3; "Generic coding of moving 486 pictures and associated audio information - Part 3: Audio", 1998. 488 [5] Handley, M., "Guidelines for Writers of RTP Payload Format 489 Specifications", BCP 36, RFC 2736, December 1999. 491 [6] Schulzrinne, H., S. Casner, "RTP Profile for Audio and Video 492 Conferences with Minimal Control", RFC 3551, July 2003. 494 [7] Handley, M., Jacobson, V. and C. Perkins, "SDP: Session 495 Description Protocol", RFC 4566, July 2006. 497 13. Author's Address 499 Ross Finlayson, 500 Live Networks, Inc. 501 650 Castro St., suite 120-196 502 Mountain View, CA 94041 504 EMail: finlayson (at) live555.com 505 WWW: http://www.live555.com/ 507 14. IPR Notice 509 The IETF takes no position regarding the validity or scope of any 510 Intellectual Property Rights or other rights that might be claimed 511 to pertain to the implementation or use of the technology 512 described in this document or the extent to which any license 513 under such rights might or might not be available; nor does it 514 represent that it has made any independent effort to identify any 515 such rights. Information on the procedures with respect to rights 516 in RFC documents can be found in BCP 78 and BCP 79. 518 Copies of IPR disclosures made to the IETF Secretariat and any 519 assurances of licenses to be made available, or the result of an 520 attempt made to obtain a general license or permission for the use 521 of such proprietary rights by implementers or users of this 522 specification can be obtained from the IETF on-line IPR repository 523 at http://www.ietf.org/ipr. 525 The IETF invites any interested party to bring to its attention 526 any copyrights, patents or patent applications, or other 527 proprietary rights that may cover technology that may be required 528 to implement this standard. Please address the information to the 529 IETF at ietf-ipr@ietf.org. 531 15. Copyright Notice 533 Copyright (C) The IETF Trust (2007). 535 This document is subject to the rights, licenses and restrictions 536 contained in BCP 78, and except as set forth therein, the authors 537 retain all their rights. 539 This document and the information contained herein are provided on an 540 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 541 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 542 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 543 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 544 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 545 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 547 Appendix A. Translating Between "MP3 Frames" and "ADU Frames" 549 The following 'pseudo code' describes how a sender using this payload 550 format can translate a sequence of regular "MP3 Frames" to "ADU 551 Frames", and how a receiver can perform the reverse translation: from 552 "ADU Frames" to "MP3 Frames". 554 We first define the following abstract data structures: 556 - "Segment": A record that represents either a "MP3 Frame" or an 557 "ADU Frame". It consists of the following fields: 558 - "header": the 4-byte MPEG header 559 - "headerSize": a constant (== 4) 560 - "sideInfo": the 'side info' structure, *including* the optional 561 2-byte CRC field, if present 562 - "sideInfoSize": the size (in bytes) of the above structure 563 - "frameData": the remaining data in this frame 564 - "frameDataSize": the size (in bytes) of the above data 565 - "backpointer": the value (expressed in bytes) of the 566 backpointer for this frame 567 - "aduDataSize": the size (in bytes) of the ADU associated with 568 this frame. (If the frame is already an "ADU Frame", then 569 aduDataSize == frameDataSize) 570 - "mp3FrameSize": the total size (in bytes) that this frame would 571 have if it were a regular "MP3 Frame". (If it is already a 572 "MP3 Frame", then mp3FrameSize == headerSize + sideInfoSize + 573 frameDataSize) Note that this size can be derived completely 574 from "header". 576 - "SegmentQueue": A FIFO queue of "Segment"s, with operations 577 - void enqueue(Segment) 578 - Segment dequeue() 579 - Boolean isEmpty() 580 - Segment head() 581 - Segment tail() 582 - Segment previous(Segment): returns the segment prior to a 583 given one 584 - Segment next(Segment): returns the segment after a given one 585 - unsigned totalDataSize(): returns the sum of the 586 "frameDataSize" fields of each entry in the queue 588 A.1 Converting a sequence of "MP3 Frames" to a sequence of "ADU Frames": 590 SegmentQueue pendingMP3Frames; // initially empty 591 while (1) { 592 // Enqueue new MP3 Frames, until we have enough data to generate 593 // the ADU for a frame: 594 do { 595 int totalDataSizeBefore 596 = pendingMP3Frames.totalDataSize(); 598 Segment newFrame = 'the next MP3 Frame'; 599 pendingMP3Frames.enqueue(newFrame); 601 int totalDataSizeAfter 602 = pendingMP3Frames.totalDataSize(); 603 } while (totalDataSizeBefore < newFrame.backpointer || 604 totalDataSizeAfter < newFrame.aduDataSize); 606 // We now have enough data to generate the ADU for the most 607 // recently enqueued frame (i.e., the tail of the queue). 608 // (The earlier frames in the queue - if any - must be 609 // discarded, as we don't have enough data to generate 610 // their ADUs.) 611 Segment tailFrame = pendingMP3Frames.tail(); 613 // Output the header and side info: 614 output(tailFrame.header); 615 output(tailFrame.sideInfo); 617 // Go back to the frame that contains the start of our ADU data: 618 int offset = 0; 619 Segment curFrame = tailFrame; 620 int prevBytes = tailFrame.backpointer; 621 while (prevBytes > 0) { 622 curFrame = pendingMP3Frames.previous(curFrame); 623 int dataHere = curFrame.frameDataSize; 624 if (dataHere < prevBytes) { 625 prevBytes -= dataHere; 626 } else { 627 offset = dataHere - prevBytes; 628 break; 629 } 630 } 632 // Dequeue any frames that we no longer need: 633 while (pendingMP3Frames.head() != curFrame) { 634 pendingMP3Frames.dequeue(); 635 } 637 // Output, from the remaining frames, the ADU data that we want: 638 int bytesToUse = tailFrame.aduDataSize; 639 while (bytesToUse > 0) { 640 int dataHere = curFrame.frameDataSize - offset; 641 int bytesUsedHere 642 = dataHere < bytesToUse ? dataHere : bytesToUse; 644 output("bytesUsedHere" bytes from curFrame.frameData, 645 starting from "offset"); 647 bytesToUse -= bytesUsedHere; 648 offset = 0; 649 curFrame = pendingMP3Frames.next(curFrame); 650 } 651 } 653 A.2 Converting a sequence of "ADU Frames" to a sequence of "MP3 Frames": 655 SegmentQueue pendingADUFrames; // initially empty 656 while (1) { 657 while (needToGetAnADU()) { 658 Segment newADU = 'the next ADU Frame'; 659 pendingADUFrames.enqueue(newADU); 661 insertDummyADUsIfNecessary(); 662 } 664 generateFrameFromHeadADU(); 665 } 667 Boolean needToGetAnADU() { 668 // Checks whether we need to enqueue one or more new ADUs before 669 // we have enough data to generate a frame for the head ADU. 670 Boolean needToEnqueue = True; 672 if (!pendingADUFrames.isEmpty()) { 673 Segment curADU = pendingADUFrames.head(); 674 int endOfHeadFrame = curADU.mp3FrameSize 675 - curADU.headerSize - curADU.sideInfoSize; 676 int frameOffset = 0; 678 while (1) { 679 int endOfData = frameOffset 680 - curADU.backpointer + 681 curADU.aduDataSize; 682 if (endOfData >= endOfHeadFrame) { 683 // We have enough data to generate a 684 // frame. 685 needToEnqueue = False; 686 break; 687 } 689 frameOffset += curADU.mp3FrameSize 690 - curADU.headerSize 691 - curADU.sideInfoSize; 692 if (curADU == pendingADUFrames.tail()) break; 693 curADU = pendingADUFrames.next(curADU); 694 } 695 } 697 return needToEnqueue; 698 } 700 void generateFrameFromHeadADU() { 701 Segment curADU = pendingADUFrames.head(); 703 // Output the header and side info: 704 output(curADU.header); 705 output(curADU.sideInfo); 707 // Begin by zeroing out the rest of the frame, in case the ADU 708 // data doesn't fill it in completely: 709 int endOfHeadFrame = curADU.mp3FrameSize 710 - curADU.headerSize - curADU.sideInfoSize; 711 output("endOfHeadFrame" zero bytes); 713 // Fill in the frame with appropriate ADU data from this and 714 // subsequent ADUs: 715 int frameOffset = 0; 716 int toOffset = 0; 718 while (toOffset < endOfHeadFrame) { 719 int startOfData = frameOffset - curADU.backpointer; 720 if (startOfData > endOfHeadFrame) { 721 break; // no more ADUs are needed 722 } 723 int endOfData = startOfData + curADU.aduDataSize; 724 if (endOfData > endOfHeadFrame) { 725 endOfData = endOfHeadFrame; 726 } 728 int fromOffset; 729 if (startOfData <= toOffset) { 730 fromOffset = toOffset - startOfData; 731 startOfData = toOffset; 732 if (endOfData < startOfData) { 733 endOfData = startOfData; 734 } 735 } else { 736 fromOffset = 0; 738 // leave some zero bytes beforehand: 739 toOffset = startOfData; 740 } 742 int bytesUsedHere = endOfData - startOfData; 743 output(starting at offset "toOffset, "bytesUsedHere" 744 bytes from "&curADU.frameData[fromOffset]"); 745 toOffset += bytesUsedHere; 747 frameOffset += curADU.mp3FrameSize 748 - curADU.headerSize - curADU.sideInfoSize; 749 curADU = pendingADUFrames.next(curADU); 750 } 752 pendingADUFrames.dequeue(); 753 } 755 void insertDummyADUsIfNecessary() { 756 // The tail segment (ADU) is assumed to have been recently 757 // enqueued. If its backpointer would overlap the data 758 // of the previous ADU, then we need to insert one or more 759 // empty, 'dummy' ADUs ahead of it. (This situation should 760 // occur only if an intermediate ADU was missing - e.g., due 761 // to packet loss.) 762 while (1) { 763 Segment tailADU = pendingADUFrames.tail(); 764 int prevADUend; // relative to the start of the tail ADU 766 if (pendingADUFrames.head() != tailADU) { 767 // there is a previous ADU 768 Segment prevADU 769 = pendingADUFrames.previous(tailADU); 770 prevADUend 771 = prevADU.mp3FrameSize + 772 prevADU.backpointer 773 - prevADU.headerSize 774 - prevADU.sideInfoSize; 775 if (prevADU.aduDataSize > prevADUend) { 776 // this shouldn't happen if the previous 777 // ADU was well-formed 778 prevADUend = 0; 779 } else { 780 prevADUend -= prevADU.aduDataSize; 781 } 782 } else { 783 prevADUend = 0; 784 } 786 if (tailADU.backpointer > prevADUend) { 787 // Insert a 'dummy' ADU in front of the tail. 788 // This ADU can have the same "header" (and thus 789 // "mp3FrameSize") as the tail ADU, but should 790 // have a "backpointer" of "prevADUend", and 791 // an "aduDataSize" of zero. The simplest 792 // way to do this is to copy the "sideInfo" from 793 // the tail ADU, replace the value of 794 // "main_data_begin" with "prevADUend", and set 795 // all of the "part2_3_length" fields to zero. 796 } else { 797 break; // no more dummy ADUs need to be inserted 798 } 799 } 800 } 802 Appendix B: Interleaving and Deinterleaving 804 The following 'pseudo code' describes how a sender can reorder a 805 sequence of "ADU Frames" according to an interleaving pattern (step 806 2), and how a receiver can perform the reverse reordering (step 6). 808 B.1 Interleaving a sequence of "ADU Frames": 810 We first define the following abstract data structures: 812 - "interleaveCycleSize": an integer in the range [1..256] - 813 "interleaveCycle": an array, of size "interleaveCycleSize", 814 containing some permutation of the integers from the set [0 .. 815 interleaveCycleSize-1] 816 e.g., if "interleaveCycleSize" == 8, "interleaveCycle" might 817 contain: 1,3,5,7,0,2,4,6 818 - "inverseInterleaveCycle": an array containing the inverse of the 819 permutation in "interleaveCycle" - i.e., such that 820 interleaveCycle[inverseInterleaveCycle[i]] == i 821 - "ii": the current Interleave Index (initially 0) 822 - "icc": the current Interleave Cycle Count (initially 0) 823 - "aduFrameBuffer": an array, of size "interleaveCycleSize", of ADU 824 Frames that are awaiting packetization 826 while (1) { 827 int positionOfNextFrame = inverseInterleaveCycle[ii]; 828 aduFrameBuffer[positionOfNextFrame] = the next ADU frame; 829 replace the high-order 11 bits of this frame's MPEG header 830 with (ii,icc); 831 // Note: Be sure to leave the remaining 21 bits as is 832 if (++ii == interleaveCycleSize) { 833 // We've finished this cycle, so pass all 834 // pending frames to the packetizing step 835 for (int i = 0; i < interleaveCycleSize; ++i) { 836 pass aduFrameBuffer[i] to the packetizing step; 837 } 839 ii = 0; 840 icc = (icc+1)%8; 841 } 842 } 844 B.2 Deinterleaving a sequence of (interleaved) "ADU Frames": 846 We first define the following abstract data structures: 848 - "ii": the Interleave Index from the current incoming ADU frame 849 - "icc": the Interleave Cycle Count from the current incoming ADU 850 frame 851 - "iiLastSeen": the most recently seen Interleave Index (initially, 852 some integer *not* in the range [0..255]) 853 - "iccLastSeen": the most recently seen Interleave Cycle Count 854 (initially, some integer *not* in the range [0..7]) 855 - "aduFrameBuffer": an array, of size 256, of (pointers to) ADU 856 Frames that have just been depacketized (initially, all entries 857 are NULL) 859 while (1) { 860 aduFrame = the next ADU frame from the depacketizing step; 861 (ii,icc) = "the high-order 11 bits of aduFrame's MPEG header"; 862 "the high-order 11 bits of aduFrame's MPEG header" = 0xFFE; 863 // Note: Be sure to leave the remaining 21 bits as is 865 if (icc != iccLastSeen || ii == iiLastSeen) { 866 // We've started a new interleave cycle 867 // (or interleaving was not used). Release all 868 // pending ADU frames to the ADU->MP3 conversion step: 869 for (int i = 0; i < 256; ++i) { 870 if (aduFrameBuffer[i] != NULL) { 871 release aduFrameBuffer[i]; 872 aduFrameBuffer[i] = NULL; 873 } 874 } 875 } 877 iiLastSeen = ii; 878 iccLastSeen = icc; 879 aduFrameBuffer[ii] = aduFrame; 880 } 882 Expires: February 28, 2008