idnits 2.17.1 draft-ietf-avt-profile-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-25) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 2) being 100 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 10 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack an Authors' Addresses Section. ** There are 108 instances of too long lines in the document, the longest one being 4 characters in excess of 72. == There are 5 instances of lines with non-RFC2606-compliant FQDNs in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 102 has weird spacing: '...cations shoul...' == Line 106 has weird spacing: '...erating param...' == Line 146 has weird spacing: '...ampling rate ...' == Line 147 has weird spacing: '... kHz kb/s ...' == Line 189 has weird spacing: '... The libra...' == (2 more instances...) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (October 20, 1993) is 11145 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'IMA' on line 157 looks like a reference Summary: 10 errors (**), 0 flaws (~~), 10 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force Audio-Video Transport Working Group 3 INTERNET-DRAFT H. Schulzrinne 4 draft-ietf-avt-profile-03.txt AT&T Bell Laboratories 5 October 20, 1993 6 Expires: 12/31/93 8 Sample Profile and Encodings for the Use of RTP for Audio and Video 9 Conferences with Minimal Control 11 Status of this Memo 13 This document is an Internet Draft. Internet Drafts are working documents 14 of the Internet Engineering Task Force (IETF), its Areas, and its Working 15 Groups. Note that other groups may also distribute working documents as 16 Internet Drafts. 18 Internet Drafts are draft documents valid for a maximum of six months. 19 Internet Drafts may be updated, replaced, or obsoleted by other documents 20 at any time. It is not appropriate to use Internet Drafts as reference 21 material or to cite them other than as a ``working draft'' or ``work in 22 progress.'' 24 Please check the I-D abstract listing contained in each Internet Draft 25 directory to learn the current status of this or any other Internet Draft. 27 Distribution of this document is unlimited. 29 Abstract 31 This note describes a profile for the use of the real-time 32 transport protocol (RTP) and the associated control protocol, RTCP, 33 within audio and video multiparticipant conferences with minimal 34 control. It provides interpretations of generic fields within the 35 RTP specification suitable for audio and video conferences. In 36 particular, this document defines a set of default mappings from 37 format index to encodings. 38 The document also describes how audio and video data may be 39 carried within RTP. It defines a set of standard encodings and 40 their names when used within RTP. However, the definitions are 41 independent of the particular transport mechanism used. The 42 descriptions provide pointers to reference implementations and 43 the detailed standards. This document is meant as an aid 44 for implementors of audio, video and other real-time multimedia 45 applications. 47 Contents 49 1 Introduction 2 51 2 Demultiplexing 3 53 3 Audio 3 55 3.1 Encoding-independent recommendations . . . . . . . . . . . . . . . 3 57 3.2 Recommended Audio Encodings. . . . . . . . . . . . . . . . . . . . 4 59 3.3 The RTCP FMT Option for Audio. . . . . . . . . . . . . . . . . . . 6 61 3.4 Port Assignment. . . . . . . . . . . . . . . . . . . . . . . . . . 7 63 4 Video 8 65 4.1 The RTCP FMT Option for Video. . . . . . . . . . . . . . . . . . . 9 67 4.2 Port Assignment. . . . . . . . . . . . . . . . . . . . . . . . . . 9 69 5 Miscellaneous 10 71 6 Address of Author 10 73 1 Introduction 75 This profile defines aspects of RTP left unspecified in the RTP protocol 76 definition (RFC TBD). This profile is intended for the use within audio and 77 video conferences with minimal session control. In particular, no support 78 for the negotiation of parameters or membership control is provided. Other 79 profiles may make different choices for the items specified here. The 80 profile specifies the use of RTP over unicast and multicast UDP as well 81 as ST-II. For unicast UDP and ST-II, references to multicast addresses 82 are to be ignored. The use of this profile is indicated by the use of 83 a media-specific well-known port number. The profile may also be used 84 with other port numbers. For example, the use of a particular session 85 announcement tool could imply use of this profile. 87 internet-dRAFT draft-ietf-avt-profile-03.txt October 20, 1993 89 2 Demultiplexing 91 For applications which choose to share a single network destination address 92 and port for both audio and video, the default channel identifier for audio 93 is 0 and for video is 1. In that case, the port number for audio is used. 94 This combination should only be used when it is known that all receiving 95 applications can properly demultiplex audio and video. 97 3 Audio 99 3.1 Encoding-independent recommendations 101 The following recommendations are default operating parameters. Ap- 102 plications should be prepared to handle other values. The ranges 103 given are meant to give guidance to application writers, allowing a set 104 of applications conforming to these guidelines to interoperate without 105 additional negotiation. These guidelines are not intended to restrict 106 operating parameters for applications that can negotiate a set of 107 interoperable parameters, e.g., through a conference control protocol. 109 For packetized audio, the default packetization interval should have a 110 duration of 20 ms, unless otherwise noted in Table 1. The packetization 111 interval determines the minimum end-to-end delay; longer packets introduce 112 less header overhead but higher delay and make packet loss more noticeable. 113 For non-interactive applications such as lectures or links with severe 114 bandwidth constraints, a higher packetization delay may be appropriate. For 115 frame-based encodings (marked as F in the table 1 below) such as LPC, CELP 116 and GSM, the sender may choose to combine several frame intervals into a 117 single message. The receiver can tell the number of frames contained in a 118 message since the frame duration is defined as part of the encoding. 120 If multiple channels are used, the left channel information always precedes 121 the right-channel information. For more than two channels, the convention 122 followed by the AIFF-C audio interchange format should be followed. (The 123 AIFF-C specification is available by anonymous ftp at ftp.sgi.com in the 124 file sgi/aiff-c.9.26.91.ps.) For two-channel stereo, the sequence is left, 125 right; for three channels, left, right, center; for quadrophonic systems, 126 front left, front right, rear left, rear right; for four-channel systems, 127 left, center, right, and surround sound; for six-channel systems left, left 128 center, center, right, right center and surround sound. 130 The sampling frequency should be drawn from the set: 8, 11.025, 16, 22.05, 131 44.1 and 48 kHz. 133 3.2 Recommended Audio Encodings 135 The table 1 shows the names, types (sample vs. frame oriented), per-channel 136 bit rates and default sampling frequencies of recommended encodings. The 137 list is partially drawn from the document "Recommended practices for 138 enhancing digital audio compatibility in multimedia systems", published by 139 the Interactive Multimedia Assocation, Version 3.00, Oct. 1992 (referenced 140 as [IMA]). The names are for identification only; they correspond to the 141 names used within the Real-Time Transport Protocol (RTP). Other applications 142 may choose different namings. Note that the L16 encoding may be used with 143 different sampling rates. The CCITT changed its name in 1993 to ITU-T; to 144 limit confusion, both old and new name are used. 146 name nom. sampling rate type frame description 147 kHz kb/s S/F ms 148 _________________________________________________________________________ 149 L16 48 768 S 16-bit linear, 2's complement 150 L16 44.1 705.6 S 151 L16 22.05 352.8 S 152 L16 11.025 176.4 S 153 G722 16 64 S CCITT/ITU-T subband ADPCM 154 PCMU 8 64 S CCITT/ITU-T mu-law PCM 155 PCMA 8 64 S CCITT/ITU-T A-law PCM 156 G721 8 32 S CCITT/ITU-T ADPCM 157 IDVI 8 32 S Intel/DVI ADPCM [IMA] 158 G723 8 24 S CCITT/ITU-T ADPCM 159 GSM 8 13 F 20 RTE/LTP GSM 06.10 160 1016 8 4.8 F 30 CELP 161 _________________________________________________________________________ 163 Table 1: Audio encodings 165 For multi-octet encodings, octets are transmitted in network byte order 166 (i.e., most significant octet first). 168 A detailed description of the encodings is given below. The names shown 169 (L16, PCMU, etc.) are limited to four characters and suitable to be used 170 for identification in protocols such as RTP (RFC TBD). 172 L16: denotes uncompressed audio data, using 16-bit signed representation 173 with 65535 equally divided steps between minimum and maximum signal 174 level, ranging from -32768 to 32767. The value is represented in two's 175 complement notation. 177 PCMU: specified in CCITT/ITU-T recommendation G.711. Audio data is encoded 178 as eight bits per sample, after companding. Code to convert between 179 linear and mu-law companded data is available in the IMA document. 181 PCMA: specified in CCITT/ITU-T recommendation G.711. Audio data is encoded 182 as eight bits per sample, after companding. Code to convert between 183 linear and A-law companded data is available in the IMA document. 185 G721 through G729: specified in the corresponding CCITT/ITU-T recommenda- 186 tions. Reference implementations for G.721 and G.723 are available 187 as part of the CCITT/ITU-T Software Tool Library (STL) from the 188 ITU General Secretariat, Sales Service, Place du Nations, CH-1211 189 Geneve 20, Switzerland. The library is covered by a license 190 and is available for anonymous ftp on gaia.cs.umass.edu, file 191 pub/ccitt/ccitt_tools.tar.Z. 193 GSM: (group speciale mobile) denotes the European GSM 06.10 provisional 194 standard for full-rate speech transcoding, prI-ETS 300 036, which 195 is based on RPE/LTP (residual pulse excitation/long term prediction) 196 coding at a rate of 13 kb/s. A reference implementation was written by 197 Carsten Borman and Jutta Degener (TU Berlin, Germany) and is available 198 for anonymous ftp from tub.cs.tu-berlin.de, directory tub/tubmik. 200 1016: uses code-excited linear prediction (CELP) and is specified in 201 Federal Standard FED-STD 1016, published by the Office of Technology 202 and Standards, Washington, DC 20305-2010. 204 The U. S. DoD's Federal-Standard-1016 based 4800 bps code excited 205 linear prediction voice coder version 3.2 (CELP 3.2) Fortran and 206 C simulation source codes are available for worldwide distribution 207 at no charge (on DOS diskettes, but configured to compile on Sun 208 SPARC stations) from: Bob Fenichel, National Communications System, 209 Washington, D.C. 20305, phone +1-703-692-2124, fax +1-703-746-4960. 211 Example input and processed speech files, a technical information 212 bulletin, and the official standard "Federal Standard 1016, Telecom- 213 munications: Analog to Digital Conversion of Radio Voice by 4,800 214 bit/second Code Excited Linear Prediction (CELP)" are included at no 215 charge. According to Vincent Cate (Carnegie Mellon), the distribution 216 is also available for anonymous ftp at furmint.nectar.cs.cmu.edu 217 (128.2.209.111) in directory celp.audio.compression. 219 The following articles describes the Federal-Standard-1016 4.8-kbps 220 CELP coder: 222 Campbell, Joseph P. Jr., Thomas E. Tremain and Vanoy C. Welch, "The 223 Proposed Federal Standard 1016 4800 bps Voice Coder: CELP," Speech 224 Technology Magazine, April/May 1990, p. 58-64. 226 Campbell, Joseph P. Jr., Thomas E. Tremain and Vanoy C. Welch, "The 227 Federal Standard 1016 4800 bps CELP Voice Coder," Digital Signal 228 Processing, Academic Press, 1991, Vol. 1, No. 3, p. 145-155. 230 Campbell, Joseph P. Jr., Thomas E. Tremain and Vanoy C. Welch, "The 231 DoD 4.8 kbps Standard (Proposed Federal Standard 1016)," in Advances 232 in Speech Coding, ed. Atal, Cuperman and Gersho, Kluwer Academic 233 Publishers, 1991, Chapter 12, p. 121-133. 235 Campbell, Joseph P. Jr., Thomas E. Tremain and Vanoy C. Welch, "The 236 Proposed Federal Standard 1016 4800 bps Voice Coder: CELP," Speech 237 Technology Magazine, April/May 1990, p. 58-64. 239 Copies of the FS-1016 document are available for $2.50 each from: 241 GSA Rm 6654 242 7th & D St SW 243 Washington, D.C. 20407 244 1-202-708-9205 246 DVI: is specified in the "Recommended Practices for Enhancing Digital Audio 247 Compatibility in Multimedia Systems", published by the Interactive 248 Multimedia Association (IMA), Annapolis, MD. The document also contains 249 reference implementations for mu-law to 16-bit, ADPCM and sample rate 250 conversions. 252 For sample-based encodings, a receiver should accept packets representing 253 between 0 and 200 ms of audio data.(1) Receivers should be prepared to 254 accept multi-channel audio, but may choose to only play a single channel. 256 All block-oriented audio codecs should be able to encode and decode several 257 consecutive blocks within a single packet. Since the frame size for 258 the block-oriented codecs is given, there is no need to use a separate 259 designation for the same encoding, but with different number of blocks per 260 packet. 262 3.3 The RTCP FMT Option for Audio 264 Unless specified with the FMT option, the mapping between the format field 265 in an RTP packet and audio encodings, sampling rates and channel counts is 266 specified by Tables 2. 268 Format values of 31 and below cannot be redefined by FMT options. In other 269 words, only values of 32 and above are valid in the format field within an 270 FMT option. The receiver is expected to discard RTP packets containing 271 media data with unknown format field values. Sites are expected to keep 272 the mapping between format and encoding constant, so that lost packets 273 containing FMT options do not lead the receiver to misinterpret media data. 274 Additional standard encodings may be registered with the Internet Assigned 275 ------------------------------ 276 1. This restriction allows reasonable buffer sizing for the receiver. 278 Numbers Authority (IANA). The format name is intended to describe the format 279 in an unambiguous way; it is interpreted as a sequence of four ASCII 280 characters, with uppercase and lowercase characters treated as distinct. 281 Format names beginning with the letter 'X' are reserved for experimental use 282 and not subject to registration. These experimental encodings may be mapped 283 to format values 32 and above using the FMT option. Additional standard 284 mappings to format values of 31 and below may also be registered with IANA. 285 Registered assignments are published periodically in the Assigned Numbers 286 RFC. 288 Within the FMT option, the format name is followed by a field containing a 289 channel count and a sample rate field, measured in samples per second.(2) A 290 channel count of zero is considered invalid. A packetization interval of 20 291 ms or a multiple thereof is suggested as it leads to integral sample counts 292 for all common sampling rates. 294 0 1 2 3 295 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 296 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 297 |F| FMT | length |0|0| format | reserved | 298 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 299 | name of format | 300 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 301 | channels | sampling rate (Hz) | 302 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 303 ... encoding specific parameters ... 304 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 306 Figure 1: FMT option for audio encodings 308 3.4 Port Assignment 310 ST-II SAP and UDP port 5005 is the default destination for multicast 311 real-time audio data carried by RTP for this profile. 313 A fixed port number is useful as it is less likely than a randomly chosen 314 port number to be already in use by another application at one or more of 315 the intended destination hosts. Also, fixed port numbers allow traffic 316 statistics to be collected and may simplify firewall implementations. A 317 single fixed port number requires that hosts allow several processes to use 318 a single UDP port with different multicast addresses. (The particular port 319 number was chosen to lie in the range above 5000 to accomodate port number 320 ------------------------------ 321 2. Fractional samples per second was considered excessive as the typical 322 crystal accuraccy of 100 ppm translates into about one Hz or more of 323 sampling rate inaccuracy. 325 index encoding sampling rate channels 326 name (kHz) 327 __________________________________________ 328 0 PCMU 8 1 329 1 1016 8 1 330 2 G721 8 1 331 3 GSM 8 1 332 4 G723 8 1 333 5 IDVI 8 1 334 10 L16 44.1 2 335 __________________________________________ 337 Table 2: Standard audio encodings 339 allocation practice within the Unix operating system, where port numbers 340 below 1024 can only be used by privileged processes and port numbers between 341 1024 and 5000 are automatically assigned by the operating system.) 343 Unicast connections may use the this or a set of mutually agreed-upon port 344 numbers. 346 4 Video 348 The following video encodings are currently defined, with their abbreviated 349 names used for identification: 351 CPV: This encoding, "Compressed Packet Video" is implemented by Concept, 352 Bolter, and ViewPoint Systems video codecs. 354 JPEG: The encoding is specified in ISO Standards DIS 10918-1 and DIS 355 10918-2. The data is formatted according to the JFIF (JPEG File 356 Interchange Format) defined by C-Cube Microsystems. 358 H261: The encoding is specified in CCITT/ITU-T standard H.261. The 359 packetization and RTP-specific properties are described in RFC TBD. 361 nv: The encoding is implemented in the program 'nv' developed at Xerox PARC 362 by Ron Frederick. 364 CUSM: The encoding is implemented in the program CU-SeeMe developed at 365 Cornell University by Dick Cogger, Scott Brim, Tim Dorcey and John 366 Lynn. 368 PicW: The encoding is implemented in the program PictureWindow developed at 369 Bolt, Beranek and Newman (BBN). 371 4.1 The RTCP FMT Option for Video 373 Unless specified with the RTCP FMT option, the mapping between the format 374 field in an RTP packet and the video encoding is specified by Tables 3. The 375 second paragraph of Section 3.3 applies for video as well. 377 Within the video FMT option, a one-octet numeric version identifier further 378 describes the encoding. Unless otherwise defined, the version identifier 379 has the value zero. 381 0 1 2 3 382 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 383 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 384 |F| FMT | length |0|0| format | reserved | 385 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 386 | name of format | 387 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 388 | version | encoding-specific parameters | 389 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 390 ... encoding-specific parameters ... 391 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 393 Figure 2: FMT option for video encodings 395 number name 396 ______________ 397 26 JPEG 398 27 CUSM 399 28 nv 400 29 PicW 401 30 Bolt 402 31 H261 404 Table 3: Format values for standard video encodings 406 4.2 Port Assignment 408 ST-II SAP and UDP port 5006 is the default destination for multicast 409 real-time video data carried by RTP for this profile. The remainder of 410 section 3.4 applies. 412 5 Miscellaneous 414 RTCP messages should be sent periodically, with a period varying randomly 415 around a set mean to avoid synchronized bursts of RTCP packets. (For 416 example, the time between messages could vary uniformly between one half and 417 1.5 times the mean.) The average period between transmissions determines 418 the additional network load due to RTCP packets and also determines how 419 long it will take a new arrival to discover the identities of the other 420 conference participants. The average period should be chosen such that no 421 more than a small fraction (say, 1%) of the media bandwidth is consumed by 422 RTCP messages from all sources, with a minimum period of a few seconds. 423 By scaling the message frequency with the (slowly increasing) number of 424 observed participants, a new conference participant will quickly inform all 425 other participants of its arrival and then slow its announcement rate. 427 6 Address of Author 429 Henning Schulzrinne 430 AT&T Bell Laboratories 431 MH 2A244 432 600 Mountain Avenue 433 Murray Hill, NJ 07974-0636 434 telephone: +1 908 582 2262 435 facsimile: +1 908 582 5809 436 electronic mail: hgs@research.att.com