idnits 2.17.1 

draft-ietf-avt-profile-03.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in
     this document.

     Expected boilerplate is as follows today (2024-04-25) according to
     https://trustee.ietf.org/license-info :

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.a:
        This Internet-Draft is submitted in full conformance with the provisions
        of BCP 78 and BCP 79.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2:
        Copyright (c) 2024 IETF Trust and the persons identified as the document
        authors.  All rights reserved.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3:
        This document is subject to BCP 78 and the IETF Trust's Legal Provisions
        Relating to IETF Documents
        (https://trustee.ietf.org/license-info) in effect on the date of
        publication of this document.  Please review these documents
        carefully, as they describe your rights and restrictions with
        respect to this document.  Code Components extracted from this
        document must include Simplified BSD License text as described in
        Section 4.e of the Trust Legal Provisions and are provided
        without warranty as described in the Simplified BSD License.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     current Internet-Drafts. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     Shadow Directories. 

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard

  == The page length should not exceed 58 lines per page, but there was 1
     longer page, the longest (page 2) being 100 lines

  == It seems as if not all pages are separated by form feeds - found 0 form
     feeds but 10 pages


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack a Security Considerations section.

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack an Authors' Addresses Section.

  ** There are 108 instances of too long lines in the document, the longest
     one being 4 characters in excess of 72.

  == There are 5 instances of lines with non-RFC2606-compliant FQDNs in the
     document.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == Line 102 has weird spacing: '...cations  shoul...'

  == Line 106 has weird spacing: '...erating  param...'

  == Line 146 has weird spacing: '...ampling  rate ...'

  == Line 147 has weird spacing: '...    kHz  kb/s ...'

  == Line 189 has weird spacing: '...    The  libra...'

  == (2 more instances...)

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (October 20, 1993) is 11145 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Missing reference section? 'IMA' on line 157 looks like a reference


     Summary: 10 errors (**), 0 flaws (~~), 10 warnings (==), 3 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Internet Engineering Task Force          Audio-Video Transport Working Group
3	INTERNET-DRAFT                                                H. Schulzrinne
4	draft-ietf-avt-profile-03.txt                         AT&T Bell Laboratories
5	                                                            October 20, 1993
6	                                                          Expires:  12/31/93

8	    Sample Profile and Encodings for the Use of RTP for Audio and Video
9	                      Conferences with Minimal Control

11	Status of this Memo

13	This document is an Internet Draft.  Internet Drafts are working documents
14	of the Internet Engineering Task Force (IETF), its Areas, and its Working
15	Groups.   Note that other groups may also distribute working documents as
16	Internet Drafts.

18	Internet Drafts are draft documents valid for a maximum of six months.
19	Internet Drafts may be updated, replaced, or obsoleted by other documents
20	at any time.   It is not appropriate to use Internet Drafts as reference
21	material or to cite them other than as a ``working draft'' or ``work in
22	progress.''

24	Please check the I-D abstract listing contained in each Internet Draft
25	directory to learn the current status of this or any other Internet Draft.

27	Distribution of this document is unlimited.

29	                                  Abstract

31	     This note describes a profile for the use of the real-time
32	    transport protocol (RTP) and the associated control protocol, RTCP,
33	    within audio and video multiparticipant conferences with minimal
34	    control.  It provides interpretations of generic fields within the
35	    RTP specification suitable for audio and video conferences.   In
36	    particular, this document defines a set of default mappings from
37	    format index to encodings.
38	     The document also describes how audio and video data may be
39	    carried within RTP. It defines a set of standard encodings and
40	    their names when used within RTP. However, the definitions are
41	    independent of the particular transport mechanism used.    The
42	    descriptions provide pointers to reference implementations and
43	    the detailed standards.    This document is meant as an aid
44	    for implementors of audio, video and other real-time multimedia
45	    applications.

47	Contents

49	1 Introduction                                                            2

51	2 Demultiplexing                                                          3

53	3 Audio                                                                   3

55	  3.1 Encoding-independent recommendations . . . . . . . . . . . . . . . 3

57	  3.2 Recommended Audio Encodings. . . . . . . . . . . . . . . . . . . . 4

59	  3.3 The RTCP FMT Option for Audio. . . . . . . . . . . . . . . . . . . 6

61	  3.4 Port Assignment. . . . . . . . . . . . . . . . . . . . . . . . . . 7

63	4 Video                                                                   8

65	  4.1 The RTCP FMT Option for Video. . . . . . . . . . . . . . . . . . . 9

67	  4.2 Port Assignment. . . . . . . . . . . . . . . . . . . . . . . . . . 9

69	5 Miscellaneous                                                          10

71	6 Address of Author                                                      10

73	1 Introduction

75	This profile defines aspects of RTP left unspecified in the RTP protocol
76	definition (RFC TBD). This profile is intended for the use within audio and
77	video conferences with minimal session control.  In particular, no support
78	for the negotiation of parameters or membership control is provided.  Other
79	profiles may make different choices for the items specified here.   The
80	profile specifies the use of RTP over unicast and multicast UDP as well
81	as ST-II. For unicast UDP and ST-II, references to multicast addresses
82	are to be ignored.   The use of this profile is indicated by the use of
83	a media-specific well-known port number.   The profile may also be used
84	with other port numbers.   For example, the use of a particular session
85	announcement tool could imply use of this profile.

87	internet-dRAFT         draft-ietf-avt-profile-03.txt        October 20, 1993

89	2 Demultiplexing

91	For applications which choose to share a single network destination address
92	and port for both audio and video, the default channel identifier for audio
93	is 0 and for video is 1.  In that case, the port number for audio is used.
94	This combination should only be used when it is known that all receiving
95	applications can properly demultiplex audio and video.

97	3 Audio

99	3.1 Encoding-independent recommendations

101	The following recommendations are default operating parameters.    Ap-
102	plications  should  be  prepared  to  handle  other  values.     The  ranges
103	given are meant to give guidance to application writers, allowing a set
104	of applications conforming to these guidelines to interoperate without
105	additional negotiation.   These guidelines are not intended to restrict
106	operating  parameters  for  applications  that  can  negotiate  a  set  of
107	interoperable parameters, e.g., through a conference control protocol.

109	For packetized audio, the default packetization interval should have a
110	duration of 20 ms, unless otherwise noted in Table 1.  The packetization
111	interval determines the minimum end-to-end delay; longer packets introduce
112	less header overhead but higher delay and make packet loss more noticeable.
113	For non-interactive applications such as lectures or links with severe
114	bandwidth constraints, a higher packetization delay may be appropriate.  For
115	frame-based encodings (marked as F in the table 1 below) such as LPC, CELP
116	and GSM, the sender may choose to combine several frame intervals into a
117	single message.  The receiver can tell the number of frames contained in a
118	message since the frame duration is defined as part of the encoding.

120	If multiple channels are used, the left channel information always precedes
121	the right-channel information.  For more than two channels, the convention
122	followed by the AIFF-C audio interchange format should be followed.  (The
123	AIFF-C specification is available by anonymous ftp at ftp.sgi.com in the
124	file sgi/aiff-c.9.26.91.ps.)  For two-channel stereo, the sequence is left,
125	right; for three channels, left, right, center; for quadrophonic systems,
126	front left, front right, rear left, rear right; for four-channel systems,
127	left, center, right, and surround sound; for six-channel systems left, left
128	center, center, right, right center and surround sound.

130	The sampling frequency should be drawn from the set:  8, 11.025, 16, 22.05,
131	44.1 and 48 kHz.

133	3.2 Recommended Audio Encodings

135	The table 1 shows the names, types (sample vs.  frame oriented), per-channel
136	bit rates and default sampling frequencies of recommended encodings.  The
137	list is partially drawn from the document "Recommended practices for
138	enhancing digital audio compatibility in multimedia systems", published by
139	the Interactive Multimedia Assocation, Version 3.00, Oct.  1992 (referenced
140	as [IMA]). The names are for identification only; they correspond to the
141	names used within the Real-Time Transport Protocol (RTP). Other applications
142	may choose different namings.  Note that the L16 encoding may be used with
143	different sampling rates.  The CCITT changed its name in 1993 to ITU-T; to
144	limit confusion, both old and new name are used.

146	  name nom.  sampling  rate type frame description
147	                   kHz  kb/s S/F  ms
148	 _________________________________________________________________________
149	  L16              48   768 S           16-bit linear, 2's complement
150	  L16            44.1 705.6 S
151	  L16           22.05 352.8 S
152	  L16          11.025 176.4 S
153	  G722             16    64 S           CCITT/ITU-T subband ADPCM
154	  PCMU              8    64 S           CCITT/ITU-T mu-law PCM
155	  PCMA              8    64 S           CCITT/ITU-T A-law PCM
156	  G721              8    32 S           CCITT/ITU-T ADPCM
157	  IDVI              8    32 S           Intel/DVI ADPCM [IMA]
158	  G723              8    24 S           CCITT/ITU-T ADPCM
159	  GSM               8    13 F    20    RTE/LTP GSM 06.10
160	  1016              8   4.8 F    30    CELP
161	 _________________________________________________________________________

163	                         Table 1:  Audio encodings

165	For multi-octet encodings, octets are transmitted in network byte order
166	(i.e., most significant octet first).

168	A detailed description of the encodings is given below.  The names shown
169	(L16, PCMU, etc.)  are limited to four characters and suitable to be used
170	for identification in protocols such as RTP (RFC TBD).

172	L16: denotes uncompressed audio data, using 16-bit signed representation
173	    with 65535 equally divided steps between minimum and maximum signal
174	    level, ranging from -32768 to 32767.  The value is represented in two's
175	    complement notation.

177	PCMU: specified in CCITT/ITU-T recommendation G.711.  Audio data is encoded
178	    as eight bits per sample, after companding.  Code to convert between
179	    linear and mu-law companded data is available in the IMA document.

181	PCMA: specified in CCITT/ITU-T recommendation G.711.  Audio data is encoded
182	    as eight bits per sample, after companding.  Code to convert between
183	    linear and A-law companded data is available in the IMA document.

185	G721 through G729: specified in the corresponding CCITT/ITU-T recommenda-
186	    tions.   Reference implementations for G.721 and G.723 are available
187	    as part of the CCITT/ITU-T Software Tool Library (STL) from the
188	    ITU General Secretariat, Sales Service, Place du Nations, CH-1211
189	    Geneve  20,  Switzerland.     The  library  is  covered  by  a  license
190	    and  is  available  for  anonymous  ftp  on  gaia.cs.umass.edu,  file
191	    pub/ccitt/ccitt_tools.tar.Z.

193	GSM: (group speciale mobile) denotes the European GSM 06.10 provisional
194	    standard for full-rate speech transcoding, prI-ETS 300 036, which
195	    is based on RPE/LTP (residual pulse excitation/long term prediction)
196	    coding at a rate of 13 kb/s.  A reference implementation was written by
197	    Carsten Borman and Jutta Degener (TU Berlin, Germany) and is available
198	    for anonymous ftp from tub.cs.tu-berlin.de, directory tub/tubmik.

200	1016: uses code-excited linear prediction (CELP) and is specified in
201	    Federal Standard FED-STD 1016, published by the Office of Technology
202	    and Standards, Washington, DC 20305-2010.

204	    The U. S. DoD's Federal-Standard-1016 based 4800 bps code excited
205	    linear prediction voice coder version 3.2 (CELP 3.2) Fortran and
206	    C simulation source codes are available for worldwide distribution
207	    at no charge (on DOS diskettes, but configured to compile on Sun
208	    SPARC stations) from:  Bob Fenichel, National Communications System,
209	    Washington, D.C. 20305, phone +1-703-692-2124, fax +1-703-746-4960.

211	    Example input and processed speech files,  a technical information
212	    bulletin, and the official standard "Federal Standard 1016, Telecom-
213	    munications:   Analog to Digital Conversion of Radio Voice by 4,800
214	    bit/second Code Excited Linear Prediction (CELP)" are included at no
215	    charge.  According to Vincent Cate (Carnegie Mellon), the distribution
216	    is  also  available  for  anonymous  ftp  at  furmint.nectar.cs.cmu.edu
217	    (128.2.209.111) in directory celp.audio.compression.

219	    The following articles describes the Federal-Standard-1016 4.8-kbps
220	    CELP coder:

222	    Campbell, Joseph P. Jr., Thomas E. Tremain and Vanoy C. Welch, "The
223	    Proposed Federal Standard 1016 4800 bps Voice Coder:  CELP," Speech
224	    Technology Magazine, April/May 1990, p.  58-64.

226	    Campbell, Joseph P. Jr., Thomas E. Tremain and Vanoy C. Welch, "The
227	    Federal Standard 1016 4800 bps CELP Voice Coder," Digital Signal
228	    Processing, Academic Press, 1991, Vol.  1, No.  3, p.  145-155.

230	    Campbell, Joseph P. Jr., Thomas E. Tremain and Vanoy C. Welch, "The
231	    DoD 4.8 kbps Standard (Proposed Federal Standard 1016)," in Advances
232	    in Speech Coding, ed.   Atal, Cuperman and Gersho, Kluwer Academic
233	    Publishers, 1991, Chapter 12, p.  121-133.

235	    Campbell, Joseph P. Jr., Thomas E. Tremain and Vanoy C. Welch, "The
236	    Proposed Federal Standard 1016 4800 bps Voice Coder:  CELP," Speech
237	    Technology Magazine, April/May 1990, p.  58-64.

239	    Copies of the FS-1016 document are available for $2.50 each from:

241	    GSA Rm 6654
242	    7th & D St SW
243	    Washington, D.C. 20407
244	    1-202-708-9205

246	DVI: is specified in the "Recommended Practices for Enhancing Digital Audio
247	    Compatibility in Multimedia Systems", published by the Interactive
248	    Multimedia Association (IMA), Annapolis, MD. The document also contains
249	    reference implementations for mu-law to 16-bit, ADPCM and sample rate
250	    conversions.

252	For sample-based encodings, a receiver should accept packets representing
253	between 0 and 200 ms of audio data.(1)   Receivers should be prepared to
254	accept multi-channel audio, but may choose to only play a single channel.

256	All block-oriented audio codecs should be able to encode and decode several
257	consecutive blocks within a single packet.    Since the frame size for
258	the block-oriented codecs is given, there is no need to use a separate
259	designation for the same encoding, but with different number of blocks per
260	packet.

262	3.3 The RTCP FMT Option for Audio

264	Unless specified with the FMT option, the mapping between the format field
265	in an RTP packet and audio encodings, sampling rates and channel counts is
266	specified by Tables 2.

268	Format values of 31 and below cannot be redefined by FMT options.  In other
269	words, only values of 32 and above are valid in the format field within an
270	FMT option.   The receiver is expected to discard RTP packets containing
271	media data with unknown format field values.  Sites are expected to keep
272	the mapping between format and encoding constant, so that lost packets
273	containing FMT options do not lead the receiver to misinterpret media data.
274	Additional standard encodings may be registered with the Internet Assigned
275	------------------------------
276	 1. This restriction allows reasonable buffer sizing for the receiver.

278	Numbers Authority (IANA). The format name is intended to describe the format
279	in an unambiguous way; it is interpreted as a sequence of four ASCII
280	characters, with uppercase and lowercase characters treated as distinct.
281	Format names beginning with the letter 'X' are reserved for experimental use
282	and not subject to registration.  These experimental encodings may be mapped
283	to format values 32 and above using the FMT option.  Additional standard
284	mappings to format values of 31 and below may also be registered with IANA.
285	Registered assignments are published periodically in the Assigned Numbers
286	RFC.

288	Within the FMT option, the format name is followed by a field containing a
289	channel count and a sample rate field, measured in samples per second.(2)  A
290	channel count of zero is considered invalid.  A packetization interval of 20
291	ms or a multiple thereof is suggested as it leads to integral sample counts
292	for all common sampling rates.

294	 0                   1                   2                   3
295	 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
296	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
297	|F|     FMT     |    length     |0|0|   format  |    reserved   |
298	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
299	|                        name of format                         |
300	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
301	|    channels   | sampling rate (Hz)                            |
302	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
303	...  encoding specific parameters                               ...
304	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

306	                 Figure 1:  FMT option for audio encodings

308	3.4 Port Assignment

310	ST-II SAP and UDP port 5005 is the default destination for multicast
311	real-time audio data carried by RTP for this profile.

313	A fixed port number is useful as it is less likely than a randomly chosen
314	port number to be already in use by another application at one or more of
315	the intended destination hosts.   Also, fixed port numbers allow traffic
316	statistics to be collected and may simplify firewall implementations.   A
317	single fixed port number requires that hosts allow several processes to use
318	a single UDP port with different multicast addresses.  (The particular port
319	number was chosen to lie in the range above 5000 to accomodate port number
320	------------------------------
321	 2. Fractional samples per second was considered excessive as the typical
322	crystal accuraccy of 100 ppm translates into about one Hz or more of
323	sampling rate inaccuracy.

325	                  index encoding sampling rate channels
326	                         name             (kHz)
327	                 __________________________________________
328	                      0 PCMU                 8        1
329	                      1 1016                 8        1
330	                      2 G721                 8        1
331	                      3 GSM                  8        1
332	                      4 G723                 8        1
333	                      5 IDVI                 8        1
334	                     10 L16               44.1        2
335	                 __________________________________________

337	                     Table 2:  Standard audio encodings

339	allocation practice within the Unix operating system, where port numbers
340	below 1024 can only be used by privileged processes and port numbers between
341	1024 and 5000 are automatically assigned by the operating system.)

343	Unicast connections may use the this or a set of mutually agreed-upon port
344	numbers.

346	4 Video

348	The following video encodings are currently defined, with their abbreviated
349	names used for identification:

351	CPV: This encoding, "Compressed Packet Video" is implemented by Concept,
352	    Bolter, and ViewPoint Systems video codecs.

354	JPEG: The encoding is specified in ISO Standards DIS 10918-1 and DIS
355	    10918-2.   The data is formatted according to the JFIF (JPEG File
356	    Interchange Format) defined by C-Cube Microsystems.

358	H261: The encoding is specified in CCITT/ITU-T standard H.261.    The
359	    packetization and RTP-specific properties are described in RFC TBD.

361	nv: The encoding is implemented in the program 'nv' developed at Xerox PARC
362	    by Ron Frederick.

364	CUSM: The encoding is implemented in the program CU-SeeMe developed at
365	    Cornell University by Dick Cogger, Scott Brim, Tim Dorcey and John
366	    Lynn.

368	PicW: The encoding is implemented in the program PictureWindow developed at
369	    Bolt, Beranek and Newman (BBN).

371	4.1 The RTCP FMT Option for Video

373	Unless specified with the RTCP FMT option, the mapping between the format
374	field in an RTP packet and the video encoding is specified by Tables 3.  The
375	second paragraph of Section 3.3 applies for video as well.

377	Within the video FMT option, a one-octet numeric version identifier further
378	describes the encoding.  Unless otherwise defined, the version identifier
379	has the value zero.

381	 0                   1                   2                   3
382	 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
383	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
384	|F|     FMT     |    length     |0|0|   format  |    reserved   |
385	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
386	|                        name of format                         |
387	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
388	|    version    | encoding-specific parameters                  |
389	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
390	... encoding-specific parameters                                ...
391	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

393	                 Figure 2:  FMT option for video encodings

395	                                number name
396	                               ______________
397	                                26     JPEG
398	                                27     CUSM
399	                                28     nv
400	                                29     PicW
401	                                30     Bolt
402	                                31     H261

404	            Table 3:  Format values for standard video encodings

406	4.2 Port Assignment

408	ST-II SAP and UDP port 5006 is the default destination for multicast
409	real-time video data carried by RTP for this profile.   The remainder of
410	section 3.4 applies.

412	5 Miscellaneous

414	RTCP messages should be sent periodically, with a period varying randomly
415	around a set mean to avoid synchronized bursts of RTCP packets.   (For
416	example, the time between messages could vary uniformly between one half and
417	1.5 times the mean.)  The average period between transmissions determines
418	the additional network load due to RTCP packets and also determines how
419	long it will take a new arrival to discover the identities of the other
420	conference participants.  The average period should be chosen such that no
421	more than a small fraction (say, 1%) of the media bandwidth is consumed by
422	RTCP messages from all sources, with a minimum period of a few seconds.
423	By scaling the message frequency with the (slowly increasing) number of
424	observed participants, a new conference participant will quickly inform all
425	other participants of its arrival and then slow its announcement rate.

427	6 Address of Author

429	Henning Schulzrinne
430	AT&T Bell Laboratories
431	MH 2A244
432	600 Mountain Avenue
433	Murray Hill, NJ 07974-0636
434	telephone:  +1 908 582 2262
435	facsimile:  +1 908 582 5809
436	electronic mail:  hgs@research.att.com