idnits 2.17.1 

draft-ietf-avt-rtp-ipmr-04.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** The document seems to lack a License Notice according IETF Trust
     Provisions of 28 Dec 2009, Section 6.b.i or Provisions of 12 Sep 2009
     Section 6.b -- however, there's a paragraph with a matching beginning.
     Boilerplate error?

     (You're using the IETF Trust Provisions' Section 6.b License Notice from
     12 Feb 2009 rather than one of the newer Notices.  See
     https://trustee.ietf.org/license-info/.)


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document seems to contain a disclaimer for pre-RFC5378 work, and may
     have content which was first submitted before 10 November 2008.  The
     disclaimer is necessary when there are original authors that you have
     been unable to contact, or if some do not wish to grant the BCP78 rights
     to the IETF Trust.  If you are able to get all authors (current and
     original) to grant those rights, you can and should remove the
     disclaimer; otherwise, the disclaimer is needed and you can ignore this
     comment. (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (May 20, 2009) is 5453 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866)

  ** Obsolete normative reference: RFC 5246 (Obsoleted by RFC 8446)


     Summary: 3 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Audio/Video Transport Working Group                            S. Ikonin
2	Internet Draft                                                SPIRIT DSP
3	Intended status: Informational                              May 20, 2009

5	RTP Payload Format for IP-MR Speech Codec draft-ietf-avt-rtp-ipmr-04.txt

7	Status of this Memo

9	This Internet-Draft is submitted to IETF in full conformance with the
10	provisions of BCP 78 and BCP 79.

12	Copyright (c) 2009 IETF Trust and the persons identified as the document
13	authors. All rights reserved.

15	This document is subject to BCP 78 and the IETF Trust's Legal Provisions
16	Relating to IETF Documents in effect on the date of publication of this
17	document (http://trustee.ietf.org/license-info). Please review these
18	documents carefully, as they describe your rights and restrictions with
19	respect to this document.

21	Internet-Drafts are working documents of the Internet Engineering Task
22	Force (IETF), its areas, and its working groups. Note that other groups
23	may also distribute working documents as Internet-Drafts.

25	Internet-Drafts are draft documents valid for a maximum of six months
26	and may be updated, replaced, or obsoleted by other documents at any
27	time. It is inappropriate to use Internet-Drafts as reference material
28	or to cite them other than as "work in progress."

30	The list of current Internet-Drafts can be accessed at
31	http://www.ietf.org/1id-abstracts.html

33	The list of Internet-Draft Shadow Directories can be accessed at
34	http://www.ietf.org/shadow.html

36	This Internet-Draft will expire on November 20, 2009.

38	Abstract

40	This document specifies the payload format for packetization of SPIRIT
41	IP-MR encoded speech signals into the Real-time Transport Protocol
42	(RTP). The payload format supports transmission of multiple frames per
43	payload and introduced redundancy for robustness against packet loss.

45	Table of Contents

47	 1. Introduction......................................................3
48	 2. IP-MR Codec Description...........................................3
49	 3. Payload Format....................................................4
50	    3.1. RTP Header Usage.............................................4
51	    3.2. Payload Format Structure.....................................5
52	    3.3. Payload Header...............................................5
53	    3.4. Speech Table of Contents.....................................6
54	    3.5. Speech Data..................................................7
55	    3.6. Redundancy Header............................................7
56	    3.7. Redundancy Table of Contents.................................8
57	    3.8. Redundancy Data..............................................9
58	 4. Payload Examples..................................................9
59	    4.1. Payload Carrying a Single Frame..............................9
60	    4.2. Payload Carrying Multiple Frames with Redundancy............10
61	 5. Media Type Registration..........................................11
62	    5.1. Registration of media subtype audio/ip-mr_v2.5..............11
63	    5.2. Mapping Media Type Parameters into SDP......................12
64	 6. Security Considerations..........................................13
65	 7. Congestion Control...............................................13
66	 8. IANA Considerations..............................................14
67	 9. Normative References.............................................15
68	 10. Author(s) Information ..........................................15
69	 11. Disclaimer......................................................15
70	 12. Legal Terms.....................................................16

72	1. Introduction

74	This document specifies the payload format for packetization of SPIRIT
75	IP-MR encoded speech signals into the Real-time Transport Protocol
76	(RTP). The payload format supports transmission of multiple frames per
77	payload and introduced redundancy for robustness against packet loss.

79	The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
80	"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
81	document are to be interpreted as described in RFC 2119 [RFC 2119].

83	2. IP-MR Codec Description

85	The IP-MR codec is scalable adaptive multi-rate wideband speech codec
86	designed by SPIRIT for use in IP based networks. These codec is suitable
87	for real time communications such as telephony and videoconferencing.

89	The codec operates on 20 ms frames at 16 kHz sampling rate and has an
90	algorithmic delay of 25ms.

92	The IP-MR supports six wide band speech coding modes with respective bit
93	rates ranging from about 7.7 to about 34.2 kbps. The coding mode can be
94	changed at any 20 ms frame boundary making possible to dynamically
95	adjust the speech encoding rate during a session to adapt to the varying
96	transmission conditions.

98	The coded frame consists of multiple coding layers - base (or core)
99	layer and several enhancement layers which are coded independently.
100	Onlythe core layer is mandatory to decode understandable speech and
101	upper layers provide quality enhancement. These enhancement layers
102	may be omitted and remaining base layer can be meaningfully decoded
103	without artifacts. This making the bit stream scalable and allows
104	reduce bit rate during transmission without re-encoding.

106	This memo specifies an optional form of redundancy coding within RTP
107	for protection against packet loss. It is based on commonly known
108	scheme when previously transmitted frames are aggregated together
109	with new ones. Each frame is retransmitted once in the following
110	RTP payload packet. f(n-2)...f(n+4) denote a sequence of speech
111	frames, and p(n-1)...p(n+4) a sequence of payload packets:

113	   --+--------+--------+--------+--------+--------+--------+--------+--
114	     | f(n-2) | f(n-1) |  f(n)  | f(n+1) | f(n+2) | f(n+3) | f(n+4) |
115	   --+--------+--------+--------+--------+--------+--------+--------+--

117	      <---- p(n-1) ---->
118	               <----- p(n) ----->
119	                        <---- p(n+1) ---->
120	                                 <---- p(n+2) ---->
121	                                          <---- p(n+3) ---->
122	                                                   <---- p(n+4) ---->

124	But because of the scalable nature of IP-MR codec there is no need to
125	duplicate the whole previous frame - only the core layer may be
126	retransmitted. This reduces redundancy overhead while keeping
127	efficiency. Moreover, the speech bits encoded in core layer are divided
128	on six classes (from A to F) of perceptual sensitivity to errors. Using
129	these classes as introduced redundancy make possible to adjust trade-off
130	between overhead and robustness against packet loss.

132	The mechanism described does not really require signaling at the session
133	setup. The sender is responsible for selecting an appropriate amount of
134	redundancy based on feedback about the channel conditions.

136	The main codec characteristics can be summarized as follows:

138	    o Wideband, 16 kHz, speech codec

140	    o Adaptive multi rate with six modes from about 7.7 to about
141	      34.2 kbps

143	    o Bit rate scalable

145	    o Variable bit rate changing in accordance with actual speech
146	      content

148	    o Discontinuous Transmission (DTX), silence suppression and
149	      comfort noise generation

151	    o In-band redundancy scheme for protection against packet loss

153	3. Payload Format

155	The main purpose of the payload design for IP-MR is to maximize the
156	potential of the codec with as minimal overhead as possible. The payload
157	format allows changing parameters of the codecs  (such as bit rate,
158	level of scalability, DTX and redundancy mode) without re-negotiation
159	at any packet boundary. This make possible dynamically adjust streaming
160	parametersin accordance to changing network conditions. The payload
161	format also supports aggregation of multiple consecutive frames
162	(up to 4) in a payload. That allows controlling trade-off between
163	delay and header overhead.

165	3.1. RTP Header Usage

167	The RTP timestamp corresponds to the sampling instant of the first
168	sample encoded for the first frame-block in the packet. The timestamp
169	clock frequency SHALL be 16 kHz. The duration of one frame is 20 ms,
170	corresponding to 320 samples at 16 kHz. Thus the timestamp is increased
171	by 320 for each consecutive frame. The timestamp is also used to recover
172	the correct decoding order of the frame-blocks.

174	The RTP header marker bit (M) SHALL be set to 1 whenever the first
175	frame-block carried in the packet is the first frame-block in a
176	talkspurt (see definition of the talkspurt in Section 4.1 [RFC 3551]).
177	For all other packets, the marker bit SHALL be set to zero (M=0).

179	The assignment of an RTP payload type for the format defined in this
180	memo is outside the scope of this document. The RTP profiles in use
181	currently mandate binding the payload type dynamically for this payload
182	format. This is basically necessary because the payload type expresses
183	the configuration of the payload itself, i.e. basic or interleaved mode,
184	and the number of channels carried.

186	The remaining RTP header fields are used as specified in [RFC 3550].

188	3.2. Payload Format Structure

190	The IP-MR payload format consists of a payload header with general
191	information about packet, a speech table of contents (TOC), and speech
192	data. An optional redundancy section follows after speech data. The
193	redundancy section consists of redundancy header, redundancy TOC and
194	redundancy data payload.

196	The following diagram shows the standard payload format layout:

198	  +---------+--------+--------+- - - - - - +- - - - - - +- - - - - - +
199	  | payload | speech | speech | redundancy | redundancy | redundancy |
200	  | header  | TOC    | data   | header     | TOC        | data       |
201	  +---------+--------+--------+- - - - - - +- - - - - - +- - - - - - +

203	3.3. Payload Header

205	The payload header has the following format:

207	                           0                   1
208	                           0 1 2 3 4 5 6 7 8 9 0 1
209	                          +-+-+-+-+-+-+-+-+-+-+-+-+
210	                          |T| CR  | BR  |D|A|GR |R|
211	                          +-+-+-+-+-+-+-+-+-+-+-+-+

213	    o T (1 bit): Reserved compatibility with future extensions. SHOULD
214	      be set to 0.

216	    o CR (3 bits): coding rate of frame(s) in this packet, as per the
217	       following table:
218	                          +-------+--------------+
219	                          |  CR   | avg. bitrate |
220	                          +-------+--------------+
221	                          |   0   |   7.7 kbps   |
222	                          |   1   |   9.8 kbps   |
223	                          |   2   |  14.3 kbps   |
224	                          |   3   |  20.8 kbps   |
225	                          |   4   |  27.9 kbps   |
226	                          |   5   |  34.2 kbps   |
227	                          |   6   |  (reserved)  |
228	                          |   7   |   NO_DATA    |
229	                          +-------+--------------+

231	The CR value 7 (NO_DATA) indicates that there is no speech data (and
232	speech TOC accordingly) in the payload. This MAY be used to transmit
233	redundancy data only. The value 6 is reserved. If receiving this value
234	the packet SHOULD be discarded.

236	    o BR (3 bits): base rate for core layer of frame(s) in this packet
237	      using the table for CR. Values in the range 0-5 indicate bitrates
238	      for core layer, same as for packet SHOULD be discarded. The base
239	      rate is the lowest rate for scalability, so speech payload can
240	      be scaled down not lower than BR value. If a received packet has
241	      BR > CR then during decoding it will be assumed that BR = CR.

243	    o D (1 bit): indicates if the DTX mode is allowed or not.

245	    o A (1 bit): byte-aligned payload. If A=1 then all speech frames
246	      MUST be byte-aligned. This mode speeds up speech data access.
247	      The A=0 value specifies bandwidth-efficient mode with no byte
248	      alignment(including end of header).

250	    o GR (2 bits): number of frames in packet (grouping size). Actual
251	      grouping size is GR + 1, thus maximum grouping supported is 4.

253	    o R (1 bit): redundancy presence bit. If R=1 then the packet
254	      contains redundancy information for lost packets recovery.
255	      In this case after speech data the redundancy section is present.

257	3.4. Speech Table of Contents

259	The speech TOC contains entries for each frame in packet (grouping size
260	in total). Each entry contains a single field:

262	                                   0
263	                                  +-+
264	                                  |E|
265	                                  +-+

267	    o E (1 bit): frame existence indicator. If set to 0, this indicates
268	      the corresponding frame is absent and the receiver should set
269	      special LOST_FRAME flag for decoder. This can be followed by the
270	      lost frame itself or by empty frames generated by the encoder
271	      during silence intervals in DTX mode.

273	Note that if CR flag from payload header is 7 (NO_DATA) then speech TOC
274	is empty.

276	3.5. Speech Data

278	Speech data of a payload contains one or more speech frames or comfort
279	noise frames, as specified in the speech TOC of the payload.

281	Each speech frame represents 20 ms of speech encoded with the rate
282	indicated in the CR and base rate indicated in BR field of the payload
283	header.
284	The size of coded speech frame is variable due to the nature of codec.
285	The Encoder's algorithm decides what size of each frame is and returns
286	it after encoding. In order to save bandwidth the size is not placed
287	into payload obviously. Decoder can calculate frame size by its content
288	and returns it to the top level application. This way a size of each
289	frame can be obtained. Moreover, there is a special service function
290	that returns frame size without total decoding which may be used for
291	this purpose.

293	3.6. Redundancy Header

295	If a packet contains redundancy (R field of payload header is 1) the
296	speech data is followed by redundancy header:

298	                             0 1 2 3 4 5
299	                            +-+-+-+-+-+-+
300	                            | CL1 | CL2 |
301	                            +-+-+-+-+-+-+

303	Redundancy header consists of two fields. Each field contains class
304	specifier for amount of redundancy partly taken from the preceding
305	packet (CL1) and pre-preceding packet (CL2), e.g. distant from the
306	current packet by 1 and 2 packets accordingly. The values are listed
307	in the table below:

309	                     +-------+-------------------+
310	                     |  CL   | amount redundancy |
311	                     +-------+-------------------+
312	                     |   0   |       NONE        |
313	                     |   1   |      CLASS A      |
314	                     |   2   |      CLASS B      |
315	                     |   3   |      CLASS C      |
316	                     |   4   |      CLASS D      |
317	                     |   5   |      CLASS E      |
318	                     |   6   |      CLASS F      |
319	                     |   7   |     (reserved)    |
320	                     +-------+-------------------+

322	Each specifier takes 3 bits, thus the total redundancy header size is 6
323	bits.

325	3.7. Redundancy Table of Contents

327	                    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+
328	                    | Pkt1 Entries| Pkt2 Entries|
329	                    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+

331	The redundancy TOC contains entries for redundancy frames from preceding
332	and pre-preceding packets. Each entry takes 1 bit like speech TOC entry
333	(3.3):

335	                                   0
336	                                  +-+
337	                                  |E|
338	                                  +-+

340	    o E (1 bit): frame existence indicator. If set to 0, this indicates
341	      the corresponding frame is absent.

343	    o For each preceding and pre-preceding packet the number of entries
344	      is equal to the grouping size of the current packet. E.g. maximum
345	      number of entries is 4*2 = 8.

347	    o If class specifier in the redundancy header is CL=0 (NO_DATA)
348	      then there is no entries for corresponding packet redundancy.

350	3.8. Redundancy Data

352	Redundancy data of a payload contains redundancy information for one or
353	more speech frames or comfort noise frames that may be lost during
354	transition, as specified in the redundancy TOC of the payload. Actually
355	redundancy is the most important part of preceding frames representing
356	20 ms of speech. This data MAY be used for partial reconstruction of
357	lost frames. The amount of available redundancy is specified by CL flag
358	in redundancy header section (3.5). This flag SHOULD be passed to
359	decoder. The length of redundancy frame is variable and can be 281
360	calculated after decoding.

362	4. Payload Examples

364	A few examples to highlight the payload format follow.

366	4.1. Payload Carrying a Single Frame

368	The following diagram shows a standard IP-MR payload carrying a single
369	speech frame without redundancy:

371	   0                   1                   2                   3
372	   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
373	  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
374	  |0|CR=1 |BR=0 |0|0|0 0|0|1|sp(0)                                |
375	  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
376	  |                                                               |
377	  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
378	  |                                                               |
379	  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
380	  |                                                               |
381	  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
382	  |                                                               |
383	  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
384	  |                                                               |
385	  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
386	  |                      sp(193)|P|
387	  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

389	In the payload the speech frame is not damaged at the IP origin (E=1),
390	the coding rate is 9.7 kbps(CR=1), the base rate is 7.8 kbps (BR=0), and
391	the DTX mode is off. There is no byte alignment (A=0) and no redundancy
392	(R=0). The encoded speech bits - s(0) to s(193) - are placed immediately
393	after TOC. Finally, one zero bit is added at the end as padding to make
394	the payload byte aligned.

396	4.2. Payload Carrying Multiple Frames with Redundancy

398	The following diagram shows a payload that contains three frames, one of
399	them with no speech data. The coding rate is 7.7 kbps (CR=0), the base
400	rate is 7.7 kbps (BR=0), and the DTX mode is on. The speech frames are
401	byte aligned (A=1), so 1 zero bit is added at the end of the header.
402	Besides the speech frames the payload contains six redundancy frames
403	(three per each delayed packet).

405	The first speech frame consists of bits sp1(0) to sp1(92). After that 3
406	bits are added for byte alignment. The second frame does not contain any
407	speech information that is represented in the payload by its TOC entry.
408	The third frame consists of bits sp3(0) to sp3(171).

410	The redundancy header follows after speech data. The one-packet- delayed
411	redundancy contains class A+B bits (CL1=2), and two-packet- delayed
412	redundancy contains class A bits (Cl2=1). The one-packet- delayed
413	redundancy contains three frames with 20, 39 and 35 bits respectively.

415	The first frame of two-packet-delayed redundancy is absent, it is
416	represented in its TOC entry, and two other frames have sizes 15 and 19
417	bits.

419	Note that all speech frames are padded with zero bits for byte
420	alignment.

422	   0                   1                   2                   3
423	   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
424	  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
425	  |0|CR=0 |BR=0 |1|1|1 0|1|1 0 1|P|sp1(0)                         |
426	  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
427	  |                                                               |
428	  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
429	  |                                                               |
430	  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
431	  |                  sp1(92)|P|P|P|sp3(0)                         |
432	  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
433	  |                                                               |
434	  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
435	  |                                                               |
436	  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
437	  |                                                               |
438	  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
439	  |                                                               |
440	  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
441	  |                                               sp3(171)|P|P|P|P|
442	  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
443	  |CL1=2|CL2=1|1 1 1|0 1 1|red1_1(0)                    red1_1(19)|
444	  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
445	  |red1_2(0)                                                      |
446	  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
447	  |   red1_2(38)|red1_3(0)                                        |
448	  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
449	  |         red1_3(34)|red2_2(0)          red2_2(14)|red2_3(0)    |
450	  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
451	  |             red2_3(18)|P|P|P|P|
452	  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

454	5. Media Type Registration

456	This section describes the media types and names associated with this
457	payload format.

459	5.1. Registration of media subtype audio/ip-mr_v2.5

461	Type name: audio

463	Subtype name: ip-mr_v2.5

465	Required parameters: none

467	Optional parameters:

469	* ptime: Gives the length of time in milliseconds represented by the
470	media in a packet. Allowed values are: 20, 40, 60 and 80.

472	Encoding considerations: This media type is framed binary data (see RFC
473	4288, Section 4.8).

475	Security considerations: See RFC 3550 [RFC 3550]

477	Interoperability considerations: none

479	Published specification: RFC XXXX

481	Applications that use this media type: Real-time audio applications like
482	voice over IP and teleconference, and multi-media streaming.

484	Additional information: none

486	Person & email address to contact for further information:
487	Elena Berlizova
488	berlizova@spiritdsp.com

490	Intended usage: COMMON

492	Restrictions on usage: This media type depends on RTP framing, and hence
493	is only defined for transfer via RTP [RFC 3550].

495	Author:
496	Sergey Ikonin <ikonin@spiritdsp.com>

498	Change controller: IETF Audio/Video Transport working group delegated
499	from the IESG.

501	5.2. Mapping Media Type Parameters into SDP

503	The information carried in the media type specification has a specific
504	mapping to fields in the Session Description Protocol (SDP) [RFC 4566],
505	which is commonly used to describe RTP sessions. When SDP is used to
506	specify sessions employing the IP-MR codec, the mapping is as follows:

508	    o The media type ("audio") goes in SDP "m=" as the media name.

510	    o The media subtype (payload format name) goes in SDP "a=rtpmap"
511	    as the encoding name. The RTP clock rate in "a=rtpmap" MUST 16000.

513	    o The parameter "ptime" goes in the SDP "a=ptime" attributes.

515	Any remaining parameters go in the SDP "a=fmtp" attribute by copying
516	them directly from the media type parameter string as a semicolon-
517	separated list of parameter=value pairs.

519	Note that the payload format (encoding) names are commonly shown in
520	upper case. Media subtypes are commonly shown in lower case. These
521	names are case-insensitive in both places.

523	6. Security Considerations

525	RTP packets using the payload format defined in this specification
526	are subject to the security considerations discussed in the RTP
527	specification [RFC 3550] and in any applicable RTP profile. The main
528	security considerations for the RTP packet carrying the RTP payload
529	format defined within this memo are confidentiality, integrity, and
530	source authenticity. Confidentiality is achieved by encryption of the
531	RTP payload. Integrity of the RTP packets is achieved through a suitable
532	cryptographic integrity protection mechanism. Such a cryptographic
533	system may also allow the authentication of the source of the payload.

535	A suitable security mechanism for this RTP payload format should
536	provide confidentiality, integrity protection, and at least source
537	authenticationcapable of determining if an RTP packet is from a
538	member of the RTP session.

540	Note that the appropriate mechanism to provide security to RTP and
541	payloads following this memo may vary. It is dependent on the
542	application, the transport, and the signaling protocol employed.
543	Therefore, a single mechanism is not sufficient, although if suitable,
544	usage of the Secure Real-time Transport Protocol (SRTP) [RFC 3711] is
545	recommended.  Other mechanisms that may be used are IPsec [RFC 4301]
546	and Transport Layer Security (TLS) [RFC 5246] (RTPover TCP); other
547	alternatives may exist.

549	This payload format does not exhibit any significant non-uniformity in
550	the receiver side computational complexity for packet processing, and
551	thus is unlikely to pose a denial-of-service threat due to the receipt
552	of pathological data.

554	7. Congestion Control

556	The general congestion control considerations for transporting RTP data
557	apply; see RTP [RFC 3550] and any applicable RTP profile like AVP
558	[RFC 3551]. However, the multi-rate capability of IP-MR speech coding
559	provides a mechanism that may help to control congestion, since the
560	bandwidth demand can be adjusted by selecting a different encoding mode.

562	The bit rate scalability of IP-MR codec allows reducing voice traffic
563	by omitting enhancement layers without re-encoding. This provides
564	additional means for congestion control. Some intermediate network
565	node MAY modify the IP-MR RTP payload by dropping some of the layers
566	during transmission to meet the available bandwidth requirements. In
567	case the payload is forwarded with modified content at least the base
568	layer MUST be preserved in the payload which is being delivered to
569	receiving side guarantees meaningful speech decoding without packet
570	loss concealment procedure.

572	The number of frames encapsulated in each RTP payload highly
573	influences the overall bandwidth of the RTP stream due to header
574	overhead constraints. Packetizing more frames in each RTP payload
575	can reduce the number of packets sent and hence the overhead from
576	IP/UDP/RTP headers, at the expense of increased delay.

578	If in-band redundancy scheme is used to protect against packet loss,
579	the amount of introduced redundancy will need to be regulated so that
580	the use of redundancy itself does not cause a congestion problem. In
581	other words, a sender SHALL NOT increase the total bitrate when adding
582	redundancy in response to packet loss, and needs instead to adjust it
583	down in accordance to the congestion control algorithm being run. Thus,
584	when adding redundancy, the media bitrate will need to be reduced to
585	provide room for the redundancy.

587	8. IANA Considerations

589	One media type has been defined and needs registration in the media
590	types registry.

592	9. Normative References

594	  [RFC 2119] Bradner, S., "Key words for use in RFCs to Indicate
595	             Requirement Levels", BCP 14, RFC 2119, March 1997.

597	  [RFC 3550] Schulzrinne, H., Casner, S., Frederick, R., and
598	             V. Jacobson, "RTP: A Transport Protocol for Real-Time
599	             Applications", STD 64, RFC 3550, July 2003.

601	  [RFC 3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio
602	             and Video Conferences with Minimal Control", STD 65,
603	             RFC 3551, July 2003.

605	  [RFC 4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
606	             Description Protocol", RFC 4566, July 2006.

608	  [RFC 3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., Norrman,
609	             K., "The Secure Real-Time Transport Protocol (SRTP)", RFC
610	             3711, March 2004.

612	  [RFC 5246] Dierks, T. and E. Rescorla, "The Transport Layer
613	             Security (TLS) Protocol Version 1.2", RFC 5246,
614	             August 2008.

616	  [RFC 4301] Kent, S. and K. Seo, "Security Architecture for the
617	             Internet Protocol", RFC 4301, December 2005.

619	10. Author(s) Information

621	Sergey Ikonin

623	email: ikonin@spiritdsp.com

625	Russia 109004
626	Building 27, A. Solgenizyn street
627	Tel: +7 495 661-2178
628	Fax: +7 495 912-6786

630	11. Disclaimer

632	This document may contain material from IETF Documents or IETF
633	Contributions published or made publicly available before November 10,
634	2008. The person(s) controlling the copyright in some of this material
635	may not have granted the IETF Trust the right to allow modifications of
636	such material outside the IETF Standards Process. Without obtaining an
637	adequate license from the person(s) controlling the copyright in such
638	materials, this document may not be modified outside the IETF Standards
639	Process, and derivative works of it may not be created outside the IETF
640	Standards Process, except to format it for publication as an RFC or to
641	translate it into languages other than English.

643	12. Legal Terms

645	All IETF Documents and the information contained therein are provided on
646	an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
647	OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
648	THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR
649	IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
650	INFORMATION THEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
651	WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

653	The IETF Trust takes no position regarding the validity or scope of any
654	Intellectual Property Rights or other rights that might be claimed to
655	pertain to the implementation or use of the technology described in any
656	IETF Document or the extent to which any license under such rights might
657	or might not be available; nor does it represent that it has made any
658	independent effort to identify any such rights.

660	Copies of Intellectual Property disclosures made to the IETF Secretariat
661	and any assurances of licenses to be made available, or the result of an
662	attempt made to obtain a general license or permission for the use of
663	such proprietary rights by implementers or users of this specification
664	can be obtained from the IETF on-line IPR repository at
665	http://www.ietf.org/ipr.

667	The IETF invites any interested party to bring to its attention any
668	copyrights, patents or patent applications, or other proprietary rights
669	that may cover technology that may be required to implement any standard
670	or specification contained in an IETF Document. Please address the
671	information to the IETF at ietf-ipr@ietf.org.

673	The definitive version of an IETF Document is that published by, or
674	under the auspices of, the IETF. Versions of IETF Documents that are
675	published by third parties, including those that are translated into
676	other languages, should not be considered to be definitive versions of
677	IETF Documents. The definitive version of these Legal Provisions is that
678	published by, or under the auspices of, the IETF. Versions of these
679	Legal Provisions that are published by third parties, including those
680	that are translated into other languages, should not be considered to be
681	definitive versions of these Legal Provisions.

683	For the avoidance of doubt, each Contributor to the IETF Standards
684	Process licenses each Contribution that he or she makes as part of the
685	IETF Standards Process to the IETF Trust pursuant to the provisions of
686	RFC 5378. No language to the contrary, or terms, conditions or rights
687	that differ from or are inconsistent with the rights and licenses
688	granted under RFC 5378, shall have any effect and shall be null and
689	void, whether published or posted by such Contributor, or included with
690	or in such Contribution.