idnits 2.17.1 draft-ietf-avt-rtp-gsm-hr-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** The document seems to lack a License Notice according IETF Trust Provisions of 28 Dec 2009, Section 6.b.i or Provisions of 12 Sep 2009 Section 6.b -- however, there's a paragraph with a matching beginning. Boilerplate error? (You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Feb 2009 rather than one of the newer Notices. See https://trustee.ietf.org/license-info/.) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (Apr 15, 2009) is 5490 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Obsolete informational reference (is this intentional?): RFC 2326 (Obsoleted by RFC 7826) -- Obsolete informational reference (is this intentional?): RFC 4288 (Obsoleted by RFC 6838) ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) ** Obsolete normative reference: RFC 5405 (Obsoleted by RFC 8085) Summary: 3 errors (**), 0 flaws (~~), 1 warning (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group X. Duan 3 Internet-Draft S. Wang 4 Intended status: Standards Track China Mobile Communications 5 Expires: October 17, 2009 Corporation 6 M. Westerlund 7 K. Hellwig 8 I. Johansson 9 Ericsson AB 10 Apr 15, 2009 12 RTP Payload format for GSM-HR 13 draft-ietf-avt-rtp-gsm-hr-00 15 Status of this Memo 17 This Internet-Draft is submitted to IETF in full conformance with the 18 provisions of BCP 78 and BCP 79. 20 Internet-Drafts are working documents of the Internet Engineering 21 Task Force (IETF), its areas, and its working groups. Note that 22 other groups may also distribute working documents as Internet- 23 Drafts. 25 Internet-Drafts are draft documents valid for a maximum of six months 26 and may be updated, replaced, or obsoleted by other documents at any 27 time. It is inappropriate to use Internet-Drafts as reference 28 material or to cite them other than as "work in progress." 30 The list of current Internet-Drafts can be accessed at 31 http://www.ietf.org/ietf/1id-abstracts.txt. 33 The list of Internet-Draft Shadow Directories can be accessed at 34 http://www.ietf.org/shadow.html. 36 This Internet-Draft will expire on October 17, 2009. 38 Copyright Notice 40 Copyright (c) 2009 IETF Trust and the persons identified as the 41 document authors. All rights reserved. 43 This document is subject to BCP 78 and the IETF Trust's Legal 44 Provisions Relating to IETF Documents in effect on the date of 45 publication of this document (http://trustee.ietf.org/license-info). 46 Please review these documents carefully, as they describe your rights 47 and restrictions with respect to this document. 49 Abstract 51 This document specifies the RTP payload format for packetization of 52 the GSM Half-Rate speech codec. 54 Requirements Language 56 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 57 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 58 document are to be interpreted as described in RFC 2119 [RFC2119]. 60 Table of Contents 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 63 2. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 3 64 3. GSM Half Rate . . . . . . . . . . . . . . . . . . . . . . . . 3 65 4. Payload format Capabilities . . . . . . . . . . . . . . . . . 4 66 4.1. Use of Forward Error Correction (FEC) . . . . . . . . . . 4 67 5. Payload format . . . . . . . . . . . . . . . . . . . . . . . . 5 68 5.1. RTP Header Usage . . . . . . . . . . . . . . . . . . . . . 6 69 5.2. Payload Structure . . . . . . . . . . . . . . . . . . . . 6 70 5.2.1. Encoding of Speech Frames . . . . . . . . . . . . . . 7 71 5.2.2. Encoding of Silence Description Frames . . . . . . . . 8 72 5.3. Implementation Considerations . . . . . . . . . . . . . . 8 73 5.3.1. Transmission of SID frames . . . . . . . . . . . . . . 8 74 5.3.2. Receiving Redundant Frames . . . . . . . . . . . . . . 8 75 5.3.3. Decoding Validation . . . . . . . . . . . . . . . . . 8 76 6. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 77 6.1. 3 frames . . . . . . . . . . . . . . . . . . . . . . . . . 9 78 6.2. 3 Frames with lost frame in the middle . . . . . . . . . . 10 79 7. Payload Format Parameters . . . . . . . . . . . . . . . . . . 10 80 7.1. Media Type Definition . . . . . . . . . . . . . . . . . . 11 81 7.2. Mapping to SDP . . . . . . . . . . . . . . . . . . . . . . 12 82 7.2.1. Offer/Answer Considerations . . . . . . . . . . . . . 13 83 7.2.2. Declarative SDP Considerations . . . . . . . . . . . . 13 84 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 85 9. Congestion Control . . . . . . . . . . . . . . . . . . . . . . 14 86 10. Security Considerations . . . . . . . . . . . . . . . . . . . 14 87 10.1. Confidentiality . . . . . . . . . . . . . . . . . . . . . 15 88 10.2. Authentication and Integrity . . . . . . . . . . . . . . . 15 89 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 15 90 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 15 91 12.1. Informative References . . . . . . . . . . . . . . . . . . 15 92 12.2. Normative References . . . . . . . . . . . . . . . . . . . 16 93 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 16 95 1. Introduction 97 This document specifies the payload format for packetization of GSM 98 Half Rate (GSM-HR) codec [TS46.002] encoded speech signals into the 99 Real-time Transport Protocol (RTP) [RFC3550]. The payload format 100 supports transmission of multiple frames per payload and packet loss 101 robustness methods using redundancy. 103 This document starts with conventions, a brief description of the 104 codec, and the payload formats capabilities. The payload format is 105 specified in Section 5. Examples can be found in Section 6. The 106 media type and its mappings to SDP, usage in SDP offer/answer is then 107 specified. The document ends with considerations around congestion 108 control and security. 110 This document registers a media type (audio/gsm-hr-08) for the Real- 111 time Transport protocol (RTP) payload format for the GSM-HR codec. 112 Note: This format is not compatible with the one that was drafted 113 back in 1999 to 2000 in the Internet drafts: 114 draft-ietf-avt-profile-new-05 to draft-ietf-avt-profile-new-09. A 115 later version of profile draft was published as RFC 3551 without any 116 specification of the GSM-HR payload format. To avoid a possible 117 conflict with this older format, the media type of the payload format 118 specified in this document has a media type name that is different 119 from (audio/gsm-hr). 121 2. Conventions 123 This document uses the normal IETF bit-order representation. Bit 124 fields in figures are read left to right and then down. The left 125 most bits in each field is the most significant. The numbering 126 starts on 0 and ascends, where bit 0 will be the most significant. 128 3. GSM Half Rate 130 The Global System for Mobile Communication (GSM) network provides 131 mobile communication services for nearly 3 billion users (status 132 2008). The GSM Half Rate Codec (GSM-HR) is one of the speech codecs 133 that are used in GSM networks. GSM-HR denotes the Half-Rate speech 134 codec as specified in [TS46.002]. 136 Note: for historical reasons these 46-series specifications are 137 internally referenced as 06-series. A simple mapping applies, for 138 example 46.020 is referenced as 06.20 and so on. 140 The GSM-HR codec has a frame length of 20 ms, with narrowband speech 141 sampled at 8 kHz, i.e. 160 samples per frame. Each speech frame is 142 compressed into 112 bits of speech parameters, which is equivalent to 143 a bit rate of 5.6 kbit/s. Speech pauses are detected by a 144 standardized Voice Activity Detection (VAD). During speech pauses 145 the transmission of speech frames is inhibited. Silence Descriptor 146 (SID) frames are transmitted at the end of a talk spurt and about 147 every 480ms during speech pauses to allow for a decent Comfort Noise 148 (CN) quality at receiver side. 150 The SID frame generation in the GSM radio network is determined by 151 the GSM mobile station and the GSM radio subsystem. SID frames come 152 during speech pauses in uplink from the mobile station about every 153 480ms. In downlink to the mobile station, when they are generated by 154 the encoder of the GSM radio subsystem, SID frames are sent every 155 20ms to the GSM base station, which then picks only one every 480ms 156 for downlink radio transmission. For other applications, like 157 transport over IP, it is more appropriate to send the SID frames less 158 often than every 20ms, but 480 ms may be too sparse. We recommend as 159 a compromise that a GSM-HR encoder outside of the GSM radio network 160 (i.e. not in the GSM mobile station and not in the GSM radio 161 subsystem, but for example in the media gateway of the core network) 162 should generate and send SID frames every 160ms. 164 4. Payload format Capabilities 166 This RTP payload format carries one or more GSM-HR encoded frames, 167 either full voice or silence descriptor (SID), representing a mono 168 speech signal. To maintain synchronization or express not sent or 169 lost frames it has the capability to indicate No_Data frames. 171 4.1. Use of Forward Error Correction (FEC) 173 Generic forward error correction within RTP is defined, for example, 174 in RFC 5109 [RFC5109]. Audio redundancy coding is defined in RFC 175 2198 [RFC2198]. Either scheme can be used to add redundant 176 information to the RTP packet stream and make it more resilient to 177 packet losses, at the expense of a higher bit rate. Please see 178 either RFCs for a discussion of the implications of the higher bit 179 rate to network congestion. 181 In addition to these media-unaware mechanisms, this memo specifies an 182 optional to use GSM-HR specific form of audio redundancy coding, 183 which may be beneficial in terms of packetization overhead. 184 Conceptually, previously transmitted transport frames are aggregated 185 together with new ones. A sliding window can be used to group the 186 frames to be sent in each payload. Figure 1 below shows an example. 188 --+--------+--------+--------+--------+--------+--------+--------+-- 189 | f(n-2) | f(n-1) | f(n) | f(n+1) | f(n+2) | f(n+3) | f(n+4) | 190 --+--------+--------+--------+--------+--------+--------+--------+-- 192 <---- p(n-1) ----> 193 <----- p(n) -----> 194 <---- p(n+1) ----> 195 <---- p(n+2) ----> 196 <---- p(n+3) ----> 197 <---- p(n+4) ----> 199 Figure 1: An example of redundant transmission 201 Here, each frame is retransmitted once in the following RTP payload 202 packet. f(n-2)...f(n+4) denote a sequence of audio frames, and p(n- 203 1)...p(n+4) a sequence of payload packets. 205 The mechanism described does not really require signaling at the 206 session setup. However, signalling has been defined to allow for the 207 sender to voluntarily bounding the buffering and delay requirements. 208 If nothing is signalled the use of this mechanism is allowed and 209 unbounded. For a certain timestamp, the receiver may receive 210 multiple copies of a frame containing encoded audio data. The cost 211 of this scheme is bandwidth and the receiver delay necessary to allow 212 the redundant copy to arrive. 214 This redundancy scheme provides a functionality similar to the one 215 described in RFC 2198, but it works only if both original frames and 216 redundant representations are GSM-HR frames. When the use of other 217 media coding schemes is desirable, one has to resort to RFC 2198. 219 The sender is responsible for selecting an appropriate amount of 220 redundancy based on feedback about the channel conditions, e.g., in 221 the RTP Control Protocol (RTCP) [RFC3550] receiver reports. The 222 sender is also responsible for avoiding congestion, which may be 223 exacerbated by redundancy (see Section 9 for more details). 225 5. Payload format 227 The format of the RTP header is specified in [RFC3550]. This payload 228 format uses the fields of the header in a manner consistent with that 229 specification. 231 The duration of one speech frame is 20 ms. The sampling frequency is 232 8kHz, corresponding to 160 speech samples per frame. An RTP packet 233 may contain multiple frames of encoded speech or SID parameters. 234 Each packet covers a period of one or more contiguous 20 ms frame 235 intervals. During silence periods no speech packets are sent, 236 however SID packets are transmitted every now and then. 238 To allow for error resiliency through redundant transmission, the 239 periods covered by multiple packets MAY overlap in time. A receiver 240 MUST be prepared to receive any speech frame multiple times. A given 241 frame MUST NOT be encoded as speech frame in one packet and as SID 242 frame or as No_Data frame in another packet. Furthermore, a given 243 frame MUST NOT be encoded with different voicing modes in different 244 packets. 246 The rules regarding maximum payload size given in Section 3.2 of 247 [RFC5405] SHOULD be followed. 249 5.1. RTP Header Usage 251 The RTP timestamp corresponds to the sampling instant of the first 252 sample encoded for the first frame in the packet. The timestamp 253 clock frequency SHALL be 8000 Hz. The timestamp is also used to 254 recover the correct decoding order of the frames. 256 The RTP header marker bit (M) SHALL be set to 1 whenever the first 257 frame carried in the packet is the first frame in a talkspurt (see 258 definition of the talkspurt in section 4.1 of [RFC3551]). For all 259 other packets the marker bit SHALL be set to zero (M=0). 261 The assignment of an RTP payload type for the format defined in this 262 memo is outside the scope of this document. The RTP profiles in use 263 currently mandates binding the payload type dynamically for this 264 payload format. 266 The remaining RTP header fields are used as specified in RFC 3550 267 [RFC3550]. 269 5.2. Payload Structure 271 The complete payload consists of a payload table of contents (ToC) 272 section, followed by speech data representing one or more speech 273 frames, SID frames or No_Data frames. The following diagram shows 274 the general payload format layout: 275 +-------------+------------------------- 276 | ToC section | speech data section ... 277 +-------------+----------------------- 279 Figure 2: General payload format layout 281 Each ToC is one octet and corresponds to one speech frame, the number 282 of ToC's is thus equal to the number of speech frames (including SID 283 frames and No_Data frames). Each ToC entry represents a consecutive 284 speech or SID or No_Data frame. The timestamp value for ToC entry 285 (and corresponding speech frame data) N within the payload is (RTP 286 timestamp field + (N-1)*160) mod 2^32 . The format of the ToC octet 287 is as follows. 289 0 1 2 3 4 5 6 7 290 +-+-+-+-+-+-+-+-+ 291 |F| FT |R R R R| 292 +-+-+-+-+-+-+-+-+ 294 Figure 3: The TOC element 296 F: Follow flag, 1 denotes that more ToC's follow, 0 denotes the last 297 ToC. 299 R: Reserved bits, MUST be set to zero and MUST be ignored by 300 receiver. 302 FT: Frame type 303 000 = Good Speech frame 304 001 = Reserved 305 010 = Good SID frame 306 011 = Reserved 307 100 = Reserved 308 101 = Reserved 309 110 = Reserved 310 111 = No_Data frame 312 The length of the payload data depends on the frame type: 314 Good Speech frame: The 112 speech data bits are put in 14 octets. 316 Good SID frame: The 33 SID data bits are put in 14 octets, as in 317 case of Speech frames, with the unused 79 bits set all to "1". 319 No data frame: Length of payload data is zero octets. 321 Frames marked in the GSM radio subsystem as "Bad Speech frame", "Bad 322 SID frame" or "No_Data frame" are not sent in RTP packets in order to 323 save bandwidth. They are marked as "No_Data frame", if they occur 324 within an RTP packet that carries more than one speech frame, SID 325 frame or No_Data frame. 327 5.2.1. Encoding of Speech Frames 329 The 112 bits of GSM-HR-coded speech (b1...b112) are defined in TS 330 46.020, Annex B [TS46.020], in the order of occurrence. The first 331 bit (b1) of the first parameter is placed in bit 0 (the MSB) of the 332 first octet (octet 1) of the payload field; the second bit is placed 333 in bit 1 of the first octet and so on. The last bit (b112) is placed 334 in the LSB (bit 7) of octet 14. 336 5.2.2. Encoding of Silence Description Frames 338 The GSM-HR Codec applies a specific coding for silence periods in so 339 called SID frames. The coding of SID frames is based on the coding 340 of speech frames by using only the first 33 bits for SID parameters 341 and by setting the remaining 79 bits all to "1". 343 5.3. Implementation Considerations 345 An application implementing this payload format MUST understand all 346 the payload parameters that is defined in this specification. Any 347 mapping of the parameters to a signaling protocol MUST support all 348 parameters. So an implementation of this payload format in an 349 application using SDP is required to understand all the payload 350 parameters in their SDP-mapped form. This requirement ensures that 351 an implementation always can decide whether it is capable of 352 communicating when the communicating entities support this version of 353 the specification. 355 5.3.1. Transmission of SID frames 357 When using this RTP payload format the sender SHOULD generate and 358 send SID frames every 160ms, i.e. every 8th frame. Other SID 359 transmission intervals may occur due to gateways to other systems 360 that uses other transmission intervals. 362 5.3.2. Receiving Redundant Frames 364 The reception of redundant audio frames, i.e. more than one audio 365 frame from the same source for the same time slot, MUST be supported 366 by the implementation. 368 5.3.3. Decoding Validation 370 If the receiver finds a mismatch between the size of a received 371 payload and the size indicated by the ToC of the payload, the 372 receiver SHOULD discard the packet. This is recommended because 373 decoding a frame parsed from a payload based on erroneous ToC data 374 could severely degrade the audio quality. 376 6. Examples 378 A few examples to highlight the payload format. 380 6.1. 3 frames 382 A basic example of the aggregation of 3 consecutive speech frames 383 into a single frame. 385 The first 24 bits are ToC fields. 386 Bit 0 is '1' as another ToC field follow. 387 Bits 1..3 is 000 = Good speech frame 388 Bits 4..7 is 0000 = Reserved 389 Bits 8 is '1' as another ToC field follow. 390 Bits 9..11 is 000 = Good speech frame 391 Bits 12..15 is 0000 = Reserved 392 Bit 16 is '0', no more ToC follows 393 Bits 17..19 is 000 = Good speech frame 394 Bits 20..23 is 0000 = Reserved 396 0 1 2 3 397 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 398 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 399 |1|0 0 0|0 0 0 0|1|0 0 0|0 0 0 0|0|0 0 0|0 0 0 0|b1 b8| 400 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 401 |b9 Frame 1 b40| 402 + + 403 |b41 b72| 404 + + 405 |b73 b104| 406 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 407 |b105 b112|b1 b24| 408 +-+-+-+-+-+-+-+-+ + 409 |b25 Frame 2 b56| 410 + + 411 |b57 b88| 412 + +-+-+-+-+-+-+-+-+ 413 |b89 b112|b1 b8| 414 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 415 |b9 Frame 3 b40| 416 + + 417 |b41 b72| 418 + + 419 |b73 b104| 420 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 421 |b105 b112| 422 +-+-+-+-+-+-+-+-+ 424 6.2. 3 Frames with lost frame in the middle 426 An example of payload carrying 3 frames where the middle one is 427 No_Data, for example due to loss prior to transmission by the RTP 428 source. 430 The first 24 bits are ToC fields. 431 Bit 0 is '1' as another ToC field follow. 432 Bits 1..3 is 000 = Good speech frame 433 Bits 4..7 is 0000 = Reserved 434 Bits 8 is '1' as another ToC field follow. 435 Bits 9..11 is 111 = No_Data frame 436 Bits 12..15 is 0000 = Reserved 437 Bit 16 is '0', no more ToC follows 438 Bits 17..19 is 000 = Good speech frame 439 Bits 20..23 is 0000 = Reserved 441 0 1 2 3 442 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 443 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 444 |1|0 0 0|0 0 0 0|1|1 1 1|0 0 0 0|0|0 0 0|0 0 0 0|b1 b8| 445 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 446 |b9 Frame 1 b40| 447 + + 448 |b41 b72| 449 + + 450 |b73 b104| 451 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 452 |b105 b112|b1 b24| 453 +-+-+-+-+-+-+-+-+ + 454 |b25 Frame 3 b56| 455 + + 456 |b57 b88| 457 + +-+-+-+-+-+-+-+-+ 458 |b89 b112| 459 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 461 7. Payload Format Parameters 463 This RTP payload format is identified using the media type "audio/ 464 gsm-hr-08", which is registered in accordance with [RFC4855] and 465 using the template of [RFC4288]. Note: Media subtype names are case- 466 insensitive. 468 7.1. Media Type Definition 470 The media type for the GSM-HR codec is allocated from the IETF tree 471 since GSM-HR is a well know speech codec. This media type 472 registration covers real-time transfer via RTP. The media subtype 473 name contains "-08" to avoid potential conflict with any earlier 474 drafts of GSM-HR RTP payload types that aren't bit compatible. 476 Note, reception of any unspecified parameter MUST be ignored by the 477 receiver to ensure that additional parameters can be added in the 478 future. 480 Type name: audio 482 Subtype name: GSM-HR-08 484 Required parameters: none 486 Optional parameters: 488 max-red: The maximum duration in milliseconds that elapses between 489 the primary (first) transmission of a frame and any redundant 490 transmission that the sender will use. This parameter allows a 491 receiver to have a bounded delay when redundancy is used. Allowed 492 values are between 0 (no redundancy will be used) and 65535. If 493 the parameter is omitted, no limitation on the use of redundancy 494 is present. 496 ptime: see [RFC4566]. 498 maxptime: see [RFC4566]. 500 Encoding considerations: 502 This media type is framed and binary, see section 4.8 in RFC4288 503 [RFC4288]. 505 Security considerations: 507 See Section 10 of RFCXXXX. 509 Interoperability considerations: 511 Published specification: 513 RFC XXXX, 3GPP TS 46.002 515 Applications that use this media type: 517 Real-time audio applications like voice over IP and 518 teleconference. 520 Additional information: none 522 Person & email address to contact for further information: 524 Ingemar Johansson 526 Intended usage: COMMON 528 Restrictions on usage: 530 This media type depends on RTP framing, and hence is only defined 531 for transfer via RTP [RFC3550]. Transport within other framing 532 protocols is not defined at this time. 534 Author: 536 Xiaodong Duan 538 Shuaiyu Wang 540 Magnus Westerlund 542 Ingemar Johansson 544 Karl Hellwig 546 Change controller: 548 IETF Audio/Video Transport working group delegated from the IESG. 550 7.2. Mapping to SDP 552 The information carried in the media type specification has a 553 specific mapping to fields in the Session Description Protocol (SDP) 554 [RFC4566], which is commonly used to describe RTP sessions. When SDP 555 is used to specify sessions employing the GSM-HR codec, the mapping 556 is as follows: 558 o The media type ("audio") goes in SDP "m=" as the media name. 560 o The media subtype (payload format name) goes in SDP "a=rtpmap" as 561 the encoding name. The RTP clock rate in "a=rtpmap" MUST be 8000, 562 and the encoding parameters (number of channels) MUST either be 563 explicitly set to 1 or omitted, implying a default value of 1. 565 o The parameters "ptime" and "maxptime" go in the SDP "a=ptime" and 566 "a=maxptime" attributes, respectively. 568 o Any remaining parameters go in the SDP "a=fmtp" attribute by 569 copying them directly from the media type parameter string as a 570 semicolon-separated list of parameter=value pairs. 572 7.2.1. Offer/Answer Considerations 574 The following considerations apply when using SDP Offer-Answer 575 procedures to negotiate the use of GSM-HR payload in RTP: 577 o The SDP offerer and answerer MUST generate GSM-HR packets as 578 described by the offered parameters. 580 o In most cases, the parameters "maxptime" and "ptime" will not 581 affect interoperability; however, the setting of the parameters 582 can affect the performance of the application. The SDP offer- 583 answer handling of the "ptime" parameter is described in 584 [RFC3264]. The "maxptime" parameter MUST be handled in the same 585 way. 587 o The parameter "max-red" is a stream property parameter. For 588 sendonly or sendrecv unicast media streams, the parameter declares 589 the limitation on redundancy that the stream sender will use. For 590 recvonly streams, it indicates the desired value for the stream 591 sent to the receiver. The answerer MAY change the value, but is 592 RECOMMENDED to use the same limitation as the offer declares. In 593 the case of multicast, the offerer MAY declare a limitation; this 594 SHALL be answered using the same value. A media sender using this 595 payload format is RECOMMENDED to always include the "max-red" 596 parameter. This information is likely to simplify the media 597 stream handling in the receiver. This is especially true if no 598 redundancy will be used, in which case "max-red" is set to 0. 600 o Any unknown media type parameter in an offer SHALL be removed in 601 the answer. 603 7.2.2. Declarative SDP Considerations 605 In declarative usage, like SDP in RTSP [RFC2326] or SAP [RFC2974], 606 the parameters SHALL be interpreted as follows: 608 o The stream property parameter ("max-red") is declarative, and a 609 participant MUST follow what is declared for the session. In this 610 case it means that the receiver MUST be prepared to allocate 611 buffer memory for the given redundancy. Any transmissions MUST 612 NOT use more redundancy then what has been declared. More than 613 one configuration may be provided if necessary by declaring 614 multiple RTP payload types; however, the number of types should be 615 kept small. 617 o Any "maxptime" and "ptime" values should be selected with care to 618 ensure that the session's participants can achieve reasonable 619 performance. 621 8. IANA Considerations 623 One media type (audio/gsm-hr-08) has been defined and needs 624 registration in the media types registry; see Section 7.1. 626 9. Congestion Control 628 The general congestion control considerations for transporting RTP 629 data apply; see RTP [RFC3550] and any applicable RTP profile like AVP 630 [RFC3551]. 632 The number of frames encapsulated in each RTP payload highly 633 influences the overall bandwidth of the RTP stream due to header 634 overhead constraints. Packetizing more frames in each RTP payload 635 can reduce the number of packets sent and hence the header overhead, 636 at the expense of increased delay and reduced error robustness. If 637 forward error correction (FEC) is used, the amount of FEC-induced 638 redundancy needs to be regulated such that the use of FEC itself does 639 not cause a congestion problem. 641 10. Security Considerations 643 RTP packets using the payload format defined in this specification 644 are subject to the general security considerations discussed in RTP 645 [RFC3550] and any applicable profile such as AVP [RFC3551] or SAVP 646 [RFC3711]. As this format transports encoded audio, the main 647 security issues include confidentiality, integrity protection, and 648 data origin authentication of the audio itself. The payload format 649 itself does not have any built-in security mechanisms. Any suitable 650 external mechanisms, such as SRTP [RFC3711], MAY be used. 652 This payload format and the GSM-HR decoder do not exhibit any 653 significant non-uniformity in the receiver-side computational 654 complexity for packet processing, and thus are unlikely to pose a 655 denial-of-service threat due to the receipt of pathological data. 656 The payload format or the codec data does not contain any type of 657 active content such as scripts. 659 10.1. Confidentiality 661 In order to ensure confidentiality of the encoded audio, all audio 662 data bits MUST be encrypted. There is less need to encrypt the 663 payload header or the table of contents since they only carry 664 information about the frame type. This information could also be 665 useful to a third party, for example, for quality monitoring. 667 10.2. Authentication and Integrity 669 To authenticate the sender of the audio-stream, an external mechanism 670 MUST be used. It is RECOMMENDED that such a mechanism protects both 671 the complete RTP header and the payload (audio and data bits). Data 672 tampering by a man-in-the-middle attacker could replace audio content 673 and also result in erroneous depacketization/decoding that could 674 lower the audio quality. 676 11. Acknowledgements 678 The author would like to thank Xiaodong Duan, Shuaiyu Wang, Rocky 679 Wang and Ying Zhang for their initial work in this area. Many thanks 680 also go to Tomas Frankkila, Karl Hellwig for useful input and 681 comments. 683 12. References 685 12.1. Informative References 687 [RFC2198] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., 688 Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse- 689 Parisis, "RTP Payload for Redundant Audio Data", RFC 2198, 690 September 1997. 692 [RFC2326] Schulzrinne, H., Rao, A., and R. Lanphier, "Real Time 693 Streaming Protocol (RTSP)", RFC 2326, April 1998. 695 [RFC2974] Handley, M., Perkins, C., and E. Whelan, "Session 696 Announcement Protocol", RFC 2974, October 2000. 698 [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 699 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 700 RFC 3711, March 2004. 702 [RFC4288] Freed, N. and J. Klensin, "Media Type Specifications and 703 Registration Procedures", BCP 13, RFC 4288, December 2005. 705 [RFC4855] Casner, S., "Media Type Registration of RTP Payload 706 Formats", RFC 4855, February 2007. 708 [RFC5109] Li, A., "RTP Payload Format for Generic Forward Error 709 Correction", RFC 5109, December 2007. 711 12.2. Normative References 713 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 714 Requirement Levels", BCP 14, RFC 2119, March 1997. 716 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 717 with Session Description Protocol (SDP)", RFC 3264, 718 June 2002. 720 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 721 Jacobson, "RTP: A Transport Protocol for Real-Time 722 Applications", STD 64, RFC 3550, July 2003. 724 [RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and 725 Video Conferences with Minimal Control", STD 65, RFC 3551, 726 July 2003. 728 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 729 Description Protocol", RFC 4566, July 2006. 731 [RFC5405] Eggert, L. and G. Fairhurst, "Unicast UDP Usage Guidelines 732 for Application Designers", BCP 145, RFC 5405, 733 November 2008. 735 [TS46.002] 736 3GPP, "Specification : 3GPP TS 46.002 http://www.3gpp.org/ 737 ftp/Specs/archive/46_series/46.002/46002-700.zip", 738 June 2007. 740 [TS46.020] 741 3GPP, "Specification : 3GPP TS 46.020 http://www.3gpp.org/ 742 ftp/Specs/archive/46_series/46.002/46020-700.zip", 743 June 2007. 745 Authors' Addresses 747 Xiaodong Duan 748 China Mobile Communications Corporation 749 53A, Xibianmennei Ave., Xuanwu District 750 Beijing, 100053 751 P.R. China 753 Phone: 754 Fax: 755 Email: duanxiaodong@chinamobile.com 756 URI: 758 Shuaiyu Wang 759 China Mobile Communications Corporation 760 53A, Xibianmennei Ave., Xuanwu District 761 Beijing, 100053 762 P.R. China 764 Phone: 765 Fax: 766 Email: wangshuaiyu@chinamobile.com 767 URI: 769 Magnus Westerlund 770 Ericsson AB 771 Farogatan 6 772 Stockholm, SE-164 80 773 Sweden 775 Phone: +46 8 719 0000 776 Fax: 777 Email: magnus.westerlund@ericsson.com 778 URI: 780 Karl Hellwig 781 Ericsson AB 782 Kackertstrasse 7-9 783 52072 Aachen 784 Germany 786 Phone: +49 2407 575-2054 787 Email: karl.hellwig@ericsson.com 788 Ingemar Johansson 789 Ericsson AB 790 Laboratoriegrand 11 791 SE-971 28 Lulea 792 SWEDEN 794 Phone: +46 73 0783289 795 Email: ingemar.s.johansson@ericsson.com