idnits 2.17.1 draft-westerlund-avt-rtp-gsm-hr-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 16. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 776. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 787. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 794. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 800. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (Oct 24, 2008) is 5662 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Obsolete informational reference (is this intentional?): RFC 2326 (Obsoleted by RFC 7826) -- Obsolete informational reference (is this intentional?): RFC 4288 (Obsoleted by RFC 6838) ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) Summary: 2 errors (**), 0 flaws (~~), 1 warning (==), 9 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group M. Westerlund 3 Internet-Draft K. Hellwig 4 Intended status: Standards Track I. Johansson 5 Expires: April 27, 2009 Ericsson AB 6 Oct 24, 2008 8 RTP Payload format for GSM-HR 9 draft-westerlund-avt-rtp-gsm-hr-00 11 Status of this Memo 13 By submitting this Internet-Draft, each author represents that any 14 applicable patent or other IPR claims of which he or she is aware 15 have been or will be disclosed, and any of which he or she becomes 16 aware will be disclosed, in accordance with Section 6 of BCP 79. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that 20 other groups may also distribute working documents as Internet- 21 Drafts. 23 Internet-Drafts are draft documents valid for a maximum of six months 24 and may be updated, replaced, or obsoleted by other documents at any 25 time. It is inappropriate to use Internet-Drafts as reference 26 material or to cite them other than as "work in progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/1id-abstracts.txt. 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html. 34 This Internet-Draft will expire on April 27, 2009. 36 Abstract 38 This document specifies the RTP payload format for packetization of 39 the GSM Half-Rate speech codec. 41 Requirements Language 43 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 44 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 45 document are to be interpreted as described in RFC 2119 [RFC2119]. 47 Table of Contents 49 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 50 2. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 3 51 3. GSM Half Rate . . . . . . . . . . . . . . . . . . . . . . . . 3 52 4. Payload format Capabilities . . . . . . . . . . . . . . . . . 4 53 4.1. Use of Forward Error Correction (FEC) . . . . . . . . . . 4 54 5. Payload format . . . . . . . . . . . . . . . . . . . . . . . . 5 55 5.1. RTP Header Usage . . . . . . . . . . . . . . . . . . . . . 6 56 5.2. Payload Structure . . . . . . . . . . . . . . . . . . . . 6 57 5.2.1. Encoding of Speech Frames . . . . . . . . . . . . . . 7 58 5.2.2. Encoding of Silence Description Frames . . . . . . . . 8 59 5.3. Implementation Considerations . . . . . . . . . . . . . . 8 60 5.3.1. Transmission of SID frames . . . . . . . . . . . . . . 8 61 5.3.2. Receiving Redundant Frames . . . . . . . . . . . . . . 8 62 5.3.3. Decoding Validation . . . . . . . . . . . . . . . . . 8 63 6. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 64 6.1. 3 frames . . . . . . . . . . . . . . . . . . . . . . . . . 9 65 6.2. 3 Frames with lost frame in the middle . . . . . . . . . . 10 66 7. Payload Format Parameters . . . . . . . . . . . . . . . . . . 10 67 7.1. Media Type Definition . . . . . . . . . . . . . . . . . . 11 68 7.2. Mapping to SDP . . . . . . . . . . . . . . . . . . . . . . 12 69 7.2.1. Offer/Answer Considerations . . . . . . . . . . . . . 13 70 7.2.2. Declarative SDP Considerations . . . . . . . . . . . . 13 71 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 72 9. Congestion Control . . . . . . . . . . . . . . . . . . . . . . 14 73 10. Security Considerations . . . . . . . . . . . . . . . . . . . 14 74 10.1. Confidentiality . . . . . . . . . . . . . . . . . . . . . 15 75 10.2. Authentication and Integrity . . . . . . . . . . . . . . . 15 76 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 15 77 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 15 78 12.1. Informative References . . . . . . . . . . . . . . . . . . 15 79 12.2. Normative References . . . . . . . . . . . . . . . . . . . 16 80 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 16 81 Intellectual Property and Copyright Statements . . . . . . . . . . 18 83 1. Introduction 85 This document specifies the payload format for packetization of GSM 86 Half Rate (GSM-HR) codec [TS46.002] encoded speech signals into the 87 Real-time Transport Protocol (RTP) [RFC3550]. The payload format 88 supports transmission of multiple frames per payload and packet loss 89 robustness methods using redundancy. 91 This document starts with conventions, a brief description of the 92 codec, and the payload formats capabilities. The payload format is 93 specified in Section 5. Examples can be found in Section 6. The 94 media type and its mappings to SDP, usage in SDP offer/answer is then 95 specified. The document ends with considerations around congestion 96 control and security. 98 This document registers a media type (audio/gsm-hr-08) for the Real- 99 time Transport protocol (RTP) payload format for the GSM-HR codec. 100 Note: This format is not compatible with the one that was drafted 101 back in 1999 to 2000 in the Internet drafts: 102 draft-ietf-avt-profile-new-05 to draft-ietf-avt-profile-new-09. A 103 later version of profile draft was published as RFC 3551 without any 104 specification of the GSM-HR payload format. To avoid a possible 105 conflict with this older format, the media type of the payload format 106 specified in this document has a media type name that is different 107 from (audio/gsm-hr). 109 2. Conventions 111 This document uses the normal IETF bit-order representation. Bit 112 fields in figures are read left to right and then down. The left 113 most bits in each field is the most significant. The numbering 114 starts on 0 and ascends, where bit 0 will be the most significant. 116 3. GSM Half Rate 118 The Global System for Mobile Communication (GSM) network provides 119 mobile communication services for nearly 3 billion users (status 120 2008). The GSM Half Rate Codec (GSM-HR) is one of the speech codecs 121 that are used in GSM networks. GSM-HR denotes the Half-Rate speech 122 codec as specified in [TS46.002]. 124 Note: for historical reasons these 46-series specifications are 125 internally referenced as 06-series. A simple mapping applies, for 126 example 46.020 is referenced as 06.20 and so on. 128 The GSM-HR codec has a frame length of 20 ms, with narrowband speech 129 sampled at 8 kHz, i.e. 160 samples per frame. Each speech frame is 130 compressed into 112 bits of speech parameters, which is equivalent to 131 a bit rate of 5.6 kbit/s. Speech pauses are detected by a 132 standardized Voice Activity Detection (VAD). During speech pauses 133 the transmission of speech frames is inhibited. Silence Descriptor 134 (SID) frames are transmitted at the end of a talk spurt and about 135 every 480ms during speech pauses to allow for a decent Comfort Noise 136 (CN) quality at receiver side. 138 The SID frame generation in the GSM radio network is determined by 139 the GSM mobile station and the GSM radio subsystem. SID frames come 140 during speech pauses in uplink from the mobile station about every 141 480ms. In downlink to the mobile station, when they are generated by 142 the encoder of the GSM radio subsystem, SID frames are sent every 143 20ms to the GSM base station, which then picks only one every 480ms 144 for downlink radio transmission. For other applications, like 145 transport over IP, it is more appropriate to send the SID frames less 146 often than every 20ms, but 480 ms may be too sparse. We recommend as 147 a compromise that a GSM-HR encoder outside of the GSM radio network 148 (i.e. not in the GSM mobile station and not in the GSM radio 149 subsystem, but for example in the media gateway of the core network) 150 should generate and send SID frames frames every 160ms. 152 4. Payload format Capabilities 154 This RTP payload format carries one or more GSM-HR encoded frames, 155 either full voice or silence descriptor (SID), representing a mono 156 speech signal. To maintain synchronization or express not sent or 157 lost frames it has the capability to indicate No_Data frames. 159 4.1. Use of Forward Error Correction (FEC) 161 Generic forward error correction within RTP is defined, for example, 162 in RFC 5109 [RFC5109]. Audio redundancy coding is defined in RFC 163 2198 [RFC2198]. Either scheme can be used to add redundant 164 information to the RTP packet stream and make it more resilient to 165 packet losses, at the expense of a higher bit rate. Please see 166 either RFCs for a discussion of the implications of the higher bit 167 rate to network congestion. 169 In addition to these media-unaware mechanisms, this memo specifies an 170 optional to use GSM-HR specific form of audio redundancy coding, 171 which may be beneficial in terms of packetization overhead. 172 Conceptually, previously transmitted transport frames are aggregated 173 together with new ones. A sliding window can be used to group the 174 frames to be sent in each payload. Figure 1 below shows an example. 176 --+--------+--------+--------+--------+--------+--------+--------+-- 177 | f(n-2) | f(n-1) | f(n) | f(n+1) | f(n+2) | f(n+3) | f(n+4) | 178 --+--------+--------+--------+--------+--------+--------+--------+-- 180 <---- p(n-1) ----> 181 <----- p(n) -----> 182 <---- p(n+1) ----> 183 <---- p(n+2) ----> 184 <---- p(n+3) ----> 185 <---- p(n+4) ----> 187 Figure 1: An example of redundant transmission 189 Here, each frame is retransmitted once in the following RTP payload 190 packet. f(n-2)...f(n+4) denote a sequence of audio frames, and p(n- 191 1)...p(n+4) a sequence of payload packets. 193 The mechanism described does not really require signaling at the 194 session setup. However, signalling has been defined to allow for the 195 sender to voluntarily bounding the buffering and delay requirements. 196 If nothing is signalled the use of this mechanism is allowed and 197 unbounded. For a certain timestamp, the receiver may receive 198 multiple copies of a frame containing encoded audio data. The cost 199 of this scheme is bandwidth and the receiver delay necessary to allow 200 the redundant copy to arrive. 202 This redundancy scheme provides a functionality similar to the one 203 described in RFC 2198, but it works only if both original frames and 204 redundant representations are GSM-HR frames. When the use of other 205 media coding schemes is desirable, one has to resort to RFC 2198. 207 The sender is responsible for selecting an appropriate amount of 208 redundancy based on feedback about the channel conditions, e.g., in 209 the RTP Control Protocol (RTCP) [RFC3550] receiver reports. The 210 sender is also responsible for avoiding congestion, which may be 211 exacerbated by redundancy (see Section 9 for more details). 213 5. Payload format 215 The format of the RTP header is specified in [RFC3550]. This payload 216 format uses the fields of the header in a manner consistent with that 217 specification. 219 The duration of one speech frame is 20 ms. The sampling frequency is 220 8kHz, corresponding to 160 speech samples per frame. An RTP packet 221 may contain multiple frames of encoded speech or SID parameters. 222 Each packet covers a period of one or more contiguous 20 ms frame 223 intervals. During silence periods no speech packets are sent, 224 however SID packets are transmitted every now and then. 226 To allow for error resiliency through redundant transmission, the 227 periods covered by multiple packets MAY overlap in time. A receiver 228 MUST be prepared to receive any speech frame multiple times. A given 229 frame MUST NOT be encoded as speech frame in one packet and as SID 230 frame or as No_Data frame in another packet. Furthermore, a given 231 frame MUST NOT be encoded with different voicing modes in different 232 packets. 234 The rules regarding maximum payload size given in Section 3.2 of 235 [I-D.ietf-tsvwg-udp-guidelines] SHOULD be followed. 237 5.1. RTP Header Usage 239 The RTP timestamp corresponds to the sampling instant of the first 240 sample encoded for the first frame in the packet. The timestamp 241 clock frequency SHALL be 8000 Hz. The timestamp is also used to 242 recover the correct decoding order of the frames. 244 The RTP header marker bit (M) SHALL be set to 1 whenever the first 245 frame carried in the packet is the first frame in a talkspurt (see 246 definition of the talkspurt in section 4.1 of [RFC3551]). For all 247 other packets the marker bit SHALL be set to zero (M=0). 249 The assignment of an RTP payload type for the format defined in this 250 memo is outside the scope of this document. The RTP profiles in use 251 currently mandates binding the payload type dynamically for this 252 payload format. 254 The remaining RTP header fields are used as specified in RFC 3550 255 [RFC3550]. 257 5.2. Payload Structure 259 The complete payload consists of a payload table of contents (ToC) 260 section, followed by speech data representing one or more speech 261 frames, SID frames or No_Data frames. The following diagram shows 262 the general payload format layout: 263 +-------------+------------------------- 264 | ToC section | speech data section ... 265 +-------------+----------------------- 267 Figure 2: General payload format layout 269 Each ToC is one octet and corresponds to one speech frame, the number 270 of ToC's is thus equal to the number of speech frames (including SID 271 frames and No_Data frames). Each ToC entry represents a consecutive 272 speech or SID or No_Data frame. The timestamp value for ToC entry 273 (and corresponding speech frame data) N within the payload is (RTP 274 timestamp field + (N-1)*160) mod 2^32 . The format of the ToC octet 275 is as follows. 277 0 1 2 3 4 5 6 7 278 +-+-+-+-+-+-+-+-+ 279 |F| FT |R R R R| 280 +-+-+-+-+-+-+-+-+ 282 Figure 3: The TOC element 284 F: Follow flag, 1 denotes that more ToC's follow, 0 denotes the last 285 ToC. 287 R: Reserved bits, MUST be set to zero and MUST be ignored by 288 receiver. 290 FT: Frame type 291 000 = Good Speech frame 292 001 = Reserved 293 010 = Good SID frame 294 011 = Reserved 295 100 = Reserved 296 101 = Reserved 297 110 = Reserved 298 111 = No_Data frame 300 The length of the payload data depends on the frame type: 302 Good Speech frame: The 112 speech data bits are put in 14 octets. 304 Good SID frame: The 33 SID data bits are put in 14 octets, as in 305 case of Speech frames, with the unused 79 bits set all to "1". 307 No data frame: Length of payload data is zero octets. 309 Frames marked in the GSM radio subsystem as "Bad Speech frame", "Bad 310 SID frame" or "No_Data frame" are not sent in RTP packets in order to 311 save bandwidth. They are marked as "No_Data frame", if they occur 312 within an RTP packet that carries more than one speech frame, SID 313 frame or No_Data frame. 315 5.2.1. Encoding of Speech Frames 317 The 112 bits of GSM-HR-coded speech (b1...b112) are defined in TS 318 46.020, Annex B [TS46.020], in the order of occurrence. The first 319 bit (b1) of the first parameter is placed in bit 0 (the MSB) of the 320 first octet (octet 1) of the payload field; the second bit is placed 321 in bit 1 of the first octet and so on. The last bit (b112) is placed 322 in the LSB (bit 7) of octet 14. 324 5.2.2. Encoding of Silence Description Frames 326 The GSM-HR Codec applies a specific coding for silence periods in so 327 called SID frames. The coding of SID frames is based on the coding 328 of speech frames by using only the first 33 bits for SID parameters 329 and by setting the remaining 79 bits all to "1". 331 5.3. Implementation Considerations 333 An application implementing this payload format MUST understand all 334 the payload parameters that is defined in this specification. Any 335 mapping of the parameters to a signaling protocol MUST support all 336 parameters. So an implementation of this payload format in an 337 application using SDP is required to understand all the payload 338 parameters in their SDP-mapped form. This requirement ensures that 339 an implementation always can decide whether it is capable of 340 communicating when the communicating entities support this version of 341 the specification. 343 5.3.1. Transmission of SID frames 345 When using this RTP payload format the sender SHOULD generate and 346 send SID frames every 160ms, i.e. every 8th frame. Other SID 347 transmission intervals may occur due to gateways to other systems 348 that uses other transmission intervals. 350 5.3.2. Receiving Redundant Frames 352 The reception of redundant audio frames, i.e. more than one audio 353 frame from the same source for the same time slot, MUST be supported 354 by the implementation. 356 5.3.3. Decoding Validation 358 If the receiver finds a mismatch between the size of a received 359 payload and the size indicated by the ToC of the payload, the 360 receiver SHOULD discard the packet. This is recommended because 361 decoding a frame parsed from a payload based on erroneous ToC data 362 could severely degrade the audio quality. 364 6. Examples 366 A few examples to highlight the payload format. 368 6.1. 3 frames 370 A basic example of the aggregation of 3 consecutive speech frames 371 into a single frame. 373 The first 24 bits are ToC fields. 374 Bit 0 is '1' as another ToC field follow. 375 Bits 1..3 is 000 = Good speech frame 376 Bits 4..7 is 0000 = Reserved 377 Bits 8 is '1' as another ToC field follow. 378 Bits 9..11 is 000 = Good speech frame 379 Bits 12..15 is 0000 = Reserved 380 Bit 16 is '0', no more ToC follows 381 Bits 17..19 is 000 = Good speech frame 382 Bits 20..23 is 0000 = Reserved 384 0 1 2 3 385 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 386 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 387 |1|0 0 0|0 0 0 0|1|0 0 0|0 0 0 0|0|0 0 0|0 0 0 0|b1 b8| 388 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 389 |b9 Frame 1 b40| 390 + + 391 |b41 b72| 392 + + 393 |b73 b104| 394 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 395 |b105 b112|b1 b24| 396 +-+-+-+-+-+-+-+-+ + 397 |b25 Frame 2 b56| 398 + + 399 |b57 b88| 400 + +-+-+-+-+-+-+-+-+ 401 |b89 b112|b1 b8| 402 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 403 |b9 Frame 3 b40| 404 + + 405 |b41 b72| 406 + + 407 |b73 b104| 408 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 409 |b105 b112| 410 +-+-+-+-+-+-+-+-+ 412 6.2. 3 Frames with lost frame in the middle 414 An example of payload carrying 3 frames where the middle one is 415 No_Data, for example due to loss prior to transmission by the RTP 416 source. 418 The first 24 bits are ToC fields. 419 Bit 0 is '1' as another ToC field follow. 420 Bits 1..3 is 000 = Good speech frame 421 Bits 4..7 is 0000 = Reserved 422 Bits 8 is '1' as another ToC field follow. 423 Bits 9..11 is 111 = No_Data frame 424 Bits 12..15 is 0000 = Reserved 425 Bit 16 is '0', no more ToC follows 426 Bits 17..19 is 000 = Good speech frame 427 Bits 20..23 is 0000 = Reserved 429 0 1 2 3 430 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 431 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 432 |1|0 0 0|0 0 0 0|1|1 1 1|0 0 0 0|0|0 0 0|0 0 0 0|b1 b8| 433 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 434 |b9 Frame 1 b40| 435 + + 436 |b41 b72| 437 + + 438 |b73 b104| 439 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 440 |b105 b112|b1 b24| 441 +-+-+-+-+-+-+-+-+ + 442 |b25 Frame 3 b56| 443 + + 444 |b57 b88| 445 + +-+-+-+-+-+-+-+-+ 446 |b89 b112| 447 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 449 7. Payload Format Parameters 451 This RTP payload format is identified using the media type "audio/ 452 gsm-hr-08", which is registered in accordance with [RFC4855] and 453 using the template of [RFC4288]. Note: Media subtype names are case- 454 insensitive. 456 7.1. Media Type Definition 458 The media type for the GSM-HR codec is allocated from the IETF tree 459 since GSM-HR is a well know speech codec. This media type 460 registration covers real-time transfer via RTP. The media subtype 461 name contains "-08" to avoid potential conflict with any earlier 462 drafts of GSM-HR RTP payload types that aren't bit compatible. 464 Note, reception of any unspecified parameter MUST be ignored by the 465 receiver to ensure that additional parameters can be added in the 466 future. 468 Type name: audio 470 Subtype name: GSM-HR-08 472 Required parameters: none 474 Optional parameters: 476 max-red: The maximum duration in milliseconds that elapses between 477 the primary (first) transmission of a frame and any redundant 478 transmission that the sender will use. This parameter allows a 479 receiver to have a bounded delay when redundancy is used. Allowed 480 values are between 0 (no redundancy will be used) and 65535. If 481 the parameter is omitted, no limitation on the use of redundancy 482 is present. 484 ptime: see [RFC4566]. 486 maxptime: see [RFC4566]. 488 Encoding considerations: 490 This media type is framed and binary, see section 4.8 in RFC4288 491 [RFC4288]. 493 Security considerations: 495 See Section 10 of RFCXXXX. 497 Interoperability considerations: 499 Published specification: 501 RFC XXXX, 3GPP TS 46.002 503 Applications that use this media type: 505 Real-time audio applications like voice over IP and 506 teleconference. 508 Additional information: none 510 Person & email address to contact for further information: 512 Ingemar Johansson 514 Intended usage: COMMON 516 Restrictions on usage: 518 This media type depends on RTP framing, and hence is only defined 519 for transfer via RTP [RFC3550]. Transport within other framing 520 protocols is not defined at this time. 522 Author: 524 Magnus Westerlund 526 Ingemar Johansson 528 Karl Hellwig 530 Change controller: 532 IETF Audio/Video Transport working group delegated from the IESG. 534 7.2. Mapping to SDP 536 The information carried in the media type specification has a 537 specific mapping to fields in the Session Description Protocol (SDP) 538 [RFC4566], which is commonly used to describe RTP sessions. When SDP 539 is used to specify sessions employing the GSM-HR codec, the mapping 540 is as follows: 542 o The media type ("audio") goes in SDP "m=" as the media name. 544 o The media subtype (payload format name) goes in SDP "a=rtpmap" as 545 the encoding name. The RTP clock rate in "a=rtpmap" MUST be 8000, 546 and the encoding parameters (number of channels) MUST either be 547 explicitly set to 1 or omitted, implying a default value of 1. 549 o The parameters "ptime" and "maxptime" go in the SDP "a=ptime" and 550 "a=maxptime" attributes, respectively. 552 o Any remaining parameters go in the SDP "a=fmtp" attribute by 553 copying them directly from the media type parameter string as a 554 semicolon-separated list of parameter=value pairs. 556 7.2.1. Offer/Answer Considerations 558 The following considerations apply when using SDP Offer-Answer 559 procedures to negotiate the use of GSM-HR payload in RTP: 561 o The SDP offerer and answerer MUST generate GSM-HR packets as 562 described by the offered parameters. 564 o In most cases, the parameters "maxptime" and "ptime" will not 565 affect interoperability; however, the setting of the parameters 566 can affect the performance of the application. The SDP offer- 567 answer handling of the "ptime" parameter is described in 568 [RFC3264]. The "maxptime" parameter MUST be handled in the same 569 way. 571 o The parameter "max-red" is a stream property parameter. For 572 sendonly or sendrecv unicast media streams, the parameter declares 573 the limitation on redundancy that the stream sender will use. For 574 recvonly streams, it indicates the desired value for the stream 575 sent to the receiver. The answerer MAY change the value, but is 576 RECOMMENDED to use the same limitation as the offer declares. In 577 the case of multicast, the offerer MAY declare a limitation; this 578 SHALL be answered using the same value. A media sender using this 579 payload format is RECOMMENDED to always include the "max-red" 580 parameter. This information is likely to simplify the media 581 stream handling in the receiver. This is especially true if no 582 redundancy will be used, in which case "max-red" is set to 0. 584 o Any unknown media type parameter in an offer SHALL be removed in 585 the answer. 587 7.2.2. Declarative SDP Considerations 589 In declarative usage, like SDP in RTSP [RFC2326] or SAP [RFC2974], 590 the parameters SHALL be interpreted as follows: 592 o The stream property parameter ("max-red") is declarative, and a 593 participant MUST follow what is declared for the session. In this 594 case it means that the receiver MUST be prepared to allocate 595 buffer memory for the given redundancy. Any transmissions MUST 596 NOT use more redundancy then what has been declared. More than 597 one configuration may be provided if necessary by declaring 598 multiple RTP payload types; however, the number of types should be 599 kept small. 601 o Any "maxptime" and "ptime" values should be selected with care to 602 ensure that the session's participants can achieve reasonable 603 performance. 605 8. IANA Considerations 607 One media type (audio/gsm-hr-08) has been defined and needs 608 registration in the media types registry; see Section 7.1. 610 9. Congestion Control 612 The general congestion control considerations for transporting RTP 613 data apply; see RTP [RFC3550] and any applicable RTP profile like AVP 614 [RFC3551]. 616 The number of frames encapsulated in each RTP payload highly 617 influences the overall bandwidth of the RTP stream due to header 618 overhead constraints. Packetizing more frames in each RTP payload 619 can reduce the number of packets sent and hence the header overhead, 620 at the expense of increased delay and reduced error robustness. If 621 forward error correction (FEC) is used, the amount of FEC-induced 622 redundancy needs to be regulated such that the use of FEC itself does 623 not cause a congestion problem. 625 10. Security Considerations 627 RTP packets using the payload format defined in this specification 628 are subject to the general security considerations discussed in RTP 629 [RFC3550] and any applicable profile such as AVP [RFC3551] or SAVP 630 [RFC3711]. As this format transports encoded audio, the main 631 security issues include confidentiality, integrity protection, and 632 data origin authentication of the audio itself. The payload format 633 itself does not have any built-in security mechanisms. Any suitable 634 external mechanisms, such as SRTP [RFC3711], MAY be used. 636 This payload format and the GSM-HR decoder do not exhibit any 637 significant non-uniformity in the receiver-side computational 638 complexity for packet processing, and thus are unlikely to pose a 639 denial-of-service threat due to the receipt of pathological data. 640 The payload format or the codec data does not contain any type of 641 active content such as scripts. 643 10.1. Confidentiality 645 In order to ensure confidentiality of the encoded audio, all audio 646 data bits MUST be encrypted. There is less need to encrypt the 647 payload header or the table of contents since they only carry 648 information about the frame type. This information could also be 649 useful to a third party, for example, for quality monitoring. 651 10.2. Authentication and Integrity 653 To authenticate the sender of the audio-stream, an external mechanism 654 MUST be used. It is RECOMMENDED that such a mechanism protects both 655 the complete RTP header and the payload (audio and data bits). Data 656 tampering by a man-in-the-middle attacker could replace audio content 657 and also result in erroneous depacketization/decoding that could 658 lower the audio quality. 660 11. Acknowledgements 662 The author would like to thank Xiaodong Duan, Shuaiyu Wang, Rocky 663 Wang and Ying Zhang for their initial work in this area. Many thanks 664 also go to Tomas Frankkila, Karl Hellwig for useful input and 665 comments. 667 12. References 669 12.1. Informative References 671 [RFC2198] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., 672 Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse- 673 Parisis, "RTP Payload for Redundant Audio Data", RFC 2198, 674 September 1997. 676 [RFC2326] Schulzrinne, H., Rao, A., and R. Lanphier, "Real Time 677 Streaming Protocol (RTSP)", RFC 2326, April 1998. 679 [RFC2974] Handley, M., Perkins, C., and E. Whelan, "Session 680 Announcement Protocol", RFC 2974, October 2000. 682 [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 683 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 684 RFC 3711, March 2004. 686 [RFC4288] Freed, N. and J. Klensin, "Media Type Specifications and 687 Registration Procedures", BCP 13, RFC 4288, December 2005. 689 [RFC4855] Casner, S., "Media Type Registration of RTP Payload 690 Formats", RFC 4855, February 2007. 692 [RFC5109] Li, A., "RTP Payload Format for Generic Forward Error 693 Correction", RFC 5109, December 2007. 695 12.2. Normative References 697 [I-D.ietf-tsvwg-udp-guidelines] 698 Eggert, L. and G. Fairhurst, "Unicast UDP Usage Guidelines 699 for Application Designers", 700 draft-ietf-tsvwg-udp-guidelines-11 (work in progress), 701 October 2008. 703 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 704 Requirement Levels", BCP 14, RFC 2119, March 1997. 706 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 707 with Session Description Protocol (SDP)", RFC 3264, 708 June 2002. 710 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 711 Jacobson, "RTP: A Transport Protocol for Real-Time 712 Applications", STD 64, RFC 3550, July 2003. 714 [RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and 715 Video Conferences with Minimal Control", STD 65, RFC 3551, 716 July 2003. 718 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 719 Description Protocol", RFC 4566, July 2006. 721 [TS46.002] 722 3GPP, "Specification : 3GPP TS 46.002 http://www.3gpp.org/ 723 ftp/Specs/archive/46_series/46.002/46002-700.zip", 724 June 2007. 726 [TS46.020] 727 3GPP, "Specification : 3GPP TS 46.020 http://www.3gpp.org/ 728 ftp/Specs/archive/46_series/46.002/46020-700.zip", 729 June 2007. 731 Authors' Addresses 733 Magnus Westerlund 734 Ericsson AB 735 Farogatan 6 736 Stockholm, SE-164 80 737 Sweden 739 Phone: +46 8 719 0000 740 Fax: 741 Email: magnus.westerlund@ericsson.com 742 URI: 744 Karl Hellwig 745 Ericsson AB 746 Kackertstrasse 7-9 747 52072 Aachen 748 Germany 750 Phone: +49 2407 575-2054 751 Email: karl.hellwig@ericsson.com 753 Ingemar Johansson 754 Ericsson AB 755 Laboratoriegrand 11 756 SE-971 28 Lulea 757 SWEDEN 759 Phone: +46 73 0783289 760 Email: ingemar.s.johansson@ericsson.com 762 Full Copyright Statement 764 Copyright (C) The IETF Trust (2008). 766 This document is subject to the rights, licenses and restrictions 767 contained in BCP 78, and except as set forth therein, the authors 768 retain all their rights. 770 This document and the information contained herein are provided on an 771 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 772 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 773 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 774 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 775 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 776 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 778 Intellectual Property 780 The IETF takes no position regarding the validity or scope of any 781 Intellectual Property Rights or other rights that might be claimed to 782 pertain to the implementation or use of the technology described in 783 this document or the extent to which any license under such rights 784 might or might not be available; nor does it represent that it has 785 made any independent effort to identify any such rights. Information 786 on the procedures with respect to rights in RFC documents can be 787 found in BCP 78 and BCP 79. 789 Copies of IPR disclosures made to the IETF Secretariat and any 790 assurances of licenses to be made available, or the result of an 791 attempt made to obtain a general license or permission for the use of 792 such proprietary rights by implementers or users of this 793 specification can be obtained from the IETF on-line IPR repository at 794 http://www.ietf.org/ipr. 796 The IETF invites any interested party to bring to its attention any 797 copyrights, patents or patent applications, or other proprietary 798 rights that may cover technology that may be required to implement 799 this standard. Please address the information to the IETF at 800 ietf-ipr@ietf.org.