idnits 2.17.1 draft-lennox-avt-rtp-audio-level-exthdr-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** The document seems to lack a License Notice according IETF Trust Provisions of 28 Dec 2009, Section 6.b.ii or Provisions of 12 Sep 2009 Section 6.b -- however, there's a paragraph with a matching beginning. Boilerplate error? (You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Feb 2009 rather than one of the newer Notices. See https://trustee.ietf.org/license-info/.) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (June 17, 2009) is 5427 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 5285 (Obsoleted by RFC 8285) == Outdated reference: A later version (-04) exists of draft-ivov-avt-slic-00 == Outdated reference: A later version (-05) exists of draft-perkins-avt-srtp-vbr-audio-00 Summary: 2 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 AVT J. Lennox 3 Internet-Draft Vidyo 4 Intended status: Standards Track June 17, 2009 5 Expires: December 19, 2009 7 A Real-Time Transport Protocol (RTP) Extension Header for Audio Level 8 Indication 9 draft-lennox-avt-rtp-audio-level-exthdr-00 11 Status of this Memo 13 This Internet-Draft is submitted to IETF in full conformance with the 14 provisions of BCP 78 and BCP 79. This document may contain material 15 from IETF Documents or IETF Contributions published or made publicly 16 available before November 10, 2008. The person(s) controlling the 17 copyright in some of this material may not have granted the IETF 18 Trust the right to allow modifications of such material outside the 19 IETF Standards Process. Without obtaining an adequate license from 20 the person(s) controlling the copyright in such materials, this 21 document may not be modified outside the IETF Standards Process, and 22 derivative works of it may not be created outside the IETF Standards 23 Process, except to format it for publication as an RFC or to 24 translate it into languages other than English. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF), its areas, and its working groups. Note that 28 other groups may also distribute working documents as Internet- 29 Drafts. 31 Internet-Drafts are draft documents valid for a maximum of six months 32 and may be updated, replaced, or obsoleted by other documents at any 33 time. It is inappropriate to use Internet-Drafts as reference 34 material or to cite them other than as "work in progress." 36 The list of current Internet-Drafts can be accessed at 37 http://www.ietf.org/ietf/1id-abstracts.txt. 39 The list of Internet-Draft Shadow Directories can be accessed at 40 http://www.ietf.org/shadow.html. 42 This Internet-Draft will expire on December 19, 2009. 44 Copyright Notice 46 Copyright (c) 2009 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents in effect on the date of 51 publication of this document (http://trustee.ietf.org/license-info). 52 Please review these documents carefully, as they describe your rights 53 and restrictions with respect to this document. 55 Abstract 57 This document defines a mechanism by which packets of Real-Time 58 Transport Protocol (RTP) audio streams can indicate, in an RTP 59 extension header, the audio level of the audio sample carried in the 60 RTP packet. In large conferences, this can reduce the load on an 61 audio mixer or other middlebox which wants to forward only a few of 62 the loudest audio streams, without requiring it to decode and measure 63 every stream that is received. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 68 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . 3 69 3. Audio Levels . . . . . . . . . . . . . . . . . . . . . . . . . 3 70 4. Signaling (Setup) Information . . . . . . . . . . . . . . . . . 5 71 5. Security Considerations . . . . . . . . . . . . . . . . . . . . 5 72 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 5 73 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 6 74 7.1. Normative References . . . . . . . . . . . . . . . . . . . 6 75 7.2. Informative References . . . . . . . . . . . . . . . . . . 6 76 Appendix A. Open issues . . . . . . . . . . . . . . . . . . . . . 7 77 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 7 79 1. Introduction 81 In a centralized Real-Time Transport Protocol (RTP) [RFC3550] audio 82 conference, an audio mixer or forwarder receives audio streams from 83 many or all of the conference participants. It then selectively 84 forwards some of them to other participants in the conference. In 85 large conferences, it is possible that such a server might be 86 receiving a large number of streams, of which only a few should be 87 forwarded to the other conference participants. 89 In such a scenario, in order to pick the audio streams to forward, a 90 centralized server needs to decode, measure audio levels, and 91 possibly perform voice activity detection on audio data from a large 92 number of streams. The need for such processing limits the size or 93 number of conferences such a server can support. 95 As an alternative, this document defines an RTP header extension 96 [RFC5285] through which senders of audio packets can indicate the 97 audio level of the packets' payload, reducing the processing load for 98 a server. 100 The header extension in this draft is different to, but complementary 101 with, the one defined in [I-D.ivov-avt-slic], which defines a 102 mechanism by which audio mixers can indicate the relative levels of 103 the contributing sources that made up the mixed audio. 105 2. Terminology 107 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 108 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 109 document are to be interpreted as described in RFC 2119 [RFC2119] and 110 indicate requirement levels for compliant implementations. 112 3. Audio Levels 114 The audio level extension header carries both the level of the audio 115 carried in the RTP payload of the packet it is associated with, as 116 well as an indication as to whether voice activity has been detected 117 in the packet. 119 The form of the audio level extension block is as follows: 121 0 1 2 122 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 123 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 124 | ID | len=1 |0| level |V| reserved | 125 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 127 Figure 1 129 The length field takes the value 1 to indicate that 2 bytes follow. 131 The audio level is defined in the same manner as is audio noise level 132 in the RTP Comfort Noise [RFC3389] specification. In that 133 specification, the overall magnitude of the noise level is encoded 134 into the first byte of the payload, with spectral information about 135 the noise in subsequent bytes. This specification's audio level 136 parameter is defined so as to be identical to the comfort noise 137 payload's noise-level byte. 139 The magnitude of the audio level is packed into the least significant 140 bits of the first payload byte of the extension header, with the most 141 significant bit unused and set to 0 as shown in Figure 1. The least 142 significant bit of the audio level magnitude is packed into the least 143 significant bit of the byte. 145 The audio level is expressed in -dBov, with values from 0 to 127 146 representing 0 to -127 dBov. dBov is the level, in decibels, relative 147 to the overload point of the system, i.e. the maximum-amplitude 148 signal that can be handled by the system without clipping. (Note: 149 Representation relative to the overload point of a system is 150 particularly useful for digital implementations, since one does not 151 need to know the relative calibration of the analog circuitry.) For 152 example, in the case of u-law (audio/pcmu) audio [ITU.G711.1988], the 153 0 dBov reference would be a square wave with values +/- 8031. (This 154 translates to 6.18 dBm0, relative to u-law's dBm0 definition in Table 155 6 of G.711.) 157 In addition, a flag byte carries bits providing additional 158 information about the audio payload carried in the media packet. At 159 this time only a single bit is defined. The V bit indicates whether 160 the encoder believes the audio packet contains voice activity (1) or 161 does not (0). The voice activity detection algorithm is unspecified 162 and left implementation-specific. 164 The other bits of the flag byte are reserved. They SHOULD be set to 165 zero by senders and ignored by receivers. 167 When this extension header is used with RTP data sent using the RTP 168 Payload for Redundant Audio Data [RFC2198], the header's data 169 describes the contents of the primary encoding. 171 4. Signaling (Setup) Information 173 The URI for declaring this header extension in an extmap attribute is 174 "urn:ietf:params:rtp-hdrext:audio-level". There is no additional 175 setup information needed for this extension (no extensionattributes). 177 5. Security Considerations 179 A malicious endpoint could choose to set the values in this extension 180 header falsely, so as to falsely claim that audio or voice is or is 181 not present. It is not clear what could be gained by falsely 182 claiming that audio is not present, but an endpoint falsely claiming 183 that audio is present could perform a denial-of-service attack on an 184 audio conference, so as to send silence to suppress other conference 185 members' audio. Thus, a device relying on audio level data from 186 untrusted endpoints SHOULD periodically audit the level information 187 transmitted, taking appropriate corrective action if endpoints appear 188 to be sending incorrect data. 190 In the Secure Real-Time Transport Protocol (SRTP) [RFC3711], RTP 191 extension headers are authenticated but not encrypted. When this 192 extension header is used, audio levels are therefore visible on a 193 packet-by-packet basis to an attacker passively observing the audio 194 stream. As discussed in [I-D.perkins-avt-srtp-vbr-audio], such an 195 attacker can infer a great deal of information about the 196 conversation, often with phoneme-level resolution. In scenarios 197 where this is a concern, additional mechanisms SHOULD be used to 198 protect the confidentiality of the extension header. 200 6. IANA Considerations 202 This document defines a new extension URI to the RTP Compact Header 203 Extensions subregistry of the Real-Time Transport Protocol (RTP) 204 Parameters registry, according to the following data: 206 Extension URI: urn:ietf:params:rtp-hdrext:audio-level 207 Description: Audio Level 208 Contact: jonathan@vidyo.com 209 Reference: RFC XXXX 211 7. References 213 7.1. Normative References 215 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 216 Requirement Levels", BCP 14, RFC 2119, March 1997. 218 [RFC2198] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., 219 Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse- 220 Parisis, "RTP Payload for Redundant Audio Data", RFC 2198, 221 September 1997. 223 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 224 Jacobson, "RTP: A Transport Protocol for Real-Time 225 Applications", STD 64, RFC 3550, July 2003. 227 [RFC5285] Singer, D. and H. Desineni, "A General Mechanism for RTP 228 Header Extensions", RFC 5285, July 2008. 230 7.2. Informative References 232 [I-D.ivov-avt-slic] 233 Ivov, E. and E. Marocco, "Delivering Conference 234 Participant Sound Level Indicators in RTP Streams", 235 draft-ivov-avt-slic-00 (work in progress), June 2009. 237 [I-D.perkins-avt-srtp-vbr-audio] 238 Perkins, C., "Guidelines for the use of Variable Bit Rate 239 Audio with Secure RTP", 240 draft-perkins-avt-srtp-vbr-audio-00 (work in progress), 241 March 2009. 243 [ITU.G711.1988] 244 International Telecommunications Union, "Pulse code 245 modulation (PCM) of voice frequencies", ITU- 246 T Recommendation G.711, November 1988. 248 [RFC3389] Zopf, R., "Real-time Transport Protocol (RTP) Payload for 249 Comfort Noise (CN)", RFC 3389, September 2002. 251 [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 252 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 253 RFC 3711, March 2004. 255 Appendix A. Open issues 257 o Should this draft be merged with [I-D.ivov-avt-slic]? 258 o Would it be useful to add a fractional part to the audio level, 259 e.g., to describe the audio level in an 8+8 fixed-point format? 260 Due to the format of RTP extension headers, a third byte for the 261 extension header is essentially "free" if no other RTP extension 262 headers are in use. 263 o Are any other bits useful in the flag byte? 264 o Is there any compelling use case for providing the audio level 265 without voice detection information, and if so, should the two 266 pieces of information be separated? 268 Author's Address 270 Jonathan Lennox 271 Vidyo, Inc. 272 433 Hackensack Avenue 273 Sixth Floor 274 Hackensack, NJ 07601 275 US 277 Email: jonathan@vidyo.com