idnits 2.17.1 

draft-ietf-rtcweb-audio-02.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (August 02, 2013) is 3917 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Possible downref: Non-RFC (?) normative reference: ref. 'Opus-RTP'

  -- No information found for draft-ekr-security-considerations-for-rtc-web -
     is the name correct?

  -- Possible downref: Normative reference to a draft: ref.
     'I-D.ekr-security-considerations-for-rtc-web' 


     Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	Network Working Group                                          JM. Valin
3	Internet-Draft                                                   Mozilla
4	Intended status: Standards Track                                 C. Bran
5	Expires: February 03, 2014                                   Plantronics
6	                                                         August 02, 2013

8	             WebRTC Audio Codec and Processing Requirements
9	                       draft-ietf-rtcweb-audio-02

11	Abstract

13	   This document outlines the audio codec and processing requirements
14	   for WebRTC client application and endpoint devices.

16	Status of This Memo

18	   This Internet-Draft is submitted in full conformance with the
19	   provisions of BCP 78 and BCP 79.

21	   Internet-Drafts are working documents of the Internet Engineering
22	   Task Force (IETF).  Note that other groups may also distribute
23	   working documents as Internet-Drafts.  The list of current Internet-
24	   Drafts is at http://datatracker.ietf.org/drafts/current/.

26	   Internet-Drafts are draft documents valid for a maximum of six months
27	   and may be updated, replaced, or obsoleted by other documents at any
28	   time.  It is inappropriate to use Internet-Drafts as reference
29	   material or to cite them other than as "work in progress."

31	   This Internet-Draft will expire on February 03, 2014.

33	Copyright Notice

35	   Copyright (c) 2013 IETF Trust and the persons identified as the
36	   document authors.  All rights reserved.

38	   This document is subject to BCP 78 and the IETF Trust's Legal
39	   Provisions Relating to IETF Documents
40	   (http://trustee.ietf.org/license-info) in effect on the date of
41	   publication of this document.  Please review these documents
42	   carefully, as they describe your rights and restrictions with respect
43	   to this document.  Code Components extracted from this document must
44	   include Simplified BSD License text as described in Section 4.e of
45	   the Trust Legal Provisions and are provided without warranty as
46	   described in the Simplified BSD License.

48	Table of Contents
49	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
50	   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   2
51	   3.  Codec Requirements  . . . . . . . . . . . . . . . . . . . . .   2
52	   4.  Audio Level . . . . . . . . . . . . . . . . . . . . . . . . .   3
53	   5.  Acoustic Echo Cancellation (AEC)  . . . . . . . . . . . . . .   4
54	   6.  Legacy VoIP Interoperability  . . . . . . . . . . . . . . . .   4
55	   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   4
56	   8.  Security Considerations . . . . . . . . . . . . . . . . . . .   5
57	   9.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .   5
58	   10. Normative References  . . . . . . . . . . . . . . . . . . . .   5
59	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .   5

61	1.  Introduction

63	   An integral part of the success and adoption of the Web Real Time
64	   Communications (WebRTC) will be the voice and video interoperability
65	   between WebRTC applications.  This specification will outline the
66	   audio processing and codec requirements for WebRTC client
67	   implementations.

69	2.  Terminology

71	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
72	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
73	   document are to be interpreted as described in RFC 2119 [RFC2119].

75	3.  Codec Requirements

77	   To ensure a baseline level of interoperability between WebRTC
78	   clients, a minimum set of required codecs are specified below.  While
79	   this section specifies the codecs that will be mandated for all
80	   WebRTC client implementations, it leaves the question of supporting
81	   additional codecs to the will of the implementer.

83	   WebRTC clients are REQUIRED to implement the following audio codecs.

85	   o  Opus [RFC6716], with the payload format specified in [Opus-RTP]
86	      and any ptime value up to 120 ms

88	   o  G.711 PCMA and PCMU with one channel, a rate of 8000 Hz and a
89	      ptime of 20 - see section 4.5.14 of [RFC3551]

91	   o  Telephone Event - [RFC4733]

93	   For all cases where the client is able to process audio at a sampling
94	   rate higher than 8 kHz, it is RECOMMENDED that Opus be offered before
95	   PCMA/PCMU.  For Opus, all modes MUST be supported on the decoder
96	   side.  The choice of encoder-side modes is left to the implementer.
97	   Clients MAY use the offer/answer mechanism to signal a preference for
98	   a particular mode or ptime.

100	4.  Audio Level

102	   It is desirable to standardize the "on the wire" audio level for
103	   speech transmission to avoid users having to manually adjust the
104	   playback and to facilitate mixing in conferencing applications.  It
105	   is also desirable to be consistent with ITU-T recommendations G.169
106	   and G.115, which recommend an active audio level of -19 dBm0.
107	   However, unlike G.169 and G.115, the audio for WebRTC is not
108	   constrained to have a passband specified by G.712 and can in fact be
109	   sampled at any sampling rate from 8 kHz to 48 kHz and up.  For this
110	   reason, the level SHOULD be normalized by only considering
111	   frequencies above 300 Hz, regardless of the sampling rate used.  The
112	   level SHOULD also be adapted to avoid clipping, either by lowering
113	   the gain to a level below -19 dBm0, or through the use of a
114	   compressor.

116	   AUTHORS' NOTE: The idea of using the same level as what the ITU-T
117	   recommends is that it should improve inter-operability while at the
118	   same time maintaining sufficient dynamic range and reducing the risk
119	   of clipping.  The main drawbacks are that the resulting level is
120	   about 12 dB lower than typical "commercial music" levels and it
121	   leaves room for ill-behaved clients to be much louder than a normal
122	   client.  While using music-type levels is not really an option (it
123	   would require using the same compressor-limitors that studios use),
124	   it would be possible to have a level slightly higher (e.g.  3 dB)
125	   than what is recommended above without causing interoperability
126	   problems.

128	   Assuming 16-bit PCM with a value of +/-32767, -19 dBm0 corresponds to
129	   a root mean square (RMS) level of 2600.  Only active speech should be
130	   considered in the RMS calculation.  If the client has control over
131	   the entire audio capture path, as is typically the case for a regular
132	   phone, then it is RECOMMENDED that the gain be adjusted in such a way
133	   that active speech have a level of 2600 (-19 dBm0) for an average
134	   speaker.  If the client does not have control over the entire audio
135	   capture, as is typically the case for a software client, then the
136	   client SHOULD use automatic gain control (AGC) to dynamically adjust
137	   the level to 2600 (-19 dBm0) +/- 6 dB.  For music or desktop sharing
138	   applications, the level SHOULD NOT be automatically adjusted and the
139	   client SHOULD allow the user to set the gain manually.

141	   The RECOMMENDED filter for normalizing the signal energy is a second-
142	   order Butterworth filter with a 300 Hz cutoff frequency.

144	   It is common for the audio output on some devices to be "calibrated"
145	   for playing back pre-recorded "commercial" music, which is typically
146	   around 12 dB louder than the level recommended in this section.
147	   Because of this, clients MAY increase the gain before playback.

149	5.  Acoustic Echo Cancellation (AEC)

151	   It is plausible that the dominant near to mid-term WebRTC usage model
152	   will be people using the interactive audio and video capabilities to
153	   communicate with each other via web browsers running on a notebook
154	   computer that has built-in microphone and speakers.  The notebook-as-
155	   communication-device paradigm presents challenging echo cancellation
156	   problems, the specific remedy of which will not be mandated here.
157	   However, while no specific algorithm or standard will be required by
158	   WebRTC compatible clients, echo cancellation will improve the user
159	   experience and should be implemented by the endpoint device.

161	   WebRTC clients SHOULD include an AEC and if that is not possible, the
162	   clients SHOULD ensure that the speaker-to-microphone gain is below
163	   unity at all frequencies to avoid instability when none of the client
164	   has echo cancellation.  For clients that do not control the audio
165	   capture and playback devices directly, it is RECOMMENDED to support
166	   echo cancellation between devices running at slight different
167	   sampling rates, such as when a webcam is used for microphone.

169	   The client SHOULD allow either the entire AEC or the non-linear
170	   processing (NLP) to be turned off for applications, such as music,
171	   that do not behave well with the spectral attenuation methods
172	   typically used in NLPs.  It SHOULD have the ability to detect the
173	   presence of a headset and disable echo cancellation.

175	   For some applications where the remote client may not have an echo
176	   canceller, the local client MAY include a far-end echo canceller, but
177	   if that is the case, it SHOULD be disabled by default.

179	6.  Legacy VoIP Interoperability

181	   The codec requirements above will ensure, at a minimum, voice
182	   interoperability capabilities between WebRTC client applications and
183	   legacy phone systems.

185	7.  IANA Considerations

187	   This document makes no request of IANA.

189	   Note to RFC Editor: this section may be removed on publication as an
190	   RFC.

192	8.  Security Considerations

194	   The codec requirements have no additional security considerations
195	   other than those captured in
196	   [I-D.ekr-security-considerations-for-rtc-web].

198	9.  Acknowledgements

200	   This draft incorporates ideas and text from various other drafts.  In
201	   particularly we would like to acknowledge, and say thanks for, work
202	   we incorporated from Harald Alvestrand and Cullen Jennings.

204	10.  Normative References

206	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
207	              Requirement Levels", BCP 14, RFC 2119, March 1997.

209	   [RFC3551]  Schulzrinne, H. and S. Casner, "RTP Profile for Audio and
210	              Video Conferences with Minimal Control", STD 65, RFC 3551,
211	              July 2003.

213	   [RFC4733]  Schulzrinne, H. and T. Taylor, "RTP Payload for DTMF
214	              Digits, Telephony Tones, and Telephony Signals", RFC 4733,
215	              December 2006.

217	   [RFC6716]  Valin, JM., Vos, K., and T. Terriberry, "Definition of the
218	              Opus Audio Codec", RFC 6716, September 2012.

220	   [Opus-RTP]
221	              Spittka, J., Vos, K., and JM. Valin, "RTP Payload Format
222	              for Opus Codec", August 2013.

224	   [I-D.ekr-security-considerations-for-rtc-web]
225	              Rescorla, E.K., "Security Considerations for RTC-Web", May
226	              2011.

228	Authors' Addresses

230	   Jean-Marc Valin
231	   Mozilla
232	   650 Castro Street
233	   Mountain View, CA  94041
234	   USA

236	   Email: jmvalin@jmvalin.ca
237	   Cary Bran
238	   Plantronics
239	   345 Encinial Street
240	   Santa Cruz, CA  95060
241	   USA

243	   Phone: +1 206 661-2398
244	   Email: cary.bran@plantronics.com