Network Working Group X. Marjou Internet-Draft S. Proust Intended status: Informational France Telecom Orange Expires: August 29, 2013 K. Bogineni Verizon Wireless R. Jesske B. Feiten Deutsche Telekom AG L. Miao Huawei E. Enrico Telecom Italia E. Berger Cisco February 25, 2013 WebRTC audio codecs for interoperability with legacy networks. draft-marjou-rtcweb-audio-codecs-for-interop-01 Abstract This document presents use-cases underlining why WebRTC needs AMR-WB, AMR and G.722 as additional relevant voice codecs to satisfactorily ensure interoperability with existing systems. It also presents a way forward that takes into consideration the concerns expressed against the addition of codecs besides Opus and G.711. It is especially recognized that unjustified additional costs on browsers must be avoided. Therefore, the proposed solution intends to fully rely on the codecs already supported on the devices implementing the browsers and for which license and implementation costs have been already paid. It is expected that this way forward will significantly limit the costs and technical impacts on browsers while greatly improving interoperability with legacy systems and overall quality. It intents to be considered as a good compromise beneficial to all parties and to the whole industry: the user quality experience will be optimized as a whole at limited additional costs without incurring high costs for both networks to support transcoding and browsers to support additional codecs. Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Marjou, et al. Expires August 29, 2013 [Page 1] Internet-Draft WebRTC audio codecs for interop February 2013 Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on August 29, 2013. Copyright Notice Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Marjou, et al. Expires August 29, 2013 [Page 2] Internet-Draft WebRTC audio codecs for interop February 2013 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 3. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 5 4. Use cases . . . . . . . . . . . . . . . . . . . . . . . . . . 5 4.1. AMR-WB . . . . . . . . . . . . . . . . . . . . . . . . . . 5 4.1.1. Use case . . . . . . . . . . . . . . . . . . . . . . . 5 4.1.2. Problem . . . . . . . . . . . . . . . . . . . . . . . 5 4.1.3. Concerns from the browser manufacturers . . . . . . . 7 4.2. AMR . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 4.2.1. Use case . . . . . . . . . . . . . . . . . . . . . . . 7 4.2.2. Problem . . . . . . . . . . . . . . . . . . . . . . . 7 4.2.3. Concerns from the browser manufacturers . . . . . . . 8 4.3. G.722 . . . . . . . . . . . . . . . . . . . . . . . . . . 8 4.3.1. Use case . . . . . . . . . . . . . . . . . . . . . . . 8 4.3.2. Problem . . . . . . . . . . . . . . . . . . . . . . . 8 4.3.3. Concerns from the browser manufacturers . . . . . . . 9 5. The proposed way-forward . . . . . . . . . . . . . . . . . . . 9 6. Security Considerations . . . . . . . . . . . . . . . . . . . 11 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 11 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 11 9.1. Normative references . . . . . . . . . . . . . . . . . . . 11 9.2. Informative references . . . . . . . . . . . . . . . . . . 12 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 12 Marjou, et al. Expires August 29, 2013 [Page 3] Internet-Draft WebRTC audio codecs for interop February 2013 1. Introduction As indicated in [I-D.ietf-rtcweb-overview], it has been anticipated that WebRTC will not remain an isolated island and that some WebRTC endpoints will need to communicate with devices used in other existing networks with the help of a gateway. In order to reach the agreement to select OPUS and G.711 as mandatory to implement codecs it was also agreed to consider possible additional codecs in order to take into account the concerns expressed on interoperability issues. The discussion is consequently currently taking place regarding the additional voice/audio codecs that need to be supported. It is mainly questioned whether Opus and G.711 are sufficient to properly address the interoperability issues with legacy systems or if additional codecs need to be supported. This document presents some use cases highlighting that Opus and G.711 are not sufficient to properly cover all interoperability requirements. In Section 4, important interoperability use-cases are presented describing the interoperability issues that would be encountered if only Opus and G.711 were supported. It therefore advocates for the addition of other codecs while addressing concerns raised against such support. In section 5, a way forward is proposed that intends to be a real compromise taking into consideration the concerns expressed against additional codecs. It is especially recognized that unjustified additional costs on browsers must be avoided. Therefore the proposed solution intends to strongly limit the cost and technical impact on browsers for the support of additional codecs (including license costs) while improving interoperability with legacy systems. Regarding audio codecs, it is a common misconception that other existing voice networks only support G.711. Actually existing networks use circuit switched networks as well as voice-over-IP networks like H.323 and SIP-based networks, which means that audio codecs are not limited to G.711. For instance, from use cases described in [I-D.ietf-rtcweb-se-ucases-and-requirements], it can be foreseen that interoperability with mobile telephony systems will often happen. In such mobile systems, the UE must support the Adaptive Multi-Rate (AMR) speech codec, and if wideband speech communication is offered, the UE must support AMR wideband (AMR-WB) codec. An increasing number of customers are now experiencing high quality voice with HD Voice services over mobile, fixed networks or over the internet. For those customers, any fall back to the G.711 narrow band quality for interoperability purpose could be perceived as a strong and unacceptable quality degradation. Support of G.711 as the only codec for legacy interoperability purposes is currently Marjou, et al. Expires August 29, 2013 [Page 4] Internet-Draft WebRTC audio codecs for interop February 2013 not sufficient. 2. Terminology In this document, the key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" are to be interpreted as described in RFC 2119 [RFC2119]. 3. Definitions Legacy networks: In this draft, legacy networks encompass the conversational networks that are already deployed like the PSTN, the PLMN, the IMS, H.323 networks. 4. Use cases 4.1. AMR-WB 4.1.1. Use case The market of voice personal communication is driven by mobile terminals and WebRTC technology is expected to be increasingly used on smartphones. Furthermore "HD voice" is gaining momentum and more and more personal communication devices will support wideband communications. Customers are now getting used to the high quality offered by HD Voice mobile devices, CAT-iq fixed HD devices and eventually, HD Voice via WebRTC and OPUS over the internet. Hence, many communications are expected to be held between a user of a WebRTC endpoint on a device integrating an AMR-WB module who wants to communicate with another user that can only be reached on a mobile device that only supports AMR-WB (and AMR); both endpoints support HD voice quality. 4.1.2. Problem For this use case, the best situation will be to have the AMR-WB supported by both sides of the connection. Indeed, as mentioned in the introduction, AMR-WB is specified by 3GPP as the mandatory codec to be supported by wideband mobile terminals for a wide range of communication services as described in [AMR-WB]. This includes the massively deployed circuit switched mobile telephony services and new multimedia telephony services over IP/IMS and 4G/VoLTE as specified by GSMA as voice IMS profile for VoLTE in IR92. Hence, AMR-WB is Marjou, et al. Expires August 29, 2013 [Page 5] Internet-Draft WebRTC audio codecs for interop February 2013 strongly increasing with deployment in more than 60 networks from 45 countries and more than 130 types of terminals (c.f. [Information-Papers]). In that use case, if OPUS and G.711 remain the only codecs supported by the WebRTC endpoints, a gateway must then transcode these codecs into AMR-WB, and vice-versa, in order to implement the use-case. As a consequence, a high number of calls are likely to be affected by transcoding operations producing a degradation of the user quality experience for many customers. This will have a very significant business impact for all service providers on both sides, not only with respect to the transcoding costs but mainly with respect to user experience degradation. The drawbacks of transcoding are recalled below: o Cost issues: transcoding places important additional costs on network gateways for example codec implementation and license costs, deployments costs, testing/validation costs etc... However these costs can be seen as just transferred from the terminal side to the network side. The real issue is rather the degradation of the quality of service affecting the end user perceived quality which will be harmful to all concerned service providers. o Intrinsic quality degradation: Subjective test results show that intrinsic voice quality is significantly degraded by transcoding. The degradation is around 0.2 to 0.3 MOS for most of transcoding use cases with AMR-WB at 12.65 kbit/s. It should be stressed that if transcoding is performed between AMR-WB and G.711, wideband voice quality will be lost. Such bandwidth reduction effect clearly degrades the user perceived quality of service leading to shorter and less frequent calls (see ref_gsma). Such a switch to G.711 will not be accepted anymore by customers. If transcoding is performed between AMR-WB and OPUS, wideband communication could be maintained. However, as the WB codecs complexity is higher than NB codecs complexity, such WB transcoding is also more costly and degrades the quality: MOS scores of transcoding between AMR-WB 12.65kbit/s and OPUS at 16 kbit/s in both directions are significantly lower than those of AMR-WB at 12.65kbit/s or OPUS at 16 kbit/s. Furthermore, in degraded conditions, the addition of defects, like audio artifacts due to packet losses, and the audio effects resulting from the cascading of different packet loss recovery algorithms may result in a quality below the acceptable limit for the customers. o Degraded interactivity due to increased latency: Transcoding means full de-packetization for decoding of the media stream (including mechanisms of de-jitter buffering and packet loss recovery) then Marjou, et al. Expires August 29, 2013 [Page 6] Internet-Draft WebRTC audio codecs for interop February 2013 re-encoding, re-packetization and re-sending. The delays produced by all these operations are additive and may increase the end to end delay beyond acceptable limits like with more than 1s end to end latency. In addition to these drawbacks related to transcoding, the following issue must be considered: o Efficiency over mobile radio access: AMR-WB has been designed and extensively tested for optimized and robust usage over mobile radio access which results in enhanced capacity and efficiency. The mobile radio bearer is optimized for such a codec with channel coding protecting its most sensitive bits. Furthermore, AMR-WB is more efficient than OPUS at regular bit rates used for mobile communication of 12.65 kbit/s with fall back modes down to 6.6 kbit/s. Finally, hardware optimized implementation may allow for less battery consumption As a consequence, re-using AMR-WB would be beneficial for the specific usage of WebRTC technology over mobile networks. With the strong increase of the smartphone market the capability to use such a mobile codec could strongly enforce and extend the market penetration of the Web RTC technology. 4.1.3. Concerns from the browser manufacturers The browser manufacturers are concerned by the additional costs that the implementation of AMR-WB would put on browsers which include integration and test costs and codec license costs. The proposed way forward in Section 5 takes carefully into account this concern. 4.2. AMR 4.2.1. Use case A user of a WebRTC endpoint on a device integrating an AMR module wants to communicate with another user that can only be reached on a mobile device that only supports AMR. Although more and more terminal devices are now "HD voice" and support AMR-WB a high number of legacy terminals supporting only AMR (terminals with no wideband / HD Voice capabilities) are still used. 4.2.2. Problem For this use case, the best solution will be to have the AMR supported by both sides of the connection. Indeed, AMR is specified by 3GPP as the mandatory codec to be supported by any mobile terminal Marjou, et al. Expires August 29, 2013 [Page 7] Internet-Draft WebRTC audio codecs for interop February 2013 for a wide range of communication services. This includes the massively deployed circuit switched mobile telephony services and new multimedia telephony services over IP/IMS and 4G/VoLTE as specified by GSMA as voice IMS profile for VoLTE in IR92. Hundreds of millions of terminals are consequently currently supporting AMR and are not supporting OPUS nor G.711. In that use case, if OPUS and G.711 remain the only codecs supported by the WebRTC endpoints the same problem as described in 4.1.1 will be experienced because of transcoding impacts (costs, quality degradation and increased latency) and lower efficiency over mobile radio access. As a consequence, re-using AMR would be beneficial for the specific usage of WebRTC technology over mobile networks. With the strong increase of the smartphone market the capability to use such a mobile codec could strongly enforce and extend the market penetration of the Web RTC technology. 4.2.3. Concerns from the browser manufacturers Same as in Section 4.1.3. 4.3. G.722 4.3.1. Use case As mentioned in Section 4.1.1, HD Voice is gaining momentum and more and more personal communication devices support wideband communications. Customers get used to high quality voice and WebRTC aims at providing high voice quality over internet. In this use case, a user of a WebRTC endpoint on a device integrating G.722 module wants to communicate with another user that can only be reached on a device that only supports G.722 as a wideband codec, G.722 being specified by ETSI as the mandatory wideband codec for New Generation DECT (e.g. CAT-iq compliant). 4.3.2. Problem For this use case, the best solution will be to have the G.722 supported by both sides of the connection. Indeed, G.722 has been chosen by ETSI DECT to greatly increase the voice quality by extending the bandwidth from narrow band to wideband. Besides providing high wideband quality, it has low complexity and very low delay. It is widely used in HD fixed services in both hard and soft endpoints. Marjou, et al. Expires August 29, 2013 [Page 8] Internet-Draft WebRTC audio codecs for interop February 2013 In this use case, if OPUS and G.711 remain the only codecs supported by the WebRTC endpoints, a gateway must then transcode them from and into G.722 in order to implement the use-case. As in Section 4.1.2, it should be stressed that if transcoding is performed between G.722 and G.711, wideband quality will be lost with fall back to narrow band. This will be perceived as a strong and unacceptable quality degradation by customers experiencing more and more wideband voice calls. It is also important to recall that wideband audio can help persons with hearing impairments to use voice communication over distance and drafts regulations dealing with this requires wide band audio wherever there is voice communication pointing at G.722 as the common codec at least to assure interoperability with wide-band audio between providers. On the other hand, transcoding with OPUS will greatly increase the complexity, now, as mentioned above, G.722 low complexity was a key factor in many applications mandating G.722. 4.3.3. Concerns from the browser manufacturers Unlike AMR and AMR-WB, G.722, as G.711, is royalty free. Some concerns about the availability of G.722 PLC were raised. Indeed G.722 and G.711 were initially designed without Packet Loss Concealment (PLC); nevertheless, ITU-T did standardize such functionality as appendices to these recommendations to extend the capabilities of current systems to support new applications and to follow the market demand (for instance when these standards have been widely used in VoIP applications). It has been argued that, unlike the main recommendations, there are non-RF IPR declarations for these PLC appendices (G.711 Appendix I, G.722 Appendices III and IV). It should be recalled that these appendices are only examples and implementers are free to use any PLC solution. For instance in the G.722 case, there are publicly available PLC in the ITU-T Software Tool Library. 5. The proposed way-forward It is proposed that the browser manufacturers re-use AMR, AMR-WB and G.722 codecs whenever they are already supported on the device on which the browsers are implemented. AMR and AMR-WB are already supported by millions of devices with license costs and technical costs (implementation, tests...) already paid for these codecs optimized for mobile usage. Android now provides the APIs needed to give access to all the voice and audio features implemented on the hardware that are required to Marjou, et al. Expires August 29, 2013 [Page 9] Internet-Draft WebRTC audio codecs for interop February 2013 develop a voice service. This especially includes AMR and AMR-WB encoding and decoding, adaptive gain control, echo cancellation and noise reduction. It is therefore technically feasible that browsers offer AMR and AMR-WB as additional RTC Web codecs without re-implementing these codecs and so with very limited additional costs: o For implementation, no codec software code has to be developed, tested or integrated directly in the browser (neither for the encoding/decoding nor for related audio functions). o For licensing, fees are already paid for the hardware implementation. Since no additional codec is implemented on the device with only one active voice call using the codec at a time it cannot be argued that additional license fees have to be paid in that case. However, in order to fully guarantee this, it is made explicit in the proposed text that the support of AMR and AMR-WB is conditional on the fact that no additional license fee is required. This proposed way forward is expected to be a good compromise beneficial to all parties. It will optimize the user experience with limited additional costs: excluding high costs for networks to support transcoding and high additional costs on browsers to support additional codecs. It is consequently proposed the following wording to be added to Section 3 (codec requirements) of [I-D.ietf-rtcweb-audio-codecs]: 3. Audio Codecs 3.1. Required Codecs To ensure a baseline level of interoperability between WebRTC clients, a minimum set of required codecs are specified below. WebRTC clients are REQUIRED to implement the following audio codecs. * Opus [RFC6716], with any ptime value up to 120 ms * G.711 PCMA and PCMU with one channel, a rate of 8000 Hz and a ptime of 20 - see section 4.5.14 of [RFC3551] * Telephone Event - [RFC4733] 3.2. Additional Codecs Marjou, et al. Expires August 29, 2013 [Page 10] Internet-Draft WebRTC audio codecs for interop February 2013 To ensure an enhanced level of interoperability between WebRTC clients, AMR-WB, AMR and G.722 codecs SHOULD be implemented by WebRTC end-points to avoid transcoding costs and quality degradations towards legacy fixed and mobile devices and allow interworking with enhanced voice quality (rather than fall back to G.711 narrow band voice). WebRTC browsers on devices for which the implementation of AMR is mandatory for voice services MUST allow AMR to be negotiated and used at WebRTC level provided it is ensured that no additional license fees are required. WebRTC browsers on wide-band devices for which the implementation of AMR-WB is mandatory for voice services MUST allow AMR-WB to be negotiated and used at WebRTC level provided it is ensured that no additional license fees are required. WebRTC browsers devices for which the implementation of G.722 is mandatory for voice services MUST allow G.722 to be negotiated and used at WebRTC level. Note: the wording of section "3.2. Additional Codecs" is a first proposal and example to try to capture the general principle explained in this document to improve interoperability while limiting the cost impact on browsers. It is subject to further modifications to reach the best possible compromise. 6. Security Considerations 7. IANA Considerations None. 8. Acknowledgements Thanks to Milan Patel for his review. 9. References 9.1. Normative references [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. Marjou, et al. Expires August 29, 2013 [Page 11] Internet-Draft WebRTC audio codecs for interop February 2013 [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999. 9.2. Informative references [AMR-WB] GSMA, "AMR-WB", 2011. [I-D.ietf-rtcweb-audio] Valin, J. and C. Bran, "WebRTC Audio Codec and Processing Requirements", draft-ietf-rtcweb-audio-01 (work in progress), November 2012. [I-D.ietf-rtcweb-overview] Alvestrand, H., "Overview: Real Time Protocols for Brower- based Applications", draft-ietf-rtcweb-overview-06 (work in progress), February 2013. [I-D.ietf-rtcweb-use-cases-and-requirements] Holmberg, C., Hakansson, S., and G. Eriksson, "Web Real- Time Communication Use-cases and Requirements", draft-ietf-rtcweb-use-cases-and-requirements-10 (work in progress), December 2012. [Information-Papers] GSMA, "Information Papers", 2013. Authors' Addresses Xavier Marjou France Telecom Orange 2, avenue Pierre Marzin Lannion 22307 France Email: xavier.marjou@orange.com Stephane Proust France Telecom Orange 2, avenue Pierre Marzin Lannion 22307 France Email: stephane.proust@orange.com Marjou, et al. Expires August 29, 2013 [Page 12] Internet-Draft WebRTC audio codecs for interop February 2013 Kalyani Bogineni Verizon Wireless Email: Kalyani.Bogineni@VerizonWireless.com Roland Jesske Deutsche Telekom AG Email: R.Jesske@telekom.de Bernhard Feiten Deutsche Telekom AG Email: R.Jesske@telekom.de Lei Miao Huawei Email: lei.miao@huawei.com Marocco Telecom Italia Email: enrico.marocco@telecomitalia.it Espen Berger Cisco Email: espeberg@cisco.com Marjou, et al. Expires August 29, 2013 [Page 13]