idnits 2.17.1 draft-burman-rtcweb-h264-proposal-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 27, 2014) is 3468 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. 'H264' ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 RTCWEB Working Group B. Burman 3 Internet-Draft Ericsson 4 Intended status: Standards Track M. Isomaki 5 Expires: April 30, 2015 Nokia 6 B. Aboba 7 Microsoft Corporation 8 G. Martin-Cocher 9 BlackBerry Ltd 10 G. Mandyam 11 Qualcomm Innovation Center 12 X. Marjou 13 Orange 14 C. Jennings 15 J. Rosenberg 16 Cisco 17 D. Singer 18 Apple 19 October 27, 2014 21 H.264 as Mandatory to Implement Video Codec for WebRTC 22 draft-burman-rtcweb-h264-proposal-05 24 Abstract 26 This document proposes that, and motivates why, H.264 should be a 27 Mandatory To Implement video codec for WebRTC. 29 Status of This Memo 31 This Internet-Draft is submitted in full conformance with the 32 provisions of BCP 78 and BCP 79. 34 Internet-Drafts are working documents of the Internet Engineering 35 Task Force (IETF). Note that other groups may also distribute 36 working documents as Internet-Drafts. The list of current Internet- 37 Drafts is at http://datatracker.ietf.org/drafts/current/. 39 Internet-Drafts are draft documents valid for a maximum of six months 40 and may be updated, replaced, or obsoleted by other documents at any 41 time. It is inappropriate to use Internet-Drafts as reference 42 material or to cite them other than as "work in progress." 44 This Internet-Draft will expire on April 30, 2015. 46 Copyright Notice 48 Copyright (c) 2014 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents 53 (http://trustee.ietf.org/license-info) in effect on the date of 54 publication of this document. Please review these documents 55 carefully, as they describe your rights and restrictions with respect 56 to this document. Code Components extracted from this document must 57 include Simplified BSD License text as described in Section 4.e of 58 the Trust Legal Provisions and are provided without warranty as 59 described in the Simplified BSD License. 61 Table of Contents 63 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 64 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 65 3. H.264 Overview . . . . . . . . . . . . . . . . . . . . . . . 3 66 4. Implementations . . . . . . . . . . . . . . . . . . . . . . . 3 67 4.1. Software . . . . . . . . . . . . . . . . . . . . . . . . 4 68 4.2. Hardware . . . . . . . . . . . . . . . . . . . . . . . . 4 69 4.3. Standards . . . . . . . . . . . . . . . . . . . . . . . . 5 70 5. Deployment . . . . . . . . . . . . . . . . . . . . . . . . . 5 71 6. Licensing . . . . . . . . . . . . . . . . . . . . . . . . . . 7 72 6.1. Royalty Free for Innovation, Low-volume Shipments . . . . 7 73 6.2. Higher H.264/AVC Profile Tools Bundled . . . . . . . . . 8 74 6.3. Licensing Stability . . . . . . . . . . . . . . . . . . . 8 75 7. Performance . . . . . . . . . . . . . . . . . . . . . . . . . 9 76 8. Profile/level . . . . . . . . . . . . . . . . . . . . . . . . 11 77 9. Negotiation . . . . . . . . . . . . . . . . . . . . . . . . . 13 78 10. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 79 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 80 12. Security Considerations . . . . . . . . . . . . . . . . . . . 15 81 13. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 15 82 14. References . . . . . . . . . . . . . . . . . . . . . . . . . 15 83 14.1. Normative References . . . . . . . . . . . . . . . . . . 15 84 14.2. Informative References . . . . . . . . . . . . . . . . . 16 85 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 19 87 1. Introduction 89 The selection of a Mandatory To Implement (MTI) video codec for 90 WebRTC has been discussed for quite some time in the RTCWEB WG. This 91 document proposes that the H.264 video codec should be mandatory to 92 implement for WebRTC implementations and gives motivation to this 93 proposal. 95 The core of the proposal is that: 97 H.264 Constrained Baseline Profile Level 1.2 MUST be supported as 98 Mandatory To Implement video codec. 100 To enable higher quality for devices capable of it: 102 H.264 Constrained High Profile Level 1.3, logically extended to 103 support 720p resolution at 30 Hz framerate is RECOMMENDED. 105 This draft discusses the advantages of H.264 as the authors of this 106 draft see them; a richness of implementations and hardware support, 107 well known licensing conditions, good performance, and well defined 108 handling of varying device capabilities. 110 2. Terminology 112 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 113 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 114 document are to be interpreted as described in BCP 14, RFC 2119 115 [RFC2119]. 117 3. H.264 Overview 119 The video coding standard Advanced Video Coding (ITU-T H.264 | ISO/ 120 IEC 14496-10 [H264]) has been around for almost ten years by now. 121 Developed jointly by MPEG and ITU-T in the Joint Video Team, it was 122 published in its first version in 2003 and amended with support for 123 higher-fidelity video in 2004. Other significant updates include 124 support for scalability (2007) and multiview (2009). The codec goes 125 under the names H.264, AVC and MPEG-4 Part10. In this memo the term 126 "H.264" will be used. 128 H.264 was from the start very successful and has become widely 129 adopted for (video) content as well as (video) communication services 130 worldwide. 132 H.264 is mandatory in mobile wireless standards for multimedia 133 telephony and packet switched streaming. It is also the leading de 134 facto standard for web video content delivered in HTML5 or other 135 technologies, and is supported in nearly all major web browsers, 136 mobile device platforms, and desktop operating systems. 138 4. Implementations 139 4.1. Software 141 There are many software implementations of the H.264 standard, 142 including royalty-free open source code from Cisco [OpenH264] and 143 Polycom [Woon], both of which support H.264/SVC (Annex G). Wikipedia 144 provides an illustration of the long list of other available 145 implementations [Implementations]. 147 The Cisco OpenH264 implementation is notable for also providing 148 binaries for common platforms that can be downloaded directly to an 149 end-user's device or application, so distribution royalties are paid 150 by Cisco rather than the application developer. The latest Mozilla 151 Firefox browser release does just that - downloads an OpenH264 binary 152 - making it possible to use H.264 in WebRTC sessions [FF33]. 154 Microsoft has also produced an H.264 prototype for use in browsers 155 [CURtcWeb]. Not only are there standalone implementations available, 156 including open source, but in addition recent Windows and Mac OS X 157 versions support H.264 encoding and decoding. 159 4.2. Hardware 161 Arguably, hardware or DSP acceleration for video encoding/decoding 162 would be mostly beneficial for devices that has relatively lower 163 capacity in terms of CPU and power (smaller batteries), and the most 164 common devices in this category are phones and tablets. There is a 165 long list of vendors offering hardware or DSP implementations of 166 H.264. In particular all vendors of platforms for mobile high-range 167 phones, smartphones, and tablets support H.264/AVC High Profile 168 encoding and decoding at least 1080p30, but those platforms are 169 currently in general not used for low- to mid-range devices. These 170 vendors are Qualcomm, TI, Nvidia, Renesas, Mediatek, Huawei 171 Hisilicon, Intel, Broadcom, Samsung. Those platforms all support 172 H.264/AVC codec with dedicated hardware or DSP. The majority of the 173 implementations also support low-delay real-time applications. 175 The WebM wiki [WEBM] shows only 8 (out of ~68) SoCs which support VP8 176 encode and decode. This only represents a fraction of deployed SoCs. 177 Almost all deployed SoCs, as well as future designs, support H.264 178 encode and decode, including desktop (Intel x86) chipsets. 180 The benefits of hardware encoder and decoder implementations 181 typically have an order of magnitude or more performance advantage 182 (e.g., 1080p versus 360p becomes achievable) and power savings (e.g., 183 tens of milliwatts versus many hundreds of milliwatts or even watts 184 are consumed just by the encoder and decoder). While VP8 proponents 185 have argued codec power is not a major concern relative to displays, 186 this neglects the advances in display technology that put the central 187 processor back near the top power consumers. 189 The availability of hardware codecs for real-time communication to 190 developers through public APIs is increasing. As of iOS8, Apple has 191 provided API for access to the hardware H.264 encoder and decoder on 192 the iOS platforms. The APIs can be found in the Video Toolbox 193 [AppleVideoToolbox]. BlackBerry recently released the API 194 [BlackBerryAPI] to the hardware H.264 codec via OpenMAX-AL [OpenMAX]. 195 Android has provided the MediaCodec API [MediaCodec] to the hardware 196 H.264 codec since version 4.1 (API 16), as well as enhancements and a 197 Compatibility Test Suite in 4.3 (API 18). 199 4.3. Standards 201 There are also other standards and specifications that support H.264. 202 One notable area is wireless display standards, where H.264 support 203 is pervasive among all the following leading standards: 205 o AirPlay (Apple) [AirPlay]. 207 o WiDi (Intel) [WiDi]. 209 o Miracast (Wi-Fi Alliance) [Miracast]. 211 o Google Cast (Google) [GoogleCast]. 213 o DLNA (Sony) [DLNA]. 215 GSMA [GSMA] has defined the following services for use in 3GPP 216 [ThreeGPP] IP Multimedia Subsystem (IMS), which use H.264 Constrained 217 Baseline Profile as MTI video codec: 219 IR.94 IMS Profile for Conversational Video Service [IR94] 221 IR.39 IMS Profile for High Definition Video Conference (HDVC) 222 Service [IR39] 224 5. Deployment 226 Today, the Internet runs on H.264 for real-time video communications. 227 Though not yet on the web, video communications is in widespread 228 usage on the Internet. It is supported in consumer applications both 229 on the desktop and in mobile apps, provided by many players like 230 Skype and Tango. It is in widespread usage for business 231 communications, in many applications like Webex, Citrix Go-To- 232 Meeting, Tandberg and Polycom telepresence systems, and many more. 234 All of these are in widespread deployment and widespread usage, and 235 are based on H.264. 237 Today, every single GSM/WCDMA mobile device, mobile operator network 238 and mobile operating system supports the H.264 Constrained Baseline 239 Profile video codec [GSMA-Codec-WP]. 241 If we want WebRTC to be successful, we must make sure it is something 242 that can be adopted by the application providers who deploy real-time 243 communications on the Internet. WebRTC needs to be for the 244 developers - the people who are building applications. And a 245 critical target customer base are the ones who are already doing 246 voice and video communications - the ones with the network effect and 247 user bases which need to be tapped to make this technology 248 successful. If WebRTC does not embrace H.264, it will be at the risk 249 of ignoring the needs of one of its most important set of potential 250 adopters - the ones most eager to use it - the ones already in the 251 market for real-time communications. 253 It may be argued that clients can be upgraded to support any new 254 codec. Opus is mandatory despite no deployment. However, G.711 is 255 also mandatory to ensure broad adoption. Likewise, H.264 should be 256 mandatory to ensure broad video adoption, since it is as widely 257 adopted in video as G.711 in voice. Also, video is more processing 258 intensive than voice, and therefore often implemented in hardware 259 that is not easily upgradeable. Other video systems use desktop 260 software which can also be difficult to broadly upgrade. Still 261 others provide SDKs and toolkits to third parties which cannot easily 262 be upgraded. Others have mobile apps which users cannot be 263 forcefully made to upgrade. 265 It may be argued that clients must be upgraded anyway to support ICE, 266 DTLS-SRTP and other WebRTC requirements. Some will, some won't. For 267 the latter, application providers will need to build server side 268 gateways. While that adds cost and complexity, the need to transcode 269 video would greatly escalate costs, perhaps making them prohibitive. 270 The CPU cost for transcoding, and the corresponding impact on quality 271 due to recoding and increased delays, are substantially larger 272 compared to just transport-level gateway functions. Perhaps enough 273 to make it impractical at scale. This view is supported by the 274 discussion on transcoding in a GSMA whitepaper [GSMA-Codec-WP], where 275 it is concluded that "...to preserve end-user experience, transcoding 276 must be avoided altogether". 278 It may be argued that deployed video systems and applications are 279 insignificant compared to the larger number of web browsers that will 280 support WebRTC. This misses a key point. Real-time communications 281 exists amongst a set of users that can talk to each other, typically 282 because they are customers of the same service. Skype users can talk 283 to each other. Tango users can talk to each other. There is, to 284 date, relatively little federation for video between these providers, 285 a problem which WebRTC is unlikely to remedy, as its causes have 286 little to do with media stacks, and everything to do with business. 287 Enabling real-time communications in the browser does not immediately 288 create a connected user base that is the size of the web. WebRTC is 289 just a media stack; the namespace is provided by the application 290 provider, as is the size of the communications network to which that 291 user can connect. Existing communications providers greatly value 292 their user bases, and those user bases define the reachable 293 communications network. When viewed in that lens, the most important 294 thing for allowing a WebRTC user to reach a massive network, is 295 enabling WebRTC to be usable by those which have existing networks of 296 users. Of those, many are asking for H.264. 298 It may be argued that WebRTC should build for the future, and not be 299 constrained by the past. This is reminiscent of the arguments made 300 by those who advocated against IETF doing work on NAT or making NAT 301 friendly protocols. The hope was the same - that IETF could, through 302 standards, dictate the future as we wished it - that by designing 303 protocols which didn't work through NAT, we would force the industry 304 to move away from NAT and embrace IPv6. That strategy failed. The 305 Internet is a living, breathing thing, constantly evolving. Those 306 technologies which are successful are actually those which work for 307 the Internet as it is today, not the Internet as we wish it could be. 308 Those then allow the Internet to take a baby step forward, and from 309 there, another step forward. Successful technologies require 310 consideration for transition, as it is more important than the 311 target. Just like NAT was, and still is, a reality on the Internet 312 today, so too is H.264 a reality of the Internet today. Just like we 313 could not upgrade the routers and switches to eliminate NAT, so too 314 are we unable to upgrade many of the Internet endpoints today to 315 instantly move away from H.264. We should learn from the past and 316 define a WebRTC which can work with the applications in existence 317 today, otherwise we significantly hinder the success and growth of 318 WebRTC. 320 6. Licensing 322 6.1. Royalty Free for Innovation, Low-volume Shipments 324 MPEG-LA released their AVC Patent Portfolio License already in 2004 325 and in 2010 they announced that H.264 encoded Internet video is free 326 to end users will never be charged royalties [MPEGLA]. Real-time 327 generated content, the content most applicable to WebRTC, was free 328 already from the establishment of the MPEG-LA license 329 [MPEGLA-License]. License fees for the distribution of products that 330 decode and encode H.264 video remain though. Those fees 331 [MPEGLA-Terms] are, and will very likely continue to be for the 332 lifetime of MPEG-LA pool, $0.20 per codec or less. 334 To paraphrase, the MPEG LA license does allow up to 100K units per 335 year, per legal entity/company (type "a" sublicensees in MPEG LA's 336 definition), to be shipped for zero ($0) royalty cost. This should 337 be adequate for many WebRTC innovators or start-ups to try out new 338 implementations on a large set of users before incurring any patent 339 royalty costs, a benefit to selecting a H.264/AVC profile as the 340 mandatory codec. 342 6.2. Higher H.264/AVC Profile Tools Bundled 344 It should be noted that when one licenses the MPEG LA H.264/AVC pool, 345 patents for higher profile tools - such as CABAC, 8x8 - are bundled 346 in with those required for the Constrained Baseline Profile. Thus, 347 these could optionally be used by WebRTC implementers to achieve even 348 greater performance or efficiencies than using H.264 Constrained 349 Baseline Profile alone. 351 It can also be noted that for MPEG-LA, since one license covers both 352 an encoder and decoder, there is no additional cost of using an 353 encoder to an implementation that supports decoding of H.264. 355 6.3. Licensing Stability 357 H.264 is a mature codec with a mature and well-known licensing model. 359 It is a well-established fact that not all H.264 right holders are 360 MPEG-LA pool members. H.264 is however an ITU/ISO/IEC international 361 standard, developed under their respective patent policies, and all 362 contributors must license their patents under Reasonable And Non- 363 Discriminatory (RAND) terms. In the field of video coding, most 364 major research groups interested in patents do contribute to the 365 ITU/ISO/IEC standards process and are therefore bound by those terms. 367 VP8 is a much younger codec than H.264 and it is fair to say that the 368 licensing situation is less clear than for H.264. Google has 369 provided their patent rights on VP8, including patents owned by 11 370 patent holders [MpegLaVp8], under a open source friendly license with 371 very restrictive reciprocity conditions. 373 VP8 in Video Coding for Browsers in MPEG is at the time of writing in 374 Draft International Standard ballot until January 2015, which is the 375 next-to-last step in becoming an MPEG standard. As such, it will 376 have to follow the ISO/IEC/ITU common patent policy 377 [IsoIecItuPolicy], before becoming International Standard. IPR 378 statements in MPEG or in the ISO/IEC database [IEC-Declarations], 379 received so far, contain royalty free (option 1), "Fair, Reasonable 380 And Non-Discriminatory" (FRAND, option 2), and "Unwilling to grant 381 license" (option 3). Potential IPR owners that do not participate in 382 this MPEG work are under no obligation to offer any license at all. 383 This indicates that the licensing situation for VP8 has still not 384 settled but tends toward a non-RF situation. 386 7. Performance 388 Comparing video quality is difficult. Practically no modern video 389 encoding method includes any bit-exact encoding where a given (video) 390 input produces a specified encoded output bitstream. Instead, the 391 encoded bitstream syntax and semantics are specified such that a 392 decoder can correctly interpret it and produce a known output. This 393 is true both for H.264 and VP8. Significant freedom is left to the 394 encoder implementation to choose how to represent the encoded video, 395 for example given a specific targeted bitrate. Thus it cannot in 396 general be expected that any encoded video bitstream represents the 397 best possible or most efficient representation, given the defined 398 bitstream syntax elements available to that codec. The actually 399 achieved quality for a certain bitstream, how close it is to the 400 optimally possible with available syntax, at any given bitrate rather 401 depends on the performance of the individual encoder implementation. 403 Also, not only is the resulting experienced video quality subjective, 404 but also depends on the source material, on the point of operation 405 and a number of other considerations. In addition, performance can 406 be measured vs. bitrate, but also vs. e.g. complexity - and here 407 another can of worms can be opened because complexity depends on 408 hardware used (some platforms have video codec accelerations), SW 409 platform (and how efficient it can use the hardware) and so on. On 410 top of this comes that different implementations can have different 411 performance, and can be operated in different ways (e.g. tradeoffs 412 between complexity and quality can be made). Regardless of how a 413 performance evaluation is carried out it can always be said that it 414 is not "fair". This section nevertheless attempts to shed some light 415 on this subject, and specifically the performance (measured against 416 bitrate) of H.264 compared to VP8. 418 A number of studies [H264perf1][H264perf2][H264perf3] have been made 419 to compare the compression efficiency performance between H.264 and 420 VP8. These studies show that H.264 is in general performing better 421 than VP8 but the studies are not specifically targeting video 422 conferencing. While constituting an independent test material 423 providing some indications, those tests however do not use exactly 424 the proposed profiles and levels, which calls for performing a set of 425 more targeted tests. 427 Google made a comparison test between VP8 and H.264 [GooglePSNR], 428 providing a set of test scripts [GoogleScripts]. That test includes 429 the use of rate control for both codecs. We believe this to be a 430 comparison problem since rate control is part of the encoder, which 431 as said above is typically not specified in video codec standards but 432 left up to individual implementations. The quantization parameter 433 (qp) level affects the rate/distortion tradeoff in video coding. 434 Comparing using fixed qp-levels is what has typically been used when 435 benchmarking new codecs, for example when benchmarking HEVC [H265] 436 against H.264 in the JCT-VC [JCT-VC] standardization. We are going 437 to select a codec (essentially bit stream format), not a rate control 438 mechanism; once the codec is selected you can choose whatever rate 439 control mechanism you wish that best suits your specific application. 440 Therefore, we propose to compare the codecs with rate control off, 441 using fixed quantization parameter (qp) levels. 443 Ericsson made a comparison using Google's published test scripts as 444 baseline and changed the parameter settings in order to make it 445 possible to measure using fixed qp. The focus of that test was to 446 evaluate the best compression efficiency that could be achieved with 447 both codecs since it was believed to be harder to make a fair 448 comparison trying to use complexity constraints. We used the same 449 eleven sequences as in the previous Google test, but limited them to 450 the first 10 seconds since they varied from 10 seconds to minutes; 451 this also eased computation time. The used video resolutions are 452 640x360 @ 30 fps, 640x480 @ 30 fps, 1280x720 @ 30 fps and 1280x720 @ 453 50 fps. 455 We used two H.264 encoder implementations: 457 o X264, which is an open-source codec that can operate in everything 458 from real-time to slow 460 o JM, which is the (Joint Model) reference implementation that was 461 used to develop H.264, and is very slow but attempts to be very 462 efficient in terms of bits per quality 464 This is a summary of the results (complete scripts and results 465 available here [H264VP8Tests]): 467 +----------------------------------+--------------------------------+ 468 | Test | Resulting bitrate at | 469 | | equivalent quality | 470 +----------------------------------+--------------------------------+ 471 | X264 Constrained Baseline vs VP8 | H.264 wins with 1% | 472 | JM Constrained Baseline vs VP8 | H.264 wins with 4% | 473 | X264 Constrained High vs VP8 | H.264 wins with 25% | 474 | JM Constrained High vs VP8 | H.264 wins with 24% | 475 +----------------------------------+--------------------------------+ 477 Table 1: Performance Comparison Results 479 It is interesting to note that the measurements are more stable in 480 this test; the variance of the percentages for the different 481 sequences is now around 70, down from around 700 in Google's test. 482 We believe this is due to the removal of the rate controller, which 483 acts as noise on the measurements. 485 It can also be noted that the Google method of calculating the rate 486 differences does not give exactly the same numbers as the JCT-VC way 487 of calculating Bjontegaard Delta bitrate (BD-rate) [PSNRdiff]. The 488 main difference is that the JM score for Constrained High in the 489 table above (Table 1) is around 29% better than VP8 if the JCT-VC way 490 of calculating BD-rate is used. 492 A rough complexity estimate can be obtained from the total running 493 times for the tests: 495 o X264: 1 hour 3 minutes 497 o VP8: 2 hours 0 minutes 499 o JM: An order of magnitude slower 501 Again, video quality is difficult to compare. The authors however 502 believe that the data provided in this section shows that H.264 503 Constrained Baseline is at least on par with VP8, while H.264 504 Constrained High seems to have a clear quality advantage. As a final 505 note, the new H.265/HEVC standard [H265] clearly outperforms all 506 three, but the authors think it is premature to mandate HEVC for 507 WebRTC. 509 8. Profile/level 511 H.264/AVC [H264] has a large number of encoding tools, grouped in 512 functionally reasonable toolsets by codec profiles, and a wide range 513 of possible implementation capability and complexity, specified by 514 codec levels. It is typically not reasonable for H.264 encoders and 515 decoders to implement maximum complexity capability for all of the 516 available tools. Thus, any H.264 decoder implementation is typically 517 not able to receive all possible H.264 streams. Which streams can be 518 received is described by what profile and level the decoder conforms 519 to. Any video stream produced by an H.264 encoder must keep within 520 the limits defined by the intended receiving decoder's profile and 521 level to ensure that the video stream can be correctly decoded. 523 Profiles can be "ranked" in terms of the amount of tools included, 524 such that some profiles with few tools are "lower" than profiles with 525 more tools. However, profiles are typically not strictly supersets 526 or subsets of each other in terms of which tools are used, so a 527 strict ranking cannot be defined. It is also in some cases possible 528 to express compliance to the common subset of tools between two 529 different profiles. This is fairly well described in [RFC6184]. 531 When choosing a Mandatory To Implement codec, it is desirable to use 532 a profile and level that is as widely supported as possible. 533 Therefore, H.264 Constrained Baseline Profile Level 1.2 MUST be 534 supported as Mandatory To Implement video codec. This is possible to 535 support with significant margin in hardware devices (Section 4) and 536 should likely also not cause performance problems for software-only 537 implementations. All Level definitions (Annex A of [H264]) include a 538 maximum framesize in macroblocks (16*16 pixels) as well as a maximum 539 processing requirement in macroblocks per second. That number of 540 macroblocks per second can be almost freely distributed between 541 framesize and framerate. The maximum framesize for Level 1.2 542 corresponds to 352*288 pixels (CIF). Examples of allowed framesize 543 and framerate combinations for Level 1.2 are CIF (352*288 pixels) at 544 15 Hz, QVGA (320*240 pixels) at 20 Hz, and QCIF (176*144 pixels) at 545 60 Hz. 547 Recognizing that while the above profile and level will likely be 548 possible to implement in any device, it is also likely not sufficient 549 for applications that require higher quality. Therefore, it is 550 RECOMMENDED that devices and implementations that can meet the 551 additional requirements also implement at least H.264 Constrained 552 High Profile Level 1.3, logically extended to support 720p resolution 553 at 30 Hz framerate, but in formal specification text it would have to 554 be expressed as a restriction on a higher level. 556 Note that the lowest non-extended Level that support 720p30 is Level 557 3.1, but fully supporting Level 3.1 also requires fairly high 558 bitrate, large buffers, and other encoding parameters included in 559 that Level definition that are likely not reasonable for the targeted 560 communication scenario. This method of extending a lower level in 561 SDP (Section 9) with a smaller set of applicable parameters is fully 562 in line with [RFC6184], and is already used by some video 563 conferencing vendors. 565 When considering the main WebRTC use case, real-time communication, 566 the lack of need to support interlaced image format in that context, 567 the limited use of bi-predictive (B) pictures, and the added 568 implementation and computation complexity that comes with interlace 569 and B-picture handling suggests that Constrained High Profile should 570 be preferred over High Profile as optional codec. Note also that 571 while Constrained High Profile is currently less supported in devices 572 than High Profile, any High Profile decoder will be capable of 573 decoding a Constrained High Profile bitstream since it is a subset of 574 High Profile. To make a High Profile encoder support Constrained 575 High Profile encoding, it will have to turn off interlace encoding 576 and turn off the use of bi-prediction. 578 The below table summarizes the H.264 video encoding features used by 579 Constrained Baseline Profile (CBP) and Constrained High Profile 580 (CHP). For more information on the listed features, see 581 [WikipediaAVC]. 583 +------------------------------------+-------+-------+ 584 | Feature | CBP | CHP | 585 +------------------------------------+-------+-------+ 586 | Bit depth per sample | 8 | 8 | 587 | Chroma formats | 4:2:0 | 4:2:0 | 588 | Flexible Macroblock Ordering (FMO) | No | No | 589 | Arbitrary Slice Ordering (ASO) | No | No | 590 | Redundant Slices | No | No | 591 | Data Partitioning | No | No | 592 | SI and SP slices | No | No | 593 | Interlaced coding | No | No | 594 | B slices | No | No | 595 | CABAC entropy coding | No | Yes | 596 | Monochrome 4:0:0 | No | Yes | 597 | 8x8 vs. 4x4 transform adaptivity | No | Yes | 598 | Quantization scaling matrices | No | Yes | 599 | Separate color QP control | No | Yes | 600 | Separate color plane coding | No | No | 601 | Predictive lossless coding | No | No | 602 | Weighted prediction | No | Yes | 603 +------------------------------------+-------+-------+ 605 9. Negotiation 607 Given that there exist a fairly large set of defined profiles and 608 levels (Section 8) in the H.264 specification, the probability is 609 rather low that randomly chosen H.264 encoder and decoder 610 implementations have exactly matching capabilities. In any 611 communication scenario, there is therefore a need for a decoder to be 612 able to convey its maximum supported profile and level that the 613 encoder must not exceed. 615 In addition and depending on the wanted use case and the conditions 616 that apply at a certain communication instance, there may also be a 617 need to describe the currently wanted profile and level at the start 618 of the communication session, which may be lower than the maximum 619 supported by the implementation. In this scenario it may also be of 620 interest to communicate from the encoder to the decoder both which 621 profile and level that will actually be used and what is the maximum 622 supported profile and level. The reason to communicate not only the 623 starting point but also the maximum assumes that communication 624 conditions may change during the conditions, maybe multiple times, 625 possibly making another profile and level be a more appropriate 626 choice. 628 Communication of maximum supported profile and level is the only 629 mandatory SDP [RFC4566] parameter in the H.264 payload format 630 [RFC6184], which also includes a large set of optional parameters, 631 describing available use (decoder) and intended use (encoder) of 632 those parameters for a specific offered [RFC3264] stream. 634 If the above mentioned (Section 8) capability for 720p30 is supported 635 as an extension to Constrained High Profile Level 1.3 (or higher), 636 the logical level extension SHOULD be signaled in SDP using the 637 following parameters as defined in section 8.1 of [RFC6184]: 639 o profile-level-id=640c0d (or corresponding to a higher Level of 640 Constrained High profile) 642 o max-fs=3600 (or greater) 644 o max-mbps=108000 (or greater) 646 o max-br=768 (or greater, whatever the device implementation can 647 support) 649 10. Summary 651 H.264 is widely adopted and used for a large set of video services. 652 This in turn is because H.264 offers great performance, reasonable 653 licensing terms (and manageable risks). As a consequence of its 654 adoption for many services, a multitude implementations in software 655 and hardware are available. Another result of the widespread 656 adoption is that all associated technologies, such as payload 657 formats, negotiation mechanisms and so on are well defined and 658 standardized. In addition, using H.264 enables interoperability with 659 many other services without video transcoding. 661 We therefore propose to the WG that H.264 shall be mandatory to 662 implement for all WebRTC endpoints that support video, according to 663 the details described in Section 8 and Section 9. 665 11. IANA Considerations 667 This document makes no request of IANA. 669 Note to RFC Editor: this section may be removed on publication as an 670 RFC. 672 12. Security Considerations 674 No specific considerations apply to the information in this document. 676 13. Acknowledgements 678 All that provided valuable descriptions, comments and insights about 679 the H.264 codec on the IETF mailing lists. 681 14. References 683 14.1. Normative References 685 [H264] ITU-T Recommendation H.264, "Advanced video coding for 686 generic audiovisual services", April 2013, 687 . 689 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 690 Requirement Levels", BCP 14, RFC 2119, March 1997. 692 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 693 with Session Description Protocol (SDP)", RFC 3264, June 694 2002. 696 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 697 Description Protocol", RFC 4566, July 2006. 699 [RFC6184] Wang, Y., Even, R., Kristensen, T., and R. Jesup, "RTP 700 Payload Format for H.264 Video", RFC 6184, May 2011. 702 14.2. Informative References 704 [AirPlay] Apple Inc, "AirPlay Overview: About AirPlay", September 705 2012, . 709 [AppleVideoToolbox] 710 Apple Inc., "AV Foundation Programming Guide", March 2014, 711 . 714 [BlackBerryAPI] 715 BlackBerry Limited, "Supported codecs - BlackBerry 716 Native", September 2014, 717 . 720 [CURtcWeb] 721 Microsoft Open Technologies, Inc., "CU-RTC-Web-Video", 722 July 2013, 723 . 726 [DLNA] DLNA(R), "Technical Overview", 2013, . 730 [FF33] "Cisco's OpenH264 Now Part of Firefox", October 2014, 731 . 734 [GSMA] "GSM Association", 2014, . 736 [GSMA-Codec-WP] 737 GSM Association, "WebRTC Codecs DRAFT v1.3", September 738 2014, 739 . 741 [GoogleCast] 742 Google, "Supported Media Types - Google Cast", October 743 2013, . 746 [GooglePSNR] 747 The WebM Project, "VP8 Results", April 2013, 748 . 751 [GoogleScripts] 752 The WebM Project, "VP8 vs H.264 Test Scripts", April 2013, 753 . 756 [H264VP8Tests] 757 Ericsson, "More H.264 vs VP8 tests", June 2013, 758 . 761 [H264perf1] 762 Vatolin, D., "MPEG-4 AVC/H.264 Video Codecs Comparison 763 2010 - Appendixes", , May 2010, 764 . 767 [H264perf2] 768 Shah, K., "Implementation, performance analysis and 769 comparison of VP8 and H.264.", University of Texas at 770 Arlington Department of Electrical Engineering, 2011, 771 . 775 [H264perf3] 776 De Simone, F., Goldmann, L., Lee, J., and T. Ebrahimi, 777 "Performance analysis of VP8 image and video compression 778 based on subjective evaluations", Ecole Polytechnique 779 F'd'rale de Lausanne (EPFL) , Aug 2011, 780 . 783 [H265] ITU-T Recommendation H.265, "High Efficiency Video 784 Coding", April 2013, 785 . 787 [IEC-Declarations] 788 International Electrotechnical Commission, "List of IEC 789 patent declarations received by IEC", October 2014, 790 . 792 [IR39] GSM Association, "IMS Profile for High Definition Video 793 Conference (HDVC)", May 2013, 794 . 797 [IR94] GSM Association, "IMS Profile for Conversational Video 798 Service", May 2013, . 802 [Implementations] 803 Wikipedia, "H.264/MPEG-4 AVC products and 804 implementations", September 2014, 805 . 808 [IsoIecItuPolicy] 809 ISO, "ISO/IEC/ITU common patent policy", April 2007, 810 . 813 [JCT-VC] ITU-T, "JCT-VC - Joint Collaborative Team on Video 814 Coding", . 817 [MPEGLA] MPEG LA, "MPEG LAs AVC License Will Not Charge Royalties 818 for Internet Video that is Free to End Users through Life 819 of License", MPEGLA News Release, August 2010, 820 . 823 [MPEGLA-License] 824 MPEG LA, "AVC Patent Portfolio License Briefing", May 825 2009, . 828 [MPEGLA-Terms] 829 MPEG LA, "SUMMARY OF AVC/H.264 LICENSE TERMS", 830 . 833 [MediaCodec] 834 Android, "MediaCodec | Android Developers", October 2014, 835 . 838 [Miracast] 839 Wi-Fi Alliance(R), "What formats does Miracast support?", 840 2013, . 843 [MpegLaVp8] 844 O'Reilly, T., "Google and MPEG LA Announce Agreement 845 Covering VP8 Video Format", March 2013, 846 . 849 [OpenH264] 850 "OpenH264", 2014, . 852 [OpenMAX] Khronos, "OpenMAX - The Standard for Media Library 853 Portability", 2014, . 855 [PSNRdiff] 856 Bjontegaard, G., "Calculation of Average PSNR Differences 857 between RD-Curves", ITU-T SG16 Q.6 Document VCEG-M33, 858 April 2001. 860 [ThreeGPP] 861 "3rd Generation Partnership Project", 862 . 864 [WEBM] The WebM Project, "SoCs Supporting VP8/VP9", October 2014, 865 . 867 [WiDi] Intel Corporation, "Intel(R) Wireless Display and Intel(R) 868 Pro Wireless Display", October 2013, 869 . 872 [WikipediaAVC] 873 Wikipedia, "H.264/MPEG-4 AVC", October 2013, 874 . 876 [Woon] Polycom, "Polycom Delivers Open Standards-Based Scalable 877 Video Coding (SVC) Technology, Royalty-Free to Industry", 878 October 2012, 879 . 882 Authors' Addresses 884 Bo Burman 885 Ericsson 886 Farogatan 6 887 Stockholm 16480 888 Sweden 890 Email: bo.burman@ericsson.com 891 Markus Isomaki 892 Nokia 893 Keilalahdentie 2-4 894 Espoo FI-02150 895 Finland 897 Email: markus.isomaki@nokia.com 899 Bernard Aboba 900 Microsoft Corporation 901 One Microsoft Way 902 Redmond, WA 98052 903 US 905 Email: bernard_aboba@hotmail.com 907 Gaelle Martin-Cocher 908 BlackBerry Ltd 909 1875 Buckhorn Gate 910 Mississauga, ON L4W 5P1 911 Canada 913 Email: gmartincocher@blackberry.com 915 Giri Mandyam 916 Qualcomm Innovation Center 918 Email: mandyam@quicinc.com 920 Xavier Marjou 921 Orange 922 2, avenue Pierre Marzin 923 Lannion 22307 924 France 926 Email: xavier.marjou@orange.com 927 Cullen Jennings 928 Cisco 929 170 West Tasman Drive 930 San Jose, CA 95134 931 United States 933 Email: fluffy@cisco.com 935 Jonathan Rosenberg 936 Cisco 937 170 West Tasman Drive 938 San Jose, CA 95134 939 USA 941 Email: jdrosen@cisco.com 943 David Singer 944 Apple 946 Email: singer@apple.com