idnits 2.17.1 draft-burman-rtcweb-h264-proposal-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (April 25, 2014) is 3648 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. 'H264' ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 RTCWEB Working Group B. Burman 3 Internet-Draft Ericsson 4 Intended status: Standards Track M. Isomaki 5 Expires: October 27, 2014 Nokia 6 B. Aboba 7 Microsoft Corporation 8 G. Martin-Cocher 9 BlackBerry Ltd 10 G. Mandyam 11 Qualcomm Innovation Center 12 X. Marjou 13 Orange 14 C. Jennings 15 J. Rosenberg 16 Cisco 17 D. Singer 18 Apple 19 April 25, 2014 21 H.264 as Mandatory to Implement Video Codec for WebRTC 22 draft-burman-rtcweb-h264-proposal-04 24 Abstract 26 This document proposes that, and motivates why, H.264 should be a 27 Mandatory To Implement video codec for WebRTC. 29 Status of This Memo 31 This Internet-Draft is submitted in full conformance with the 32 provisions of BCP 78 and BCP 79. 34 Internet-Drafts are working documents of the Internet Engineering 35 Task Force (IETF). Note that other groups may also distribute 36 working documents as Internet-Drafts. The list of current Internet- 37 Drafts is at http://datatracker.ietf.org/drafts/current/. 39 Internet-Drafts are draft documents valid for a maximum of six months 40 and may be updated, replaced, or obsoleted by other documents at any 41 time. It is inappropriate to use Internet-Drafts as reference 42 material or to cite them other than as "work in progress." 44 This Internet-Draft will expire on October 27, 2014. 46 Copyright Notice 48 Copyright (c) 2014 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents 53 (http://trustee.ietf.org/license-info) in effect on the date of 54 publication of this document. Please review these documents 55 carefully, as they describe your rights and restrictions with respect 56 to this document. Code Components extracted from this document must 57 include Simplified BSD License text as described in Section 4.e of 58 the Trust Legal Provisions and are provided without warranty as 59 described in the Simplified BSD License. 61 Table of Contents 63 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 64 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 65 3. H.264 Overview . . . . . . . . . . . . . . . . . . . . . . . 3 66 4. Implementations . . . . . . . . . . . . . . . . . . . . . . . 3 67 5. Deployment . . . . . . . . . . . . . . . . . . . . . . . . . 5 68 6. Licensing . . . . . . . . . . . . . . . . . . . . . . . . . . 6 69 6.1. Royalty Free for Innovation, Low-volume Shipments . . . . 6 70 6.2. Higher H.264/AVC Profile Tools Bundled . . . . . . . . . 7 71 6.3. Licensing Stability . . . . . . . . . . . . . . . . . . . 7 72 7. Performance . . . . . . . . . . . . . . . . . . . . . . . . . 8 73 8. Profile/level . . . . . . . . . . . . . . . . . . . . . . . . 10 74 9. Negotiation . . . . . . . . . . . . . . . . . . . . . . . . . 12 75 10. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 76 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 77 12. Security Considerations . . . . . . . . . . . . . . . . . . . 14 78 13. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 14 79 14. References . . . . . . . . . . . . . . . . . . . . . . . . . 14 80 14.1. Normative References . . . . . . . . . . . . . . . . . . 14 81 14.2. Informative References . . . . . . . . . . . . . . . . . 15 82 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 17 84 1. Introduction 86 The selection of a Mandatory To Implement (MTI) video codec for 87 WebRTC has been discussed for quite some time in the RTCWEB WG. This 88 document proposes that the H.264 video codec should be mandatory to 89 implement for WebRTC implementations and gives motivation to this 90 proposal. 92 The core of the proposal is that: 94 H.264 Constrained Baseline Profile Level 1.2 MUST be supported as 95 Mandatory To Implement video codec. 97 To enable higher quality for devices capable of it: 99 H.264 Constrained High Profile Level 1.3, logically extended to 100 support 720p resolution at 30 Hz framerate is RECOMMENDED. 102 This draft discusses the advantages of H.264 as the authors of this 103 draft see them; a richness of implementations and hardware support, 104 well known licensing conditions, good performance, and well defined 105 handling of varying device capabilities. 107 2. Terminology 109 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 110 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 111 document are to be interpreted as described in BCP 14, RFC 2119 112 [RFC2119]. 114 3. H.264 Overview 116 The video coding standard Advanced Video Coding (ITU-T H.264 | ISO/ 117 IEC 14496-10 [H264]) has been around for almost ten years by now. 118 Developed jointly by MPEG and ITU-T in the Joint Video Team, it was 119 published in its first version in 2003 and amended with support for 120 higher-fidelity video in 2004. Other significant updates include 121 support for scalability (2007) and multiview (2009). The codec goes 122 under the names H.264, AVC and MPEG-4 Part10. In this memo the term 123 "H.264" will be used. 125 H.264 was from the start very successful and has become widely 126 adopted for (video) content as well as (video) communication services 127 worldwide. 129 H.264 is mandatory in mobile wireless standards for multimedia 130 telephony and packet switched streaming. It is also the leading de 131 facto standard for web video content delivered in HTML5 or other 132 technologies, and is supported in all major web browsers, mobile 133 device platforms, and desktop operating systems. 135 4. Implementations 137 Arguably, hardware or DSP acceleration for video encoding/decoding 138 would be mostly beneficial for devices that has relatively lower 139 capacity in terms of CPU and power (smaller batteries), and the most 140 common devices in this category are phones and tablets. There is a 141 long list of vendors offering hardware or DSP implementations of 142 H.264. In particular all vendors of platforms for mobile high-range 143 phones, smartphones, and tablets support H.264/AVC High Profile 144 encoding and decoding at least 1080p30, but those platforms are 145 currently in general not used for low- to mid-range devices. These 146 vendors are Qualcomm, TI, Nvidia, Renesas, Mediatek, Huawei 147 Hisilicon, Intel, Broadcom, Samsung. Those platforms all support 148 H.264/AVC codec with dedicated hardware or DSP. The majority of the 149 implementations also support low-delay real-time applications. 151 There are also other standards and specifications that support H.264. 152 One notable area is wireless display standards, where H.264 support 153 is pervasive among all the following leading standards: 155 o AirPlay (Apple) [AirPlay]. 157 o WiDi (Intel) [WiDi]. 159 o Miracast (Wi-Fi Alliance) [Miracast]. 161 o Google Cast (Google) [GoogleCast]. 163 o DLNA (Sony) [DLNA]. 165 Regarding software implementations there is a long list of available 166 implementations. Wikipedia provides an illustration of this with 167 their list [Implementations], and more implementations appear, e.g. a 168 royalty-free open source implementation from Polycom including H.264/ 169 SVC support [Woon]. Microsoft has produced an H.264 prototype for 170 use in browsers [CURtcWeb]. Not only are there standalone 171 implementations available, including open source, but in addition 172 recent Windows and Mac OS X versions support H.264 encoding and 173 decoding. 175 The WebM wiki [WEBM] shows only 3 (out of ~37) ARM SoCs which support 176 VP8 encode and decode. All (~37) support H.264. This only 177 represents a fraction of deployed SoCs. Almost all deployed SoCs, as 178 well as future designs, support H.264 encode and decode, including 179 desktop (Intel x86) chipsets. 181 The benefits of hardware encoder and decoder implementations 182 typically have an order of magnitude or more performance advantage 183 (e.g., 1080p versus 360p becomes achievable) and power savings (e.g., 184 tens of milliwatts versus many hundreds of milliwatts or even watts 185 are consumed just by the encoder and decoder). While VP8 proponents 186 have argued codec power is not a major concern relative to displays, 187 this neglects the advances in display technology that put the central 188 processor back near the top power consumers. 190 5. Deployment 192 Today, the Internet runs on H.264 for real-time video communications. 193 Though not yet on the web, video communications is in widespread 194 usage on the Internet. It is supported in consumer applications both 195 on the desktop and in mobile apps, provided by many players like 196 Skype and Tango. It is in widespread usage for business 197 communications, in many applications like Webex, Citrix Go-To- 198 Meeting, Tandberg and Polycom telepresence systems, and many more. 199 All of these are in widespread deployment and widespread usage, and 200 are based on H.264. 202 If we want WebRTC to be successful, we must make sure it is something 203 that can be adopted by the application providers who deploy real-time 204 communications on the Internet. WebRTC needs to be for the 205 developers - the people who are building applications. And a 206 critical target customer base are the ones who are already doing 207 voice and video communications - the ones with the network effect and 208 user bases which need to be tapped to make this technology 209 successful. If WebRTC does not embrace H.264, it will be at the risk 210 of ignoring the needs of one of its most important set of potential 211 adopters - the ones most eager to use it - the ones already in the 212 market for real-time communications. 214 It may be argued that clients can be upgraded to support any new 215 codec. Opus is mandatory despite no deployment. However, G.711 is 216 also mandatory to ensure broad adoption. Likewise, H.264 should be 217 mandatory to ensure broad video adoption, since it is as widely 218 adopted in video as G.711 in voice. Also, video is more processing 219 intensive than voice, and therefore often implemented in hardware 220 that is not easily upgradeable. Other video systems use desktop 221 software which can also be difficult to broadly upgrade. Still 222 others provide SDKs and toolkits to third parties which cannot easily 223 be upgraded. Others have mobile apps which users cannot be 224 forcefully made to upgrade. 226 It may be argued that clients must be upgraded anyway to support ICE, 227 DTLS-SRTP and other WebRTC requirements. Some will, some won't. For 228 the latter, application providers will need to build server side 229 gateways. While that adds cost and complexity, the need to transcode 230 video would greatly escalate costs, perhaps making them prohibitive. 231 The CPU cost for transcoding, and the corresponding impact on quality 232 due to recoding and increased delays, are substantially larger 233 compared to just transport-level gateway functions. Perhaps enough 234 to make it impractical at scale. 236 It may be argued that deployed video systems and applications are 237 insignificant compared to the larger number of web browsers that will 238 support WebRTC. This misses a key point. Real-time communications 239 exists amongst a set of users that can talk to each other, typically 240 because they are customers of the same service. Skype users can talk 241 to each other. Tango users can talk to each other. There is, to 242 date, relatively little federation for video between these providers, 243 a problem which WebRTC is unlikely to remedy, as its causes have 244 little to do with media stacks, and everything to do with business. 245 Enabling real-time communications in the browser does not immediately 246 create a connected user base that is the size of the web. WebRTC is 247 just a media stack; the namespace is provided by the application 248 provider, as is the size of the communications network to which that 249 user can connect. Existing communications providers greatly value 250 their user bases, and those user bases define the reachable 251 communications network. When viewed in that lens, the most important 252 thing for allowing a WebRTC user to reach a massive network, is 253 enabling WebRTC to be usable by those which have existing networks of 254 users. Of those, many are asking for H.264. 256 It may be argued that WebRTC should build for the future, and not be 257 constrained by the past. This is reminiscent of the arguments made 258 by those who advocated against IETF doing work on NAT or making NAT 259 friendly protocols. The hope was the same - that IETF could, through 260 standards, dictate the future as we wished it - that by designing 261 protocols which didn't work through NAT, we would force the industry 262 to move away from NAT and embrace IPv6. That strategy failed. The 263 Internet is a living, breathing thing, constantly evolving. Those 264 technologies which are successful are actually those which work for 265 the Internet as it is today, not the Internet as we wish it could be. 266 Those then allow the Internet to take a baby step forward, and from 267 there, another step forward. Successful technologies require 268 consideration for transition, as it is more important than the 269 target. Just like NAT was, and still is, a reality on the Internet 270 today, so too is H.264 a reality of the Internet today. Just like we 271 could not upgrade the routers and switches to eliminate NAT, so too 272 are we unable to upgrade many of the Internet endpoints today to 273 instantly move away from H.264. We should learn from the past and 274 define a WebRTC which can work with the applications in existence 275 today, otherwise we significantly hinder the success and growth of 276 WebRTC. 278 6. Licensing 280 6.1. Royalty Free for Innovation, Low-volume Shipments 282 MPEG-LA released their AVC Patent Portfolio License already in 2004 283 and in 2010 they announced that H.264 encoded Internet video is free 284 to end users will never be charged royalties [MPEGLA]. Real-time 285 generated content, the content most applicable to WebRTC, was free 286 already from the establishment of the MPEG-LA license 287 [MPEGLA-License]. License fees for products that decode and encode 288 H.264 video remain though. Those fees [MPEGLA-Terms] are, and will 289 very likely continue to be for the lifetime of MPEG-LA pool, $0.20 290 per codec or less. 292 To paraphrase, the MPEG LA license does allow up to 100K units per 293 year, per legal entity/company (type "a" sublicensees in MPEG LA's 294 definition), to be shipped for zero ($0) royalty cost. This should 295 be adequate for many WebRTC innovators or start-ups to try out new 296 implementations on a large set of users before incurring any patent 297 royalty costs, a benefit to selecting a H.264/AVC profile as the 298 mandatory codec. 300 6.2. Higher H.264/AVC Profile Tools Bundled 302 It should be noted that when one licenses the MPEG LA H.264/AVC pool, 303 patents for higher profile tools - such as CABAC, 8x8 - are bundled 304 in with those required for the Constrained Baseline Profile. Thus, 305 these could optionally be used by WebRTC implementers to achieve even 306 greater performance or efficiencies than using H.264 Constrained 307 Baseline Profile alone. 309 It can also be noted that for MPEG-LA, since one license covers both 310 an encoder and decoder, there is no additional cost of using an 311 encoder to an implementation that supports decoding of H.264. 313 6.3. Licensing Stability 315 H.264 is a mature codec with a mature and well-known licensing model. 317 It is a well-established fact that not all H.264 right holders are 318 MPEG-LA pool members. H.264 is however an ITU/ISO/IEC international 319 standard, developed under their respective patent policies, and all 320 contributors must license their patents under Reasonable And Non- 321 Discriminatory (RAND) terms. In the field of video coding, most 322 major research groups interested in patents do contribute to the ITU/ 323 ISO/IEC standards process and are therefore bound by those terms. 325 VP8 is a much younger codec than H.264 and it is fair to say that the 326 licensing situation is less clear than for H.264. Google has 327 provided their patent rights on VP8, including patents owned by 11 328 patent holders [MpegLaVp8], under a open source friendly license with 329 very restrictive reciprocity conditions. 331 Recently, VP8 was adopted as Working Draft for Video Coding for 332 Browsers in MPEG, which is the first step in becoming an MPEG 333 standard. As such, it will have to follow the ISO/IEC/ITU common 334 patent policy [IsoIecItuPolicy], but IPR statements cannot be 335 expected there for still some time. There is no guarantee that IPR 336 statements in MPEG will be royalty free (option 1), but may just as 337 well be "Fair, Reasonable And Non-Discriminatory" (FRAND, option 2), 338 and potential IPR owners that do not participate in this MPEG work 339 are under no obligation to offer any license at all. This indicates 340 that the licensing situation for VP8 has still not settled. 342 7. Performance 344 Comparing video quality is difficult. Practically no modern video 345 encoding method includes any bit-exact encoding where a given (video) 346 input produces a specified encoded output bitstream. Instead, the 347 encoded bitstream syntax and semantics are specified such that a 348 decoder can correctly interpret it and produce a known output. This 349 is true both for H.264 and VP8. Significant freedom is left to the 350 encoder implementation to choose how to represent the encoded video, 351 for example given a specific targeted bitrate. Thus it cannot in 352 general be expected that any encoded video bitstream represents the 353 best possible or most efficient representation, given the defined 354 bitstream syntax elements available to that codec. The actually 355 achieved quality for a certain bitstream, how close it is to the 356 optimally possible with available syntax, at any given bitrate rather 357 depends on the performance of the individual encoder implementation. 359 Also, not only is the resulting experienced video quality subjective, 360 but also depends on the source material, on the point of operation 361 and a number of other considerations. In addition, performance can 362 be measured vs. bitrate, but also vs. e.g. complexity - and here 363 another can of worms can be opened because complexity depends on 364 hardware used (some platforms have video codec accelerations), SW 365 platform (and how efficient it can use the hardware) and so on. On 366 top of this comes that different implementations can have different 367 performance, and can be operated in different ways (e.g. tradeoffs 368 between complexity and quality can be made). Regardless of how a 369 performance evaluation is carried out it can always be said that it 370 is not "fair". This section nevertheless attempts to shed some light 371 on this subject, and specifically the performance (measured against 372 bitrate) of H.264 compared to VP8. 374 A number of studies [H264perf1][H264perf2][H264perf3] have been made 375 to compare the compression efficiency performance between H.264 and 376 VP8. These studies show that H.264 is in general performing better 377 than VP8 but the studies are not specifically targeting video 378 conferencing. While constituting an independent test material 379 providing some indications, those tests however do not use exactly 380 the proposed profiles and levels, which calls for performing a set of 381 more targeted tests. 383 Google made a comparison test between VP8 and H.264 [GooglePSNR], 384 providing a set of test scripts [GoogleScripts]. That test includes 385 the use of rate control for both codecs. We believe this to be a 386 comparison problem since rate control is part of the encoder, which 387 as said above is typically not specified in video codec standards but 388 left up to individual implementations. The quantization parameter 389 (qp) level affects the rate/distortion tradeoff in video coding. 390 Comparing using fixed qp-levels is what has typically been used when 391 benchmarking new codecs, for example when benchmarking HEVC [H265] 392 against H.264 in the JCT-VC [JCT-VC] standardization. We are going 393 to select a codec (essentially bit stream format), not a rate control 394 mechanism; once the codec is selected you can choose whatever rate 395 control mechanism you wish that best suits your specific application. 396 Therefore, we propose to compare the codecs with rate control off, 397 using fixed quantization parameter (qp) levels. 399 Ericsson made a comparison using Google's published test scripts as 400 baseline and changed the parameter settings in order to make it 401 possible to measure using fixed qp. The focus of that test was to 402 evaluate the best compression efficiency that could be achieved with 403 both codecs since it was believed to be harder to make a fair 404 comparison trying to use complexity constraints. We used the same 405 eleven sequences as in the previous Google test, but limited them to 406 the first 10 seconds since they varied from 10 seconds to minutes; 407 this also eased computation time. The used video resolutions are 408 640x360 @ 30 fps, 640x480 @ 30 fps, 1280x720 @ 30 fps and 1280x720 @ 409 50 fps. 411 We used two H.264 encoder implementations: 413 o X264, which is an open-source codec that can operate in everything 414 from real-time to slow 416 o JM, which is the (Joint Model) reference implementation that was 417 used to develop H.264, and is very slow but attempts to be very 418 efficient in terms of bits per quality 420 This is a summary of the results (complete scripts and results 421 available here [H264VP8Tests]): 423 +----------------------------------+--------------------------------+ 424 | Test | Resulting bitrate at | 425 | | equivalent quality | 426 +----------------------------------+--------------------------------+ 427 | X264 Constrained Baseline vs VP8 | H.264 wins with 1% | 428 | JM Constrained Baseline vs VP8 | H.264 wins with 4% | 429 | X264 Constrained High vs VP8 | H.264 wins with 25% | 430 | JM Constrained High vs VP8 | H.264 wins with 24% | 431 +----------------------------------+--------------------------------+ 433 Table 1: Performance Comparison Results 435 It is interesting to note that the measurements are more stable in 436 this test; the variance of the percentages for the different 437 sequences is now around 70, down from around 700 in Google's test. 438 We believe this is due to the removal of the rate controller, which 439 acts as noise on the measurements. 441 It can also be noted that the Google method of calculating the rate 442 differences does not give exactly the same numbers as the JCT-VC way 443 of calculating Bjontegaard Delta bitrate (BD-rate) [PSNRdiff]. The 444 main difference is that the JM score for Constrained High in the 445 table above (Table 1) is around 29% better than VP8 if the JCT-VC way 446 of calculating BD-rate is used. 448 A rough complexity estimate can be obtained from the total running 449 times for the tests: 451 o X264: 1 hour 3 minutes 453 o VP8: 2 hours 0 minutes 455 o JM: An order of magnitude slower 457 Again, video quality is difficult to compare. The authors however 458 believe that the data provided in this section shows that H.264 459 Constrained Baseline is at least on par with VP8, while H.264 460 Constrained High seems to have a clear quality advantage. As a final 461 note, the new H.265/HEVC standard [H265] clearly outperforms all 462 three, but the authors think it is premature to mandate HEVC for 463 WebRTC. 465 8. Profile/level 467 H.264/AVC [H264] has a large number of encoding tools, grouped in 468 functionally reasonable toolsets by codec profiles, and a wide range 469 of possible implementation capability and complexity, specified by 470 codec levels. It is typically not reasonable for H.264 encoders and 471 decoders to implement maximum complexity capability for all of the 472 available tools. Thus, any H.264 decoder implementation is typically 473 not able to receive all possible H.264 streams. Which streams can be 474 received is described by what profile and level the decoder conforms 475 to. Any video stream produced by an H.264 encoder must keep within 476 the limits defined by the intended receiving decoder's profile and 477 level to ensure that the video stream can be correctly decoded. 479 Profiles can be "ranked" in terms of the amount of tools included, 480 such that some profiles with few tools are "lower" than profiles with 481 more tools. However, profiles are typically not strictly supersets 482 or subsets of each other in terms of which tools are used, so a 483 strict ranking cannot be defined. It is also in some cases possible 484 to express compliance to the common subset of tools between two 485 different profiles. This is fairly well described in [RFC6184]. 487 When choosing a Mandatory To Implement codec, it is desirable to use 488 a profile and level that is as widely supported as possible. 489 Therefore, H.264 Constrained Baseline Profile Level 1.2 MUST be 490 supported as Mandatory To Implement video codec. This is possible to 491 support with significant margin in hardware devices (Section 4) and 492 should likely also not cause performance problems for software-only 493 implementations. All Level definitions (Annex A of [H264]) include a 494 maximum framesize in macroblocks (16*16 pixels) as well as a maximum 495 processing requirement in macroblocks per second. That number of 496 macroblocks per second can be almost freely distributed between 497 framesize and framerate. The maximum framesize for Level 1.2 498 corresponds to 352*288 pixels (CIF). Examples of allowed framesize 499 and framerate combinations for Level 1.2 are CIF (352*288 pixels) at 500 15 Hz, QVGA (320*240 pixels) at 20 Hz, and QCIF (176*144 pixels) at 501 60 Hz. 503 Recognizing that while the above profile and level will likely be 504 possible to implement in any device, it is also likely not sufficient 505 for applications that require higher quality. Therefore, it is 506 RECOMMENDED that devices and implementations that can meet the 507 additional requirements also implement at least H.264 Constrained 508 High Profile Level 1.3, logically extended to support 720p resolution 509 at 30 Hz framerate, but in formal specification text it would have to 510 be expressed as a restriction on a higher level. 512 Note that the lowest non-extended Level that support 720p30 is Level 513 3.1, but fully supporting Level 3.1 also requires fairly high 514 bitrate, large buffers, and other encoding parameters included in 515 that Level definition that are likely not reasonable for the targeted 516 communication scenario. This method of extending a lower level in 517 SDP (Section 9) with a smaller set of applicable parameters is fully 518 in line with [RFC6184], and is already used by some video 519 conferencing vendors. 521 When considering the main WebRTC use case, real-time communication, 522 the lack of need to support interlaced image format in that context, 523 the limited use of bi-predictive (B) pictures, and the added 524 implementation and computation complexity that comes with interlace 525 and B-picture handling suggests that Constrained High Profile should 526 be preferred over High Profile as optional codec. Note also that 527 while Constrained High Profile is currently less supported in devices 528 than High Profile, any High Profile decoder will be capable of 529 decoding a Constrained High Profile bitstream since it is a subset of 530 High Profile. To make a High Profile encoder support Constrained 531 High Profile encoding, it will have to turn off interlace encoding 532 and turn off the use of bi-prediction. 534 The below table summarizes the H.264 video encoding features used by 535 Constrained Baseline Profile (CBP) and Constrained High Profile 536 (CHP). For more information on the listed features, see 537 [WikipediaAVC]. 539 +------------------------------------+-------+-------+ 540 | Feature | CBP | CHP | 541 +------------------------------------+-------+-------+ 542 | Bit depth per sample | 8 | 8 | 543 | Chroma formats | 4:2:0 | 4:2:0 | 544 | Flexible Macroblock Ordering (FMO) | No | No | 545 | Arbitrary Slice Ordering (ASO) | No | No | 546 | Redundant Slices | No | No | 547 | Data Partitioning | No | No | 548 | SI and SP slices | No | No | 549 | Interlaced coding | No | No | 550 | B slices | No | No | 551 | CABAC entropy coding | No | Yes | 552 | Monochrome 4:0:0 | No | Yes | 553 | 8x8 vs. 4x4 transform adaptivity | No | Yes | 554 | Quantization scaling matrices | No | Yes | 555 | Separate color QP control | No | Yes | 556 | Separate color plane coding | No | No | 557 | Predictive lossless coding | No | No | 558 | Weighted prediction | No | Yes | 559 +------------------------------------+-------+-------+ 561 9. Negotiation 563 Given that there exist a fairly large set of defined profiles and 564 levels (Section 8) in the H.264 specification, the probability is 565 rather low that randomly chosen H.264 encoder and decoder 566 implementations have exactly matching capabilities. In any 567 communication scenario, there is therefore a need for a decoder to be 568 able to convey its maximum supported profile and level that the 569 encoder must not exceed. 571 In addition and depending on the wanted use case and the conditions 572 that apply at a certain communication instance, there may also be a 573 need to describe the currently wanted profile and level at the start 574 of the communication session, which may be lower than the maximum 575 supported by the implementation. In this scenario it may also be of 576 interest to communicate from the encoder to the decoder both which 577 profile and level that will actually be used and what is the maximum 578 supported profile and level. The reason to communicate not only the 579 starting point but also the maximum assumes that communication 580 conditions may change during the conditions, maybe multiple times, 581 possibly making another profile and level be a more appropriate 582 choice. 584 Communication of maximum supported profile and level is the only 585 mandatory SDP [RFC4566] parameter in the H.264 payload format 586 [RFC6184], which also includes a large set of optional parameters, 587 describing available use (decoder) and intended use (encoder) of 588 those parameters for a specific offered [RFC3264] stream. 590 If the above mentioned (Section 8) capability for 720p30 is supported 591 as an extension to Constrained High Profile Level 1.3 (or higher), 592 the logical level extension SHOULD be signaled in SDP using the 593 following parameters as defined in section 8.1 of [RFC6184]: 595 o profile-level-id=640c0d (or corresponding to a higher Level of 596 Constrained High profile) 598 o max-fs=3600 (or greater) 600 o max-mbps=108000 (or greater) 602 o max-br=768 (or greater, whatever the device implementation can 603 support) 605 10. Summary 607 H.264 is widely adopted and used for a large set of video services. 608 This in turn is because H.264 offers great performance, reasonable 609 licensing terms (and manageable risks). As a consequence of its 610 adoption for many services, a multitude implementations in software 611 and hardware are available. Another result of the widespread 612 adoption is that all associated technologies, such as payload 613 formats, negotiation mechanisms and so on are well defined and 614 standardized. In addition, using H.264 enables interoperability with 615 many other services without video transcoding. 617 We therefore propose to the WG that H.264 shall be mandatory to 618 implement for all WebRTC endpoints that support video, according to 619 the details described in Section 8 and Section 9. 621 11. IANA Considerations 623 This document makes no request of IANA. 625 Note to RFC Editor: this section may be removed on publication as an 626 RFC. 628 12. Security Considerations 630 No specific considerations apply to the information in this document. 632 13. Acknowledgements 634 All that provided valuable descriptions, comments and insights about 635 the H.264 codec on the IETF mailing lists. 637 14. References 639 14.1. Normative References 641 [H264] ITU-T Recommendation H.264, "Advanced video coding for 642 generic audiovisual services", April 2013, 643 . 645 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 646 Requirement Levels", BCP 14, RFC 2119, March 1997. 648 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 649 with Session Description Protocol (SDP)", RFC 3264, June 650 2002. 652 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 653 Description Protocol", RFC 4566, July 2006. 655 [RFC6184] Wang, Y., Even, R., Kristensen, T., and R. Jesup, "RTP 656 Payload Format for H.264 Video", RFC 6184, May 2011. 658 14.2. Informative References 660 [AirPlay] Apple Inc, "AirPlay Overview: About AirPlay", September 661 2012, . 665 [CURtcWeb] 666 Microsoft Open Technologies, Inc., "CU-RTC-Web-Video", 667 July 2013, . 670 [DLNA] DLNA(R), "Technical Overview", 2013, . 674 [GoogleCast] 675 Google, "Supported Media Types - Google Cast", October 676 2013, . 679 [GooglePSNR] 680 The WebM Project, "VP8 Results", April 2013, 681 . 684 [GoogleScripts] 685 The WebM Project, "VP8 vs H.264 Test Scripts", April 2013, 686 . 689 [H264VP8Tests] 690 Ericsson, "More H.264 vs VP8 tests", June 2013, 691 . 694 [H264perf1] 695 Vatolin, D., "MPEG-4 AVC/H.264 Video Codecs Comparison 696 2010 - Appendixes", , May 2010, 697 . 700 [H264perf2] 701 Shah, K., "Implementation, performance analysis and 702 comparison of VP8 and H.264.", University of Texas at 703 Arlington Department of Electrical Engineering, 2011, 704 . 708 [H264perf3] 709 De Simone, F., Goldmann, L., Lee, J., and T. Ebrahimi, 710 "Performance analysis of VP8 image and video compression 711 based on subjective evaluations", Ecole Polytechnique 712 F'd'rale de Lausanne (EPFL) , Aug 2011, 713 . 716 [H265] ITU-T Recommendation H.265, "High Efficiency Video 717 Coding", April 2013, 718 . 720 [Implementations] 721 Wikipedia, "H.264/MPEG-4 AVC products and 722 implementations", April 2013, . 725 [IsoIecItuPolicy] 726 ISO, "ISO/IEC/ITU common patent policy", April 2007, 727 . 730 [JCT-VC] ITU-T, "JCT-VC - Joint Collaborative Team on Video 731 Coding", . 734 [MPEGLA-License] 735 MPEG LA, "AVC Patent Portfolio License Briefing", May 736 2009, . 739 [MPEGLA-Terms] 740 MPEG LA, "SUMMARY OF AVC/H.264 LICENSE TERMS", 741 . 744 [MPEGLA] MPEG LA, "MPEG LAs AVC License Will Not Charge Royalties 745 for Internet Video that is Free to End Users through Life 746 of License", MPEGLA News Release, August 2010, 747 . 750 [Miracast] 751 Wi-Fi Alliance(R), "What formats does Miracast support?", 752 2013, . 755 [MpegLaVp8] 756 O'Reilly, T., "Google and MPEG LA Announce Agreement 757 Covering VP8 Video Format", March 2013, 758 . 761 [PSNRdiff] 762 Bjontegaard, G., "Calculation of Average PSNR Differences 763 between RD-Curves", ITU-T SG16 Q.6 Document VCEG-M33, 764 April 2001. 766 [WEBM] The WebM Project, "ARM SoCs", 767 . 769 [WiDi] Intel Corporation, "Intel(R) Wireless Display and Intel(R) 770 Pro Wireless Display", October 2013, . 774 [WikipediaAVC] 775 Wikipedia, "H.264/MPEG-4 AVC", October 2013, 776 . 778 [Woon] Polycom, "Polycom Delivers Open Standards-Based Scalable 779 Video Coding (SVC) Technology, Royalty-Free to Industry", 780 October 2012, . 783 Authors' Addresses 785 Bo Burman 786 Ericsson 787 Farogatan 6 788 Stockholm 16480 789 Sweden 791 Email: bo.burman@ericsson.com 792 Markus Isomaki 793 Nokia 794 Keilalahdentie 2-4 795 Espoo FI-02150 796 Finland 798 Email: markus.isomaki@nokia.com 800 Bernard Aboba 801 Microsoft Corporation 802 One Microsoft Way 803 Redmond, WA 98052 804 US 806 Email: bernard_aboba@hotmail.com 808 Gaelle Martin-Cocher 809 BlackBerry Ltd 810 1875 Buckhorn Gate 811 Mississauga, ON L4W 5P1 812 Canada 814 Email: gmartincocher@blackberry.com 816 Giri Mandyam 817 Qualcomm Innovation Center 819 Email: mandyam@quicinc.com 821 Xavier Marjou 822 Orange 823 2, avenue Pierre Marzin 824 Lannion 22307 825 France 827 Email: xavier.marjou@orange.com 828 Cullen Jennings 829 Cisco 830 170 West Tasman Drive 831 San Jose, CA 95134 832 United States 834 Email: fluffy@cisco.com 836 Jonathan Rosenberg 837 Cisco 838 170 West Tasman Drive 839 San Jose, CA 95134 840 USA 842 Email: jdrosen@cisco.com 844 David Singer 845 Apple 847 Email: singer@apple.com