idnits 2.17.1 draft-ietf-rtcweb-video-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 13, 2015) is 3360 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC4175' is defined on line 345, but no explicit reference was found in the text == Unused Reference: 'RFC4421' is defined on line 348, but no explicit reference was found in the text == Unused Reference: 'RFC5104' is defined on line 352, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. 'H264' -- Possible downref: Non-RFC (?) normative reference: ref. 'HSUP1' == Outdated reference: A later version (-17) exists of draft-ietf-payload-vp8-11 == Outdated reference: A later version (-19) exists of draft-ietf-rtcweb-overview-12 -- Possible downref: Non-RFC (?) normative reference: ref. 'IEC23001-8' ** Downref: Normative reference to an Informational RFC: RFC 6386 -- Possible downref: Non-RFC (?) normative reference: ref. 'SRGB' == Outdated reference: A later version (-26) exists of draft-ietf-rtcweb-rtp-usage-06 == Outdated reference: A later version (-20) exists of draft-ietf-rtcweb-security-arch-09 == Outdated reference: A later version (-12) exists of draft-ietf-rtcweb-security-06 Summary: 1 error (**), 0 flaws (~~), 9 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group A.B. Roach 3 Internet-Draft Mozilla 4 Intended status: Standards Track February 13, 2015 5 Expires: August 17, 2015 7 WebRTC Video Processing and Codec Requirements 8 draft-ietf-rtcweb-video-04 10 Abstract 12 This specification provides the requirements and considerations for 13 WebRTC applications to send and receive video across a network. It 14 specifies the video processing that is required, as well as video 15 codecs and their parameters. 17 Status of This Memo 19 This Internet-Draft is submitted in full conformance with the 20 provisions of BCP 78 and BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF). Note that other groups may also distribute 24 working documents as Internet-Drafts. The list of current Internet- 25 Drafts is at http://datatracker.ietf.org/drafts/current/. 27 Internet-Drafts are draft documents valid for a maximum of six months 28 and may be updated, replaced, or obsoleted by other documents at any 29 time. It is inappropriate to use Internet-Drafts as reference 30 material or to cite them other than as "work in progress." 32 This Internet-Draft will expire on August 17, 2015. 34 Copyright Notice 36 Copyright (c) 2015 IETF Trust and the persons identified as the 37 document authors. All rights reserved. 39 This document is subject to BCP 78 and the IETF Trust's Legal 40 Provisions Relating to IETF Documents 41 (http://trustee.ietf.org/license-info) in effect on the date of 42 publication of this document. Please review these documents 43 carefully, as they describe your rights and restrictions with respect 44 to this document. Code Components extracted from this document must 45 include Simplified BSD License text as described in Section 4.e of 46 the Trust Legal Provisions and are provided without warranty as 47 described in the Simplified BSD License. 49 Table of Contents 51 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 52 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 2 53 3. Pre and Post Processing . . . . . . . . . . . . . . . . . . . 2 54 3.1. Camera Source Video . . . . . . . . . . . . . . . . . . . 3 55 3.2. Screen Source Video . . . . . . . . . . . . . . . . . . . 3 56 4. Stream Orientation . . . . . . . . . . . . . . . . . . . . . 3 57 5. Mandatory to Implement Video Codec . . . . . . . . . . . . . 4 58 6. Codec-Specific Considerations . . . . . . . . . . . . . . . . 5 59 6.1. VP8 . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 60 6.2. H.264 . . . . . . . . . . . . . . . . . . . . . . . . . . 5 61 7. Security Considerations . . . . . . . . . . . . . . . . . . . 7 62 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 63 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 7 64 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 7 65 10.1. Normative References . . . . . . . . . . . . . . . . . . 7 66 10.2. Informative References . . . . . . . . . . . . . . . . . 9 67 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 9 69 1. Introduction 71 One of the major functions of WebRTC endpoints is the ability to send 72 and receive interactive video. The video might come from a camera, a 73 screen recording, a stored file, or some other source. This 74 specification defines how the video is used and discusses special 75 considerations for processing the video. It also covers the video- 76 related algorithms WebRTC devices need to support. 78 Note that this document only discusses those issues dealing with 79 video codec handling. Issues that are related to transport of media 80 streams across the network are specified in 81 [I-D.ietf-rtcweb-rtp-usage]. 83 2. Terminology 85 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 86 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 87 document are to be interpreted as described in [RFC2119]. 89 3. Pre and Post Processing 91 This section provides guidance on pre- or post-processing of video 92 streams. 94 Unless specified otherwise by the SDP or codec, the color space 95 SHOULD be sRGB [SRGB]. For clarity, this the color space indicated 96 by codepoint 1 from "ColourPrimaries" as defined in [IEC23001-8]. 98 Unless specified otherwise by the SDP or codec, the video scan 99 pattern for video codecs is Y'CbCr 4:2:0. 101 3.1. Camera Source Video 103 This document imposes no normative requirements on camera capture; 104 however, implementors are encouraged to take advantage of the 105 following features, if feasible for their platform: 107 o Automatic focus, if applicable for the camera in use 109 o Automatic white balance 111 o Automatic light level control 113 o Dynamic frame rate for video capture based on actual encoding in 114 use (e.g., if encoding at 15 fps due to bandwidth constraints, low 115 light conditions, or application settings, the camera will ideally 116 capture at 15 fps rather than a higher rate). 118 3.2. Screen Source Video 120 If the video source is some portion of a computer screen (e.g., 121 desktop or application sharing), then the considerations in this 122 section also apply. 124 Because screen-sourced video can change resolution (due to, e.g., 125 window resizing and similar operations), WebRTC video recipients MUST 126 be prepared to handle mid-stream resolution changes in a way that 127 preserves their utility. Precise handling (e.g., resizing the 128 element a video is rendered in versus scaling down the received 129 stream; decisions around letter/pillarboxing) is left to the 130 discretion of the application. 132 Note that the default video scan format (Y'CbCr 4:2:0) is known to be 133 less than optimal for the representation of screen content produced 134 by most systems in use at the time of this document's publication, 135 which generally use RGB with at least 24 bits per sample. In the 136 future, it may be advisable to use video codecs optimized for screen 137 content for the representation of this type of content. 139 Additionally, attention is drawn to the requirements in 140 [I-D.ietf-rtcweb-security-arch] section 5.2 and the considerations in 141 [I-D.ietf-rtcweb-security] section 4.1.1. 143 4. Stream Orientation 144 In some circumstances - and notably those involving mobile devices - 145 the orientation of the camera may not match the orientation used by 146 the encoder. Of more importance, the orientation may change over the 147 course of a call, requiring the receiver to change the orientation in 148 which it renders the stream. 150 While the sender may elect to simply change the pre-encoding 151 orientation of frames, this may not be practical or efficient (in 152 particular, in cases where the interface to the camera returns pre- 153 compressed video frames). Note that the potential for this behavior 154 adds another set of circumstances under which the resolution of a 155 screen might change in the middle of a video stream, in addition to 156 those mentioned under "Screen Sourced Video," above. 158 To accommodate these circumstances, RTCWEB implementations that can 159 generate media in orientations other than the default MUST support 160 generating the R0 and R1 bits of the Coordination of Video 161 Orientation (CVO) mechanism described in section 7.4.5 of [TS26.114], 162 and MUST send them for all orientations when the peer indicates 163 support for the mechanism. They MAY support sending the other bits 164 in the CVO extension, including the higher-resolution rotation bits. 165 All implementations SHOULD support interpretation of the R0 and R1 166 bits, and MAY support the other CVO bits. 168 Further, some codecs support in-band signaling of orientation (for 169 example, the SEI "Display Orientation" messages in H.264 and H.265). 170 If CVO has been negotiated, then the sender MUST NOT make use of such 171 codec-specific mechanisms. However, when support for CVO is not 172 signaled in the SDP, then such implementations MAY make use of the 173 codec-specific mechanisms instead. 175 5. Mandatory to Implement Video Codec 177 For the definitions of "WebRTC Brower," "WebRTC Non-Browser", and 178 "WebRTC-Compatible Endpoint" as they are used in this section, please 179 refer to [I-D.ietf-rtcweb-overview]. 181 WebRTC Browsers MUST implement the VP8 video codec as described in 182 [RFC6386] and H.264 Constrained Baseline as described in [H264]. 184 WebRTC Non-Browsers that support transmitting and/or receiving video 185 MUST implement the VP8 video codec as described in [RFC6386] and 186 H.264 Constrained Baseline as described in [H264]. 188 To promote the use of non-royalty bearing video codecs, 189 participants in the RTCWEB working group, and any successor 190 working groups in the IETF, intend to monitor the evolving 191 licensing landscape as it pertains to the two mandatory-to- 192 implement codecs. If compelling evidence arises that one of the 193 codecs is available for use on a royalty-free basis, the working 194 group plans to revisit the question of which codecs are required 195 for Non-Browsers, with the intention being that the royalty-free 196 codec will remain mandatory to implement, and the other will 197 become optional. 199 These provisions apply to WebRTC Non-Browsers only. There is no 200 plan to revisit the codecs required for WebRTC Browsers. 202 "WebRTC-compatible endpoints" are free to implement any video codecs 203 they see fit. This follows logically from the definition of "WebRTC- 204 compatible endpoint." It is, of course, advisable to implement at 205 least one of the video codecs that is mandated for WebRTC Browsers, 206 and implementors are encouraged to do so. 208 6. Codec-Specific Considerations 210 SDP allows for codec-independent indication of preferred video 211 resolutions using the mechanism described in [RFC6236]. If a 212 recipient of video indicates a receiving resolution, the sender 213 SHOULD accommodate this resolution, as the receiver may not be 214 capable of handling higher resolutions. 216 Additionally, codecs may include codec-specific means of signaling 217 maximum receiver abilities with regards to resolution, frame rate, 218 and bitrate. 220 Unless otherwise signaled in SDP, recipients of video streams MUST be 221 able to decode video at a rate of at least 20 fps at a resolution of 222 at least 320x240. These values are selected based on the 223 recommendations in [HSUP1]. 225 Encoders are encouraged to support encoding media with at least the 226 same resolution and frame rates cited above. 228 6.1. VP8 230 For the VP8 codec, defined in [RFC6386], endpoints MUST support the 231 payload formats defined in [I-D.ietf-payload-vp8]. 233 In addition to the [RFC6236] mechanism, VP8 encoders MUST limit the 234 streams they send to conform to the values indicated by receivers in 235 the corresponding max-fr and max-fs SDP attributes. 237 6.2. H.264 238 For the [H264] codec, endpoints MUST support the payload formats 239 defined in [RFC6184]. In addition, they MUST support Constrained 240 Baseline Profile Level 1.2, and they SHOULD support H.264 Constrained 241 High Profile Level 1.3. 243 Implementations of the H.264 codec have utilized a wide variety of 244 optional parameters. To improve interoperability the following 245 parameter settings are specified: 247 packetization-mode: Packetization-mode 1 MUST be supported. Other 248 modes MAY be negotiated and used. 250 profile-level-id: Implementations MUST include this parameter within 251 SDP and MUST interpret it when receiving it. 253 max-mbps, max-smbps, max-fs, max-cpb, max-dpb, and max-br: These 255 parameters allow the implementation to specify that they can 256 support certain features of H.264 at higher rates and values than 257 those signalled by their level (set with profile-level-id). 258 Implementations MAY include these parameters in their SDP, but 259 SHOULD interpret them when receiving them, allowing them to send 260 the highest quality of video possible. 262 sprop-parameter-sets: H.264 allows sequence and picture information 263 to be sent both in-band, and out-of-band. WebRTC implementations 264 MUST signal this information in-band. This means that WebRTC 265 implementations MUST NOT include this parameter in the SDP they 266 generate. 268 H.264 codecs MAY send and MUST support proper interpretation of SEI 269 "filler payload" and "full frame freeze" messages. "Full frame 270 freeze" messages are used in video switching MCUs, to ensure a stable 271 decoded displayed picture while switching among various input 272 streams. 274 When the use of the video orientation (CVO) RTP header extension is 275 not signaled as part of the SDP, H.264 implementations MAY send and 276 SHOULD support proper interpretation of Display Orientation SEI 277 messages. 279 Implementations MAY send and act upon "User data registered by Rec. 280 ITU-T T.35" and "User data unregistered" messages. Even if they do 281 not act on them, implementations MUST be prepared to receive such 282 messages without any ill effects. 284 Unless otherwise signaled, implementations that use H.264 MUST encode 285 and decode pixels with a implied 1:1 (square) aspect ratio. 287 7. Security Considerations 289 This specification does not introduce any new mechanisms or security 290 concerns beyond what the other documents it references. In WebRTC, 291 video is protected using DTLS/SRTP. A complete discussion of the 292 security can be found in [I-D.ietf-rtcweb-security] and 293 [I-D.ietf-rtcweb-security-arch]. Implementers should consider 294 whether the use of variable bit rate video codecs are appropriate for 295 their application based on [RFC6562]. 297 Implementors making use of H.264 are also advised to take careful 298 note of the "Security Considerations" section of [RFC6184], paying 299 special regard to the normative requirement pertaining to SEI 300 messages. 302 8. IANA Considerations 304 This document requires no actions from IANA. 306 9. Acknowledgements 308 The author would like to thank Gaelle Martin-Cocher, Stephan Wenger, 309 and Bernard Aboba for their detailed feedback and assistance with 310 this document. Thanks to Cullen Jennings for providing text and 311 review. This draft includes text from draft-cbran-rtcweb-codec. 313 10. References 315 10.1. Normative References 317 [H264] ITU-T Recommendation H.264, "Advanced video coding for 318 generic audiovisual services (V9)", February 2014, 319 . 321 [HSUP1] ITU-T Recommendation H.Sup1, "Application profile - Sign 322 language and lip-reading real-time conversation using low 323 bit rate video communication", May 1999, 324 . 326 [I-D.ietf-payload-vp8] 327 Westin, P., Lundin, H., Glover, M., Uberti, J., and F. 328 Galligan, "RTP Payload Format for VP8 Video", draft-ietf- 329 payload-vp8-11 (work in progress), February 2014. 331 [I-D.ietf-rtcweb-overview] 332 Alvestrand, H., "Overview: Real Time Protocols for 333 Browser-based Applications", draft-ietf-rtcweb-overview-12 334 (work in progress), October 2014. 336 [IEC23001-8] 337 ISO/IEC 23001-8:2013/DCOR1, "Coding independent media 338 description code points", 2013, . 342 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 343 Requirement Levels", BCP 14, RFC 2119, March 1997. 345 [RFC4175] Gharai, L. and C. Perkins, "RTP Payload Format for 346 Uncompressed Video", RFC 4175, September 2005. 348 [RFC4421] Perkins, C., "RTP Payload Format for Uncompressed Video: 349 Additional Colour Sampling Modes", RFC 4421, February 350 2006. 352 [RFC5104] Wenger, S., Chandra, U., Westerlund, M., and B. Burman, 353 "Codec Control Messages in the RTP Audio-Visual Profile 354 with Feedback (AVPF)", RFC 5104, February 2008. 356 [RFC6184] Wang, Y.-K., Even, R., Kristensen, T., and R. Jesup, "RTP 357 Payload Format for H.264 Video", RFC 6184, May 2011. 359 [RFC6236] Johansson, I. and K. Jung, "Negotiation of Generic Image 360 Attributes in the Session Description Protocol (SDP)", RFC 361 6236, May 2011. 363 [RFC6386] Bankoski, J., Koleszar, J., Quillio, L., Salonen, J., 364 Wilkins, P., and Y. Xu, "VP8 Data Format and Decoding 365 Guide", RFC 6386, November 2011. 367 [RFC6562] Perkins, C. and JM. Valin, "Guidelines for the Use of 368 Variable Bit Rate Audio with Secure RTP", RFC 6562, March 369 2012. 371 [SRGB] IEC 61966-2-1, "Multimedia systems and equipment - Colour 372 measurement and management - Part 2-1: Colour management - 373 Default RGB colour space - sRGB.", October 1999, . 376 [TS26.114] 377 3GPP TS 26.114 V12.8.0, "3rd Generation Partnership 378 Project; Technical Specification Group Services and System 379 Aspects; IP Multimedia Subsystem (IMS); Multimedia 380 Telephony; Media handling and interaction (Release 12)", 381 December 2014, . 383 10.2. Informative References 385 [I-D.ietf-rtcweb-rtp-usage] 386 Perkins, C., Westerlund, M., and J. Ott, "Web Real-Time 387 Communication (WebRTC): Media Transport and Use of RTP", 388 draft-ietf-rtcweb-rtp-usage-06 (work in progress), 389 February 2013. 391 [I-D.ietf-rtcweb-security-arch] 392 Rescorla, E., "WebRTC Security Architecture", draft-ietf- 393 rtcweb-security-arch-09 (work in progress), February 2014. 395 [I-D.ietf-rtcweb-security] 396 Rescorla, E., "Security Considerations for WebRTC", draft- 397 ietf-rtcweb-security-06 (work in progress), January 2014. 399 Author's Address 401 Adam Roach 402 Mozilla 403 \ 404 Dallas 405 US 407 Phone: +1 650 903 0800 x863 408 Email: adam@nostrum.com