idnits 2.17.1 draft-lennox-avtext-lrr-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 9, 2015) is 3307 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-08) exists of draft-ietf-avtext-rtp-grouping-taxonomy-06 ** Downref: Normative reference to an Informational draft: draft-ietf-avtext-rtp-grouping-taxonomy (ref. 'I-D.ietf-avtext-rtp-grouping-taxonomy') == Outdated reference: A later version (-15) exists of draft-ietf-payload-rtp-h265-07 == Outdated reference: A later version (-17) exists of draft-ietf-payload-vp8-14 == Outdated reference: A later version (-01) exists of draft-uberti-payload-vp9-00 Summary: 2 errors (**), 0 flaws (~~), 5 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Payload Working Group J. Lennox 3 Internet-Draft D. Hong 4 Intended status: Standards Track Vidyo 5 Expires: September 10, 2015 J. Uberti 6 S. Holmer 7 M. Flodman 8 Google 9 March 9, 2015 11 The Layer Refresh Request (LRR) RTCP Feedback Message 12 draft-lennox-avtext-lrr-00 14 Abstract 16 This memo describes the RTP Payload-Specific Feedback Message "Layer 17 Refresh Request" (LRR), which can be used to request a state refresh 18 of one or more substreams of a layered media stream. It also defines 19 its use with several scalable media formats. 21 Status of This Memo 23 This Internet-Draft is submitted in full conformance with the 24 provisions of BCP 78 and BCP 79. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF). Note that other groups may also distribute 28 working documents as Internet-Drafts. The list of current Internet- 29 Drafts is at http://datatracker.ietf.org/drafts/current/. 31 Internet-Drafts are draft documents valid for a maximum of six months 32 and may be updated, replaced, or obsoleted by other documents at any 33 time. It is inappropriate to use Internet-Drafts as reference 34 material or to cite them other than as "work in progress." 36 This Internet-Draft will expire on September 10, 2015. 38 Copyright Notice 40 Copyright (c) 2015 IETF Trust and the persons identified as the 41 document authors. All rights reserved. 43 This document is subject to BCP 78 and the IETF Trust's Legal 44 Provisions Relating to IETF Documents 45 (http://trustee.ietf.org/license-info) in effect on the date of 46 publication of this document. Please review these documents 47 carefully, as they describe your rights and restrictions with respect 48 to this document. Code Components extracted from this document must 49 include Simplified BSD License text as described in Section 4.e of 50 the Trust Legal Provisions and are provided without warranty as 51 described in the Simplified BSD License. 53 Table of Contents 55 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 56 2. Conventions, Definitions and Acronyms . . . . . . . . . . . . 2 57 2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 2 58 3. Layer Refresh Request . . . . . . . . . . . . . . . . . . . . 4 59 3.1. Message Format . . . . . . . . . . . . . . . . . . . . . 4 60 4. Usage with specific codecs . . . . . . . . . . . . . . . . . 5 61 4.1. H264 SVC . . . . . . . . . . . . . . . . . . . . . . . . 6 62 4.2. VP8 . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 63 4.3. H265 . . . . . . . . . . . . . . . . . . . . . . . . . . 6 64 4.4. VP9 . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 65 5. Usage with different scalability transmission mechanisms . . 8 66 6. Security Considerations . . . . . . . . . . . . . . . . . . . 8 67 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 68 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 69 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 71 1. Introduction 73 This memo describes an RTP Payload-Specific Feedback Message 74 [RFC4585] "Layer Refresh Request" (LRR). It is designed to allow a 75 receiver of a layered media stream to request that one or more of its 76 substreams be refreshed, such that it can then be decoded by an 77 endpoint which previously was not receiving those layers, without 78 requiring that the entire stream be refreshed (as it would be if the 79 receiver sent a Full Intra Request (FIR) [RFC5104]. 81 The message is designed to be applicable both to temporally and 82 spatially scaled streams, and to both single-stream and multi-stream 83 scalability modes. 85 2. Conventions, Definitions and Acronyms 87 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 88 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 89 document are to be interpreted as described in [RFC2119]. 91 2.1. Terminology 93 A "Layer Refresh Point" is a point in a scalable stream after which a 94 decoder, which previously had been able to decode only some (possibly 95 none) of the available layers of stream, is able to decode a greater 96 number of the layers. 98 For spatial (or quality) layers, layer refresh typically requires 99 that a spatial layer be encoded in a way that references only lower- 100 layer subpictures of the current picture, not any earlier pictures of 101 that spatial layer. Additionally, the encoder must promise that no 102 earlier pictures of that spatial layer will be used as reference in 103 the future. 105 An illustration of spatial layer refresh is shown below. 107 ... <-- S1 <-- S1 S1 <-- S1 <-- ... 108 | | | | 109 \/ \/ \/ \/ 110 ... <-- S0 <-- S0 <-- S0 <-- S0 <-- ... 112 1 2 3 4 114 In this illustration, frame 3 is a layer refresh point for spatial 115 layer S1; a decoder which had previously only been decoding spatial 116 layer S0 would be able to decode layer S1 starting at frame 3. 118 Figure 1 120 For temporal layers, layer refresh requires that the layer be 121 "temporally nested", i.e. use as reference only earlier frames of a 122 lower temporal layer, not any earlier frames of this temporal layer, 123 and also promise that no future frames of this temporal layer will 124 reference frames of this temporal layer before the refresh point. In 125 many cases, the temporal structure of the stream will mean that all 126 frames are temporally nested, in which case decoders will have no 127 need to send LRR messages for the stream. 129 An illustration of temporal layer refresh is shown below. 131 ... <----- T1 <------ T1 T1 <------ ... 132 / / / 133 |_ |_ |_ 134 ... <-- T0 <------ T0 <------ T0 <------ T0 <--- ... 136 1 2 3 4 5 6 7 138 In this illustration, frame 6 is a layer refresh point for temporal 139 layer T1; a decoder which had previously only been decoding temporal 140 layer T0 would be able to decode layer T1 starting at frame 6. 142 Figure 2 144 An illustration of an inherently temporally nested stream is shown 145 below. 147 T1 T1 T1 148 / / / 149 |_ |_ |_ 150 ... <-- T0 <------ T0 <------ T0 <------ T0 <--- ... 152 1 2 3 4 5 6 7 154 In this illustration, the stream is temporally nested in its ordinary 155 structure; a decoder receiving layer T0 can begin decoding layer T1 156 at any point. 158 Figure 3 160 3. Layer Refresh Request 162 A layer refresh frame can be requested by sending a Layer Refresh 163 Request (LRR), which is an RTCP payload-specific feedback message 164 [RFC4585] asking the encoder to encode a frame which makes it 165 possible to upgrade to a higher layer. The LRR contains one or two 166 tuples, indicating the layer the decoder wants to upgrade to, and 167 (optionally) the currently highest layer the decoder can decode. 169 The specific format of the tuples, and the mechanism by which a 170 receiver recognizes a refresh frame, is codec-dependent. Usage for 171 several codecs is discussed in Section 4. 173 LRR follows the model of the Full Intra Request (FIR) 174 [RFC5104](Section 3.5.1) for its retransmission, reliability, and use 175 in multipoint conferences. TODO: expand these here. 177 The LRR message is identified by RTCP packet type value PT=PSFB and 178 FMT=TBD. The FCI field MUST contain one or more FIR entries. Each 179 entry applies to a different media sender, identified by its SSRC. 181 3.1. Message Format 183 The Feedback Control Information (FCI) for the Layer Refresh Request 184 consists of one or more FCI entries, the content of which is depicted 185 in Figure 4. The length of the LRR feedback message MUST be set to 186 2+3*N, where N is the number of FCI entries. 188 0 1 2 3 189 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 190 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 191 | SSRC | 192 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 193 | Seq nr. |C| Payload Type| Reserved | 194 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 195 | Target Layer Index | Current Layer Index (opt) | 196 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 198 Figure 4 200 SSRC (32 bits) The SSRC value of the media sender that is requested 201 to send a layer refresh point. 203 Seq nr. (8 bits) Command sequence number. The sequence number space 204 is unique for each pairing of the SSRC of command source and the 205 SSRC of the command target. The sequence number SHALL be 206 increased by 1 modulo 256 for each new command. A repetition 207 SHALL NOT increase the sequence number. The initial value is 208 arbitrary. 210 C (1 bit) A flag bit indicating whether the "Current Layer Index" 211 field is present in the FCI. If this bit is false, the sender of 212 the LRR message is requesting refresh of all layers up to and 213 including the target layer. 215 Payload Type (7 bits) The RTP payload type for which the LRR is 216 being requested. This gives the context in which the target layer 217 index is to be interpreted. 219 Reserved (16 bits) All bits SHALL be set to 0 by the sender and 220 SHALL be ignored on reception. 222 Target Layer Index (16 bits) The target layer for which the receiver 223 wishes a refresh point. Its format is dependent on the payload 224 type field. 226 Current Layer Index (16 bits) If C is 1, the current layer being 227 decoded by the receiver. This message is not requesting refresh 228 of layers at or below this layer. If C is 0, this field SHALL be 229 set to 0 by the sender and SHALL be ignored on reception. 231 4. Usage with specific codecs 232 4.1. H264 SVC 234 H.264 SVC [RFC6190] defines temporal, dependency (spatial), and 235 quality scalability modes. 237 +---------------+---------------+ 238 |0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7| 239 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 240 |R| DID | QID | TID |RES | 241 +---------------+---------------+ 243 Figure 5 245 Figure 5 shows the format of the layer index field for H.264 SVC 246 streams. This is designed to follow the same layout as the third and 247 fourth bytes of the H.264 SVC NAL unit extension, which carry the 248 stream's layer information. The "R" and "RES" fields MUST be set to 249 0 on transmission and ignored on reception. See [RFC6190] 250 Section 1.1.3 for details on the DID, QID, and TID fields. 252 TODO: identifying layer refresh frames in an H.264 bitstream. 254 4.2. VP8 256 The VP8 RTP payload format [I-D.ietf-payload-vp8] defines temporal 257 scalability modes. It does not support spatial scalability. 259 +---------------+---------------+ 260 |0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7| 261 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 262 |TID| RES | 263 +---------------+---------------+ 265 Figure 6 267 Figure 6 shows the format of the layer index field for VP8 streams. 268 The "RES" fields MUST be set to 0 on transmission and ingnored on 269 reception. See [I-D.ietf-payload-vp8] Section 4.2 for details on the 270 TID field. 272 TODO: identifying layer refresh frames in an VP8 bitstream. 274 4.3. H265 276 The initial version of the H.265 payload format 277 [I-D.ietf-payload-rtp-h265] defines temporal scalability, with 278 protocol elements reserved for spatial or other scalability modes 279 (which are expected to be defined in a future version of the 280 specification. 282 +---------------+---------------+ 283 |0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7| 284 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 285 | RES | LayerId | TID | 286 +-------------+-----------------+ 288 Figure 7 290 Figure 7 shows the format of the layer index field for H.265 streams. 291 This is designed to follow the same layout as the first and second 292 bytes of the H.265 NAL unit header, which carry the stream's layer 293 information. The "RES" field MUST be set to 0 on transmission and 294 ingnored on reception. See [I-D.ietf-payload-rtp-h265] Section 1.1.3 295 for details on the LayerId and TID fields. 297 TODO: identifying layer refresh frames in an H.265 bitstream. 299 4.4. VP9 301 The RTP payload format for VP9 [I-D.uberti-payload-vp9] defines how 302 it can be used for spatial and temporal scalability. 304 +---------------+---------------+ 305 |0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7| 306 +-------------+-----------------+ 307 | T |R| S | RES | 308 +-------------+-----------------+ 310 Figure 8 312 Figure 8 shows the format of the layer index field for VP9 streams. 313 This is designed to follow the same layout as the "L" byte of the VP9 314 payload header, which carries the stream's layer information. The 315 "R" and "RES" fields MUST be set to 0 on transmission and ingnored on 316 reception. See [I-D.uberti-payload-vp9] for details on the T and S 317 fields. 319 Identification of a layer refresh frame can be derived from the 320 reference IDs of each frame by backtracking the dependency chain 321 until reaching a point where only decodable frames are being 322 referenced. Therefore it's recommended for both the flexible and the 323 non-flexible mode that, when upgrade frames are being encoded in 324 response to a LRR, those packets should contain layer indices and the 325 reference fields so that the decoder or an MCU can make this 326 derivation. 328 Example: 330 LRR {1,0}, {2,1} is sent by an MCU when it is currently relaying 331 {1,0} to a receiver and which wants to upgrade to {2,1}. In response 332 the encoder should encode the next frames in layers {1,1} and {2,1} 333 by only referring to frames in {1,0}, or {0,0}. 335 In the non-flexible mode, periodic upgrade frames can be defined by 336 the layer structure of the SS, thus periodic upgrade frames can be 337 automatically identified by the picture ID. 339 5. Usage with different scalability transmission mechanisms 341 Several different mechanisms are defined for how scalable streams can 342 be transmitted in RTP. The RTP Taxonomy 343 [I-D.ietf-avtext-rtp-grouping-taxonomy] Section 3.7 defines three 344 mechanisms: Single RTP Stream on a Single Media Transport (SRST), 345 Multiple RTP Streams on a Single Media Transport (MRST), and Multiple 346 RTP Streams on Multiple Media Transports (MRMT). 348 The LRR message is applicable to all these mechanisms. For MRST and 349 MRMT mechanisms, the "media source" field of the LRR FCI is set to 350 the SSRC of the RTP stream containing the layer indicated by the 351 Current Layer Index (if "C" is 1), or the stream containing the base 352 encoded stream (if "C" is 0). For MRMT, it is sent on the RTP 353 session on which this stream is sent. On receipt, the sender MUST 354 refresh all the layers requested in the stream, simultaneously in 355 decode order. 357 Note: arguably, for the MRST and MRMT mechanisms, FIR feedback 358 messages could instead be used to refresh specific individual layers. 359 However, the usage of FIR for MRSR/MRMT is not explicitly specified 360 anywhere, and if FIR is interpreted as refreshing layers, there is no 361 way to request an actual full, synchronized refresh of all the layers 362 of an MRST/MRMT layered source. Thus, the authors feel that 363 interpreting FIR as refreshing the entire source, and using LRR for 364 the individual layers, would be more useful. 366 6. Security Considerations 368 All the security considerations of FIR feedback packets [RFC5104] 369 apply to LRR feedback packets as well. Additionally, media senders 370 receiving LRR feedback packets MUST validate that the payload types 371 and layer indices they are receiving are valid for the stream they 372 are currently sending, and discard the requests if not. 374 7. IANA Considerations 376 The IANA is requested to register the following values: 377 - TODO: PSFB value for LRR 379 8. References 381 [I-D.ietf-avtext-rtp-grouping-taxonomy] 382 Lennox, J., Gross, K., Nandakumar, S., and G. Salgueiro, 383 "A Taxonomy of Grouping Semantics and Mechanisms for Real- 384 Time Transport Protocol (RTP) Sources", draft-ietf-avtext- 385 rtp-grouping-taxonomy-06 (work in progress), March 2015. 387 [I-D.ietf-payload-rtp-h265] 388 Wang, Y., Sanchez, Y., Schierl, T., Wenger, S., and M. 389 Hannuksela, "RTP Payload Format for High Efficiency Video 390 Coding", draft-ietf-payload-rtp-h265-07 (work in 391 progress), December 2014. 393 [I-D.ietf-payload-vp8] 394 Westin, P., Lundin, H., Glover, M., Uberti, J., and F. 395 Galligan, "RTP Payload Format for VP8 Video", draft-ietf- 396 payload-vp8-14 (work in progress), March 2015. 398 [I-D.uberti-payload-vp9] 399 Uberti, J., Holmer, S., Flodman, M., and J. Lennox, "RTP 400 Payload Format for VP9 Video", draft-uberti-payload-vp9-00 401 (work in progress), October 2014. 403 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 404 Requirement Levels", BCP 14, RFC 2119, March 1997. 406 [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, 407 "Extended RTP Profile for Real-time Transport Control 408 Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, July 409 2006. 411 [RFC5104] Wenger, S., Chandra, U., Westerlund, M., and B. Burman, 412 "Codec Control Messages in the RTP Audio-Visual Profile 413 with Feedback (AVPF)", RFC 5104, February 2008. 415 [RFC6190] Wenger, S., Wang, Y., Schierl, T., and A. Eleftheriadis, 416 "RTP Payload Format for Scalable Video Coding", RFC 6190, 417 May 2011. 419 Authors' Addresses 421 Jonathan Lennox 422 Vidyo, Inc. 423 433 Hackensack Avenue 424 Seventh Floor 425 Hackensack, NJ 07601 426 US 428 Email: jonathan@vidyo.com 430 Danny Hong 431 Vidyo, Inc. 432 433 Hackensack Avenue 433 Seventh Floor 434 Hackensack, NJ 07601 435 US 437 Email: danny@vidyo.com 439 Justin Uberti 440 Google, Inc. 441 747 6th Street South 442 Kirkland, WA 98033 443 USA 445 Email: justin@uberti.name 447 Stefan Holmer 448 Google, Inc. 449 Kungsbron 2 450 Stockholm 111 22 451 Sweden 453 Magnus Flodman 454 Google, Inc. 455 Kungsbron 2 456 Stockholm 111 22 457 Sweden