idnits 2.17.1 draft-ietf-payload-rtp-mvc-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Sep 2009 rather than the newer Notice from 28 Dec 2009. (See https://trustee.ietf.org/license-info/) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 914 has weird spacing: '... (the medi...' -- The document date (September 7, 2011) is 4615 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'I-D.draft-ietf-avt-svc' is mentioned on line 1118, but not defined == Missing Reference: 'RFC3984' is mentioned on line 1115, but not defined ** Obsolete undefined reference: RFC 3984 (Obsoleted by RFC 6184) == Unused Reference: 'RFC3548' is defined on line 988, but no explicit reference was found in the text == Unused Reference: 'DVB-H' is defined on line 1003, but no explicit reference was found in the text == Unused Reference: 'IGMP' is defined on line 1009, but no explicit reference was found in the text == Unused Reference: 'McCanne' is defined on line 1013, but no explicit reference was found in the text == Unused Reference: 'MBMS' is defined on line 1017, but no explicit reference was found in the text == Unused Reference: 'MPEG2' is defined on line 1021, but no explicit reference was found in the text == Unused Reference: 'RFC3450' is defined on line 1023, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. 'MPEG4-10' -- Possible downref: Non-RFC (?) normative reference: ref. 'MVC' ** Obsolete normative reference: RFC 3548 (Obsoleted by RFC 4648) ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) -- Obsolete informational reference (is this intentional?): RFC 3450 (Obsoleted by RFC 5775) Summary: 4 errors (**), 0 flaws (~~), 11 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Audio/Video Transport Payloads WG Y.-K. Wang 2 Internet Draft Qualcomm Inc. 3 Intended status: Standards track T. Schierl 4 Expires: March 2012 Fraunhofer HHI 5 R. Skupin 6 Fraunhofer HHI 7 September 7, 2011 9 RTP Payload Format for MVC Video 10 draft-ietf-payload-rtp-mvc-01.txt 12 Status of this Memo 14 This Internet-Draft is submitted to IETF in full conformance with 15 the provisions of BCP 78 and BCP 79. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as Internet- 20 Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six 23 months and may be updated, replaced, or obsoleted by other documents 24 at any time. It is inappropriate to use Internet-Drafts as 25 reference material or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt. 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 This Internet-Draft will expire on March 7, 2012. 35 Copyright Notice 37 Copyright (c) 2011 IETF Trust and the persons identified as the 38 document authors. All rights reserved. 40 This document is subject to BCP 78 and the IETF Trust's Legal 41 Provisions Relating to IETF Documents 42 (http://trustee.ietf.org/license-info) in effect on the date of 43 publication of this document. Please review these documents 44 carefully, as they describe your rights and restrictions with 45 respect to this document. Code Components extracted from this 46 document must include Simplified BSD License text as described in 47 Section 4.e of the Trust Legal Provisions and are provided without 48 warranty as described in the BSD License. 50 Abstract 52 This memo describes an RTP payload format for the multiview 53 extension of the ITU-T Recommendation H.264 video codec that is 54 technically identical to ISO/IEC International Standard 14496-10. 55 The RTP payload format allows for packetization of one or more 56 Network Abstraction Layer (NAL) units, produced by the video 57 encoder, in each RTP payload. The payload format can be applied in 58 RTP based 3D video transmissions such as such as 3D video streaming, 59 free-viewpoint video, and 3DTV. 61 Table of Contents 63 1. Introduction...................................................3 64 1.1. The MVC Codec.............................................4 65 1.1.1. Overview.............................................4 66 1.1.2. Parameter Set Concept................................4 67 1.1.3. Network Abstraction Layer Unit Header................5 68 1.2. Overview of the Payload Format............................7 69 1.2.1. Design Principles....................................8 70 1.2.2. Transmission Modes and Packetization Modes...........8 71 2. Conventions....................................................8 72 3. Definitions and Abbreviations..................................9 73 3.1. Definitions...............................................9 74 3.1.1. Definitions per MVC specification....................9 75 3.1.2. Definitions Specific to this memo...................10 76 3.1. Abbreviations............................................10 77 4. MVC RTP Payload Format........................................11 78 4.1. RTP Header Usage.........................................11 79 4.2. Common Structure of the RTP Payload Format...............11 80 4.3. NAL Unit Header Usage....................................11 81 4.4. Packetization Modes......................................12 82 4.4.1. Packetization Modes for single-session transmission.12 83 4.4.2. Packetization Modes for multi-session transmission..12 84 4.5. Aggregation Packets......................................12 85 4.6. Fragmentation Units (FUs)................................12 86 4.7. Payload Content Scalability Information (PACSI) NAL Unit for 87 MVC...........................................................13 88 4.8. Non-Interleaved Multi-Time Aggregation Packets (NI-MTAPs)17 89 4.9. Cross-Session DON (CS-DON) for multi-session transmission17 90 5. Packetization Rules...........................................17 91 6. De-Packetization Process (Informative)........................18 92 7. Payload Format Parameters.....................................19 93 7.1. Media Type Registration..................................19 94 7.2. SDP Parameters...........................................20 95 7.2.1. Mapping of Payload Type Parameters to SDP...........20 96 7.2.2. Usage with the SDP Offer/Answer Model...............21 97 7.2.3. Usage with multi-session transmission...............21 98 7.2.4. Usage in Declarative Session Descriptions...........21 99 7.3. Examples.................................................21 100 7.4. Parameter Set Considerations.............................21 101 8. Security Considerations.......................................21 102 9. Congestion Control............................................21 103 10. IANA Considerations..........................................21 104 11. Acknowledgments..............................................21 105 12. References...................................................21 106 12.1. Normative References....................................21 107 12.2. Informative References..................................22 108 Author's Addresses...............................................23 109 13. Open issues..................................................23 110 14. Changes Log..................................................24 112 1. Introduction 114 This memo specifies an RTP [RFC3550] payload format for a 115 forthcoming new mode of the H.264/AVC video coding standard, known 116 as Multiview Video Coding (MVC). Formally, MVC will take the form 117 of Amendment 4 to ISO/IEC 14496 Part 10 [MPEG4-10], and Annex H of 118 ITU-T Rec. H.264 [H.264]. The latest draft specification of MVC is 119 available in [MVC]. 121 MVC covers a wide range of 3D video applications, including 3D video 122 streaming, free-viewpoint video as well as 3DTV. 124 This memo follows a backward compatible enhancement philosophy, by 125 keeping as close an alignment to the H.264/AVC payload format 126 [RFC6184] as possible. It documents the enhancements relevant from 127 an RTP transport viewpoint, and defines signaling support for MVC, 128 including a new media subtype name. 130 Due to the similarity between MVC and Scalable Video Coding (SVC), 131 as defined in Annex G of H.264 [H.264], in system and transport 132 aspects, this memo reuses the design principles as well as many 133 features of the SVC RTP payload draft [RFC6190]. 135 [Ed.Note(TS):Need text on session multiplexing and on the relation 136 of this draft to [RFC6190] here.] 138 1.1. The MVC Codec 140 1.1.1. Overview 142 MVC provides multi-view video bitstreams. An MVC bitstream contains 143 a base view conforming to at least one of the profiles of H.264/AVC 144 as defined in Annex A of [H.264], and one or more non-base views. 145 To enable high compression efficiency, coding of a non-base view can 146 utilize other views for inter-view prediction, thus its decoding 147 relies on the presence of the views it depends on. Each coded view 148 itself may be temporally scalable. Besides temporal scalability, 149 MVC also supports view scalability, wherein a subset of the encoded 150 views can be extracted, decoded and displayed, whenever it is 151 desired by the application. 153 The concept of video coding layer (VCL) and network abstraction 154 layer (NAL) is inherited from H.264/AVC. The VCL contains the 155 signal processing functionality of the codec; mechanisms such as 156 transform, quantization, motion-compensated prediction, loop 157 filtering and inter-view prediction. The NAL encapsulates each 158 slice generated by the VCL into one or more NAL units. Please 159 consult RFC 6184 for a more in-depth discussion of the NAL unit 160 concept. MVC specifies the decoding order of NAL units. 162 In MVC, one access unit contains all NAL units pertaining to one 163 output time instance for all the views. Within one access unit, the 164 coded representation of each view, also named as view component, 165 consists of one or more slices. 167 The concept of temporal scalability is not newly introduced by SVC 168 or MVC, as profiles defined in Annex A of [H.264] already support 169 it. In [H.264], sub-sequences have been introduced in order to 170 allow optional use of temporal layers. SVC extended this approach 171 by advertising the temporal scalability information within the NAL 172 unit header or prefix NAL units, both were inherited to MVC. 174 1.1.2. Parameter Set Concept 176 The parameter set concept was first specified in [H.264]. Please 177 refer to section 1.2 of [RFC6184] for more details. SVC introduced 178 some new parameter set mechanisms. MVC has inherited the parameter 179 set concept from [H.264]. 181 In particular, a different type of sequence parameter set (SPS), 182 which is referred to as subset SPS, using a different NAL unit type 183 than "the old SPS" specified in [H.264] is used for non-base views, 184 while the base view still uses "the old SPS". Slices from different 185 views would be able to use either 1) the same sequence or picture 186 parameter set, or 2) different sequence or picture parameter sets. 188 The inter-view dependency and the decoding order of all the encoded 189 views are indicated in a new syntax structure, the SPS MVC 190 extension, included in each subset SPS. 192 1.1.3. Network Abstraction Layer Unit Header 194 An MVC NAL unit of type 20 or 14 consists of a header of four octets 195 and the payload byte string. MVC NAL units of type 20 are coded 196 slices of non-base views. A special type of an MVC NAL unit is the 197 prefix NAL unit (type 14) that includes descriptive information of 198 the associated H.264/AVC VCL NAL unit (type 1 or 5) that immediately 199 follows the prefix NAL unit. 201 MVC extends the one-byte H.264/AVC NAL unit header by three 202 additional octets. The header indicates the type of the NAL unit, 203 the (potential) presence of bit errors or syntax violations in the 204 NAL unit payload, information regarding the relative importance of 205 the NAL unit for the decoding process, the view identification 206 information, the temporal layer identification information, and 207 other fields as discussed below. 209 The syntax and semantics of the NAL unit header are formally 210 specified in [MVC], but the essential properties of the NAL unit 211 header are summarized below. 213 The first byte of the NAL unit header has the following format (the 214 bit fields are the same as defined for the one-byte H.264/AVC NAL 215 unit header, while the semantics of some fields have changed 216 slightly, in a backward compatible way): 218 +---------------+ 219 |0|1|2|3|4|5|6|7| 220 +-+-+-+-+-+-+-+-+ 221 |F|NRI| Type | 222 +---------------+ 224 F: 1 bit 226 forbidden_zero_bit. H.264/AVC declares a value of 1 as a syntax 227 violation. 229 NRI: 2 bits 231 nal_ref_idc. A value of 00 indicates that the content of the NAL 232 unit is not used to reconstruct reference pictures for future 233 prediction. Such NAL units can be discarded without risking the 234 integrity of the reference pictures in the same view. A value 235 higher than 00 indicates that the decoding of the NAL unit is 236 required to maintain the integrity of reference pictures in the same 237 view, or that the NAL unit contains parameter sets. 239 Type: 5 bits 241 nal_unit_type. This component specifies the NAL unit type. 243 In H.264/AVC, NAL unit types 14 and 20 are reserved for future 244 extensions. MVC uses these two NAL unit types. NAL unit type 14 is 245 used for prefix NAL unit, and NAL unit type 20 is used for coded 246 slice of non-base view. NAL unit types 14 and 20 indicate the 247 presence of three additional octets in the NAL unit header, as shown 248 below. 250 +---------------+---------------+---------------+ 251 |0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7| 252 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 253 |S|I| PRID | VID | TID |A|V|O| 254 +---------------+---------------+---------------+ 256 S: 1 bit 258 svc_extention_flag. MUST be equal to 0 in MVC context. In the 259 context of Scalable Video Coding (SVC), the flag must be equal to 1. 261 I: 1 bit 263 non_idr_flag. This component specifies whether the access unit the 264 NAL unit belongs to is an IDR access unit (when equal to 0) or not 265 (when equal to 1), as specified in [MVC]. 267 PRID: 6 bits 269 priority_id. This flag specifies a priority identifier for the NAL 270 unit. A lower value of PRID indicates a higher priority. 272 VID: 10 bits 274 view_id. This component specifies the view identifier of the view 275 the NAL unit belongs to. 277 TID: 3 bits 279 temporal_id. This component specifies the temporal layer (or frame 280 rate) hierarchy. Informally put, a temporal layer consisting of 281 view component with a less temporal_id corresponds to a lower frame 282 rate. A given temporal layer typically depends on the lower 283 temporal layers (i.e. the temporal layers with less temporal_id 284 values) but never depends on any higher temporal layer (i.e. a 285 temporal layers with higher temporal_id value). 287 A: 1 bit 289 anchor_pic_flag. This component specifies whether the access unit 290 the NAL unit belongs to is an anchor access unit (when equal to 1) 291 or not (when equal to 0), as specified in [MVC]. 293 V: 1 bit 295 inter_view_flag. This component specifies whether the view 296 component is used for inter-view prediction (when equal to 1) or not 297 (when equal to 0). 299 O: 1 bit 301 reserved_one_bit. Reserved bit for future extension. R shall be 302 equal to 1. Receivers SHOULD ignore the value of 303 reserved_zero_one_bit. This memo reuses the same additional NAL unit 304 types introduced in RFC 6190, which are presented in section 4.2. 305 In addition, this memo introduces one more NAL unit type, 30, as 306 specified in section 4.7. These NAL unit types are marked as 307 unspecified in [MVC] and intentionally reserved for use in systems 308 specifications like this memo. Moreover, this specification extends 309 the semantics of F, NRI, PRID, TID, A, and I as described in section 310 4.3. 312 1.2. Overview of the Payload Format 314 This payload specification can only be used to carry the "naked" NAL 315 unit stream over RTP, and not the byte stream format according to 316 Annex B of [MVC]. Likely, the applications of this specification 317 will be in the IP based multimedia communications fields including 318 3D video streaming over IP, free-viewpoint video over IP, and 3DTV 319 over IP. 321 This specification allows, in a given RTP packet stream, to 322 encapsulate NAL units belonging to 324 o the base view only, detailed specification in [RFC6184], or 326 o one or more non-base views, or 328 o the base view and one or non-base views 330 [Ed.Note(YkW): To be extended to allow separate carriage of 331 different temporal layers in different RTP packet streams as in 332 [RFC6190].] 334 1.2.1. Design Principles 336 The following design principles have been observed: 338 o Backward compatibility with [RFC6184] wherever possible. 340 o As the MVC base view is H.264/AVC compatible, the base view or any 341 H.264/AVC compatible subset of it, when transmitted in its own RTP 342 packet stream, MUST be encapsulated using [RFC6184]. Requiring this 343 has the desirable side effect that the transmitted data can be 344 received by [RFC6184] receivers and decoded by H.264/AVC decoders. 346 o Media-Aware Network Elements (MANEs) as defined in [RFC6184] are 347 signaling aware and rely on signaling information. MANEs have 348 state. 350 o MANEs can aggregate multiple RTP streams, possibly from multiple 351 RTP sessions. 353 o MANEs can perform media-aware stream thinning. By using the 354 payload header information identifying Layers within an RTP session, 355 MANEs are able to remove packets from the incoming RTP packet 356 stream. This implies rewriting the RTP headers of the outgoing 357 packet stream and rewriting of RTCP Receiver Reports. 359 1.2.2. Transmission Modes and Packetization Modes 361 Please see section 1.2.2 of [RFC6190]. 363 2. Conventions 365 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 366 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 367 document are to be interpreted as described in BCP 14, RFC 2119 368 [RFC2119]. 370 This specification uses the notion of setting and clearing a bit 371 when bit fields are handled. Setting a bit is the same as assigning 372 that bit the value of 1 (On). Clearing a bit is the same as 373 assigning that bit the value of 0 (Off). 375 3. Definitions and Abbreviations 377 3.1. Definitions 379 3.1.1. Definitions per MVC specification 381 This document uses the definitions of [MVC]. The following terms, 382 defined in [MVC], are summed up for convenience: 384 access unit: A set of NAL units always containing exactly one 385 primary coded picture with one or more view components. In addition 386 to the primary coded picture, an access unit may also contain one or 387 more redundant coded pictures, one auxiliary coded picture, or other 388 NAL units not containing slices or slice data partitions of a coded 389 picture. The decoding of an access unit always results in one 390 decoded picture. All slices or slice data partitions in an access 391 unit have the same value of picture order count. 393 prefix NAL unit: A NAL unit with nal_unit_type equal to 14 that 394 immediately precedes a NAL unit with nal_unit_type equal to 1, 5, 395 or 12. The NAL unit that succeeds the prefix NAL unit is also 396 referred to as the associated NAL unit. The prefix NAL unit 397 contains data associated with the associated NAL unit, which are 398 considered to be part of the associated NAL unit. 400 view component: An access unit subset containing only NAL units that 401 share to the same view identifier. 403 base view: A bitstream subset that contains all the NAL units with 404 the nal_unit_type syntax element equal to 1, 5 or 14 of the bitstream 405 and does not contain any NAL unit with the nal_unit_type syntax 406 element equal to 15, or 20 and conforms to one or more of the 407 profiles specified in Annex A of [H.264]. 409 anchor access unit: An access unit of which all included views can be 410 decoded independently from other access units. 412 target output view: A view that is targeted for output. 414 3.1.2. Definitions Specific to this memo 416 MVC NAL unit: A NAL unit of NAL unit type 14 or 20 as specified in 417 Annex H of [MVC]. An MVC NAL unit has a four-byte NAL unit header. 419 operation point: An operation point of an MVC bitstream represents 420 a certain level of temporal and view scalability. An operation 421 point contains only those NAL units required for a valid bitstream 422 to represent a certain subset of views at a certain temporal level. 423 An operation point is described by the view_id values of the subset 424 of views, and the highest temporal_id. 426 multi-session transmission: The transmission mode in which the MVC 427 bitstream is transmitted over multiple RTP sessions, with each 428 stream having the same SSRC. These multiple RTP streams can be 429 associated using the RTCP CNAME, or explicit signalling of the SSRC 430 used. Dependency between RTP sessions MUST be signaled according to 431 [RFC5583] and this memo. 433 single-session transmission: The transmission mode in which the MVC 434 bitstream is transmitted over a single RTP session, with a single 435 SSRC and separate timestamp and sequence number spaces. 437 cross-session decoding order number (CS-DON): A derived variable 438 indicating NAL unit decoding order number over all NAL units within 439 all the session-multiplexed RTP sessions that carry the same MVC 440 bitstream. 442 [Ed.Note(TS):Need more definitions here.] 444 3.1. Abbreviations 446 In addition to the abbreviations defined in [RFC6184], the following 447 ones are defined. 449 MVC: Multiview Video Coding 450 CS-DON: Cross-Session Decoding Order Number 451 MST: multi-session transmission 452 PACSI: Payload Content Scalability Information 453 SST: single-session transmission 455 4. MVC RTP Payload Format 457 4.1. RTP Header Usage 459 Please see section 5.1 of [RFC6184]. 461 4.2. Common Structure of the RTP Payload Format 463 Please see section 5.2 of [RFC6184]. 465 4.3. NAL Unit Header Usage 467 The structure and semantics of the NAL unit header were introduced 468 in section Error! Reference source not found. This section specifies 469 the semantics of F, NRI, PRID, TID, A and I according to this 470 specification. 472 Note that, in the context of this section, "protecting a NAL unit" 473 means any RTP or network transport mechanism that could improve the 474 probability of success delivery of the packet conveying the NAL 475 unit, including applying a QoS-enabled network, forward error 476 correction (FEC), retransmissions, and advanced scheduling behavior, 477 whenever possible. 479 The semantics of F specified in section 5.3 of [RFC6184] also 480 applies herein. 482 For NRI, for a bitstream conforming to one of the profiles defined 483 in Annex A of [H.264] and transported using [RFC6184], the semantics 484 specified in section 5.3 of [RFC6184] are applicable, i.e., NRI also 485 indicates the relative importance of NAL units. In MVC context, in 486 addition to the semantics specified in Annex H of [MVC] are 487 applicable, NRI also indicate the relative importance of NAL units 488 within a view. MANEs MAY use this information to protect more 489 important NAL units better than less important NAL units. 490 [Ed.Note(YkW): "MVC context" to be clearly specified.] 492 For PRID, the semantics specified in Annex H of [MVC] applies. Note 493 that MANEs implementing unequal error protection MAY use this 494 information to protect NAL units with smaller PRID values better 495 than those with larger PRID values, for example by including only 496 the more important NAL units in a forward error correction (FEC) 497 protection mechanism. The importance for the decoding process 498 decreases as the PRID value increases. 500 For TID, in addition to the semantics specified in Annex H of [MVC], 501 according to this memo, values of TID indicate the relative 502 importance. A lower value of TID indicates a higher importance for 503 NAL units within a view. MANEs MAY use this information to protect 504 more important NAL units better than less important NAL units. 506 For A, in addition to the semantics specified in Annex H of [MVC], 507 according to this memo, MANEs MAY use this information to protect 508 NAL units with A equal to 1 better than NAL units with A equal to 0. 509 MANEs MAY also utilize information of NAL units with A equal to 1 to 510 decide when to forward more packets for an RTP packet stream. For 511 example, when it is sensed that view switching has happened such 512 that the operation point has changed, MANEs MAY start to forward NAL 513 units for a new target view only after forwarding a NAL unit with A 514 equal to 1 for the new target view. 516 For I, in addition to the semantics specified in Annex H of [MVC], 517 according to this memo, MANEs MAY use this information to protect 518 NAL units with I equal to 1 better than NAL units with I equal to 0. 519 MANEs MAY also utilize information of NAL units with I equal to 1 to 520 decide when to forward more packets for an RTP packet stream. For 521 example, when it is sensed that view switching has happened such 522 that the operation point has changed, MANEs MAY start to forward NAL 523 units for a new target view only after forwarding a NAL unit with I 524 equal to 1 for the new target view. 526 4.4. Packetization Modes 528 [Ed.Note(TS): Need to add text from [RFC6190] to this section with 529 respect to MVC.] 531 4.4.1. Packetization Modes for single-session transmission 533 This section will address the issues of section 4.5.1 and 5.1 of 534 [RFC6190]. 536 4.4.2. Packetization Modes for multi-session transmission 538 This section will address the issues of section 4.5.2 and 5.2 of 539 [RFC6190]. 541 4.5. Aggregation Packets 543 This section will address the issues of section 4.7 of [RFC6190]. 545 4.6. Fragmentation Units (FUs) 547 This section will address the issues of section 4.8 of [RFC6190]. 549 4.7. Payload Content Scalability Information (PACSI) NAL Unit for MVC 551 A new NAL unit type is specified in this memo, and referred to as 552 payload content scalability information (PACSI) NAL unit. The PACSI 553 NAL unit, if present, MUST be the first NAL unit in an aggregation 554 packet, and it MUST NOT be present in other types of packets. The 555 PACSI NAL unit indicates view and temporal scalability information 556 and other characteristics that are common for all the remaining NAL 557 units in the payload of the aggregation packet. Furthermore, a PACSI 558 NAL unit MAY include a DONC field and contain zero or more SEI NAL 559 units. PACSI NAL unit makes it easier for MANEs to decide whether 560 to forward/process/discard the aggregation packet containing the 561 PACSI NAL unit. Senders MAY create PACSI NAL units and receivers 562 MAY ignore them, or use them as hints to enable efficient 563 aggregation packet processing. Note that the NAL unit type for the 564 PACSI NAL unit is selected among those values that are unspecified 565 in [MVC] and [RFC6184]. 567 When the first aggregation unit of an aggregation packet contains a 568 PACSI NAL unit, there MUST be at least one additional aggregation 569 unit present in the same packet. The RTP header and payload header 570 fields of the aggregation packet are set according to the remaining 571 NAL units in the aggregation packet. 573 When a PACSI NAL unit is included in a multi-time aggregation packet 574 (MTAP), the decoding order number (DON) for the PACSI NAL unit MUST 575 be set to indicate that the PACSI NAL unit has an identical DON to 576 the first NAL unit in decoding order among the remaining NAL units 577 in the aggregation packet. 579 The structure of a PACSI NAL unit is as follows. The first four 580 octets are exactly the same as the four-byte MVC NAL unit header as 581 discussed in section Error! Reference source not found. They are 582 followed by two always present octet, two optional octets, and zero 583 or more SEI NAL units, each SEI NAL unit preceded by a 16-bit 584 unsigned size field (in network byte order) that indicates the size 585 of the following NAL unit in bytes (excluding these two octets, but 586 including the NAL unit type octet of the SEI NAL unit). Figure 1 587 illustrates the PACSI NAL unit structure and an example of a PACSI 588 NAL unit containing two SEI NAL units. 590 The bits P, C, S, and E are specified only if the bit X is equal to 591 1. The T bit MUST NOT be equal to 1 if the aggregation packet 592 containing the PACSI NAL unit is not an STAP-A packet. The T bit 593 MAY be equal to 1 if the aggregation packet containing the PACSI NAL 594 unit is an STAP-A packet. The field DONC MUST NOT be present if the 595 T bit is equal to 0, and MUST be present if the T bit is equal to 1. 597 0 1 2 3 598 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 599 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 600 |F|NRI| Type |S| PRID | TID |A| VID |I|V|R| 601 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 602 |X|T|RR |P|C|S|E| RRR | DONC (optional) | 603 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 604 | NAL unit size 1 | | 605 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ SEI NAL unit 1 | 606 | | 607 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 608 | NAL unit size 2 | | 609 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ SEI NAL unit 2 | 610 | | 611 | +-+-+-+-+-+-+-+-+ 612 | | 613 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 615 Figure 1. PACSI NAL unit structure 617 The values of the fields in PACSI NAL unit MUST be set as follows. 618 The term "target NAL units" are used in the semantics of some 619 fields. The target NAL units are such NAL units contained in the 620 aggregation packet, but not included in the PACSI NAL unit, that are 621 within the access unit to which the first NAL unit following the 622 PACSI NAL unit in the aggregation packet belongs. 624 o The F bit MUST be set to 1 if the F bit in at least one of the 625 remaining NAL units in the aggregation packet is equal to 1. 626 Otherwise, the F bit MUST be set to 0. 628 o The NRI field MUST be set to the highest value of NRI field among 629 all the remaining NAL units in the aggregation packet. 631 o The Type field MUST be set to 30. 633 o The S bit MUST be set to 1. 635 o The PRID field MUST be set to the lowest value of the PRID values 636 of all the remaining NAL units in the aggregation packet. 638 o The TID field MUST be set to the lowest value of the TID values of 639 all the remaining NAL units with the lowest value of VID in the 640 aggregation packet. 642 o The A bit MUST be set to 1 if the A bit of at least one of the 643 remaining NAL units in the aggregation packet is equal to 1. 644 Otherwise, the A bit MUST be set to 0. 646 o The VID field MUST be set to the lowest value of the VID values of 647 all the remaining NAL units in the aggregation packet. 649 o The I bit MUST be set to 1 if the I bit of at least one of the 650 remaining NAL units in the aggregation packet is equal to 1. 651 Otherwise, the I bit MUST be set to 0. 653 o The V bit MUST be set to 1 if the V bit of at least one of the 654 remaining NAL units in the aggregation packet is equal to 1. 655 Otherwise, the A bit MUST be set to 0. 657 o The R bit MUST be set to 0. Receivers SHOULD ignore the value of 658 R. 660 o If the X bit is equal to 1, the bits P, C, S, and E are specified 661 as below. Otherwise, the bits P, C, S, and E are unspecified, and 662 receivers MUST ignore these bits. The X bit SHOULD be identical for 663 all the PACSI NAL units involved in all the RTP sessions conveying 664 an MVC bitstream. 666 o The RR field MUST be set to '00' (in binary form). Receivers 667 SHOULD ignore the value of RR. 669 o If the T bit is equal to 1, the OPTIONAL field DONC MUST be 670 present and specified as below. Otherwise, the field DONC MUST NOT 671 be present. 673 o The P bit MUST be set to 1 if all the remaining NAL units in the 674 aggregation packet are with redundant_pic_cnt higher than 0, i.e. 675 the slices are redundant slices. Otherwise, the P bit MUST be set 676 to 0. 678 Informative note: The P bit indicates whether the packet can be 679 discarded because it contains only redundant slice NAL units. 680 Without this bit, the corresponding information can be concluded 681 from the syntax element redundant_pic_cnt, which is buried in the 682 variable-length coded slice header. 684 o The C bit MUST be set to 1 if the target NAL units belong to an 685 access unit for which the view components are intra coded. 686 Otherwise, the C bit MUST be set to 0. The C bit SHOULD be 687 identical for all the PACSI NAL units for which the target NAL units 688 belong to the same access unit. 690 Informative note: The C bit indicates whether the packet contains 691 intra slices which may be the only packets to be forwarded for a 692 fast forward playback, e.g. when the network condition is 693 extremely bad. 695 o The S bit MUST be set to 1, if the first VCL NAL unit, in 696 transmission order, of the view component containing the first NAL 697 unit following the PACSI NAL unit in the aggregation packet is 698 present in the aggregation packet. Otherwise, the S bit MUST be set 699 to 0. 701 o The E bit MUST be set to 1, if the last VCL NAL unit, in 702 transmission order, of the view component containing the first NAL 703 unit following the PACSI NAL unit in the aggregation packet is 704 present in the aggregation packet. Otherwise, the E field MUST be 705 set to 0. 707 Informative note: The S or E bit indicates whether the first or 708 last slice, in transmission order, of a view component is in the 709 packet, to enable a MANE to detect slice loss and take proper 710 action such as requesting a retransmission as soon as possible, 711 as well as to allow an efficient playout buffer handling 712 similarly as the M bit in the RTP header. The M bit in the RTP 713 header still indicates the end of an access unit, not the end of 714 a view component. 716 o The RRR field MUST be set to '00000000'(in binary form). 717 Receivers SHOULD ignore the value of RRR. 719 o When present, the field DONC indicates the CL-DON value for the 720 first NAL unit in the STAP-A in transmission order. 722 SEI NAL units included in the PACSI NAL unit, if any, MUST contain a 723 subset of the SEI messages associated with the access unit of the 724 first NAL unit following the PACSI NAL unit within the aggregation 725 packet. 727 Informative note: Senders may repeat such SEI NAL units in the 728 PACSI NAL unit the presence of which in more than one packet is 729 essential for packet loss robustness. Receivers may use the 730 repeated SEI messages in place of missing SEI messages. 732 An SEI message SHOULD NOT be included in a PACSI NAL unit and 733 included in one of the remaining NAL units contained in the same 734 aggregation packet. 736 4.8. Non-Interleaved Multi-Time Aggregation Packets (NI-MTAPs) 738 This section will address the issues of section 4.7.1 of [RFC6190]. 740 4.9. Cross-Session DON (CS-DON) for multi-session transmission 742 This section will address the issues of section 4.11 of [RFC6190]. 744 5. Packetization Rules 746 [Ed.Note(TS): We need to adjust this section with respect to 747 [RFC6190].] 749 Section 6 of [RFC6184] applies. The following rules apply in 750 addition. 752 All receivers MUST support the single NAL unit packetization mode to 753 provide backward compatibility to endpoints supporting only the 754 single NAL unit mode of RFC 3984. However, the single NAL unit 755 packetization mode SHOULD NOT be used whenever possible, because 756 encapsulating NAL units of small sizes, e.g. small NAL units 757 containing parameter sets, SEI messages or prefix NAL units, in 758 their own packets is typically less efficient because of the 759 relatively big overhead. 761 All receivers MUST support the non-interleaved packetization mode. 763 Informative note: The non-interleaved mode allows an application 764 to encapsulate a single NAL unit in a single RTP packet. 765 Historically, the single NAL unit mode has been included into 766 [RFC6184] only for compatibility with ITU-T Rec. H.241 Annex A 767 [H.241]. There is no point in carrying this historic ballast 768 towards a new application space such as the one provided with 769 MVC. More technically speaking, the implementation complexity 770 increase for providing the additional mechanisms of the non- 771 interleaved mode (namely STAP-A and FU-A) is minor, and the 772 benefits are great, that STAP-A implementation is required. 774 A NAL unit of small size SHOULD be encapsulated in an aggregation 775 packet together with one or more other NAL units. For example, non- 776 VCL NAL units such as access unit delimiter, parameter set, or SEI 777 NAL unit are typically small. 779 A prefix NAL unit SHOULD be aggregated to the same packet as the 780 associated NAL unit following the prefix NAL unit in decoding order. 782 When the first aggregation unit of an aggregation packet contains a 783 PACSI NAL unit, there MUST be at least one additional aggregation 784 unit present in the same packet. 786 When an MVC bitstream is transported in more than one RTP session, 787 the following applies. 789 o Interleaved mode SHOULD be used for all the RTP sessions. 791 o An RTP session that does not use interleaved mode SHOULD be 792 constrained as follows. 794 - Non-interleaved mode MUST be used. 796 - STAP-A MUST be used, and any other type of packets MUST NOT be 797 used. 799 - Each STAP-A MUST contain a PACSI NAL unit and the DONC field 800 MUST be present in the PACSI NAL unit. 802 Informative note: The motivation for these constraints is to 803 allow the use of non-interleaved mode for the session conveying 804 the H.264/AVC compatible view, such that RFC 3984 receivers 805 without interleaved mode implementation can subscribe to the base 806 view session. 808 Non-VCL NAL units SHOULD be conveyed in the same session as the 809 associated VCL NAL units. To meet this, SEI messages that are 810 contained in scalable nesting SEI message and are applicable to more 811 than one session SHOULD be separated and contained into multiple 812 scalable nesting SEI messages. The DON values MUST indicate the 813 cross-layer decoding order number values as if all these SEI 814 messages were in separate scalable nesting SEI messages and 815 contained in the beginning of the corresponding access units as 816 specified in [MVC]. 818 6. De-Packetization Process (Informative) 820 For a single RTP session, the de-packetization process specified in 821 section 7 of [RFC6184] applies. 823 For receiving more than one of multiple RTP sessions conveying a 824 scalable bitstream, an example of a suitable implementation of the 825 de-packetization process is to be specified similarly as what will 826 be finally included in [RFC6190]. 828 7. Payload Format Parameters 830 This section specifies the parameters that MAY be used to select 831 optional features of the payload format and certain features of the 832 bitstream. The parameters are specified here as part of the media 833 type registration for the MVC codec. A mapping of the parameters 834 into the Session Description Protocol (SDP) [RFC4566] is also 835 provided for applications that use SDP. Equivalent parameters could 836 be defined elsewhere for use with control protocols that do not use 837 SDP. 839 7.1. Media Type Registration 841 The media subtype for the MVC codec is allocated from the IETF tree. 843 The receiver MUST ignore any unspecified parameter. 845 Informative note: Requiring ignoring unspecified parameter allows 846 for backward compatibility of future extensions. For example, if 847 a future specification that is backward compatible to this 848 specification specifies some new parameters, then a receiver 849 according to this specification is capable of receiving data per 850 the new payload but ignoring those parameters newly specified in 851 the new payload specification. This sentence is also present in 852 RFC 3984. 854 Media Type name: video 856 Media subtype name: H264-MVC 858 The media subtype "H264" MUST be used for RTP streams using RFC 859 3984, i.e. not using any of the new features introduced by this 860 specification compared to RFC 3984. For RTP streams using any of 861 the new features introduced by this specification compared to RFC 862 3984, the media subtype "H264-MVC" SHOULD be used, and the media 863 subtype "H264" MAY be used. Use of the media subtype "H264" for RTP 864 streams using the new features allows for RFC 3984 receivers to 865 negotiate and receive H.264/AVC or MVC streams packetized according 866 to this specification, but to ignore media parameters and NAL unit 867 types it does not recognize. 869 Required parameters: none 871 OPTIONAL parameters: to be specified. 873 Encoding considerations: 875 This type is only defined for transfer via RTP (RFC 3550). 877 Security considerations: 879 See section 10 of RFC XXXX. 881 Public specification: 883 Please refer to RFC XXXX and its section 14. 885 Additional information: none 887 File extensions: none 889 Macintosh file type code: none 891 Object identifier or OID: none 893 Person & email address to contact for further information: 895 Intended usage: COMMON 897 Author: NN 899 Change controller: 901 IETF Audio/Video Transport working group delegated from the 902 IESG. 904 7.2. SDP Parameters 906 7.2.1. Mapping of Payload Type Parameters to SDP 908 The media type video/H264-MVC string is mapped to fields in the 909 Session Description Protocol (SDP) as follows: 911 The media name in the "m=" line of SDP MUST be video. 913 The encoding name in the "a=rtpmap" line of SDP MUST be H264-MVC 914 (the media subtype). 916 The clock rate in the "a=rtpmap" line MUST be 90000. 918 The OPTIONAL parameters, when present, MUST be included in the 919 "a=fmtp" line of SDP. These parameters are expressed as a media 920 type string, in the form of a semicolon separated list of 921 parameter=value pairs. 923 7.2.2. Usage with the SDP Offer/Answer Model 925 TBD. 927 7.2.3. Usage with multi-session transmission 929 If multi-session transmission is used, the rules on signaling media 930 decoding dependency in SDP as defined in 931 [RFC5583] apply. 933 7.2.4. Usage in Declarative Session Descriptions 935 TBD. 937 7.3. Examples 939 TBD. 941 7.4. Parameter Set Considerations 943 Please see section 10 of [RFC6184]. 945 8. Security Considerations 947 Please see section 11 of [RFC6184]. 949 9. Congestion Control 951 TBD. 953 10. IANA Considerations 955 Request for media type registration to be added. 957 11. Acknowledgments 959 The work of Thomas Schierl has been supported by the European 960 Commission under contract number FP7-ICT-248036, project COAST. 962 This document was prepared using 2-Word-v2.0.template.dot. 964 12. References 966 12.1. Normative References 968 [H.264] ITU-T Recommendation H.264, "Advanced video coding for 969 generic audiovisual services", March 2010. 971 [RFC6190] Wenger, S., Wang, Y. -K., Schierl, T. and A. 972 Eleftheriadis, "RTP payload format for SVC video", 973 RFC6190, May 2011. 975 [RFC5583] Schierl, T., and Wenger, S., "Signaling media decoding 976 dependency in the Session Description Protocol (SDP)", RFC 977 5583, July 2009. 979 [MPEG4-10] 980 ISO/IEC International Standard 14496-10:2005. 982 [MVC] Annex H of ITU-T Recommendation H.264, "Advanced video 983 coding for generic audiovisual services", March 2010. 985 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 986 Requirement Levels", BCP 14, RFC 2119, March 1997. 988 [RFC3548] Josefsson, S., "The Base16, Base32, and Base64 Data 989 Encodings", RFC 3548, July 2003. 991 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and Jacobson, 992 V., "RTP: A Transport Protocol for Real-Time 993 Applications", STD 64, RFC 3550, July 2003. 995 [RFC6184] Wang, Y.-K., Kristensen, T., Jesup, R., "RTP Payload 996 Format for H.264 Video", RFC 6184, May 2011. 998 [RFC4566] Handley, M., Jacobson, V., and Perkins, C., "SDP: Session 999 Description Protocol", RFC 4566, July 2006. 1001 12.2. Informative References 1003 [DVB-H] DVB - Digital Video Broadcasting (DVB); DVB-H 1004 Implementation Guidelines, ETSI TR 102 377, 2005. 1006 [H.241] ITU-T Rec. H.241, "Extended video procedures and control 1007 signals for H.300-series terminals", May 2006. 1009 [IGMP] Cain, B., Deering S., Kovenlas, I., Fenner, B., and 1010 Thyagarajan, A., "Internet Group Management Protocol, 1011 Version 3", RFC 3376, October 2002. 1013 [McCanne] McCanne, S., Jacobson, V., and Vetterli, M., "Receiver- 1014 driven layered multicast", in Proc. of ACM SIGCOMM'96, 1015 pages 117--130, Stanford, CA, August 1996. 1017 [MBMS] 3GPP - Technical Specification Group Services and System 1018 Aspects; Multimedia Broadcast/Multicast Service (MBMS); 1019 Protocols and codecs (Release 6), December 2005. 1021 [MPEG2] ISO/IEC International Standard 13818-2:1993. 1023 [RFC3450] Luby, M., Gemmell, J., Vicisano, L., Rizzo, L., and 1024 Crowcroft, J., "Asynchronous layered coding (ALC) protocol 1025 instantiation", RFC 3450, December 2002. 1027 Author's Addresses 1029 Ye-Kui Wang 1030 Qualcomm Incorporated 1031 10160 Pacific Mesa Blvd 1032 San Diego, CA 92121 1033 USA 1034 Phone: +1-858-651-8345 1035 EMail: yekuiw@qualcomm.com 1037 Thomas Schierl 1038 Fraunhofer HHI 1039 Einsteinufer 37 1040 D-10587 Berlin 1041 Germany 1042 Phone: +49-30-31002-227 1043 EMail: ts@thomas-schierl.de 1044 Robert Skupin 1045 Fraunhofer HHI 1046 Einsteinufer 37 1047 D-10587 Berlin 1048 Germany 1049 Phone: +49-30-314-21700 1050 EMail: robert.skupin@hhi.fraunhofer.de 1052 13. Open issues 1054 - The use of CL-DON for session reordering allows also for 1055 interleaved transmission with non-interleaved packetization mode. 1056 There should be a clear separation between both tools. This issue 1057 should be handled the same way as for the SVC payload draft. 1059 - Since SVC session multiplexing (multi source transmission(MST)) is 1060 cleared, it would be great to just reference the MST sections in 1061 [RFC6190]. Since the text in sections 6 and 7 of [RFC6190] is 1062 currently very SVC specific, the authors would have to try to 1063 rewrite these sections in a more generic way. If this is not 1064 possible, we need to copy text from [RFC6190] with respect to MVC. 1066 - The structure of this document should be aligned with recently 1067 finished RFC6190. 1069 - This document is not intended to be a delta document in respect to 1070 RFC6190. 1072 - The PASCI definition in this document differs from the definition 1073 in RFC6190 1075 14. Changes Log 1077 Initial version 00 1079 10 November 2007: YkW 1080 Initial version 1082 12 November 2007: TS 1083 - Added definition of "Session multiplexing" 1084 - Added the reference of [I-D.draft-ietf-mmusic-decoding- 1085 dependency], and its reference in section 9.2.3 1086 12 November 2007: YkW 1087 - Added the reference of [I-D.draft-ietf-avt-svc] and its 1088 reference in section 1. 1089 - Added in sections 3.1 and 3.2 paragraphs regarding inter- 1090 view prediction 1092 From draft-wang-avt-rtp-mvc-00 to draft-wang-avt-rtp-mvc-01 1094 18 February 2008: YkW 1095 - Alignment to the latest MVC draft in JVT-Z209 and version 07 1096 of [I-D.draft-ietf-avt-svc]. 1098 25 February 2008: TS 1100 - Minor modifications and updates throughout the document 1102 - Added open issue on clear separation between "decoding order 1103 recovery" and "interleaving" 1105 From draft-wang-avt-rtp-mvc-01 to draft-wang-avt-rtp-mvc-02 1107 09 July 2008: TS 1109 - Minor modifications and updates throughout the document 1111 - Added open issue 1113 - NAL unit header alignment with MVC spec 1115 - Section 6. References corresponding sections in [RFC3984] and [I- 1116 D.draft-ietf-avt-svc]. 1118 - TBD: Section 7, we may align [I-D.draft-ietf-avt-svc] in a way 1119 that SVC is not mentioned in this paragraphs, so that we can 1120 reference them from this document. 1122 21 August 2008: 1124 - Minor modifications, editing and adding notes throughout the 1125 document. 1127 - Updated references 1129 From draft-wang-avt-rtp-mvc-02 to draft-wang-avt-rtp-mvc-03 1131 04 February 2009: YkW 1133 - Updated author's address. 1135 04 February 2009: YkW 1137 - Updated the boiler template. 1139 From draft-wang-avt-rtp-mvc-03 to draft-wang-avt-rtp-mvc-04 1141 22 October 2009: YkW 1143 - Updated author's address and the boiler template (added the last 1144 sentence in Copyright Notice). 1146 From draft-wang-avt-rtp-mvc-04 to draft-wang-avt-rtp-mvc-05 1148 22 April 2010: YkW 1150 - To keep the draft alive, no change other than version number etc. 1152 From draft-wang-avt-rtp-mvc-05 to draft-ietf-avt-rtp-mvc-00 1154 28 April 2010: YkW 1156 - No change other than version number etc. 1158 From draft-ietf-avt-rtp-mvc-00 to draft-ietf-avt-rtp-mvc-01 1160 8/9 October 2010: 1162 - YkW: Updated the NAL unit header syntax and semantics in section 1163 3.3 per the latest MVC specification. 1165 - TS: Minor edits 1167 From draft-ietf-avt-rtp-mvc-01 to draft-ietf-payload-rtp-mvc-00 1169 14 March 2011: YkW 1171 - Minor changes such as updates of some references the work group 1172 name from AVT to AVT Payload, etc. 1174 From draft-ietf-payload-rtp-mvc-00 to draft-ietf-payload-rtp-mvc-01 1176 1 September 2011: RS 1178 - Added some definitions 1179 - Started structural alignment with RFC 6190 1181 - Reference updates: (RFC3984 -> RFC6184), (I-D.draft-ietf-avt-rtp- 1182 svc -> RFC6190)