idnits 2.17.1 draft-wang-avt-rtp-mvc-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 16. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 961. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 938. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 945. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 951. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (November 12, 2007) is 6010 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'I-D.draft-ietf-avt-svc' is mentioned on line 116, but not defined == Unused Reference: 'I-D.draft-ietf-avt-rtp-svc' is defined on line 853, but no explicit reference was found in the text == Unused Reference: 'RFC3548' is defined on line 871, but no explicit reference was found in the text == Unused Reference: 'DVB-H' is defined on line 891, but no explicit reference was found in the text == Unused Reference: 'IGMP' is defined on line 897, but no explicit reference was found in the text == Unused Reference: 'McCanne' is defined on line 901, but no explicit reference was found in the text == Unused Reference: 'MBMS' is defined on line 905, but no explicit reference was found in the text == Unused Reference: 'MPEG2' is defined on line 909, but no explicit reference was found in the text == Unused Reference: 'RFC3450' is defined on line 911, but no explicit reference was found in the text == Outdated reference: A later version (-27) exists of draft-ietf-avt-rtp-svc-03 == Outdated reference: A later version (-08) exists of draft-ietf-mmusic-decoding-dependency-00 -- Possible downref: Non-RFC (?) normative reference: ref. 'MPEG4-10' -- Possible downref: Non-RFC (?) normative reference: ref. 'MVC' ** Obsolete normative reference: RFC 3548 (Obsoleted by RFC 4648) ** Obsolete normative reference: RFC 3984 (Obsoleted by RFC 6184) ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) -- Possible downref: Non-RFC (?) normative reference: ref. 'SVC' -- Obsolete informational reference (is this intentional?): RFC 3450 (Obsoleted by RFC 5775) Summary: 4 errors (**), 0 flaws (~~), 13 warnings (==), 11 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Y.-K. Wang 3 INTERNET-DRAFT Nokia 4 Intended status: Standards Track T. Schierl 5 Expires: May 11, 2007 Fraunhofer HHI 6 November 12, 2007 8 RTP Payload Format for MVC Video 9 draft-wang-avt-rtp-mvc-00.txt 11 Status of this Memo 13 By submitting this Internet-Draft, each author represents that any 14 applicable patent or other IPR claims of which he or she is aware 15 have been or will be disclosed, and any of which he or she becomes 16 aware will be disclosed, in accordance with Section 6 of BCP 79. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that 20 other groups may also distribute working documents as Internet- 21 Drafts. 23 Internet-Drafts are draft documents valid for a maximum of six months 24 and may be updated, replaced, or obsoleted by other documents at any 25 time. It is inappropriate to use Internet-Drafts as reference 26 material or to cite them other than as "work in progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/1id-abstracts.txt. 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html. 34 This Internet-Draft will expire on May 11, 2008. 36 Copyright Notice 38 Copyright (C) The IETF Trust (2007). 40 Abstract 41 This memo describes an RTP payload format for the multiview extension 42 of the ITU-T Recommendation H.264 video codec which is technically 43 identical to ISO/IEC International Standard 14496-10. The RTP 44 payload format allows for packetization of one or more Network 45 Abstraction Layer (NAL) units, produced by the video encoder, in each 46 RTP payload. The payload format has wide applicability, such as 3D 47 video streaming, free-viewpoint video, and 3DTV. 49 Table of Contents 51 1. Introduction ............................................... 4 52 2. Conventions ................................................ 4 53 3. The MVC Codec .............................................. 4 54 3.1. Overview ................................................... 4 55 3.2. Parameter Set Concept ...................................... 5 56 3.3. Network Abstraction Layer Unit Header ...................... 6 57 4. Scope ...................................................... 8 58 5. Definitions and Abbreviations .............................. 9 59 5.1. Definitions ................................................ 9 60 5.1.1. Definitions per MVC specification .......................... 9 61 5.1.2. Definitions local to this memo ............................. 9 62 5.2. Abbreviations ............................................. 10 63 6. MVC RTP Payload Format .................................... 10 64 6.1. Design Principles ......................................... 10 65 6.2. RTP Header Usage .......................................... 11 66 6.3. Common Structure of the RTP Payload Format ................ 11 67 6.4. NAL Unit Header Usage ..................................... 11 68 6.5. Packetization Modes ....................................... 13 69 6.6. Decoding Order Number (DON) ............................... 13 70 6.7. Aggregation Packets ....................................... 13 71 6.8. Fragmentation Units (FUs) ................................. 14 72 6.9. Payload Content Scalability Information (PACSI) NAL Unit .. 14 73 7. Packetization Rules ....................................... 18 74 8. De-Packetization Process (Informative) .................... 19 75 9. Payload Format Parameters ................................. 20 76 9.1. Media Type Registration ................................... 20 77 9.2. SDP Parameters ............................................ 21 78 9.2.1. Mapping of Payload Type Parameters to SDP ................. 21 79 9.2.2. Usage with the SDP Offer/Answer Model ..................... 22 80 9.2.3. Usage with Session Multiplexing ........................... 22 81 9.2.4. Usage in Declarative Session Descriptions ................. 22 82 9.3. Examples .................................................. 22 83 9.4. Parameter Set Considerations .............................. 22 84 10. Security Considerations ................................... 22 85 11. Congestion Control ........................................ 22 86 12. IANA Considerations ....................................... 22 87 13. References ................................................ 23 88 13.1. Normative References ...................................... 23 89 13.2. Informative References .................................... 24 90 14. Author's Addresses ........................................ 24 91 15. Intellectual Property Statement ........................... 25 92 16. Disclaimer of Validity .................................... 25 93 17. Copyright Statement ....................................... 25 94 18. Acknowledgment ............................................ 26 96 1. Introduction 98 This memo specifies an RTP [RFC3550] payload format for a forthcoming 99 new mode of the H.264/AVC video coding standard, known as Multiview 100 Video Coding (MVC). Formally, MVC takes the form of Amendment 4 to 101 ISO/IEC 14496 Part 10 [MPEG4-10], and Annex H of ITU-T Rec. H.264 102 [H.264]. The latest draft specification of MVC is available in [MVC]. 104 MVC covers a wide range of 3D video applications, including 3D video 105 streaming, free-viewpoint video as well as 3DTV. 107 This memo tries to follow a backward compatible enhancement 108 philosophy similar to what the video coding standardization 109 committees implement, by keeping as close an alignment to the 110 H.264/AVC payload format [RFC3984] as possible. It documents the 111 enhancements relevant from an RTP transport viewpoint, and defines 112 signaling support for MVC, including a new media subtype name. 114 Due to the similarity between MVC and SVC in system and transport 115 aspects, this memo reuses the design principles as well as many 116 features of the SVC RTP payload draft [I-D.draft-ietf-avt-svc]. The 117 feasibility of specifying this memo as a delta in relative to the SVC 118 RTP payload draft could be studied in future versions. 120 2. Conventions 122 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 123 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 124 document are to be interpreted as described in BCP 14, RFC 2119 125 [RFC2119] 127 This specification uses the notion of setting and clearing a bit when 128 bit fields are handled. Setting a bit is the same as assigning that 129 bit the value of 1 (On). Clearing a bit is the same as assigning 130 that bit the value of 0 (Off). 132 3. The MVC Codec 134 3.1. Overview 135 MVC provides multi-view video bitstreams. An MVC bitstream contains 136 a base view conforming to at least one of the profiles of H.264/AVC 137 as defined in Annex A of [H.264], and one or more non-base views. To 138 enable high compression efficiency, coding of a non-base view can 139 utilize other views for inter-view prediction, thus its decoding 140 relies on the presence of the views it depends on. Each coded view 141 itself may be temporally scalable, similar as a scalable video coding 142 (SVC) [SVC] bitstream. Besides temporal scalability, MVC also 143 supports view scalability, wherein a subset of the encoded views can 144 be extracted, decoded and displayed, whenever it is desired by the 145 application. 147 The concept of video coding layer (VCL) and network abstraction layer 148 (NAL) is inherited from H.264/AVC. The VCL contains the signal 149 processing functionality of the codec; mechanisms such as transform, 150 quantization, motion-compensated prediction, loop filtering and 151 inter-layer prediction. The Network Abstraction Layer (NAL) 152 encapsulates each slice generated by the VCL into one or more Network 153 Abstraction Layer Units (NAL units). Please consult RFC 3984 for a 154 more in-depth discussion of the NAL unit concept. MVC specifies the 155 decoding order of NAL units. 157 In MVC, one access unit contains all NAL units pertaining to one 158 output time instance for all the views. Within one access unit, each 159 view representation consists of one or more slices. 161 The concept of temporal scalability is not newly introduced by SVC or 162 MVC, as profiles defined in Annex A of [H.264] already support it. 163 In [H.264], sub-sequences have been introduced in order to allow 164 optional use of temporal layers. SVC extended this approach by 165 advertising the temporal scalability information within the NAL unit 166 header or prefix NAL units, both were inherited to MVC. 168 3.2. Parameter Set Concept 170 The parameter set concept was first specified in [H.264]. Please 171 refer to section 1.2 of [RFC3984] for more details. SVC introduced 172 some new parameter set mechanisms, as specified in [SVC]. It is 173 expected that MVC will inherit the parameter set concept from both 174 [H.264] and [SVC]. 176 In particular, a different type of sequence parameter set (SPS) using 177 a different NAL unit type than "the old SPS" specified in [H.264] 178 would be used for non-base views, while the base view would still use 179 "the old SPS". Slices from different views would be able to use 180 either 1) the same sequence or picture parameter set, or 2) different 181 sequence or picture parameter sets. 183 The inter-view dependency as well as the decoding order of all the 184 encoded views are indicated in a new syntax structure, the SPS MVC 185 extension, included in SPS. 187 3.3. Network Abstraction Layer Unit Header 189 An MVC NAL unit (of type 20 or 14) consists of a header of four 190 octets and the payload byte string. MVC NAL units of type 20 are 191 coded slices of non-base views. A special type of an MVC NAL unit is 192 the prefix NAL unit (type 14) that includes descriptive information 193 of the associated H.264/AVC VCL NAL unit (type 1 or 5) that 194 immediately follows the prefix NAL unit. 195 MVC extends the one-byte H.264/AVC NAL unit header by three 196 additional octets. The header indicates the type of the NAL unit, 197 the (potential) presence of bit errors or syntax violations in the 198 NAL unit payload, information regarding the relative importance of 199 the NAL unit for the decoding process, the view identification 200 information, the temporal layer identification information, and other 201 fields as discussed below. 202 The syntax and semantics of the NAL unit header are formally 203 specified in [MVC], but the essential properties of the NAL unit 204 header are summarized below. 206 The first byte of the NAL unit header has the following format (the 207 bit fields are the same as defined for the one-byte H.264/AVC NAL 208 unit header, while the semantics of some fields have changed 209 slightly, in a backward compatible way): 211 +---------------+ 212 |0|1|2|3|4|5|6|7| 213 +-+-+-+-+-+-+-+-+ 214 |F|NRI| Type | 215 +---------------+ 217 F: 1 bit 218 forbidden_zero_bit. H.264/AVC declares a value of 1 as a syntax 219 violation. 221 NRI: 2 bits 222 nal_ref_idc. A value of 00 indicates that the content of the NAL 223 unit is not used to reconstruct reference pictures for future 224 prediction. Such NAL units can be discarded without risking the 225 integrity of the reference pictures in the same view. A value 226 greater than 00 indicates that the decoding of the NAL unit is 227 required to maintain the integrity of reference pictures in the same 228 view, or that the NAL unit contains parameter sets. 230 Type: 5 bits 231 nal_unit_type. This component specifies the NAL unit type. 232 In H.264/AVC, NAL unit types 14 and 20 are reserved for future 233 extensions. MVC uses these two NAL unit types. NAL unit type 14 is 234 used for prefix NAL unit, NAL unit type 20 is used for coded slice of 235 non-base view. NAL unit types 14 and 20 indicate the presence of 236 three additional octets in the NAL unit header, as shown below. 238 +---------------+---------------+---------------+ 239 |0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7| 240 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 241 |S| PRID | TID |A| VID |I|V|R| 242 +---------------+---------------+---------------+ 244 S: 1 bit 245 svc_mvc_flag. This flag specifies whether the NAL unit is an SVC NAL 246 unit (when equal to 0) or an MVC NAL unit (when equal to 1). 248 PRID: 6 bits 249 priority_id. This flag specifies a priority identifier for the NAL 250 unit. A lower value of PRID indicates a higher priority. 252 TID: 3 bits 253 temporal_id. This component specifies the temporal layer (or frame 254 rate) hierarchy. Informally put, a layer consisting of view 255 representations with a less temporal_id corresponds to a lower frame 256 rate. A given temporal layer typically depends on the lower temporal 257 layers (i.e. the temporal layers with less temporal_id) but never 258 depends on any higher temporal layer. 260 A: 1 bit 261 anchor_pic_flag. This component specifies whether the view 262 representation is an anchor picture (when equal to 1) or not (when 263 equal to 0), as specified in [MVC]. 265 VID: 10 bits 266 view_id. This component specifies the view identifier of the view. 268 I: 1 bit 269 idr_flag. This component specifies whether the view representation 270 is a view instantaneous decoding refresh (V-IDR) picture for the view 271 (when equal to 1) or not (when equal to 0), as specified in [MVC]. 273 V: 1 bit 274 inter_view_flag. This component specifies whether the view 275 representation is used for inter-view prediction (when equal to 1) or 276 not (when equal to 0). 278 R: 1 bit 279 reserved_zero_one_bit. Reserved bit for future extension. R MUST be 280 equal to 0. Receivers SHOULD discard NAL units with R equal to 1. 282 This memo reuses the same additional NAL unit types introduced in RFC 283 3984, which are presented in section 6.3. In addition, this memo 284 introduces one more NAL unit type, 30, as specified in section 6.9. 285 These NAL unit types are marked as unspecified in [MVC] and 286 intentionally reserved for use in systems specifications like this 287 memo. Moreover, this specification extends the semantics of F, NRI, 288 PRID, TID, A, and I as described in section 6.4. 290 4. Scope 292 This payload specification can only be used to carry the "naked" NAL 293 unit stream over RTP, and not the byte stream format according to 294 Annex B of [MVC]. Likely, the applications of this specification 295 will be in the IP based multimedia communications fields including 3D 296 video streaming over IP, free-viewpoint video over IP, and 3DTV over 297 IP. 299 This specification allows, in a given RTP session, to encapsulate NAL 300 units belonging to 302 o the base view only, detailed specification in [RFC3984], or 303 o one or more non-base views, or 304 o the base view and one or non-base views 306 5. Definitions and Abbreviations 308 5.1. Definitions 310 5.1.1. Definitions per MVC specification 312 This document uses the definitions of [MVC]. The following terms, 313 defined in [SVC], are summed up for convenience: 315 access unit: A set of NAL units pertaining to a certain temporal 316 location. An access unit includes the coded slices of all the views 317 at that temporal location and possibly other associated data, e.g. 318 supplemental enhancement information (SEI) messages and parameter 319 sets. 321 prefix NAL unit: A NAL unit with nal_unit_type equal to 14 that 322 immediately precedes a NAL unit with nal_unit_type equal to 1, 5, 323 or 12. The NAL unit that succeeds the prefix NAL unit is also 324 referred to as the associated NAL unit. The prefix NAL unit contains 325 data associated with the associated NAL unit, which are considered to 326 be part of the associated NAL unit. 328 5.1.2. Definitions local to this memo 330 MVC NAL unit: A NAL unit of NAL unit type 14 or 20 as specified in 331 Annex H of [MVC]. An SVC NAL unit has a four-byte NAL unit header. 333 operation point: An operation point of an MVC bitstream represents a 334 certain level of temporal and view scalability. An operation point 335 contains only those NAL units required for a valid bitstream to 336 represent a certain subset of views at a certain temporal level. An 337 operation point is described by the view_id values of the subset of 338 views, and the highest temporal_id. 340 temporal scalable layer switching point: A view representation for 341 which itself and all subsequent view representations with the same 342 value of temporal_id and view_id in decoding order do not refer to 343 any preceding view representation with the same value of temporal_id 344 and view_id in decoding order for inter prediction. Such a view 345 representation can be used to switching from the next lower temporal 346 layer to the current temporal layer when operating at the same 347 view_id. 349 Session multiplexing: The views of the MVC bitstream are distributed 350 onto different RTP sessions, whereby each RTP session carries a 351 single RTP packet stream. Each RTP session requires a separate 352 signaling and has a separate Timestamp, Sequence Number, and SSRC 353 space. Dependency between sessions MUST be signaled according to [I- 354 D.draft-ietf-mmusic-decoding-dependency] and this memo. 356 5.2. Abbreviations 358 In addition to the abbreviations defined in [RFC3984], the following 359 ones are defined. 361 MVC: Multiview Video Coding 362 CL-DON: Cross-Layer Decoding Order Number 363 PACSI: Payload Content Scalability Information 365 6. MVC RTP Payload Format 367 6.1. Design Principles 369 The following design principles have been observed: 371 o Backward compatibility with [RFC3984] wherever possible. 373 o As the MVC base view is H.264/AVC compatible, the base view or any 374 subset, when transmitted in its own session, MUST be encapsulated 375 using [RFC3984] and dependency between sessions MUST be signaled 376 according to [I-D.draft-ietf-mmusic-decoding-dependency]. Requiring 377 this has the desirable side effect that it can be used by [RFC3984] 378 legacy devices. 380 o MANEs are signaling aware and rely on signaling information. MANEs 381 have state. 383 o MANEs can aggregate multiple RTP streams, possibly from multiple 384 RTP sessions. 386 o MANEs can perform media-aware stream thinning. By using the 387 payload header information identifying Layers within an RTP session, 388 MANEs are able to remove packets from the incoming RTP packet stream. 389 This implies rewriting the RTP headers of the outgoing packet stream 390 and rewriting of RTCP Receiver Reports. 392 6.2. RTP Header Usage 394 Please see section 5.1 of [RFC3984]. 396 6.3. Common Structure of the RTP Payload Format 398 Please see section 5.2 of [RFC3984]. 400 6.4. NAL Unit Header Usage 402 The structure and semantics of the NAL unit header were introduced in 403 section 3.3 This section specifies the semantics of F, NRI, PRID, 404 TID, A and I according to this specification. 406 Note that, in the context of this section, "protecting a NAL unit" 407 means any RTP or network transport mechanism that could improve the 408 probability of success delivery of the packet conveying the NAL unit, 409 including applying a QoS-enabled network, FEC, retransmissions, and 410 advanced scheduling behavior, whenever possible. 412 The semantics of F specified in section 5.3 of [RFC3984] also applies 413 herein. 415 For NRI, for a bitstream conforming to one of the profiles defined in 416 Annex A of [H.264] and transported using [RFC3984], the semantics 417 specified in section 5.3 of [RFC3984] are applicable, i.e., NRI also 418 indicates the relative importance of NAL units. In MVC context, in 419 addition to the semantics specified in Annex H of [MVC] are 420 applicable, NRI also indicate the relative importance of NAL units 421 within a view. MANEs MAY use this information to protect more 422 important NAL units better than less important NAL units. 424 For PRID, the semantics specified in Annex H of [MVC] applies. Note, 425 that MANEs implementing unequal error protection MAY use this 426 information to protect NAL units with smaller PRID values better than 427 those with larger PRID values, for example by including only the more 428 important NAL units in a forward error correction (FEC) protection 429 mechanism. The importance for the decoding process decreases as the 430 PRID value increases. 432 For TID, in addition to the semantics specified in Annex H of [MVC], 433 according to this memo, values of TID indicate the relative 434 importance. A lower value of TID indicates a higher importance for a 435 certain view. MANEs MAY use this information to protect more 436 important NAL units better than less important NAL units. 438 For A, in addition to the semantics specified in Annex H of [MVC], 439 according to this memo, MANEs MAY use this information to protect NAL 440 units with A equal to 1 better than NAL units with A equal to 0. 441 MANEs MAY also utilize information of NAL units with A equal to 1 to 442 decide when to forward more packets for an RTP packet stream. For 443 example, when it is sensed that view switching has happened such that 444 the operation point has changed, MANEs MAY start to forward NAL units 445 for a new target view only after forwarding a NAL unit with A equal 446 to 1 for the new target view. 448 For I, in addition to the semantics specified in Annex H of [MVC], 449 according to this memo, MANEs MAY use this information to protect NAL 450 units with I equal to 1 better than NAL units with I equal to 0. 451 MANEs MAY also utilize information of NAL units with I equal to 1 to 452 decide when to forward more packets for an RTP packet stream. For 453 example, when it is sensed that view switching has happened such that 454 the operation point has changed, MANEs MAY start to forward NAL units 455 for a new target view only after forwarding a NAL unit with I equal 456 to 1 for the new target view. 458 6.5. Packetization Modes 460 Please see section 5.4 of [RFC3984]. 462 6.6. Decoding Order Number (DON) 464 Please see section 5.5 of [RFC3984]. The following applies in 465 addition. 467 If different views of a SVC bitstream are transported in more than 468 one RTP session and interleaved mode is used, the DON values of all 469 the NAL units in the RTP sessions using interleaved mode MUST 470 indicate CL-DON values. 472 When different views of an SVC bitstream are transported in more than 473 one RTP session and at least one STAP-A packet is present in any of 474 the RTP sessions and interleaved mode is used in at least one of the 475 RTP sessions, the following applies: 477 o A PACSI NAL unit MUST be present in each STAP-A packet. 479 o A CL-DON field MUST be present in the PACSI NAL unit included in an 480 STAP-A. 482 o The DON values for the NAL units in each STAP-A packet MUST be 483 derived as follows and indicate CL-DON values. The CL-DON field in 484 the PACSI NAL unit specifies the value of DON for the first NAL unit 485 in the STAP-A in transmission order. For each successive NAL unit in 486 appearance order in the STAP-A, the value of DON is equal to (the 487 value of DON of the previous NAL unit in the STAP-A + 1) % 65536, 488 wherein '%' stands for modulo operation. 490 6.7. Aggregation Packets 491 Please see section 5.7 of [RFC3984]. 493 6.8. Fragmentation Units (FUs) 495 Please see section 5.8 of [RFC3984]. 497 6.9. Payload Content Scalability Information (PACSI) NAL Unit 499 A new NAL unit type is specified in this memo, and referred to as 500 payload content scalability information (PACSI) NAL unit. The PACSI 501 NAL unit, if present, MUST be the first NAL unit in an aggregation 502 packet, and it MUST NOT be present in other types of packets. The 503 PACSI NAL unit indicates view and temporal scalability information 504 and other characteristics that are common for all the remaining NAL 505 units in the payload of the aggregation packet. Furthermore, a PACSI 506 NAL unit MAY include a CL-DON field and contain zero or more SEI NAL 507 units. PACSI NAL unit makes it easier for MANEs to decide whether to 508 forward/process/discard the aggregation packet containing the PACSI 509 NAL unit. Senders MAY create PACSI NAL units and receivers MAY 510 ignore them, or use them as hints to enable efficient aggregation 511 packet processing. Note that the NAL unit type for the PACSI NAL 512 unit is selected among those values that are unspecified in [MVC] and 513 [RFC3984]. 515 When the first aggregation unit of an aggregation packet contains a 516 PACSI NAL unit, there MUST be at least one additional aggregation 517 unit present in the same packet. The RTP header and payload header 518 fields of the aggregation packet are set according to the remaining 519 NAL units in the aggregation packet. 521 When a PACSI NAL unit is included in a multi-time aggregation packet 522 (MTAP), the decoding order number (DON) for the PACSI NAL unit MUST 523 be set to indicate that the PACSI NAL unit has an identical DON to 524 the first NAL unit in decoding order among the remaining NAL units in 525 the aggregation packet. 527 The structure of a PACSI NAL unit is as follows. The first four 528 octets are exactly the same as the four-byte MVC NAL unit header as 529 discussed in section 3.3. They are followed by two always present 530 octet, two optional octets, and zero or more SEI NAL units, each SEI 531 NAL unit preceded by a 16-bit unsigned size field (in network byte 532 order) that indicates the size of the following NAL unit in bytes 533 (excluding these two octets, but including the NAL unit type octet of 534 the SEI NAL unit). Figure 1 illustrates the PACSI NAL unit structure 535 and an example of a PACSI NAL unit containing two SEI NAL units. 537 The bits T, P, C, S, and E are specified only if the bit X is equal 538 to 1. The field CL-DON MUST NOT be present if the aggregation packet 539 containing the PACSI NAL unit is not an STAP-A packet. The field CL- 540 DON MAY be present if the aggregation packet containing the PACSI NAL 541 unit is an STAP-A packet. 543 0 1 2 3 544 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 545 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 546 |F|NRI| Type |S| PRID | TID |A| VID |I|V|R| 547 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 548 |X|RR |T|P|C|S|E| RRR | CL-DON (optional) | 549 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 550 | NAL unit size 1 | | 551 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ SEI NAL unit 1 | 552 | | 553 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 554 | NAL unit size 2 | | 555 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ SEI NAL unit 2 | 556 | | 557 | +-+-+-+-+-+-+-+-+ 558 | | 559 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 561 Figure 1. PACSI NAL unit structure 563 The values of the fields in PACSI NAL unit MUST be set as follows. 565 o The F bit MUST be set to 1 if the F bit in at least one of the 566 remaining NAL units in the payload is equal to 1. Otherwise, the F 567 bit MUST be set to 0. 569 o The NRI field MUST be set to the highest value of NRI field among 570 all the remaining NAL units in the payload. 572 o The Type field MUST be set to 30. 574 o The S bit MUST be set to 1. 576 o The PRID field MUST be set to the lowest value of the PRID values 577 of all the remaining NAL units in the payload. 579 o The TID field MUST be set to the lowest value of the TID values of 580 all the remaining NAL units with the lowest value of DID in the 581 payload. 583 o The A bit MUST be set to 1 if the A bit of at least one of the 584 remaining NAL units in the payload is equal to 1. Otherwise, the A 585 bit MUST be set to 0. 587 o The V bit MUST be set to 1 if the V bit of at least one of the 588 remaining NAL units in the payload is equal to 1. Otherwise, the A 589 bit MUST be set to 0. 591 o The I bit MUST be set to 1 if the I bit of at least one of the 592 remaining NAL units in the payload is equal to 1. Otherwise, the I 593 bit MUST be set to 0. 595 o The R bit MUST be set to 0. 597 o If the X bit is equal to 1, the bits T, P, C, S, and E are 598 specified as in below. Otherwise, the bits T, P, C, S, and E are 599 unspecified, and receivers MUST ignore these bits. The X bit SHOULD 600 be identical for all the PACSI NAL units involved in all the RTP 601 sessions conveying an SVC bitstream. 603 o The RR field MUST be set to '00'. 605 o The T bit MUST be set to 1 if all the target NAL units (as defined 606 above) belong to temporal scalable layer switching points. 607 Otherwise, the T bit MUST be set to 0. The T bit SHOULD be identical 608 for all the PACSI NAL units for which the target NAL units belong to 609 the same access unit. 611 o The P bit MUST be set to 1 if all the target NAL units (as defined 612 above) are with redundant_pic_cnt greater than 0, i.e. the slices are 613 redundant slices. Otherwise, the P bit MUST be set to 0. The P bit 614 SHOULD be identical for all the PACSI NAL units for which the target 615 NAL units belong to the same access unit. 617 o The C bit MUST be set to 1 if the target NAL units (as defined 618 above) belong to an access unit for which the view representations 619 are intra view representations. Otherwise, the C bit MUST be set to 620 0. The C bit SHOULD be identical for all the PACSI NAL units for 621 which the target NAL units belong to the same access unit. 623 o The S bit MUST be set to 1, if the first VCL NAL unit, in decoding 624 order, of the view representation containing the first NAL unit 625 following the PACSI NAL unit in the aggregation packet is present in 626 the payload. Otherwise, the S bit MUST be set to 0. 628 o The E bit MUST be set to 1, if the last VCL NAL unit, in decoding 629 order, of the view representation containing the first NAL unit 630 following the PACSI NAL unit in the aggregation packet is present in 631 the payload. Otherwise, the E field MUST be set to 0. 633 o The RRR field MUST be set to '00000000'. 635 o When present, the field CL-DON indicates the cross-layer decoding 636 order number for the first NAL unit in the STAP-A in transmission 637 order. 639 SEI NAL units included in the PACSI NAL unit, if any, MUST contain a 640 subset of the SEI messages associated with the access unit of the 641 first NAL unit following the PACSI NAL unit within the aggregation 642 packet. 644 Informative note: Senders may repeat such SEI NAL units in the PACSI 645 NAL unit the presence of which in more than one packet is essential 646 for packet loss robustness. Receivers may use the repeated SEI 647 messages in place of missing SEI messages. 649 An SEI message SHOULD NOT be included in a PACSI NAL unit and 650 included in one of the remaining NAL units contained in the same 651 aggregation packet at the same time. 653 7. Packetization Rules 655 Section 6 of [RFC3984] applies. The following rules apply in 656 addition. 657 All receivers MUST support the single NAL unit packetization mode to 658 provide backward compatibility to endpoints supporting only the 659 single NAL unit mode of RFC 3984. However, the single NAL unit 660 packetization mode SHOULD NOT be used whenever possible, because 661 encapsulating NAL units of small sizes, e.g. small NAL units 662 containing parameter sets or SEI messages, in their own packets is 663 typically less efficient because of the relatively big overhead. 665 All receivers MUST support the non-interleaved packetization mode. 667 Informative note: The non-interleaved mode allows an application to 668 encapsulate a single NAL unit in a single RTP packet. Historically, 669 the single NAL unit mode has been included into [RFC3984] only for 670 compatibility with ITU-T Rec. H.241 Annex A [H.241]. There is no 671 point in carrying this historic ballast towards a new application 672 space such as the one provided with MVC. More technically speaking, 673 the implementation complexity increase for providing the additional 674 mechanisms of the non-interleaved mode (namely STAP-A) is so minor, 675 and the benefits are so great, that STAP-A implementation is 676 required. 678 A NAL unit of small size SHOULD be encapsulated in an aggregation 679 packet together with one or more other NAL units. For example, non- 680 VCL NAL units such as access unit delimiter, parameter set, or SEI 681 NAL unit are typically small. 683 A prefix NAL unit SHOULD be aggregated to the same packet as the 684 associated NAL unit following the prefix NAL unit in decoding order. 686 When the first aggregation unit of an aggregation packet contains a 687 PACSI NAL unit, there MUST be at least one additional aggregation 688 unit present in the same packet. 690 When an MVC bitstream is transported in more than one RTP session, 691 the following applies. 693 o Interleaved mode SHOULD be used for all the RTP sessions. 695 o An RTP session that does not use interleaved mode SHOULD be 696 constrained as follows. 698 - Non-interleaved mode MUST be used. 699 - STAP-A MUST be used, and any other type of packets MUST NOT be 700 used. 701 - Each STAP-A MUST contain a PACSI NAL unit and the CL-DON field 702 MUST be present in the PACSI NAL unit. 704 Informative note: The motivation for these constraints is to allow 705 the use of non-interleaved mode for the session conveying the 706 H.264/AVC compatible view, such that RFC 3984 receivers without 707 interleaved mode implementation can subscribe to the base view 708 session. 710 Non-VCL NAL units SHOULD be conveyed in the same session as the 711 associated VCL NAL units. To meet this, SEI messages that are 712 contained in scalable nesting SEI message and are applicable to more 713 than one session SHOULD be separated and contained into multiple 714 scalable nesting SEI messages. The CL-DON values MUST indicate the 715 cross-layer decoding order number values as if all these SEI messages 716 were in separate scalable nesting SEI messages and contained in the 717 beginning of the corresponding access units as specified in [MVC]. 719 8. De-Packetization Process (Informative) 721 For a single RTP session, the de-packetization process specified in 722 section 7 of [RFC3984] applies. 724 For receiving more than one of multiple RTP sessions conveying a 725 scalable bitstream, an example of a suitable implementation of the 726 de-packetization process is to be specified. 728 9. Payload Format Parameters 730 This section specifies the parameters that MAY be used to select 731 optional features of the payload format and certain features of the 732 bitstream. The parameters are specified here as part of the media 733 type registration for the MVC codec. A mapping of the parameters 734 into the Session Description Protocol (SDP) [RFC4566] is also 735 provided for applications that use SDP. Equivalent parameters could 736 be defined elsewhere for use with control protocols that do not use 737 SDP. 739 9.1. Media Type Registration 741 The media subtype for the MVC codec is allocated from the IETF tree. 742 The receiver MUST ignore any unspecified parameter. 743 Informative note: Requiring ignoring unspecified parameter allows for 744 backward compatibility of future extensions. For example, if a 745 future specification that is backward compatible to this 746 specification specifies some new parameters, then a receiver 747 according to this specification is capable of receiving data per the 748 new payload but ignoring those parameters newly specified in the new 749 payload specification. This sentence is also present in RFC 3984. 751 Media Type name: video 753 Media subtype name: H264-MVC or H264 755 The media subtype "H264" MUST be used for RTP streams using RFC 3984, 756 i.e. not using any of the new features introduced by this 757 specification compared to RFC 3984. For RTP streams using any of the 758 new features introduced by this specification compared to RFC 3984, 759 the media subtype "H264-MVC" SHOULD be used, and the media subtype 760 "H264" MAY be used. Use of the media subtype "H264" for RTP streams 761 using the new features allows for RFC 3984 receivers to negotiate and 762 receive H.264/AVC or MVC streams packetized according to this 763 specification, but to ignore media parameters and NAL unit types it 764 does not recognize. 766 Required parameters: none 767 OPTIONAL parameters: to be specified. 769 Encoding considerations: 770 This type is only defined for transfer via RTP (RFC 3550). 772 Security considerations: 773 See section 10 of RFC XXXX. 775 Public specification: 776 Please refer to RFC XXXX and its section 14. 778 Additional information: none 780 File extensions: none 782 Macintosh file type code: none 784 Object identifier or OID: none 786 Person & email address to contact for further information: 788 Intended usage: COMMON 790 Author: NN 792 Change controller: 793 IETF Audio/Video Transport working group delegated from the IESG. 795 9.2. SDP Parameters 797 9.2.1. Mapping of Payload Type Parameters to SDP 799 The media type video/H264-MVC string is mapped to fields in the 800 Session Description Protocol (SDP) as follows: 801 The media name in the "m=" line of SDP MUST be video. 802 The encoding name in the "a=rtpmap" line of SDP MUST be H264-MVC (the 803 media subtype). 805 The clock rate in the "a=rtpmap" line MUST be 90000. 807 The OPTIONAL parameters, when present, MUST be included in the 808 "a=fmtp" line of SDP. These parameters are expressed as a media type 809 string, in the form of a semicolon separated list of parameter=value 810 pairs. 812 9.2.2. Usage with the SDP Offer/Answer Model 814 TBD. 816 9.2.3. Usage with Session Multiplexing 818 If Session multiplexing is used, the rules on signaling media 819 decoding dependency in SDP as defined in 820 [I-D.draft-ietf-mmusic-decoding-dependency] apply. 822 9.2.4. Usage in Declarative Session Descriptions 824 TBD. 826 9.3. Examples 828 TBD. 830 9.4. Parameter Set Considerations 832 Please see section 10 of [RFC3984]. 834 10. Security Considerations 836 Please see section 11 of [RFC3984]. 838 11. Congestion Control 840 TBD. 842 12. IANA Considerations 844 Request for media type registration to be added. 846 13. References 848 13.1. Normative References 850 [H.264] ITU-T Recommendation H.264, "Advanced video coding for 851 generic audiovisual services", Version 4, July 2005. 853 [I-D.draft-ietf-avt-rtp-svc] Wenger, S., Schierl, T., and Wang, Y. - 854 K., "RTP payload format for SVC video", draft-ietf-avt-rtp-svc-03 855 (work in progress), November 2007. 857 [I-D.draft-ietf-mmusic-decoding-dependency] Schierl, T., and Wenger, 858 S., "Signaling media decoding dependency in Session Description 859 Protocol (SDP)", draft-ietf-mmusic-decoding-dependency-00 (work in 860 progress), November 2007. 862 [MPEG4-10] ISO/IEC International Standard 14496-10:2005. 864 [MVC] Joint Video Team, "Joint Draft 4 of MVC ", available from 865 http://ftp3.itu.ch/av-arch/jvt-site/2007_06_Geneva/JVT-X209.zip, 866 Geneva, Switzerland, June 2007. 868 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 869 Requirement Levels", BCP 14, RFC 2119, March 1997. 871 [RFC3548] Josefsson, S., "The Base16, Base32, and Base64 Data 872 Encodings", RFC 3548, July 2003. 874 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and Jacobson, 875 V., "RTP: A Transport Protocol for Real-Time Applications", STD 64, 876 RFC 3550, July 2003. 878 [RFC3984] Wenger, S., Hannuksela, M., Stockhammer, T., Westerlund, 879 M., and Singer, D., "RTP Payload Format for H.264 Video", RFC 3984, 880 February 2005. 882 [RFC4566] Handley, M., Jacobson, V., and Perkins, C., "SDP: Session 883 Description Protocol", RFC 4566, July 2006. 885 [SVC] Joint Video Team, "Joint Draft 11 of SVC Amendment", 886 available from http://ftp3.itu.ch/av-arch/jvt- 887 site/2007_06_Geneva/JVT-X201.zip, Geneva, Switzerland, June 2007. 889 13.2. Informative References 891 [DVB-H] DVB - Digital Video Broadcasting (DVB); DVB-H 892 Implementation Guidelines, ETSI TR 102 377, 2005. 894 [H.241] ITU-T Rec. H.241, "Extended video procedures and control 895 signals for H.300-series terminals", May 2006. 897 [IGMP] Cain, B., Deering S., Kovenlas, I., Fenner, B., and 898 Thyagarajan, A., "Internet Group Management Protocol, Version 3", RFC 899 3376, October 2002. 901 [McCanne] McCanne, S., Jacobson, V., and Vetterli, M., "Receiver- 902 driven layered multicast", in Proc. of ACM SIGCOMM'96, pages 117-- 903 130, Stanford, CA, August 1996. 905 [MBMS] 3GPP - Technical Specification Group Services and System 906 Aspects; Multimedia Broadcast/Multicast Service (MBMS); Protocols and 907 codecs (Release 6), December 2005. 909 [MPEG2] ISO/IEC International Standard 13818-2:1993. 911 [RFC3450] Luby, M., Gemmell, J., Vicisano, L., Rizzo, L., and 912 Crowcroft, J., "Asynchronous layered coding (ALC) protocol 913 instantiation", RFC 3450, December 2002. 915 14. Author's Addresses 917 Ye-Kui Wang Phone: +358-50-486-7004 918 Nokia Research Center Email: ye-kui.wang@nokia.com 919 P.O. Box 100 920 FIN-33721 Tampere 921 Finland 923 Thomas Schierl Phone: +49-30-31002-227 924 Fraunhofer HHI Email: schierl@hhi.fhg.de 925 Einsteinufer 37 926 D-10587 Berlin 927 Germany 929 15. Intellectual Property Statement 931 The IETF takes no position regarding the validity or scope of any 932 Intellectual Property Rights or other rights that might be claimed to 933 pertain to the implementation or use of the technology described in 934 this document or the extent to which any license under such rights 935 might or might not be available; nor does it represent that it has 936 made any independent effort to identify any such rights. Information 937 on the procedures with respect to rights in RFC documents can be 938 found in BCP 78 and BCP 79. 940 Copies of IPR disclosures made to the IETF Secretariat and any 941 assurances of licenses to be made available, or the result of an 942 attempt made to obtain a general license or permission for the use of 943 such proprietary rights by implementers or users of this 944 specification can be obtained from the IETF on-line IPR repository at 945 http://www.ietf.org/ipr. 947 The IETF invites any interested party to bring to its attention any 948 copyrights, patents or patent applications, or other proprietary 949 rights that may cover technology that may be required to implement 950 this standard. Please address the information to the IETF at 951 ietf-ipr@ietf.org. 953 16. Disclaimer of Validity 955 This document and the information contained herein are provided on an 956 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 957 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 958 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 959 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 960 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 961 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 963 17. Copyright Statement 964 Copyright (C) The IETF Trust (2007). 965 This document is subject to the rights, licenses and restrictions 966 contained in BCP 78, and except as set forth therein, the authors 967 retain all their rights. 969 18. Acknowledgment 971 Funding for the RFC Editor function is currently provided by the 972 Internet Society.