idnits 2.17.1 draft-murillo-avtcore-multi-codec-payload-format-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 2 instances of too long lines in the document, the longest one being 2 characters in excess of 72. ** The abstract seems to contain references ([SFrame], [WebRTCInsertableStreams]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (11 July 2021) is 1019 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC2119' is defined on line 663, but no explicit reference was found in the text == Unused Reference: 'RFC3711' is defined on line 678, but no explicit reference was found in the text == Unused Reference: 'RFC4566' is defined on line 683, but no explicit reference was found in the text == Unused Reference: 'RFC8285' is defined on line 697, but no explicit reference was found in the text == Unused Reference: 'RFC6464' is defined on line 724, but no explicit reference was found in the text == Unused Reference: 'RFC6465' is defined on line 730, but no explicit reference was found in the text == Unused Reference: 'RFC6904' is defined on line 736, but no explicit reference was found in the text ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) ** Obsolete normative reference: RFC 5285 (Obsoleted by RFC 8285) ** Downref: Normative reference to an Informational RFC: RFC 7656 Summary: 5 errors (**), 0 flaws (~~), 9 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 AVTCORE S. Garcia Murillo 3 Internet-Draft CoSMo 4 Intended status: Standards Track Y. Fablet 5 Expires: 12 January 2022 Apple Inc. 6 A. Gouaillard 7 CoSMo 8 J. Uberti 9 Clubhouse 10 11 July 2021 12 Multi Codec RTP payload format 13 draft-murillo-avtcore-multi-codec-payload-format-01 15 Abstract 17 RTP Media Chains usually rely on piping encoder output directly to 18 packetizers. Media packetization formats often support a specific 19 codec format and optimize RTP packets generation accordingly. With 20 the development of Selective Forward Unit (SFU) solutions, RTP Media 21 Chains used in WebRTC solutions are increasingly relying on 22 application-specific transforms that sit between encoder and 23 packetizer on one end and between depacketizer and decoder on the 24 other end. These transforms are typically encrypting media content 25 so that the media content is not readable from the SFU, for instance 26 using [SFrame] or [WebRTCInsertableStreams]. In that context, RTP 27 packetizers can no longer expect to use packetization formats that 28 mandate media content to be in a specific codec format. This 29 document provides a solution to that problem by describing a RTP 30 packetization format that can be used for many media content, and how 31 to negotiate use of this format. This document also describes a 32 solution to allow SFUs to continue performing packet routing on top 33 of this RTP packetization format. 35 Status of This Memo 37 This Internet-Draft is submitted in full conformance with the 38 provisions of BCP 78 and BCP 79. 40 Internet-Drafts are working documents of the Internet Engineering 41 Task Force (IETF). Note that other groups may also distribute 42 working documents as Internet-Drafts. The list of current Internet- 43 Drafts is at https://datatracker.ietf.org/drafts/current/. 45 Internet-Drafts are draft documents valid for a maximum of six months 46 and may be updated, replaced, or obsoleted by other documents at any 47 time. It is inappropriate to use Internet-Drafts as reference 48 material or to cite them other than as "work in progress." 49 This Internet-Draft will expire on 12 January 2022. 51 Copyright Notice 53 Copyright (c) 2021 IETF Trust and the persons identified as the 54 document authors. All rights reserved. 56 This document is subject to BCP 78 and the IETF Trust's Legal 57 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 58 license-info) in effect on the date of publication of this document. 59 Please review these documents carefully, as they describe your rights 60 and restrictions with respect to this document. Code Components 61 extracted from this document must include Simplified BSD License text 62 as described in Section 4.e of the Trust Legal Provisions and are 63 provided without warranty as described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 68 2. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 69 3. RTP Packetization . . . . . . . . . . . . . . . . . . . . . . 6 70 4. Payload Multiplexing . . . . . . . . . . . . . . . . . . . . 7 71 5. SDP Negotiation . . . . . . . . . . . . . . . . . . . . . . . 8 72 6. SFU Packet Selection . . . . . . . . . . . . . . . . . . . . 9 73 7. Sender Processing Rules . . . . . . . . . . . . . . . . . . . 10 74 8. Redundancy Techniques Considerations . . . . . . . . . . . . 10 75 8.1. Retransmission Techniques . . . . . . . . . . . . . . . . 10 76 8.2. Forward Error Correction (FEC) Techniques . . . . . . . . 11 77 8.3. Redundant Audio Data Techniques . . . . . . . . . . . . . 11 78 9. Alternatives . . . . . . . . . . . . . . . . . . . . . . . . 11 79 9.1. Generic Packetization With In-Payload APT . . . . . . . . 12 80 9.2. A Payload Type for Generic Packetization AND Media 81 Format . . . . . . . . . . . . . . . . . . . . . . . . . 12 82 9.3. A RTP Header To Choose Packetization . . . . . . . . . . 13 83 10. Security Considerations . . . . . . . . . . . . . . . . . . . 14 84 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 85 11.1. Registration of audio/generic . . . . . . . . . . . . . 14 86 12. Registration of video/generic . . . . . . . . . . . . . . . . 15 87 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 15 88 13.1. Normative References . . . . . . . . . . . . . . . . . . 15 89 13.2. Informative References . . . . . . . . . . . . . . . . . 16 90 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 17 92 1. Introduction 94 As per Figure 1 of [RFC7656], a Media Packetizer transforms a single 95 Encoded Stream into one or several RTP packets. The Encoded Stream 96 is coming straight from the Media Encoder and is expected to follow 97 the format produced by the Media Encoder. A number of Media 98 Packetizer formats have been designed to process a specific format 99 produced by Media Encoder. For instance [RFC6184] is dedicated to 100 the processing of content produced by H.264 Media Encoders, and 101 generates packets following NALUs organization. 103 WebRTC applications are increasingly deploying end-to-end encryption 104 solutions on top of RTP Media Chains. End-to-end encryption is 105 implemented by inserting application-specific Media Transformers 106 between Media Encoder and Media Packetizer on the sending side, and 107 between Media Depacketizer and Media Decoder on the receiving side, 108 as described in Figure 1 and Figure 2. To support end-to-end 109 encryption, Media Transformers can use the [SFrame] format. In 110 browsers, Media Transformers are implemented using 111 [WebRTCInsertableStreams], for instance by injecting JavaScript code 112 provided by web pages. 114 Physical Stimulus 115 | 116 V 117 +----------------------+ 118 | Media Capture | 119 +----------------------+ 120 | 121 Raw Stream 122 V 123 +----------------------+ 124 | Media Source |<-- Synchronization Timing 125 +----------------------+ 126 | 127 Source Stream 128 V 129 +----------------------+ 130 | Media Encoder | 131 +----------------------+ 132 | 133 Encoded Stream 134 V 135 +----------------------+ 136 | Media Transformer |<-- NEW: application-specific transform 137 +----------------------+ (e.g. SFrame Encryption) 138 | 139 Transformed Stream +------------+ 140 V | V 141 +----------------------+ | +----------------------+ 142 | Media Packetizer | | | RTP-Based Redundancy | 143 +----------------------+ | +----------------------+ 144 | | | 145 +-------------+ Redundancy RTP Stream 146 Source RTP Stream | 147 V V 148 +----------------------+ +----------------------+ 149 | RTP-Based Security | | RTP-Based Security | 150 +----------------------+ +----------------------+ 151 | | 152 Secured RTP Stream Secured Redundancy RTP Stream 153 V V 154 +----------------------+ +----------------------+ 155 | Media Transport | | Media Transport | 156 +----------------------+ +----------------------+ 158 Figure 1: Sender side concepts in the Media Chain with 159 application-level Media Transform 161 These RTP packets are sent over the wire to a receiver media chain 162 matching the sender side, reaching the Media Depacketizer that will 163 reconstruct the Encoded Stream before passing it to the Media 164 Decoder. 166 +----------------------+ +----------------------+ 167 | Media Transport | | Media Transport | 168 +----------------------+ +----------------------+ 169 Received | Received | Secured 170 Secured RTP Stream Redundancy RTP Stream 171 V V 172 +----------------------+ +----------------------+ 173 | RTP-Based Validation | | RTP-Based Validation | 174 +----------------------+ +----------------------+ 175 | | 176 Received RTP Stream Received Redundancy RTP Stream 177 | | 178 | +--------------------+ 179 V V 180 +----------------------+ 181 | RTP-Based Repair | 182 +----------------------+ 183 | 184 Repaired RTP Stream 185 V 186 +----------------------+ 187 | Media Depacketizer | 188 +----------------------+ 189 | 190 Received Transformed Stream 191 V 192 +----------------------+ 193 | Media Transformer |<-- NEW: application-specific transform 194 +----------------------+ (e.g. SFrame Decryption) 195 | 196 Received Encoded Stream 197 V 198 +----------------------+ 199 | Media Decoder | 200 +----------------------+ 201 | 202 Received Source Stream 203 V 204 +----------------------+ 205 | Media Sink |--> Synchronization Information 206 +----------------------+ 207 | 208 Received Raw Stream 209 V 210 +----------------------+ 211 | Media Render | 212 +----------------------+ 213 | 214 V 215 Physical Stimulus 217 Figure 2: Receiver side concepts in the Media Chain with 218 application-level Media Transform 220 This packetization does not change how the mapping between one or 221 several encoded or dependant streams are mapped to the RTP streams or 222 how the synchronization sources(s) (SSRC) are assigned. 224 Given the use of post-encoder application-specific transforms, the 225 whole Media Chain needs to be made aware of it. This includes the 226 sender post-transform Media Chain, Media Transport intermediaries 227 (SFUs typically) and receiver pre-transform Media Chain. 229 As these transforms can alter Encoded Streams in any possible way, 230 the use of codec-specific Media Packetizers like [RFC6184] on 231 Transformed Stream may be suboptimal on sender side. It may also be 232 problematic on the receiving side in case codec-specific processing 233 is done prior the Media Transformer. Media Transport intermediaries 234 are often looking at the Media Content itself to fuel their packet 235 selection algorithms. 237 2. Goals 239 The objective of this document is to support inserting any 240 application-specific transform between encoders and packetizers in 241 the Media Chain. For that purpose, this document will: 1. Provide a 242 packetization format that supports multiple media content used by 243 WebRTC applications (audio compressed by Opus, video compressed by 244 H264 or VP8, encrypted content...) that allows reuse of existing RTP 245 mechanisms in place in WebRTC applications such as RTX, RED or FEC. 246 2. Provide a way to negotiate use of this packetization format 247 between sender and receiver, with minimum impact on existing 248 negotiation approaches. 3. Provide a side-channel information so 249 that network intermediaries (SFU in particular) can do their existing 250 packet routing strategies without inspecting the media content. 252 3. RTP Packetization 254 This packetizer, by design, is not expected to understand the format 255 of the media to transmit. The unit used by the packetizer to do 256 processing is called a frame in the remainder of the document. 258 It is the responsibility of the application using the packetizer to 259 group media content in meaningful frames. In the common case of a 260 video codec, the packetizer frame is the frame in byte format (h264 261 annex b for example) generated by the encoder. 263 If the application wants to transform encoded content, the 264 application needs to split the encoded content into frames prior the 265 transform. Each frame is then transformed independently, for 266 instance encrypted using [SFrame]. The content of each transformed 267 frame is then processed by the packetizer. 269 In the case of a video codec supporting spatial scalability, each 270 spatial layer MUST be split in its own frame by the application 271 before passing it to the packetizer. 273 When the packetizer receives a frame from the application, it MUST 274 fragment the frame content in multiple RTP packets to ensure packets 275 do not exceed the network maximum transmission unit. The content of 276 the frame will be treated as a binary blob by the packetizer, so the 277 decision about the boundaries of each fragment is decided arbitrarily 278 by the packetizer. The packetizer or any relying server MUST NOT 279 modify the frame content and concatenating the RTP payload of the RTP 280 packets for each frame MUST produce the exact binary content of the 281 input frame content. 283 The marker bit of each RTP packet in a frame MUST be set according to 284 the audio and video profiles specified in [RFC3551]. 286 The spatial layer frames are sent in ascending order, with the same 287 RTP timestamp, and only the last RTP packet of the last spatial layer 288 frame will have the marker bit set to 1. 290 4. Payload Multiplexing 292 In order to reduce the number of payload type in the SDP exchange, a 293 single payload type code for this multi-codec packetization can be 294 used for all negotiated media formats that the multi-codec 295 packetization supports. That requires to identify the original 296 payload type code of the frame negotiated media format, called the 297 associated payload type (APT) hereunder. The APT value is the 298 payload type code of the associated format passed to the multi-codec 299 Media Packetizer before any transformation is applied. 301 The APT value is sent in a dedicated header extension. The payload 302 of this header extension can be encoded using either the one-byte or 303 two-byte header defined in [RFC5285]. Figures 3 and 4 show examples 304 with each one of these examples. 306 0 1 307 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 308 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 309 | ID | len=0 |S| APT | 310 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 312 Figure 3: Frame associated payload type encoding using the One- 313 Byte header format 315 0 1 2 3 316 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 317 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 318 | ID | len=1 |S| APT | 0 (pad) | 319 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 321 Figure 4: Frame associated payload type encoding using the Two- 322 Byte header format 324 The APT value is the associated payload type value. The S bit 325 indicates if the media stream can be forwarded safely starting from 326 this RTP packet. Typically, it will be set to 1 on the first RTP 327 packet of an intra video frame and in all RTP audio packets. 329 Receivers MUST be ready to receive RTP packets with different 330 associated payload types in the same way they would receive different 331 payload type codes on the RTP packets. 333 The URI for declaring this header extension in an extmap attribute is 334 "urn:ietf:params:rtp-hdrext:associated-payload-type". 336 5. SDP Negotiation 338 To use the multi-codec packetization, the SDP Offer/Answer exchange 339 MUST negotiate: - The payload type of the negotiated codec format - 340 The multi-codec payload type - The associated payload type header 341 extension 343 Only the negotiated payload types are allowed to be used as 344 associated payload types. Figure 5 illustrates a SDP that negotiates 345 exchange of video using either VP8 or VP9 codecs with the possibility 346 to use the multi-codec packetization. In this example, RTX is also 347 negotiated and will be applied normally on each associated payload 348 type. 350 m=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101 351 c=IN IP4 0.0.0.0 352 a=rtcp:9 IN IP4 0.0.0.0 353 a=setup:actpass 354 a=mid:1 355 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 356 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id 357 a=extmap:3 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id 358 a=extmap:4 urn:ietf:params:rtp-hdrext:associated-payload-type 359 a=sendrecv 360 a=rtpmap:96 vp9/90000 361 a=rtpmap:97 vp8/90000 362 a=rtpmap:98 generic/90000 363 a=rtpmap:99 rtx/90000 364 a=fmtp:99 apt=96 365 a=rtpmap:100 rtx/90000 366 a=fmtp:100 apt=97 367 a=rtpmap:101 rtx/90000 368 a=fmtp:101 apt=98 370 Figure 5: SDP example negotiating the multi-codec payload type 371 and related header extension for video 373 6. SFU Packet Selection 375 SFUs need to have a basic understanding of each frame they receive so 376 they can decide to forward it or not and to which endpoint. They 377 might need similar information to support media content recording. 378 This information is either generic to a group of frames (called a 379 stream hereafter) or specific to each frame. 381 The information is transmitted as a RTP header extension as the RTP 382 packet payload should be treated as opaque by the SFU. This is 383 especially necessary if the payload is end-to-end encrypted. The 384 amount of information should be limited to what is strictly necessary 385 to the SFU task since it is not always as trusted as individual 386 peers. 388 For audio, configuration information such as Opus TOC might be 389 useful. For video, configuration information might include: - Stream 390 configuration information: resolution, quality, frame rate... - Codec 391 specific configuration information: codec profile like profile_idc... 392 - Frame specific information: whether the stream is decodable when 393 starting from this frame, whether the frame is skippable... 395 For video content, this information is sent using a Dependency 396 Descriptor header extension. In that case, the first RTP packet of 397 the frame will have its start_of_frame equal to 1 and the last packet 398 will have its end_of_frame equal to 1. 400 7. Sender Processing Rules 402 The sender identifies the use of the multi-codec payload format by 403 using the urn:ietf:params:rtp-hdrext:associated-payload-type 404 extension. When doing so, the sender follows these additional rules: 405 - For audio content, the associated payload type MUST reference an 406 audio codec in the supported audio codec list. The supported audio 407 codec list contains the audio codecs enumerated in [RFC7874]. This 408 list may be extended in future versions of this specification. - For 409 video content, H.264 and VP8 are supported as described in [RFC7742], 410 as well as VP9 and AV.1. In the case scalable video coding is used, 411 the sender MUST generate a Dependency Descriptor header extension. 412 This requires the associated payload type to reference a video codec 413 that can be described using the Dependency Descriptor header 414 extension. This also requires the sender to split the video encoder 415 output in frames that can each be described using the Dependency 416 Descriptor header extension. 418 These rules apply to both the originator of the content as well as 419 SFUs that might route the content to end receivers. 421 8. Redundancy Techniques Considerations 423 The solution described in this document is expected to integrate well 424 with the existing RTP ecosystem. This section describes how the 425 multi-codec packetizer can be used jointly with existing techniques 426 that allow to mitigate unreliable transports. 428 8.1. Retransmission Techniques 430 [RFC4588] defines a retransmission payload format (RTX) that can be 431 used in case of packet loss. As defined in [RFC4588], RTX is able to 432 handle any payload format, including the format described in this 433 document. Given RTX preserves both RTP packet payload and headers, 434 the receiver will be able to identify the payload type of the 435 recovered packet and whether multi-codec packetization is used. RTX 436 will also allow recovering RTP header extensions that convey 437 information on the media content itself. 439 8.2. Forward Error Correction (FEC) Techniques 441 FEC is another technique used in RTP Media Chains to protect media 442 content against packet loss. [RFC5109] defines such a payload format 443 used to transmit FEC for specific packets protection. 445 FEC may protect some parts of the media content more than others. 446 For instance, intra video frame encoded data or important network 447 abstraction layer units (NALUs) like SPS/PPS may be more protected. 448 With a post-encoder transform and the use of a multi-codec 449 packetization, the granularity of the recovery mechanism is no longer 450 at the NALU level but at the level of the frame generated by the 451 post-encoder transform. In case a SVC codec is used, each spatial 452 layer will be processed as an independent frame. In that case, base 453 layers can be protected more heavily than higher resolution layers. 455 8.3. Redundant Audio Data Techniques 457 As defined in [RFC7656] RTP-based redundancy is defined here as a 458 transformation that generates redundant or repair packets sent out as 459 a Redundancy RTP Stream to mitigate Network Transport impairments, 460 like packet loss and delay. 462 [RFC2198] defines a payload format for sending the same audio data 463 encoded multiple times at different quality levels. This allows to 464 use a lower quality encoding of the audio data, should the higher 465 quality encoding of the audio data is lost during the transmission. 467 If a Media Transformation is in use, both the primary and redundant 468 encoding must be transformed independently and the redundant packet 469 created normally. As the RTP headers present in the redundant packet 470 are only applicable to the primary encoding, if the payload type for 471 a redundant encoding block is mapped to the multi-codec packetizer, 472 the value of the associated payload type for the primary encoding is 473 applied to the redundant encoding block as well. 475 9. Alternatives 477 Various alternatives can be used to implement and negotiate multi- 478 codec packetization. This section describes a few additional 479 alternatives. This section is to be removed before finalization of 480 the document. 482 9.1. Generic Packetization With In-Payload APT 484 Instead of using a RTP header extension to convey the APT value, it 485 is prepended in the RTP payload itself. As the value cannot change 486 for a whole frame, its value is prepended to the first packet 487 generated of the frame only. This removes the need to negotiate a 488 dedicated header extension, but may require the SFU to update the 489 payload when sending or recording content. 491 9.2. A Payload Type for Generic Packetization AND Media Format 493 The payload type is negotiated in the SDP so as to identify both the 494 negotiated codec format and the multi-codec packetization use. There 495 is no network cost but this increases the number of payload types 496 used in the SDP. 498 m=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101 499 c=IN IP4 0.0.0.0 500 a=rtcp:9 IN IP4 0.0.0.0 501 a=setup:actpass 502 a=mid:1 503 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 504 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id 505 a=extmap:3 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id 506 a=sendrecv 507 a=rtpmap:96 vp9/90000 508 a=rtpmap:97 generic/90000 509 a=fmtp:97 apt=96 510 a=rtpmap:98 vp8/90000 511 a=rtpmap:99 generic/90000 512 a=fmtp:99 apt=98 513 a=rtpmap:100 rtx/90000 514 a=fmtp:100 apt=96 515 a=rtpmap:101 rtx/90000 516 a=fmtp:101 apt=97 517 a=rtpmap:102 rtx/90000 518 a=fmtp:102 apt=98 519 a=rtpmap:103 rtx/90000 520 a=fmtp:103 apt=99 522 Figure 6: SDP example negotiating a payload type for format and 523 multi-codec packetization 525 A variation of this approach is to consider defining several multi- 526 codec payload types, each of them having an identified codec format. 528 m=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101 529 c=IN IP4 0.0.0.0 530 a=rtcp:9 IN IP4 0.0.0.0 531 a=setup:actpass 532 a=mid:1 533 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 534 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id 535 a=extmap:3 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id 536 a=sendrecv 537 a=rtpmap:96 generic/90000 538 a=fmtp:96 codec=vp9 539 a=rtpmap:97 generic/90000 540 a=fmtp:97 codec=vp8 541 a=rtpmap:98 rtx/90000 542 a=fmtp:98 apt=96 543 a=rtpmap:99 rtx/90000 544 a=fmtp:99 apt=97 546 Figure 7: Alternative SDP example negotiating a payload type for 547 format and multi-codec packetization 549 9.3. A RTP Header To Choose Packetization 551 A RTP header extension can be used to flag content as opaque so that 552 the receiver knows whether to use or not the multi-codec 553 packetization. As for the API header extension, the RTP header 554 extension may not need to be sent for every packet, it could for 555 instance be sent for the first packet of every intra video frame. 556 The main advantage of this approach is the reduced impact on SDP 557 negotiation. 559 m=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101 560 c=IN IP4 0.0.0.0 561 a=rtcp:9 IN IP4 0.0.0.0 562 a=setup:actpass 563 a=mid:1 564 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 565 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id 566 a=extmap:3 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id 567 a=extmap:4 urn:ietf:params:rtp-hdrext:multi-codec-packetization-use 568 a=sendrecv 569 a=rtpmap:96 vp9/90000 570 a=rtpmap:97 vp8/90000 571 a=rtpmap:98 rtx/90000 572 a=fmtp:98 apt=96 573 a=rtpmap:99 rtx/90000 574 a=fmtp:99 apt=97 575 Figure 8: SDP example negotiating multi-codec packetization as 576 RTP header extension 578 10. Security Considerations 580 RTP packets using the payload format defined in this specification 581 are subject to the general security considerations discussed in 582 [RFC3550]. It is not expected that the proposed solution presented 583 in this document can create new security threats. The use and 584 implementation of RTP Media Chains containing Media Transformers 585 needs to be done carefully. It is important to refer to the security 586 considerations discussed in [SFrame] and [WebRTCInsertableStreams]. 587 In particular Media Transformers on the receiver side need to be 588 prepared to receive arbitrary content, like decoders already do. 589 Similarly, since Media Transformers can be implemented as JavaScript 590 in browsers, RTP Packetizers should be prepared to receive arbitrary 591 content. 593 11. IANA Considerations 595 Two new media subtypes have been registered with IANA, as described 596 in this section. 598 11.1. Registration of audio/generic 600 Type name: audio 602 Subtype name: generic 604 Required parameters: none 606 Optional parameters: none 608 Encoding considerations: This format is framed (see Section 4.8 in 609 the template document) and contains binary data. 611 Security considerations: TBD. 613 Interoperability considerations: TBD 615 Published specification: TBD. 617 Applications that use this media type: TBD. 619 Additional information: none 621 Intended usage: COMMON 622 Restrictions on usage: TBD 624 Author: 626 Change controller: 628 12. Registration of video/generic 630 Type name: video 632 Subtype name: generic 634 Required parameters: none 636 Optional parameters: none 638 Encoding considerations: This format is framed (see Section 4.8 in 639 the template document) and contains binary data. 641 Security considerations: TBD. 643 Interoperability considerations: TBD 645 Published specification: TBD. 647 Applications that use this media type: TBD. 649 Additional information: none 651 Intended usage: COMMON 653 Restrictions on usage: TBD 655 Author: 657 Change controller: 659 13. References 661 13.1. Normative References 663 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 664 Requirement Levels", BCP 14, RFC 2119, 665 DOI 10.17487/RFC2119, March 1997, 666 . 668 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 669 Jacobson, "RTP: A Transport Protocol for Real-Time 670 Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550, 671 July 2003, . 673 [RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and 674 Video Conferences with Minimal Control", STD 65, RFC 3551, 675 DOI 10.17487/RFC3551, July 2003, 676 . 678 [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 679 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 680 RFC 3711, DOI 10.17487/RFC3711, March 2004, 681 . 683 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 684 Description Protocol", RFC 4566, DOI 10.17487/RFC4566, 685 July 2006, . 687 [RFC5285] Singer, D. and H. Desineni, "A General Mechanism for RTP 688 Header Extensions", RFC 5285, DOI 10.17487/RFC5285, July 689 2008, . 691 [RFC7656] Lennox, J., Gross, K., Nandakumar, S., Salgueiro, G., and 692 B. Burman, Ed., "A Taxonomy of Semantics and Mechanisms 693 for Real-Time Transport Protocol (RTP) Sources", RFC 7656, 694 DOI 10.17487/RFC7656, November 2015, 695 . 697 [RFC8285] Singer, D., Desineni, H., and R. Even, Ed., "A General 698 Mechanism for RTP Header Extensions", RFC 8285, 699 DOI 10.17487/RFC8285, October 2017, 700 . 702 13.2. Informative References 704 [RFC2198] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., 705 Handley, M., Bolot, J.C., Vega-Garcia, A., and S. Fosse- 706 Parisis, "RTP Payload for Redundant Audio Data", RFC 2198, 707 DOI 10.17487/RFC2198, September 1997, 708 . 710 [RFC4588] Rey, J., Leon, D., Miyazaki, A., Varsa, V., and R. 711 Hakenberg, "RTP Retransmission Payload Format", RFC 4588, 712 DOI 10.17487/RFC4588, July 2006, 713 . 715 [RFC5109] Li, A., Ed., "RTP Payload Format for Generic Forward Error 716 Correction", RFC 5109, DOI 10.17487/RFC5109, December 717 2007, . 719 [RFC6184] Wang, Y.-K., Even, R., Kristensen, T., and R. Jesup, "RTP 720 Payload Format for H.264 Video", RFC 6184, 721 DOI 10.17487/RFC6184, May 2011, 722 . 724 [RFC6464] Lennox, J., Ed., Ivov, E., and E. Marocco, "A Real-time 725 Transport Protocol (RTP) Header Extension for Client-to- 726 Mixer Audio Level Indication", RFC 6464, 727 DOI 10.17487/RFC6464, December 2011, 728 . 730 [RFC6465] Ivov, E., Ed., Marocco, E., Ed., and J. Lennox, "A Real- 731 time Transport Protocol (RTP) Header Extension for Mixer- 732 to-Client Audio Level Indication", RFC 6465, 733 DOI 10.17487/RFC6465, December 2011, 734 . 736 [RFC6904] Lennox, J., "Encryption of Header Extensions in the Secure 737 Real-time Transport Protocol (SRTP)", RFC 6904, 738 DOI 10.17487/RFC6904, April 2013, 739 . 741 [RFC7742] Roach, A.B., "WebRTC Video Processing and Codec 742 Requirements", RFC 7742, DOI 10.17487/RFC7742, March 2016, 743 . 745 [RFC7874] Valin, JM. and C. Bran, "WebRTC Audio Codec and Processing 746 Requirements", RFC 7874, DOI 10.17487/RFC7874, May 2016, 747 . 749 [SFrame] "Secure Frame (SFrame)", n.d., 750 . 752 [WebRTCInsertableStreams] 753 "WebRTC Insertable Media using Streams", n.d., 754 . 756 Authors' Addresses 758 Sergio Garcia Murillo 759 CoSMo 761 Email: sergio.garcia.murillo@cosmosoftware.io 762 Youenn Fablet 763 Apple Inc. 765 Email: youenn@apple.com 767 Alex Gouaillard 768 CoSMo 770 Email: alex.gouaillard@cosmosoftware.io 772 Justin Uberti 773 Clubhouse 775 Email: justin@uberti.name