idnits 2.17.1 draft-gouaillard-avtcore-codec-agn-rtp-payload-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([SFrame], [WebRTCInsertableStreams]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (March 08, 2021) is 1107 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC2119' is defined on line 671, but no explicit reference was found in the text == Unused Reference: 'RFC3711' is defined on line 686, but no explicit reference was found in the text == Unused Reference: 'RFC4566' is defined on line 691, but no explicit reference was found in the text == Unused Reference: 'RFC8285' is defined on line 705, but no explicit reference was found in the text == Unused Reference: 'RFC6464' is defined on line 732, but no explicit reference was found in the text == Unused Reference: 'RFC6465' is defined on line 738, but no explicit reference was found in the text == Unused Reference: 'RFC6904' is defined on line 744, but no explicit reference was found in the text ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) ** Obsolete normative reference: RFC 5285 (Obsoleted by RFC 8285) ** Downref: Normative reference to an Informational RFC: RFC 7656 Summary: 4 errors (**), 0 flaws (~~), 9 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 AVTCORE S. Garcia Murillo 2 Internet-Draft CoSMo Software 3 Intended status: Standards Track Y. Fablet 4 Expires: September 9, 2021 Apple Inc. 5 A. Gouaillard 6 CoSMo Software 7 March 08, 2021 9 Codec agnostic RTP payload format for video 10 draft-gouaillard-avtcore-codec-agn-rtp-payload-01 12 Abstract 14 RTP Media Chains usually rely on piping encoder output directly to 15 packetizers. Media packetization formats often support a specific 16 codec format and optimize RTP packets generation accordingly. 18 With the development of Selective Forward Unit (SFU) solutions, that 19 do not process media content server side, the need for media content 20 processing at the origin and at the destination has arised. 22 RTP Media Chains used e.g. in WebRTC solutions are increasingly 23 relying on application-specific transforms that sit in-between 24 encoder and packetizer on one end and in-between depacketizer and 25 decoder on the other end. This use case has become so important, 26 that the W3C is standardizing the capacity to access encoded content 27 with the [WebRTCInsertableStreams] API proposal. An extremely 28 popular use case is application level end-to-end encryption of media 29 content, using for instance [SFrame]. 31 Whatever the modification applied to the media content, RTP 32 packetizers can no longer expect to use packetization formats that 33 mandate media content to be in a specific codec format. 35 In the extreme cases like encryption, where the RTP Payload is made 36 completely opaque to the SFUs, some extra mechanism must also be 37 added for them to be able to route the packets without depending on 38 RTP payload or payload headers. 40 The traditionnal process of creating a new RTP Payload specification 41 per content would not be practical as we would need to make a new one 42 for each codec-transform pair. 44 This document describes a solution, which provides the following 45 features in the case the encoded content has been modified before 46 reaching the packetizer: - a paylaod agnostic RTP packetization 47 format that can be used on any media content, - a negotiation 48 mechanism for the above format and the inner payload, Both of the 49 above mechanism are backward compatible with most of (S)RTP/RTCP 50 mechanisms used for bandwidth estimation and congestion control in 51 RTP/SRTP/webrtc, including but not limited to SSRC, RED, FEC, RTX, 52 NACK, SR/RR, REMB, transport-wide-CC, TMBR, .... It as illustrated by 53 existing implementations in chrome, safari, and Medooze. 55 This document also describes a solution to allow SFUs to continue 56 performing packet routing on top of this generic RTP packetization 57 format. 59 This document complements the SFrame (media encryption), and 60 Dependency Descriptor (AV1 payload annex) documents to provide an 61 End-to-End-Encryption solution that would sit on top of SRTP/Webrtc, 62 use SFUs on the media back-end, and leverage W3C APIs in the browser. 63 A high level description of such system will be provided as an 64 informational I-D in the SFrame WG and then cited here. 66 Status of This Memo 68 This Internet-Draft is submitted in full conformance with the 69 provisions of BCP 78 and BCP 79. 71 Internet-Drafts are working documents of the Internet Engineering 72 Task Force (IETF). Note that other groups may also distribute 73 working documents as Internet-Drafts. The list of current Internet- 74 Drafts is at https://datatracker.ietf.org/drafts/current/. 76 Internet-Drafts are draft documents valid for a maximum of six months 77 and may be updated, replaced, or obsoleted by other documents at any 78 time. It is inappropriate to use Internet-Drafts as reference 79 material or to cite them other than as "work in progress." 81 This Internet-Draft will expire on September 9, 2021. 83 Copyright Notice 85 Copyright (c) 2021 IETF Trust and the persons identified as the 86 document authors. All rights reserved. 88 This document is subject to BCP 78 and the IETF Trust's Legal 89 Provisions Relating to IETF Documents 90 (https://trustee.ietf.org/license-info) in effect on the date of 91 publication of this document. Please review these documents 92 carefully, as they describe your rights and restrictions with respect 93 to this document. Code Components extracted from this document must 94 include Simplified BSD License text as described in Section 4.e of 95 the Trust Legal Provisions and are provided without warranty as 96 described in the Simplified BSD License. 98 Table of Contents 100 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 101 2. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 102 3. RTP Packetization . . . . . . . . . . . . . . . . . . . . . . 7 103 4. Payload Multiplexing . . . . . . . . . . . . . . . . . . . . 7 104 5. SDP Negotiation . . . . . . . . . . . . . . . . . . . . . . . 8 105 6. SFU Packet Selection . . . . . . . . . . . . . . . . . . . . 9 106 7. Redundancy Techniques Considerations . . . . . . . . . . . . 10 107 7.1. Retransmission Techniques . . . . . . . . . . . . . . . . 10 108 7.2. Forward Error Correction (FEC) Techniques . . . . . . . . 10 109 7.3. Redundant Audio Data Techniques . . . . . . . . . . . . . 10 110 8. Alternatives . . . . . . . . . . . . . . . . . . . . . . . . 11 111 8.1. Generic Packetization With In-Payload APT . . . . . . . . 11 112 8.2. A Payload Type for Generic Packetization AND Media Format 11 113 8.3. A RTP Header To Choose Packetization . . . . . . . . . . 13 114 9. Security Considerations . . . . . . . . . . . . . . . . . . . 13 115 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 116 10.1. Registration of audio/generic . . . . . . . . . . . . . 14 117 11. Registration of video/generic . . . . . . . . . . . . . . . . 14 118 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 15 119 12.1. Normative References . . . . . . . . . . . . . . . . . . 15 120 12.2. Informative References . . . . . . . . . . . . . . . . . 16 121 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 17 123 1. Introduction 125 As per Figure 1 of [RFC7656], a Media Packetizer transforms a single 126 Encoded Stream into one or several RTP packets. The Encoded Stream 127 is coming straight from the Media Encoder and is expected to follow 128 the format produced by the Media Encoder. A number of Media 129 Packetizer formats have been designed to process a specific format 130 produced by Media Encoder. For instance [RFC6184] is dedicated to 131 the processing of content produced by H.264 Media Encoders, and 132 generates packets following NALUs organization. 134 WebRTC applications are increasingly deploying end-to-end encryption 135 solutions on top of RTP Media Chains. End-to-end encryption is 136 implemented by inserting application-specific Media Transformers 137 between Media Encoder and Media Packetizer on the sending side, and 138 between Media Depacketizer and Media Decoder on the receiving side, 139 as described in Figure 1 and Figure 2. To support end-to-end 140 encryption, Media Transformers can use the [SFrame] format. In 141 browsers, Media Transformers are implemented using 143 [WebRTCInsertableStreams], for instance by injecting JavaScript code 144 provided by web pages. 146 Physical Stimulus 147 | 148 V 149 +----------------------+ 150 | Media Capture | 151 +----------------------+ 152 | 153 Raw Stream 154 V 155 +----------------------+ 156 | Media Source |<-- Synchronization Timing 157 +----------------------+ 158 | 159 Source Stream 160 V 161 +----------------------+ 162 | Media Encoder | 163 +----------------------+ 164 | 165 Encoded Stream 166 V 167 +----------------------+ 168 | Media Transformer |<-- NEW: application-specific transform 169 +----------------------+ (e.g. SFrame Encryption) 170 | 171 Transformed Stream +------------+ 172 V | V 173 +----------------------+ | +----------------------+ 174 | Media Packetizer | | | RTP-Based Redundancy | 175 +----------------------+ | +----------------------+ 176 | | | 177 +-------------+ Redundancy RTP Stream 178 Source RTP Stream | 179 V V 180 +----------------------+ +----------------------+ 181 | RTP-Based Security | | RTP-Based Security | 182 +----------------------+ +----------------------+ 183 | | 184 Secured RTP Stream Secured Redundancy RTP Stream 185 V V 186 +----------------------+ +----------------------+ 187 | Media Transport | | Media Transport | 188 +----------------------+ +----------------------+ 189 Figure 1: Sender Side Concepts in the Media Chain 190 With Application-level Media Transform 192 These RTP packets are sent over the wire to a receiver media chain 193 matching the sender side, reaching the Media Depacketizer that will 194 reconstruct the Encoded Stream before passing it to the Media 195 Decoder. 197 +----------------------+ +----------------------+ 198 | Media Transport | | Media Transport | 199 +----------------------+ +----------------------+ 200 Received | Received | Secured 201 Secured RTP Stream Redundancy RTP Stream 202 V V 203 +----------------------+ +----------------------+ 204 | RTP-Based Validation | | RTP-Based Validation | 205 +----------------------+ +----------------------+ 206 | | 207 Received RTP Stream Received Redundancy RTP Stream 208 | | 209 | +--------------------+ 210 V V 211 +----------------------+ 212 | RTP-Based Repair | 213 +----------------------+ 214 | 215 Repaired RTP Stream 216 V 217 +----------------------+ 218 | Media Depacketizer | 219 +----------------------+ 220 | 221 Received Transformed Stream 222 V 223 +----------------------+ 224 | Media Transformer |<-- NEW: application-specific transform 225 +----------------------+ (e.g. SFrame Decryption) 226 | 227 Received Encoded Stream 228 V 229 +----------------------+ 230 | Media Decoder | 231 +----------------------+ 232 | 233 Received Source Stream 234 V 235 +----------------------+ 236 | Media Sink |--> Synchronization Information 237 +----------------------+ 238 | 239 Received Raw Stream 240 V 241 +----------------------+ 242 | Media Render | 243 +----------------------+ 244 | 245 V 246 Physical Stimulus 248 Figure 2: Receiver Side Concepts in the Media Chain 249 With Application-level Media Transform 251 This generic packetization does not change how the mapping between 252 one or several encoded or dependant streams are mapped to the RTP 253 streams or how the synchronization sources(s) (SSRC) are assigned. 255 Given the use of post-encoder application-specific transforms, the 256 whole Media Chain needs to be made aware of it. This includes the 257 sender post-transform Media Chain, Media Transport intermediaries 258 (SFUs typically) and receiver pre-transform Media Chain. 260 As these transforms can alter Encoded Streams in any possible way, 261 the use of codec-specific Media Packetizers like [RFC6184] on 262 Transformed Stream may be suboptimal on sender side. It may also be 263 problematic on the receiving side in case codec-specific processing 264 is done prior the Media Transformer. Media Transport intermediaries 265 are often looking at the Media Content itself to fuel their packet 266 selection algorithms. 268 2. Goals 270 The objective of this document is to support inserting any 271 application-specific transform between encoders and packetizers in 272 the Media Chain. For that purpose, this document will: 1. Provide a 273 generic packetization format that supports any media content 274 (compressed audio, compressed video, encrypted content...) that 275 allows reuse of existing RTP mechanisms in place in WebRTC 276 applications such as RTX, RED or FEC. 2. Provide a way to negotiate 277 use of the generic packetization format between sender and receiver, 278 with minimum impact on existing negotiation approaches. 3. Provide 279 a side-channel information so that network intermediaries (SFU in 280 particular) can do their existing packet routing strategies without 281 inspecting the media content. 283 3. RTP Packetization 285 A generic packetizer, by design, is not expected to understand the 286 format of the media to transmit. The unit used by the packetizer to 287 do processing is called a frame in the remainder of the document. 289 It is the responsibility of the application using the packetizer to 290 group media content in meaningful frames. In the common case of a 291 video codec, the packetizer frame is the frame in byte format (h264 292 annex b for example) generated by the encoder. 294 If the application wants to transform encoded content, the 295 application needs to split the encoded content into frames prior the 296 transform. Each frame is then transformed independently, for 297 instance encrypted using [SFrame]. The content of each transformed 298 frame is then processed by the packetizer. 300 In the case of a video codec supporting spatial scalability, each 301 spatial layer MUST be split in its own frame by the application 302 before passing it to the packetizer. 304 When the packetizer receives a frame from the application, it MUST 305 fragment the frame content in multiple RTP packets to ensure packets 306 do not exceed the network maximum transmission unit. The content of 307 the frame will be treated as a binary blob by the packetizer, so the 308 decision about the boundaries of each fragment is decided arbitrarily 309 by the packetizer. The packetizer or any relaying server MUST NOT 310 modify the frame content and concatenating the RTP payload of the RTP 311 packets for each frame MUST produce the exact binary content of the 312 input frame content. 314 The marker bit of each RTP packet in a frame MUST be set according to 315 the audio and video profiles specified in [RFC3551]. 317 The spatial layer frames are sent in ascending order, with the same 318 RTP timestamp, and only the last RTP packet of the last spatial layer 319 frame will have the marker bit set to 1. 321 4. Payload Multiplexing 323 In order to reduce the number of payload type in the SDP exchange, a 324 single payload type code for the generic packetization can be used 325 for all negotiated media formats. That requires to identify the 326 original payload type code of the frame negotiated media format, 327 called the associated payload type (APT) hereunder. The APT value is 328 the payload type code of the associated format passed to the generic 329 Media Packetizer before any transformation is applied. 331 The APT value is sent in a dedicated header extension. The payload 332 of this header extension can be encoded using either the one-byte or 333 two-byte header defined in [RFC5285]. Figures 3 and 4 show examples 334 with each one of these examples. 336 0 1 337 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 338 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 339 | ID | len=0 |S| APT | 340 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 342 Figure 3: Frame Associated Payload Type Encoding Using the One-Byte 343 Header Format 345 0 1 2 3 346 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 347 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 348 | ID | len=1 |S| APT | 0 (pad) | 349 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 351 Figure 4: Frame Associated Payload Type Encoding Using the Two-Byte 352 Header Format 354 The APT value is the associated payload type value. The S bit 355 indicates if the media stream can be forwarded safely starting from 356 this RTP packet. Typically, it will be set to 1 on the first RTP 357 packet of an intra video frame and in all RTP audio packets. 359 Receivers MUST be ready to receive RTP packets with different 360 associated payload types in the same way they would receive different 361 payload type codes on the RTP packets. 363 The URI for declaring this header extension in an extmap attribute is 364 "urn:ietf:params:rtp-hdrext:associated-payload-type". 366 5. SDP Negotiation 368 To use the RTP generic packetization, the SDP Offer/Answer exchange 369 MUST negotiate: - The payload type of the negotiated codec format - 370 The generic payload type - The associated payload type header 371 extension 373 Only the negotiated payload types are allowed to be used as 374 associated payload types. Figure 5 illustrates a SDP that negotiates 375 exchange of video using either VP8 or VP9 codecs with the possibility 376 to use the generic packetization. In this example, RTX is also 377 negotiated and will be applied normally on each associated payload 378 type. 380 m=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101 381 c=IN IP4 0.0.0.0 382 a=rtcp:9 IN IP4 0.0.0.0 383 a=setup:actpass 384 a=mid:1 385 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 386 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id 387 a=extmap:3 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id 388 a=extmap:4 urn:ietf:params:rtp-hdrext:associated-payload-type 389 a=sendrecv 390 a=rtpmap:96 vp9/90000 391 a=rtpmap:97 vp8/90000 392 a=rtpmap:98 generic/90000 393 a=rtpmap:99 rtx/90000 394 a=fmtp:99 apt=96 395 a=rtpmap:100 rtx/90000 396 a=fmtp:100 apt=97 397 a=rtpmap:101 rtx/90000 398 a=fmtp:101 apt=98 400 Figure 5: SDP example negotiating the generic payload type and 401 related header extension for video 403 6. SFU Packet Selection 405 SFUs need to have a basic understanding of each frame they receive so 406 they can decide to forward it or not and to which endpoint. They 407 might need similar information to support media content recording. 408 This information is either generic to a group of frame (called a 409 stream hereafter) or specific to each frame. 411 The information is transmitted as a RTP header extension as the RTP 412 packet payload should be treated as opaque by the SFU. This is 413 especially necessary if the payload is end-to-end encrypted. The 414 amount of information should be limited to what is strictly necessary 415 to the SFU task since it is not always as trusted as individual 416 peers. 418 For audio, configuration information such as Opus TOC might be 419 useful. For video, configuration information might include: - Stream 420 configuration information: resolution, quality, frame rate... - Codec 421 specific configuration information: codec profile like profile_idc... 422 - Frame specific information: whether the stream is decodable when 423 starting from this frame, whether the frame is skippable... 425 For video content, this information can be sent using a Dependency 426 Descriptor header extension. In that case, the first RTP packet of 427 the frame will have its start_of_frame equal to 1 and the last packet 428 will have its end_of_frame equal to 1. 430 7. Redundancy Techniques Considerations 432 The solution described in this document is expected to integrate well 433 with the existing RTP ecosystem. This section describes how the 434 generic packetizer can be used jointly with existing techniques that 435 allow to mitigate unreliable transports. 437 7.1. Retransmission Techniques 439 [RFC4588] defines a retransmission payload format (RTX) that can be 440 used in case of packet loss. As defined in [RFC4588], RTX is able to 441 handle any payload format, including the format described in this 442 document. Given RTX preserves both RTP packet payload and headers, 443 the receiver will be able to identify the payload type of the 444 recovered packet and whether generic packetization is used. RTX will 445 also allow recovering RTP header extensions that convey information 446 on the media content itself. 448 7.2. Forward Error Correction (FEC) Techniques 450 FEC is another technique used in RTP Media Chains to protect media 451 content against packet loss. [RFC5109] defines such a payload format 452 used to transmit FEC for specific packets protection. 454 FEC may protect some parts of the media content more than others. 455 For instance, intra video frame encoded data or important network 456 abstraction layer units (NALUs) like SPS/PPS may be more protected. 457 With a post-encoder transform and the use of a generic packetization, 458 the granularity of the recovery mechanism is no longer at the NALU 459 level but at the level of the frame generated by the post-encoder 460 transform. In case a SVC codec is used, each spatial layer will be 461 processed as an independent frame. In that case, base layers can be 462 protected more heavily than higher resolution layers. 464 7.3. Redundant Audio Data Techniques 466 As defined in [RFC7656] RTP-based redundancy is defined here as a 467 transformation that generates redundant or repair packets sent out as 468 a Redundancy RTP Stream to mitigate Network Transport impairments, 469 like packet loss and delay. 471 [RFC2198] defines a payload format for sending the same audio data 472 encoded multiple times at different quality levels. This allows to 473 use a lower quality encoding of the audio data, should the higher 474 quality encoding of the audio data is lost during the transmission. 476 If a Media Transformation is in use, both the primary and redundant 477 encoding must be transformed independently and the redundant packet 478 created normally. As the RTP headers present in the redundant packet 479 are only applicable to the primary encoding, if the payload type for 480 a redundant encoding block is mapped to the generic packetizer, the 481 value of the associated payload type for the primary encoding is 482 applied to the redundant encoding block as well. 484 8. Alternatives 486 Various alternatives can be used to implement and negotiate generic 487 packetization. This section describes a few additional alternatives. 488 This section is to be removed before finalization of the document. 490 8.1. Generic Packetization With In-Payload APT 492 Instead of using a RTP header extension to convey the APT value, it 493 is prepended in the RTP payload itself. As the value cannot change 494 for a whole frame, its value is prepended to the first packet 495 generated of the frame only. This removes the need to negotiate a 496 dedicated header extension, but may require the SFU to update the 497 payload when sending or recording content. 499 8.2. A Payload Type for Generic Packetization AND Media Format 501 The payload type is negotiated in the SDP so as to identify both the 502 negotiated codec format and the generic packetization use. There is 503 no network cost but this increases the number of payload types used 504 in the SDP. 506 m=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101 507 c=IN IP4 0.0.0.0 508 a=rtcp:9 IN IP4 0.0.0.0 509 a=setup:actpass 510 a=mid:1 511 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 512 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id 513 a=extmap:3 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id 514 a=sendrecv 515 a=rtpmap:96 vp9/90000 516 a=rtpmap:97 generic/90000 517 a=fmtp:97 apt=96 518 a=rtpmap:98 vp8/90000 519 a=rtpmap:99 generic/90000 520 a=fmtp:99 apt=98 521 a=rtpmap:100 rtx/90000 522 a=fmtp:100 apt=96 523 a=rtpmap:101 rtx/90000 524 a=fmtp:101 apt=97 525 a=rtpmap:102 rtx/90000 526 a=fmtp:102 apt=98 527 a=rtpmap:103 rtx/90000 528 a=fmtp:103 apt=99 530 Figure 6: SDP example negotiating a payload type for format and 531 generic packetization 533 A variation of this approach is to consider defining generic payload 534 types, each of them having an identified codec format. 536 m=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101 537 c=IN IP4 0.0.0.0 538 a=rtcp:9 IN IP4 0.0.0.0 539 a=setup:actpass 540 a=mid:1 541 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 542 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id 543 a=extmap:3 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id 544 a=sendrecv 545 a=rtpmap:96 generic/90000 546 a=fmtp:96 codec=vp9 547 a=rtpmap:97 generic/90000 548 a=fmtp:97 codec=vp8 549 a=rtpmap:98 rtx/90000 550 a=fmtp:98 apt=96 551 a=rtpmap:99 rtx/90000 552 a=fmtp:99 apt=97 553 Figure 7: SDP example negotiating a payload type for format and 554 generic packetization 556 8.3. A RTP Header To Choose Packetization 558 A RTP header extension can be used to flag content as opaque so that 559 the receiver knows whether to use or not the generic packetization. 560 As for the API header extension, the RTP header extension may not 561 need to be sent for every packet, it could for instance be sent for 562 the first packet of every intra video frame. The main advantage of 563 this approach is the reduced impact on SDP negotiation. 565 m=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101 566 c=IN IP4 0.0.0.0 567 a=rtcp:9 IN IP4 0.0.0.0 568 a=setup:actpass 569 a=mid:1 570 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 571 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id 572 a=extmap:3 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id 573 a=extmap:4 urn:ietf:params:rtp-hdrext:generic-packetization-use 574 a=sendrecv 575 a=rtpmap:96 vp9/90000 576 a=rtpmap:97 vp8/90000 577 a=rtpmap:98 rtx/90000 578 a=fmtp:98 apt=96 579 a=rtpmap:99 rtx/90000 580 a=fmtp:99 apt=97 582 Figure 8: SDP example negotiating generic packetization as RTP header 583 extension 585 9. Security Considerations 587 RTP packets using the payload format defined in this specification 588 are subject to the general security considerations discussed in 589 [RFC3550]. It is not expected that the proposed solutions (generic 590 packetization and header extension) presented in this document can 591 create new security threats. The use and implementation of RTP Media 592 Chains containing Media Transformers needs to be done carerefully. 593 It is important to refer to the security considerations discussed in 594 [SFrame] and [WebRTCInsertableStreams]. In particular Media 595 Transformers on the receiver side need to be prepared to receive 596 arbitrary content, like decoders already do. Similarly, since Media 597 Transformers can be implemented as JavaScript in browsers, RTP 598 Packetizers should be prepared to receive arbitrary content. 600 10. IANA Considerations 602 Two new media subtypes have been registered with IANA, as described 603 in this section. 605 10.1. Registration of audio/generic 607 Type name: audio 609 Subtype name: generic 611 Required parameters: none 613 Optional parameters: none 615 Encoding considerations: This format is framed (see Section 4.8 in 616 the template document) and contains binary data. 618 Security considerations: TBD. 620 Interoperability considerations: TBD 622 Published specification: TBD. 624 Applications that use this media type: TBD. 626 Additional information: none 628 Intended usage: COMMON 630 Restrictions on usage: TBD 632 Author: 634 Change controller: 636 11. Registration of video/generic 638 Type name: video 640 Subtype name: generic 642 Required parameters: none 644 Optional parameters: none 646 Encoding considerations: This format is framed (see Section 4.8 in 647 the template document) and contains binary data. 649 Security considerations: TBD. 651 Interoperability considerations: TBD 653 Published specification: TBD. 655 Applications that use this media type: TBD. 657 Additional information: none 659 Intended usage: COMMON 661 Restrictions on usage: TBD 663 Author: 665 Change controller: 667 12. References 669 12.1. Normative References 671 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 672 Requirement Levels", BCP 14, RFC 2119, 673 DOI 10.17487/RFC2119, March 1997, 674 . 676 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 677 Jacobson, "RTP: A Transport Protocol for Real-Time 678 Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550, 679 July 2003, . 681 [RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and 682 Video Conferences with Minimal Control", STD 65, RFC 3551, 683 DOI 10.17487/RFC3551, July 2003, 684 . 686 [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 687 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 688 RFC 3711, DOI 10.17487/RFC3711, March 2004, 689 . 691 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 692 Description Protocol", RFC 4566, DOI 10.17487/RFC4566, 693 July 2006, . 695 [RFC5285] Singer, D. and H. Desineni, "A General Mechanism for RTP 696 Header Extensions", RFC 5285, DOI 10.17487/RFC5285, July 697 2008, . 699 [RFC7656] Lennox, J., Gross, K., Nandakumar, S., Salgueiro, G., and 700 B. Burman, Ed., "A Taxonomy of Semantics and Mechanisms 701 for Real-Time Transport Protocol (RTP) Sources", RFC 7656, 702 DOI 10.17487/RFC7656, November 2015, 703 . 705 [RFC8285] Singer, D., Desineni, H., and R. Even, Ed., "A General 706 Mechanism for RTP Header Extensions", RFC 8285, 707 DOI 10.17487/RFC8285, October 2017, 708 . 710 12.2. Informative References 712 [RFC2198] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., 713 Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse- 714 Parisis, "RTP Payload for Redundant Audio Data", RFC 2198, 715 DOI 10.17487/RFC2198, September 1997, 716 . 718 [RFC4588] Rey, J., Leon, D., Miyazaki, A., Varsa, V., and R. 719 Hakenberg, "RTP Retransmission Payload Format", RFC 4588, 720 DOI 10.17487/RFC4588, July 2006, 721 . 723 [RFC5109] Li, A., Ed., "RTP Payload Format for Generic Forward Error 724 Correction", RFC 5109, DOI 10.17487/RFC5109, December 725 2007, . 727 [RFC6184] Wang, Y., Even, R., Kristensen, T., and R. Jesup, "RTP 728 Payload Format for H.264 Video", RFC 6184, 729 DOI 10.17487/RFC6184, May 2011, 730 . 732 [RFC6464] Lennox, J., Ed., Ivov, E., and E. Marocco, "A Real-time 733 Transport Protocol (RTP) Header Extension for Client-to- 734 Mixer Audio Level Indication", RFC 6464, 735 DOI 10.17487/RFC6464, December 2011, 736 . 738 [RFC6465] Ivov, E., Ed., Marocco, E., Ed., and J. Lennox, "A Real- 739 time Transport Protocol (RTP) Header Extension for Mixer- 740 to-Client Audio Level Indication", RFC 6465, 741 DOI 10.17487/RFC6465, December 2011, 742 . 744 [RFC6904] Lennox, J., "Encryption of Header Extensions in the Secure 745 Real-time Transport Protocol (SRTP)", RFC 6904, 746 DOI 10.17487/RFC6904, April 2013, 747 . 749 [SFrame] "Secure Frame (SFrame)", n.d., 750 . 752 [WebRTCInsertableStreams] 753 "WebRTC Insertable Media using Streams", n.d., 754 . 756 Authors' Addresses 758 Sergio Garcia Murillo 759 CoSMo Software 761 Email: sergio.garcia.murillo@cosmosoftware.io 763 Youenn Fablet 764 Apple Inc. 766 Email: youenn@apple.com 768 Alexandre Gouaillard 769 CoSMo Software 771 Email: alex.gouaillard@cosmosoftware.io