idnits 2.17.1 draft-gouaillard-avtcore-codec-agn-rtp-payload-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([SFrame], [WebRTCInsertableStreams]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 298: '... spatial layer MUST be split in its ...' RFC 2119 keyword, line 301: '... a frame from the application, it MUST...' RFC 2119 keyword, line 306: '...ketizer or any relying server MUST NOT...' RFC 2119 keyword, line 308: '...s for each frame MUST produce the exac...' RFC 2119 keyword, line 311: '...acket in a frame MUST be set according...' (2 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 19, 2021) is 1161 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 5285 (Obsoleted by RFC 8285) ** Downref: Normative reference to an Informational RFC: RFC 7656 Summary: 4 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 AVTCORE S. Garcia Murillo 2 Internet-Draft A. Gouaillard 3 Intended status: Standards Track CoSMo Software 4 Expires: August 23, 2021 February 19, 2021 6 Codec agnostic RTP payload format for video 7 draft-gouaillard-avtcore-codec-agn-rtp-payload-00 9 Abstract 11 RTP Media Chains usually rely on piping encoder output directly to 12 packetizers. Media packetization formats often support a specific 13 codec format and optimize RTP packets generation accordingly. 15 With the development of Selective Forward Unit (SFU) solutions, that 16 do not process media content server side, the need for media content 17 processing at the origin and at the destination has arised. 19 RTP Media Chains used e.g. in WebRTC solutions are increasingly 20 relying on application-specific transforms that sit in-between 21 encoder and packetizer on one end and in-between depacketizer and 22 decoder on the other end. This use case has become so important, 23 that the W3C is standardizing the capacity to access encoded content 24 with the [WebRTCInsertableStreams] API proposal. An extremely 25 popular use case is application level end-to-end encryption of media 26 content, using for instance [SFrame]. 28 Whatever the modification applied to the media content, RTP 29 packetizers can no longer expect to use packetization formats that 30 mandate media content to be in a specific codec format. 32 In the extreme cases like encryption, where the RTP Payload is made 33 completely opaque to the SFUs, some extra mechanism must also be 34 added for them to be able to route the packets without depending on 35 RTP payload or payload headers. 37 The traditionnal process of creating a new RTP Payload specification 38 per content would not be practical as we would need to make a new one 39 for each codec-transform pair. 41 This document describes a solution, which provides the following 42 features in the case the encoded content has been modified before 43 reaching the packetizer: - a paylaod agnostic RTP packetization 44 format that can be used on any media content, - a signalling 45 mechanism for the above format and the inner payload, Both of the 46 above mechanism are backward compatible with most of (S)RTP/RTCP 47 mechanisms used for bandwidth estimation and congestion control in 48 RTP/SRTP/webrtc, including but not limited to SSRC, RED, FEC, RTX, 49 NACK, SR/RR, REMB, transport-wide-CC, TIMBR, .... It as illustrated 50 by existing implementations in chrome, safari, and Medooze. 52 This document also describes a solution to allow SFUs to continue 53 performing packet routing on top of this generic RTP packetization 54 format. 56 This document complements the SFrame (media encryption), and 57 Dependency Descriptor (AV1 payload annex) documents to provide an 58 End-to-End-Encryption solution that would sit on top of SRTP/Webrtc, 59 use SFUs on the media back-end, and leverage W3C APIs in the browser. 60 A high level description of such system will be provided as an 61 informational I-D in the SFrame WG and then cited here. 63 Status of This Memo 65 This Internet-Draft is submitted in full conformance with the 66 provisions of BCP 78 and BCP 79. 68 Internet-Drafts are working documents of the Internet Engineering 69 Task Force (IETF). Note that other groups may also distribute 70 working documents as Internet-Drafts. The list of current Internet- 71 Drafts is at https://datatracker.ietf.org/drafts/current/. 73 Internet-Drafts are draft documents valid for a maximum of six months 74 and may be updated, replaced, or obsoleted by other documents at any 75 time. It is inappropriate to use Internet-Drafts as reference 76 material or to cite them other than as "work in progress." 78 This Internet-Draft will expire on August 23, 2021. 80 Copyright Notice 82 Copyright (c) 2021 IETF Trust and the persons identified as the 83 document authors. All rights reserved. 85 This document is subject to BCP 78 and the IETF Trust's Legal 86 Provisions Relating to IETF Documents 87 (https://trustee.ietf.org/license-info) in effect on the date of 88 publication of this document. Please review these documents 89 carefully, as they describe your rights and restrictions with respect 90 to this document. Code Components extracted from this document must 91 include Simplified BSD License text as described in Section 4.e of 92 the Trust Legal Provisions and are provided without warranty as 93 described in the Simplified BSD License. 95 Table of Contents 97 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 98 2. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 99 3. RTP Packetization . . . . . . . . . . . . . . . . . . . . . . 6 100 4. Payload Multiplexing . . . . . . . . . . . . . . . . . . . . 7 101 5. SDP Negotiation . . . . . . . . . . . . . . . . . . . . . . . 8 102 6. SFU Packet Selection . . . . . . . . . . . . . . . . . . . . 9 103 7. Redundancy Techniques Considerations . . . . . . . . . . . . 10 104 7.1. Retransmission Techniques . . . . . . . . . . . . . . . . 10 105 7.2. Forward Error Correction (FEC) Techniques . . . . . . . . 10 106 7.3. Redundant Audio Data Techniques . . . . . . . . . . . . . 10 107 8. Alternatives . . . . . . . . . . . . . . . . . . . . . . . . 11 108 8.1. Generic Packetization With In-Payload APT . . . . . . . . 11 109 8.2. A Payload Type for Generic Packetization AND Media Format 11 110 8.3. A RTP Header To Choose Packetization . . . . . . . . . . 13 111 9. Security Considerations . . . . . . . . . . . . . . . . . . . 13 112 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 113 10.1. Registration of audio/generic . . . . . . . . . . . . . 14 114 11. Registration of video/generic . . . . . . . . . . . . . . . . 14 115 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 15 116 12.1. Normative References . . . . . . . . . . . . . . . . . . 15 117 12.2. Informative References . . . . . . . . . . . . . . . . . 15 118 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 16 120 1. Introduction 122 As per Figure 1 of [RFC7656], a Media Packetizer transforms a single 123 Encoded Stream into one or several RTP packets. The Encoded Stream 124 is coming straight from the Media Encoder and is expected to follow 125 the format produced by the Media Encoder. A number of Media 126 Packetizer formats have been designed to process a specific format 127 produced by Media Encoder. For instance [RFC6184] is dedicated to 128 the processing of content produced by H.264 Media Encoders, and 129 generates packets following NALUs organization. 131 WebRTC applications are increasingly deploying end-to-end encryption 132 solutions on top of RTP Media Chains. End-to-end encryption is 133 implemented by inserting application-specific Media Transformers 134 between Media Encoder and Media Packetizer on the sending side, and 135 between Media Depacketizer and Media Decoder on the receiving side, 136 as described in Figure 1 and Figure 2. To support end-to-end 137 encryption, Media Transformers can use the [SFrame] format. In 138 browsers, Media Transformers are implemented using 139 [WebRTCInsertableStreams], for instance by injecting JavaScript code 140 provided by web pages. 142 Physical Stimulus 143 | 144 V 145 +----------------------+ 146 | Media Capture | 147 +----------------------+ 148 | 149 Raw Stream 150 V 151 +----------------------+ 152 | Media Source |<-- Synchronization Timing 153 +----------------------+ 154 | 155 Source Stream 156 V 157 +----------------------+ 158 | Media Encoder | 159 +----------------------+ 160 | 161 Encoded Stream 162 V 163 +----------------------+ 164 | Media Transformer |<-- NEW: application-specific transform 165 +----------------------+ (e.g. SFrame Encryption) 166 | 167 Transformed Stream +------------+ 168 V | V 169 +----------------------+ | +----------------------+ 170 | Media Packetizer | | | RTP-Based Redundancy | 171 +----------------------+ | +----------------------+ 172 | | | 173 +-------------+ Redundancy RTP Stream 174 Source RTP Stream | 175 V V 176 +----------------------+ +----------------------+ 177 | RTP-Based Security | | RTP-Based Security | 178 +----------------------+ +----------------------+ 179 | | 180 Secured RTP Stream Secured Redundancy RTP Stream 181 V V 182 +----------------------+ +----------------------+ 183 | Media Transport | | Media Transport | 184 +----------------------+ +----------------------+ 186 Figure 1: Sender Side Concepts in the Media Chain 187 With Application-level Media Transform 189 These RTP packets are sent over the wire to a receiver media chain 190 matching the sender side, reaching the Media Depacketizer that will 191 reconstruct the Encoded Stream before passing it to the Media 192 Decoder. 194 +----------------------+ +----------------------+ 195 | Media Transport | | Media Transport | 196 +----------------------+ +----------------------+ 197 Received | Received | Secured 198 Secured RTP Stream Redundancy RTP Stream 199 V V 200 +----------------------+ +----------------------+ 201 | RTP-Based Validation | | RTP-Based Validation | 202 +----------------------+ +----------------------+ 203 | | 204 Received RTP Stream Received Redundancy RTP Stream 205 | | 206 | +--------------------+ 207 V V 208 +----------------------+ 209 | RTP-Based Repair | 210 +----------------------+ 211 | 212 Repaired RTP Stream 213 V 214 +----------------------+ 215 | Media Depacketizer | 216 +----------------------+ 217 | 218 Received Transformed Stream 219 V 220 +----------------------+ 221 | Media Transformer |<-- NEW: application-specific transform 222 +----------------------+ (e.g. SFrame Decryption) 223 | 224 Received Encoded Stream 225 V 226 +----------------------+ 227 | Media Decoder | 228 +----------------------+ 229 | 230 Received Source Stream 231 V 232 +----------------------+ 233 | Media Sink |--> Synchronization Information 234 +----------------------+ 235 | 236 Received Raw Stream 237 V 238 +----------------------+ 239 | Media Render | 240 +----------------------+ 241 | 242 V 243 Physical Stimulus 245 Figure 2: Receiver Side Concepts in the Media Chain 246 With Application-level Media Transform 248 This generic packetization does not change how the mapping between 249 one or several encoded or dependant streams are mapped to the RTP 250 streams or how the synchronization sources(s) (SSRC) are assigned. 252 Given the use of post-encoder application-specific transforms, the 253 whole Media Chain needs to be made aware of it. This includes the 254 sender post-transform Media Chain, Media Transport intermediaries 255 (SFUs typically) and receiver pre-transform Media Chain. 257 As these transforms can alter Encoded Streams in any possible way, 258 the use of codec-specific Media Packetizers like [RFC6184] on 259 Transformed Stream may be suboptimal on sender side. It may also be 260 problematic on the receiving side in case codec-specific processing 261 is done prior the Media Transformer. Media Transport intermediaries 262 are often looking at the Media Content itself to fuel their packet 263 selection algorithms. 265 2. Goals 267 The objective of this document is to support inserting any 268 application-specific transform between encoders and packetizers in 269 the Media Chain. For that purpose, this document will: 1. Provide a 270 generic packetization format that supports any media content 271 (compressed audio, compressed video, encrypted content...) that 272 allows reuse of existing RTP mechanisms in place in WebRTC 273 applications such as RTX, RED or FEC. 2. Provide a way to negotiate 274 use of the generic packetization format between sender and receiver, 275 with minimum impact on existing negotiation approaches. 3. Provide 276 a side-channel information so that network intermediaries (SFU in 277 particular) can do their existing packet routing strategies without 278 inspecting the media content. 280 3. RTP Packetization 282 A generic packetizer, by design, is not expected to understand the 283 format of the media to transmit. The unit used by the packetizer to 284 do processing is called a frame in the remainder of the document. 286 It is the responsibility of the application using the packetizer to 287 group media content in meaningful frames. In the common case of a 288 video codec, the packetizer frame is the frame in byte format (h264 289 annex b for example) generated by the encoder. 291 If the application wants to transform encoded content, the 292 application needs to split the encoded content into frames prior the 293 transform. Each frame is then transformed independently, for 294 instance encrypted using [SFrame]. The content of each transformed 295 frame is then processed by the packetizer. 297 In the case of a video codec supporting spatial scalability, each 298 spatial layer MUST be split in its own frame by the application 299 before passing it to the packetizer. 301 When the packetizer receives a frame from the application, it MUST 302 fragment the frame content in multiple RTP packets to ensure packets 303 do not exceed the network maximum transmission unit. The content of 304 the frame will be treated as a binary blob by the packetizer, so the 305 decision about the boundaries of each fragment is decided arbitrarily 306 by the packetizer. The packetizer or any relying server MUST NOT 307 modify the frame content and concatenating the RTP payload of the RTP 308 packets for each frame MUST produce the exact binary content of the 309 input frame content. 311 The marker bit of each RTP packet in a frame MUST be set according to 312 the audio and video profiles specified in [RFC3551]. 314 The spatial layer frames are sent in ascending order, with the same 315 RTP timestamp, and only the last RTP packet of the last spatial layer 316 frame will have the marker bit set to 1. 318 4. Payload Multiplexing 320 In order to reduce the number of payload type in the SDP exchange, a 321 single payload type code for the generic packetization can be used 322 for all negotiated media formats. That requires to identify the 323 original payload type code of the frame negotiated media format, 324 called the associated payload type (APT) hereunder. The APT value is 325 the payload type code of the associated format passed to the generic 326 Media Packetizer before any transformation is applied. 328 The APT value is sent in a dedicated header extension. The payload 329 of this header extension can be encoded using either the one-byte or 330 two-byte header defined in [RFC5285]. Figures 3 and 4 show examples 331 with each one of these examples. 333 0 1 334 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 335 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 336 | ID | len=0 |S| APT | 337 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 339 Figure 3: Frame Associated Payload Type Encoding Using the One-Byte 340 Header Format 342 0 1 2 3 343 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 344 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 345 | ID | len=1 |S| APT | 0 (pad) | 346 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 348 Figure 4: Frame Associated Payload Type Encoding Using the Two-Byte 349 Header Format 351 The APT value is the associated payload type value. The S bit 352 indicates if the media stream can be forwarded safely starting from 353 this RTP packet. Typically, it will be set to 1 on the first RTP 354 packet of an intra video frame and in all RTP audio packets. 356 Receivers MUST be ready to receive RTP packets with different 357 associated payload types in the same way they would receive different 358 payload type codes on the RTP packets. 360 The URI for declaring this header extension in an extmap attribute is 361 "urn:ietf:params:rtp-hdrext:associated-payload-type". 363 5. SDP Negotiation 365 To use the RTP generic packetization, the SDP Offer/Answer exchange 366 MUST negotiate: - The payload type of the negotiated codec format - 367 The generic payload type - The associated payload type header 368 extension 370 Only the negotiated payload types are allowed to be used as 371 associated payload types. Figure 5 illustrates a SDP that negotiates 372 exchange of video using either VP8 or VP9 codecs with the possibility 373 to use the generic packetization. In this example, RTX is also 374 negotiated and will be applied normally on each associated payload 375 type. 377 m=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101 378 c=IN IP4 0.0.0.0 379 a=rtcp:9 IN IP4 0.0.0.0 380 a=setup:actpass 381 a=mid:1 382 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 383 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id 384 a=extmap:3 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id 385 a=extmap:4 urn:ietf:params:rtp-hdrext:associated-payload-type 386 a=sendrecv 387 a=rtpmap:96 vp9/90000 388 a=rtpmap:97 vp8/90000 389 a=rtpmap:98 generic/90000 390 a=rtpmap:99 rtx/90000 391 a=fmtp:99 apt=96 392 a=rtpmap:100 rtx/90000 393 a=fmtp:100 apt=97 394 a=rtpmap:101 rtx/90000 395 a=fmtp:101 apt=98 397 Figure 5: SDP example negotiating the generic payload type and 398 related header extension for video 400 6. SFU Packet Selection 402 SFUs need to have a basic understanding of each frame they receive so 403 they can decide to forward it or not and to which endpoint. They 404 might need similar information to support media content recording. 405 This information is either generic to a group of frame (called a 406 stream hereafter) or specific to each frame. 408 The information is transmitted as a RTP header extension as the RTP 409 packet payload should be treated as opaque by the SFU. This is 410 especially necessary if the payload is end-to-end encrypted. The 411 amount of information should be limited to what is strictly necessary 412 to the SFU task since it is not always as trusted as individual 413 peers. 415 For audio, configuration information such as Opus TOC might be 416 useful. For video, configuration information might include: - Stream 417 configuration information: resolution, quality, frame rate... - Codec 418 specific configuration information: codec profile like profile_idc... 419 - Frame specific information: whether the stream is decodable when 420 starting from this frame, whether the frame is skippable... 422 For video content, this information can be sent using a Dependency 423 Descriptor header extension. In that case, the first RTP packet of 424 the frame will have its start_of_frame equal to 1 and the last packet 425 will have its end_of_frame equal to 1. 427 7. Redundancy Techniques Considerations 429 The solution described in this document is expected to integrate well 430 with the existing RTP ecosystem. This section describes how the 431 generic packetizer can be used jointly with existing techniques that 432 allow to mitigate unreliable transports. 434 7.1. Retransmission Techniques 436 [RFC4588] defines a retransmission payload format (RTX) that can be 437 used in case of packet loss. As defined in [RFC4588], RTX is able to 438 handle any payload format, including the format described in this 439 document. Given RTX preserves both RTP packet payload and headers, 440 the receiver will be able to identify the payload type of the 441 recovered packet and whether generic packetization is used. RTX will 442 also allow recovering RTP header extensions that convey information 443 on the media content itself. 445 7.2. Forward Error Correction (FEC) Techniques 447 FEC is another technique used in RTP Media Chains to protect media 448 content against packet loss. [RFC5109] defines such a payload format 449 used to transmit FEC for specific packets protection. 451 FEC may protect some parts of the media content more than others. 452 For instance, intra video frame encoded data or important network 453 abstraction layer units (NALUs) like SPS/PPS may be more protected. 454 With a post-encoder transform and the use of a generic packetization, 455 the granularity of the recovery mechanism is no longer at the NALU 456 level but at the level of the frame generated by the post-encoder 457 transform. In case a SVC codec is used, each spatial layer will be 458 processed as an independent frame. In that case, base layers can be 459 protected more heavily than higher resolution layers. 461 7.3. Redundant Audio Data Techniques 463 As defined in [RFC7656] RTP-based redundancy is defined here as a 464 transformation that generates redundant or repair packets sent out as 465 a Redundancy RTP Stream to mitigate Network Transport impairments, 466 like packet loss and delay. 468 [RFC2198] defines a payload format for sending the same audio data 469 encoded multiple times at different quality levels. This allows to 470 use a lower quality encoding of the audio data, should the higher 471 quality encoding of the audio data is lost during the transmission. 473 If a Media Transformation is in use, both the primary and redundant 474 encoding must be transformed independently and the redundant packet 475 created normally. As the RTP headers present in the redundant packet 476 are only applicable to the primary encoding, if the payload type for 477 a redundant encoding block is mapped to the generic packetizer, the 478 value of the associated payload type for the primary encoding is 479 applied to the redundant encoding block as well. 481 8. Alternatives 483 Various alternatives can be used to implement and negotiate generic 484 packetization. This section describes a few additional alternatives. 485 This section is to be removed before finalization of the document. 487 8.1. Generic Packetization With In-Payload APT 489 Instead of using a RTP header extension to convey the APT value, it 490 is prepended in the RTP payload itself. As the value cannot change 491 for a whole frame, its value is prepended to the first packet 492 generated of the frame only. This removes the need to negotiate a 493 dedicated header extension, but may require the SFU to update the 494 payload when sending or recording content. 496 8.2. A Payload Type for Generic Packetization AND Media Format 498 The payload type is negotiated in the SDP so as to identify both the 499 negotiated codec format and the generic packetization use. There is 500 no network cost but this increases the number of payload types used 501 in the SDP. 503 m=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101 504 c=IN IP4 0.0.0.0 505 a=rtcp:9 IN IP4 0.0.0.0 506 a=setup:actpass 507 a=mid:1 508 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 509 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id 510 a=extmap:3 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id 511 a=sendrecv 512 a=rtpmap:96 vp9/90000 513 a=rtpmap:97 generic/90000 514 a=fmtp:97 apt=96 515 a=rtpmap:98 vp8/90000 516 a=rtpmap:99 generic/90000 517 a=fmtp:99 apt=98 518 a=rtpmap:100 rtx/90000 519 a=fmtp:100 apt=96 520 a=rtpmap:101 rtx/90000 521 a=fmtp:101 apt=97 522 a=rtpmap:102 rtx/90000 523 a=fmtp:102 apt=98 524 a=rtpmap:103 rtx/90000 525 a=fmtp:103 apt=99 527 Figure 6: SDP example negotiating a payload type for format and 528 generic packetization 530 A variation of this approach is to consider defining generic payload 531 types, each of them having an identified codec format. 533 m=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101 534 c=IN IP4 0.0.0.0 535 a=rtcp:9 IN IP4 0.0.0.0 536 a=setup:actpass 537 a=mid:1 538 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 539 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id 540 a=extmap:3 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id 541 a=sendrecv 542 a=rtpmap:96 generic/90000 543 a=fmtp:96 codec=vp9 544 a=rtpmap:97 generic/90000 545 a=fmtp:97 codec=vp8 546 a=rtpmap:98 rtx/90000 547 a=fmtp:98 apt=96 548 a=rtpmap:99 rtx/90000 549 a=fmtp:99 apt=97 550 Figure 7: SDP example negotiating a payload type for format and 551 generic packetization 553 8.3. A RTP Header To Choose Packetization 555 A RTP header extension can be used to flag content as opaque so that 556 the receiver knows whether to use or not the generic packetization. 557 As for the API header extension, the RTP header extension may not 558 need to be sent for every packet, it could for instance be sent for 559 the first packet of every intra video frame. The main advantage of 560 this approach is the reduced impact on SDP negotiation. 562 m=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101 563 c=IN IP4 0.0.0.0 564 a=rtcp:9 IN IP4 0.0.0.0 565 a=setup:actpass 566 a=mid:1 567 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 568 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id 569 a=extmap:3 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id 570 a=extmap:4 urn:ietf:params:rtp-hdrext:generic-packetization-use 571 a=sendrecv 572 a=rtpmap:96 vp9/90000 573 a=rtpmap:97 vp8/90000 574 a=rtpmap:98 rtx/90000 575 a=fmtp:98 apt=96 576 a=rtpmap:99 rtx/90000 577 a=fmtp:99 apt=97 579 Figure 8: SDP example negotiating generic packetization as RTP header 580 extension 582 9. Security Considerations 584 RTP packets using the payload format defined in this specification 585 are subject to the general security considerations discussed in 586 [RFC3550]. It is not expected that the proposed solutions (generic 587 packetization and header extension) presented in this document can 588 create new security threats. The use and implementation of RTP Media 589 Chains containing Media Transformers needs to be done carerefully. 590 It is important to refer to the security considerations discussed in 591 [SFrame] and [WebRTCInsertableStreams]. In particular Media 592 Transformers on the receiver side need to be prepared to receive 593 arbitrary content, like decoders already do. Similarly, since Media 594 Transformers can be implemented as JavaScript in browsers, RTP 595 Packetizers should be prepared to receive arbitrary content. 597 10. IANA Considerations 599 Two new media subtypes have been registered with IANA, as described 600 in this section. 602 10.1. Registration of audio/generic 604 Type name: audio 606 Subtype name: generic 608 Required parameters: none 610 Optional parameters: none 612 Encoding considerations: This format is framed (see Section 4.8 in 613 the template document) and contains binary data. 615 Security considerations: TBD. 617 Interoperability considerations: TBD 619 Published specification: TBD. 621 Applications that use this media type: TBD. 623 Additional information: none 625 Intended usage: COMMON 627 Restrictions on usage: TBD 629 Author: 631 Change controller: 633 11. Registration of video/generic 635 Type name: video 637 Subtype name: generic 639 Required parameters: none 641 Optional parameters: none 643 Encoding considerations: This format is framed (see Section 4.8 in 644 the template document) and contains binary data. 646 Security considerations: TBD. 648 Interoperability considerations: TBD 650 Published specification: TBD. 652 Applications that use this media type: TBD. 654 Additional information: none 656 Intended usage: COMMON 658 Restrictions on usage: TBD 660 Author: 662 Change controller: 664 12. References 666 12.1. Normative References 668 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 669 Jacobson, "RTP: A Transport Protocol for Real-Time 670 Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550, 671 July 2003, . 673 [RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and 674 Video Conferences with Minimal Control", STD 65, RFC 3551, 675 DOI 10.17487/RFC3551, July 2003, 676 . 678 [RFC5285] Singer, D. and H. Desineni, "A General Mechanism for RTP 679 Header Extensions", RFC 5285, DOI 10.17487/RFC5285, July 680 2008, . 682 [RFC7656] Lennox, J., Gross, K., Nandakumar, S., Salgueiro, G., and 683 B. Burman, Ed., "A Taxonomy of Semantics and Mechanisms 684 for Real-Time Transport Protocol (RTP) Sources", RFC 7656, 685 DOI 10.17487/RFC7656, November 2015, 686 . 688 12.2. Informative References 690 [RFC2198] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., 691 Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse- 692 Parisis, "RTP Payload for Redundant Audio Data", RFC 2198, 693 DOI 10.17487/RFC2198, September 1997, 694 . 696 [RFC4588] Rey, J., Leon, D., Miyazaki, A., Varsa, V., and R. 697 Hakenberg, "RTP Retransmission Payload Format", RFC 4588, 698 DOI 10.17487/RFC4588, July 2006, 699 . 701 [RFC5109] Li, A., Ed., "RTP Payload Format for Generic Forward Error 702 Correction", RFC 5109, DOI 10.17487/RFC5109, December 703 2007, . 705 [RFC6184] Wang, Y., Even, R., Kristensen, T., and R. Jesup, "RTP 706 Payload Format for H.264 Video", RFC 6184, 707 DOI 10.17487/RFC6184, May 2011, 708 . 710 [SFrame] "Secure Frame (SFrame)", n.d., 711 . 713 [WebRTCInsertableStreams] 714 "WebRTC Insertable Media using Streams", n.d., 715 . 717 Authors' Addresses 719 Sergio Garcia Murillo 720 CoSMo Software 722 Email: sergio.garcia.murillo@cosmosoftware.io 724 Alexandre Gouaillard 725 CoSMo Software 727 Email: alex.gouaillard@cosmosoftware.io