idnits 2.17.1 

draft-murillo-avtcore-multi-codec-payload-format-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** There are 2 instances of too long lines in the document, the longest one
     being 2 characters in excess of 72.

  ** The abstract seems to contain references ([SFrame],
     [WebRTCInsertableStreams]), which it shouldn't.  Please replace those
     with straight textual mentions of the documents in question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == The document seems to lack the recommended RFC 2119 boilerplate, even if
     it appears to use RFC 2119 keywords. 

     (The document does seem to have the reference to RFC 2119 which the
     ID-Checklist requires).
  -- The document date (11 July 2021) is 1019 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Unused Reference: 'RFC2119' is defined on line 663, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC3711' is defined on line 678, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC4566' is defined on line 683, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC8285' is defined on line 697, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC6464' is defined on line 724, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC6465' is defined on line 730, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC6904' is defined on line 736, but no explicit
     reference was found in the text

  ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866)

  ** Obsolete normative reference: RFC 5285 (Obsoleted by RFC 8285)

  ** Downref: Normative reference to an Informational RFC: RFC 7656


     Summary: 5 errors (**), 0 flaws (~~), 9 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	AVTCORE                                                S. Garcia Murillo
3	Internet-Draft                                                     CoSMo
4	Intended status: Standards Track                               Y. Fablet
5	Expires: 12 January 2022                                      Apple Inc.
6	                                                           A. Gouaillard
7	                                                                   CoSMo
8	                                                               J. Uberti
9	                                                               Clubhouse
10	                                                            11 July 2021

12	                     Multi Codec RTP payload format
13	          draft-murillo-avtcore-multi-codec-payload-format-01

15	Abstract

17	   RTP Media Chains usually rely on piping encoder output directly to
18	   packetizers.  Media packetization formats often support a specific
19	   codec format and optimize RTP packets generation accordingly.  With
20	   the development of Selective Forward Unit (SFU) solutions, RTP Media
21	   Chains used in WebRTC solutions are increasingly relying on
22	   application-specific transforms that sit between encoder and
23	   packetizer on one end and between depacketizer and decoder on the
24	   other end.  These transforms are typically encrypting media content
25	   so that the media content is not readable from the SFU, for instance
26	   using [SFrame] or [WebRTCInsertableStreams].  In that context, RTP
27	   packetizers can no longer expect to use packetization formats that
28	   mandate media content to be in a specific codec format.  This
29	   document provides a solution to that problem by describing a RTP
30	   packetization format that can be used for many media content, and how
31	   to negotiate use of this format.  This document also describes a
32	   solution to allow SFUs to continue performing packet routing on top
33	   of this RTP packetization format.

35	Status of This Memo

37	   This Internet-Draft is submitted in full conformance with the
38	   provisions of BCP 78 and BCP 79.

40	   Internet-Drafts are working documents of the Internet Engineering
41	   Task Force (IETF).  Note that other groups may also distribute
42	   working documents as Internet-Drafts.  The list of current Internet-
43	   Drafts is at https://datatracker.ietf.org/drafts/current/.

45	   Internet-Drafts are draft documents valid for a maximum of six months
46	   and may be updated, replaced, or obsoleted by other documents at any
47	   time.  It is inappropriate to use Internet-Drafts as reference
48	   material or to cite them other than as "work in progress."
49	   This Internet-Draft will expire on 12 January 2022.

51	Copyright Notice

53	   Copyright (c) 2021 IETF Trust and the persons identified as the
54	   document authors.  All rights reserved.

56	   This document is subject to BCP 78 and the IETF Trust's Legal
57	   Provisions Relating to IETF Documents (https://trustee.ietf.org/
58	   license-info) in effect on the date of publication of this document.
59	   Please review these documents carefully, as they describe your rights
60	   and restrictions with respect to this document.  Code Components
61	   extracted from this document must include Simplified BSD License text
62	   as described in Section 4.e of the Trust Legal Provisions and are
63	   provided without warranty as described in the Simplified BSD License.

65	Table of Contents

67	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
68	   2.  Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . .   6
69	   3.  RTP Packetization . . . . . . . . . . . . . . . . . . . . . .   6
70	   4.  Payload Multiplexing  . . . . . . . . . . . . . . . . . . . .   7
71	   5.  SDP Negotiation . . . . . . . . . . . . . . . . . . . . . . .   8
72	   6.  SFU Packet Selection  . . . . . . . . . . . . . . . . . . . .   9
73	   7.  Sender Processing Rules . . . . . . . . . . . . . . . . . . .  10
74	   8.  Redundancy Techniques Considerations  . . . . . . . . . . . .  10
75	     8.1.  Retransmission Techniques . . . . . . . . . . . . . . . .  10
76	     8.2.  Forward Error Correction (FEC) Techniques . . . . . . . .  11
77	     8.3.  Redundant Audio Data Techniques . . . . . . . . . . . . .  11
78	   9.  Alternatives  . . . . . . . . . . . . . . . . . . . . . . . .  11
79	     9.1.  Generic Packetization With In-Payload APT . . . . . . . .  12
80	     9.2.  A Payload Type for Generic Packetization AND Media
81	           Format  . . . . . . . . . . . . . . . . . . . . . . . . .  12
82	     9.3.  A RTP Header To Choose Packetization  . . . . . . . . . .  13
83	   10. Security Considerations . . . . . . . . . . . . . . . . . . .  14
84	   11. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  14
85	     11.1.  Registration of audio/generic  . . . . . . . . . . . . .  14
86	   12. Registration of video/generic . . . . . . . . . . . . . . . .  15
87	   13. References  . . . . . . . . . . . . . . . . . . . . . . . . .  15
88	     13.1.  Normative References . . . . . . . . . . . . . . . . . .  15
89	     13.2.  Informative References . . . . . . . . . . . . . . . . .  16
90	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  17

92	1.  Introduction

94	   As per Figure 1 of [RFC7656], a Media Packetizer transforms a single
95	   Encoded Stream into one or several RTP packets.  The Encoded Stream
96	   is coming straight from the Media Encoder and is expected to follow
97	   the format produced by the Media Encoder.  A number of Media
98	   Packetizer formats have been designed to process a specific format
99	   produced by Media Encoder.  For instance [RFC6184] is dedicated to
100	   the processing of content produced by H.264 Media Encoders, and
101	   generates packets following NALUs organization.

103	   WebRTC applications are increasingly deploying end-to-end encryption
104	   solutions on top of RTP Media Chains.  End-to-end encryption is
105	   implemented by inserting application-specific Media Transformers
106	   between Media Encoder and Media Packetizer on the sending side, and
107	   between Media Depacketizer and Media Decoder on the receiving side,
108	   as described in Figure 1 and Figure 2.  To support end-to-end
109	   encryption, Media Transformers can use the [SFrame] format.  In
110	   browsers, Media Transformers are implemented using
111	   [WebRTCInsertableStreams], for instance by injecting JavaScript code
112	   provided by web pages.

114	                Physical Stimulus
115	                      |
116	                      V
117	           +----------------------+
118	           |     Media Capture    |
119	           +----------------------+
120	                      |
121	                 Raw Stream
122	                      V
123	           +----------------------+
124	           |     Media Source     |<-- Synchronization Timing
125	           +----------------------+
126	                      |
127	                Source Stream
128	                      V
129	           +----------------------+
130	           |    Media Encoder     |
131	           +----------------------+
132	                      |
133	                Encoded Stream
134	                      V
135	           +----------------------+
136	           |   Media Transformer  |<-- NEW: application-specific transform
137	           +----------------------+         (e.g. SFrame Encryption)
138	                      |
139	              Transformed Stream    +------------+
140	                      V             |            V
141	           +----------------------+ | +----------------------+
142	           |   Media Packetizer   | | | RTP-Based Redundancy |
143	           +----------------------+ | +----------------------+
144	                      |             |            |
145	                      +-------------+  Redundancy RTP Stream
146	               Source RTP Stream                 |
147	                      V                          V
148	           +----------------------+   +----------------------+
149	           |  RTP-Based Security  |   |  RTP-Based Security  |
150	           +----------------------+   +----------------------+
151	                      |                          |
152	              Secured RTP Stream   Secured Redundancy RTP Stream
153	                      V                          V
154	           +----------------------+   +----------------------+
155	           |   Media Transport    |   |   Media Transport    |
156	           +----------------------+   +----------------------+

158	        Figure 1: Sender side concepts in the Media Chain with
159	                  application-level Media Transform

161	   These RTP packets are sent over the wire to a receiver media chain
162	   matching the sender side, reaching the Media Depacketizer that will
163	   reconstruct the Encoded Stream before passing it to the Media
164	   Decoder.

166	          +----------------------+   +----------------------+
167	          |   Media Transport    |   |   Media Transport    |
168	          +----------------------+   +----------------------+
169	            Received |                 Received | Secured
170	            Secured RTP Stream       Redundancy RTP Stream
171	                     V                          V
172	          +----------------------+   +----------------------+
173	          | RTP-Based Validation |   | RTP-Based Validation |
174	          +----------------------+   +----------------------+
175	                     |                          |
176	            Received RTP Stream   Received Redundancy RTP Stream
177	                     |                          |
178	                     |     +--------------------+
179	                     V     V
180	          +----------------------+
181	          |   RTP-Based Repair   |
182	          +----------------------+
183	                     |
184	            Repaired RTP Stream
185	                     V
186	          +----------------------+
187	          |  Media Depacketizer  |
188	          +----------------------+
189	                     |
190	         Received Transformed Stream
191	                     V
192	          +----------------------+
193	          |   Media Transformer  |<-- NEW: application-specific transform
194	          +----------------------+         (e.g. SFrame Decryption)
195	                     |
196	           Received Encoded Stream
197	                     V
198	          +----------------------+
199	          |    Media Decoder     |
200	          +----------------------+
201	                     |
202	           Received Source Stream
203	                     V
204	          +----------------------+
205	          |      Media Sink      |--> Synchronization Information
206	          +----------------------+
207	                     |
208	            Received Raw Stream
209	                     V
210	          +----------------------+
211	          |     Media Render     |
212	          +----------------------+
213	                     |
214	                     V
215	             Physical Stimulus

217	       Figure 2: Receiver side concepts in the Media Chain with
218	                  application-level Media Transform

220	   This packetization does not change how the mapping between one or
221	   several encoded or dependant streams are mapped to the RTP streams or
222	   how the synchronization sources(s) (SSRC) are assigned.

224	   Given the use of post-encoder application-specific transforms, the
225	   whole Media Chain needs to be made aware of it.  This includes the
226	   sender post-transform Media Chain, Media Transport intermediaries
227	   (SFUs typically) and receiver pre-transform Media Chain.

229	   As these transforms can alter Encoded Streams in any possible way,
230	   the use of codec-specific Media Packetizers like [RFC6184] on
231	   Transformed Stream may be suboptimal on sender side.  It may also be
232	   problematic on the receiving side in case codec-specific processing
233	   is done prior the Media Transformer.  Media Transport intermediaries
234	   are often looking at the Media Content itself to fuel their packet
235	   selection algorithms.

237	2.  Goals

239	   The objective of this document is to support inserting any
240	   application-specific transform between encoders and packetizers in
241	   the Media Chain.  For that purpose, this document will: 1.  Provide a
242	   packetization format that supports multiple media content used by
243	   WebRTC applications (audio compressed by Opus, video compressed by
244	   H264 or VP8, encrypted content...) that allows reuse of existing RTP
245	   mechanisms in place in WebRTC applications such as RTX, RED or FEC.
246	   2.  Provide a way to negotiate use of this packetization format
247	   between sender and receiver, with minimum impact on existing
248	   negotiation approaches. 3.  Provide a side-channel information so
249	   that network intermediaries (SFU in particular) can do their existing
250	   packet routing strategies without inspecting the media content.

252	3.  RTP Packetization

254	   This packetizer, by design, is not expected to understand the format
255	   of the media to transmit.  The unit used by the packetizer to do
256	   processing is called a frame in the remainder of the document.

258	   It is the responsibility of the application using the packetizer to
259	   group media content in meaningful frames.  In the common case of a
260	   video codec, the packetizer frame is the frame in byte format (h264
261	   annex b for example) generated by the encoder.

263	   If the application wants to transform encoded content, the
264	   application needs to split the encoded content into frames prior the
265	   transform.  Each frame is then transformed independently, for
266	   instance encrypted using [SFrame].  The content of each transformed
267	   frame is then processed by the packetizer.

269	   In the case of a video codec supporting spatial scalability, each
270	   spatial layer MUST be split in its own frame by the application
271	   before passing it to the packetizer.

273	   When the packetizer receives a frame from the application, it MUST
274	   fragment the frame content in multiple RTP packets to ensure packets
275	   do not exceed the network maximum transmission unit.  The content of
276	   the frame will be treated as a binary blob by the packetizer, so the
277	   decision about the boundaries of each fragment is decided arbitrarily
278	   by the packetizer.  The packetizer or any relying server MUST NOT
279	   modify the frame content and concatenating the RTP payload of the RTP
280	   packets for each frame MUST produce the exact binary content of the
281	   input frame content.

283	   The marker bit of each RTP packet in a frame MUST be set according to
284	   the audio and video profiles specified in [RFC3551].

286	   The spatial layer frames are sent in ascending order, with the same
287	   RTP timestamp, and only the last RTP packet of the last spatial layer
288	   frame will have the marker bit set to 1.

290	4.  Payload Multiplexing

292	   In order to reduce the number of payload type in the SDP exchange, a
293	   single payload type code for this multi-codec packetization can be
294	   used for all negotiated media formats that the multi-codec
295	   packetization supports.  That requires to identify the original
296	   payload type code of the frame negotiated media format, called the
297	   associated payload type (APT) hereunder.  The APT value is the
298	   payload type code of the associated format passed to the multi-codec
299	   Media Packetizer before any transformation is applied.

301	   The APT value is sent in a dedicated header extension.  The payload
302	   of this header extension can be encoded using either the one-byte or
303	   two-byte header defined in [RFC5285].  Figures 3 and 4 show examples
304	   with each one of these examples.

306	                       0                   1
307	                       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
308	                      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
309	                      |  ID   | len=0 |S|     APT     |
310	                      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

312	      Figure 3: Frame associated payload type encoding using the One-
313	                             Byte header format

315	        0                   1                   2                   3
316	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
317	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
318	       |      ID       |     len=1     |S|     APT     |    0 (pad)    |
319	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

321	     Figure 4: Frame associated payload type encoding using the Two-
322	                            Byte header format

324	   The APT value is the associated payload type value.  The S bit
325	   indicates if the media stream can be forwarded safely starting from
326	   this RTP packet.  Typically, it will be set to 1 on the first RTP
327	   packet of an intra video frame and in all RTP audio packets.

329	   Receivers MUST be ready to receive RTP packets with different
330	   associated payload types in the same way they would receive different
331	   payload type codes on the RTP packets.

333	   The URI for declaring this header extension in an extmap attribute is
334	   "urn:ietf:params:rtp-hdrext:associated-payload-type".

336	5.  SDP Negotiation

338	   To use the multi-codec packetization, the SDP Offer/Answer exchange
339	   MUST negotiate: - The payload type of the negotiated codec format -
340	   The multi-codec payload type - The associated payload type header
341	   extension

343	   Only the negotiated payload types are allowed to be used as
344	   associated payload types.  Figure 5 illustrates a SDP that negotiates
345	   exchange of video using either VP8 or VP9 codecs with the possibility
346	   to use the multi-codec packetization.  In this example, RTX is also
347	   negotiated and will be applied normally on each associated payload
348	   type.

350	   m=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101
351	   c=IN IP4 0.0.0.0
352	   a=rtcp:9 IN IP4 0.0.0.0
353	   a=setup:actpass
354	   a=mid:1
355	   a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid
356	   a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id
357	   a=extmap:3 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id
358	   a=extmap:4 urn:ietf:params:rtp-hdrext:associated-payload-type
359	   a=sendrecv
360	   a=rtpmap:96 vp9/90000
361	   a=rtpmap:97 vp8/90000
362	   a=rtpmap:98 generic/90000
363	   a=rtpmap:99 rtx/90000
364	   a=fmtp:99 apt=96
365	   a=rtpmap:100 rtx/90000
366	   a=fmtp:100 apt=97
367	   a=rtpmap:101 rtx/90000
368	   a=fmtp:101 apt=98

370	       Figure 5: SDP example negotiating the multi-codec payload type
371	                   and related header extension for video

373	6.  SFU Packet Selection

375	   SFUs need to have a basic understanding of each frame they receive so
376	   they can decide to forward it or not and to which endpoint.  They
377	   might need similar information to support media content recording.
378	   This information is either generic to a group of frames (called a
379	   stream hereafter) or specific to each frame.

381	   The information is transmitted as a RTP header extension as the RTP
382	   packet payload should be treated as opaque by the SFU.  This is
383	   especially necessary if the payload is end-to-end encrypted.  The
384	   amount of information should be limited to what is strictly necessary
385	   to the SFU task since it is not always as trusted as individual
386	   peers.

388	   For audio, configuration information such as Opus TOC might be
389	   useful.  For video, configuration information might include: - Stream
390	   configuration information: resolution, quality, frame rate... - Codec
391	   specific configuration information: codec profile like profile_idc...
392	   - Frame specific information: whether the stream is decodable when
393	   starting from this frame, whether the frame is skippable...

395	   For video content, this information is sent using a Dependency
396	   Descriptor header extension.  In that case, the first RTP packet of
397	   the frame will have its start_of_frame equal to 1 and the last packet
398	   will have its end_of_frame equal to 1.

400	7.  Sender Processing Rules

402	   The sender identifies the use of the multi-codec payload format by
403	   using the urn:ietf:params:rtp-hdrext:associated-payload-type
404	   extension.  When doing so, the sender follows these additional rules:
405	   - For audio content, the associated payload type MUST reference an
406	   audio codec in the supported audio codec list.  The supported audio
407	   codec list contains the audio codecs enumerated in [RFC7874].  This
408	   list may be extended in future versions of this specification. - For
409	   video content, H.264 and VP8 are supported as described in [RFC7742],
410	   as well as VP9 and AV.1.  In the case scalable video coding is used,
411	   the sender MUST generate a Dependency Descriptor header extension.
412	   This requires the associated payload type to reference a video codec
413	   that can be described using the Dependency Descriptor header
414	   extension.  This also requires the sender to split the video encoder
415	   output in frames that can each be described using the Dependency
416	   Descriptor header extension.

418	   These rules apply to both the originator of the content as well as
419	   SFUs that might route the content to end receivers.

421	8.  Redundancy Techniques Considerations

423	   The solution described in this document is expected to integrate well
424	   with the existing RTP ecosystem.  This section describes how the
425	   multi-codec packetizer can be used jointly with existing techniques
426	   that allow to mitigate unreliable transports.

428	8.1.  Retransmission Techniques

430	   [RFC4588] defines a retransmission payload format (RTX) that can be
431	   used in case of packet loss.  As defined in [RFC4588], RTX is able to
432	   handle any payload format, including the format described in this
433	   document.  Given RTX preserves both RTP packet payload and headers,
434	   the receiver will be able to identify the payload type of the
435	   recovered packet and whether multi-codec packetization is used.  RTX
436	   will also allow recovering RTP header extensions that convey
437	   information on the media content itself.

439	8.2.  Forward Error Correction (FEC) Techniques

441	   FEC is another technique used in RTP Media Chains to protect media
442	   content against packet loss.  [RFC5109] defines such a payload format
443	   used to transmit FEC for specific packets protection.

445	   FEC may protect some parts of the media content more than others.
446	   For instance, intra video frame encoded data or important network
447	   abstraction layer units (NALUs) like SPS/PPS may be more protected.
448	   With a post-encoder transform and the use of a multi-codec
449	   packetization, the granularity of the recovery mechanism is no longer
450	   at the NALU level but at the level of the frame generated by the
451	   post-encoder transform.  In case a SVC codec is used, each spatial
452	   layer will be processed as an independent frame.  In that case, base
453	   layers can be protected more heavily than higher resolution layers.

455	8.3.  Redundant Audio Data Techniques

457	   As defined in [RFC7656] RTP-based redundancy is defined here as a
458	   transformation that generates redundant or repair packets sent out as
459	   a Redundancy RTP Stream to mitigate Network Transport impairments,
460	   like packet loss and delay.

462	   [RFC2198] defines a payload format for sending the same audio data
463	   encoded multiple times at different quality levels.  This allows to
464	   use a lower quality encoding of the audio data, should the higher
465	   quality encoding of the audio data is lost during the transmission.

467	   If a Media Transformation is in use, both the primary and redundant
468	   encoding must be transformed independently and the redundant packet
469	   created normally.  As the RTP headers present in the redundant packet
470	   are only applicable to the primary encoding, if the payload type for
471	   a redundant encoding block is mapped to the multi-codec packetizer,
472	   the value of the associated payload type for the primary encoding is
473	   applied to the redundant encoding block as well.

475	9.  Alternatives

477	   Various alternatives can be used to implement and negotiate multi-
478	   codec packetization.  This section describes a few additional
479	   alternatives.  This section is to be removed before finalization of
480	   the document.

482	9.1.  Generic Packetization With In-Payload APT

484	   Instead of using a RTP header extension to convey the APT value, it
485	   is prepended in the RTP payload itself.  As the value cannot change
486	   for a whole frame, its value is prepended to the first packet
487	   generated of the frame only.  This removes the need to negotiate a
488	   dedicated header extension, but may require the SFU to update the
489	   payload when sending or recording content.

491	9.2.  A Payload Type for Generic Packetization AND Media Format

493	   The payload type is negotiated in the SDP so as to identify both the
494	   negotiated codec format and the multi-codec packetization use.  There
495	   is no network cost but this increases the number of payload types
496	   used in the SDP.

498	   m=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101
499	   c=IN IP4 0.0.0.0
500	   a=rtcp:9 IN IP4 0.0.0.0
501	   a=setup:actpass
502	   a=mid:1
503	   a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid
504	   a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id
505	   a=extmap:3 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id
506	   a=sendrecv
507	   a=rtpmap:96 vp9/90000
508	   a=rtpmap:97 generic/90000
509	   a=fmtp:97 apt=96
510	   a=rtpmap:98 vp8/90000
511	   a=rtpmap:99 generic/90000
512	   a=fmtp:99 apt=98
513	   a=rtpmap:100 rtx/90000
514	   a=fmtp:100 apt=96
515	   a=rtpmap:101 rtx/90000
516	   a=fmtp:101 apt=97
517	   a=rtpmap:102 rtx/90000
518	   a=fmtp:102 apt=98
519	   a=rtpmap:103 rtx/90000
520	   a=fmtp:103 apt=99

522	      Figure 6: SDP example negotiating a payload type for format and
523	                         multi-codec packetization

525	   A variation of this approach is to consider defining several multi-
526	   codec payload types, each of them having an identified codec format.

528	   m=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101
529	   c=IN IP4 0.0.0.0
530	   a=rtcp:9 IN IP4 0.0.0.0
531	   a=setup:actpass
532	   a=mid:1
533	   a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid
534	   a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id
535	   a=extmap:3 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id
536	   a=sendrecv
537	   a=rtpmap:96 generic/90000
538	   a=fmtp:96 codec=vp9
539	   a=rtpmap:97 generic/90000
540	   a=fmtp:97 codec=vp8
541	   a=rtpmap:98 rtx/90000
542	   a=fmtp:98 apt=96
543	   a=rtpmap:99 rtx/90000
544	   a=fmtp:99 apt=97

546	      Figure 7: Alternative SDP example negotiating a payload type for
547	                    format and multi-codec packetization

549	9.3.  A RTP Header To Choose Packetization

551	   A RTP header extension can be used to flag content as opaque so that
552	   the receiver knows whether to use or not the multi-codec
553	   packetization.  As for the API header extension, the RTP header
554	   extension may not need to be sent for every packet, it could for
555	   instance be sent for the first packet of every intra video frame.
556	   The main advantage of this approach is the reduced impact on SDP
557	   negotiation.

559	   m=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101
560	   c=IN IP4 0.0.0.0
561	   a=rtcp:9 IN IP4 0.0.0.0
562	   a=setup:actpass
563	   a=mid:1
564	   a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid
565	   a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id
566	   a=extmap:3 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id
567	   a=extmap:4 urn:ietf:params:rtp-hdrext:multi-codec-packetization-use
568	   a=sendrecv
569	   a=rtpmap:96 vp9/90000
570	   a=rtpmap:97 vp8/90000
571	   a=rtpmap:98 rtx/90000
572	   a=fmtp:98 apt=96
573	   a=rtpmap:99 rtx/90000
574	   a=fmtp:99 apt=97
575	       Figure 8: SDP example negotiating multi-codec packetization as
576	                            RTP header extension

578	10.  Security Considerations

580	   RTP packets using the payload format defined in this specification
581	   are subject to the general security considerations discussed in
582	   [RFC3550].  It is not expected that the proposed solution presented
583	   in this document can create new security threats.  The use and
584	   implementation of RTP Media Chains containing Media Transformers
585	   needs to be done carefully.  It is important to refer to the security
586	   considerations discussed in [SFrame] and [WebRTCInsertableStreams].
587	   In particular Media Transformers on the receiver side need to be
588	   prepared to receive arbitrary content, like decoders already do.
589	   Similarly, since Media Transformers can be implemented as JavaScript
590	   in browsers, RTP Packetizers should be prepared to receive arbitrary
591	   content.

593	11.  IANA Considerations

595	   Two new media subtypes have been registered with IANA, as described
596	   in this section.

598	11.1.  Registration of audio/generic

600	   Type name: audio

602	   Subtype name: generic

604	   Required parameters: none

606	   Optional parameters: none

608	   Encoding considerations: This format is framed (see Section 4.8 in
609	   the template document) and contains binary data.

611	   Security considerations: TBD.

613	   Interoperability considerations: TBD

615	   Published specification: TBD.

617	   Applications that use this media type: TBD.

619	   Additional information: none

621	   Intended usage: COMMON
622	   Restrictions on usage: TBD

624	   Author:

626	   Change controller:

628	12.  Registration of video/generic

630	   Type name: video

632	   Subtype name: generic

634	   Required parameters: none

636	   Optional parameters: none

638	   Encoding considerations: This format is framed (see Section 4.8 in
639	   the template document) and contains binary data.

641	   Security considerations: TBD.

643	   Interoperability considerations: TBD

645	   Published specification: TBD.

647	   Applications that use this media type: TBD.

649	   Additional information: none

651	   Intended usage: COMMON

653	   Restrictions on usage: TBD

655	   Author:

657	   Change controller:

659	13.  References

661	13.1.  Normative References

663	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
664	              Requirement Levels", BCP 14, RFC 2119,
665	              DOI 10.17487/RFC2119, March 1997,
666	              <https://www.rfc-editor.org/info/rfc2119>.

668	   [RFC3550]  Schulzrinne, H., Casner, S., Frederick, R., and V.
669	              Jacobson, "RTP: A Transport Protocol for Real-Time
670	              Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550,
671	              July 2003, <https://www.rfc-editor.org/info/rfc3550>.

673	   [RFC3551]  Schulzrinne, H. and S. Casner, "RTP Profile for Audio and
674	              Video Conferences with Minimal Control", STD 65, RFC 3551,
675	              DOI 10.17487/RFC3551, July 2003,
676	              <https://www.rfc-editor.org/info/rfc3551>.

678	   [RFC3711]  Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K.
679	              Norrman, "The Secure Real-time Transport Protocol (SRTP)",
680	              RFC 3711, DOI 10.17487/RFC3711, March 2004,
681	              <https://www.rfc-editor.org/info/rfc3711>.

683	   [RFC4566]  Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
684	              Description Protocol", RFC 4566, DOI 10.17487/RFC4566,
685	              July 2006, <https://www.rfc-editor.org/info/rfc4566>.

687	   [RFC5285]  Singer, D. and H. Desineni, "A General Mechanism for RTP
688	              Header Extensions", RFC 5285, DOI 10.17487/RFC5285, July
689	              2008, <https://www.rfc-editor.org/info/rfc5285>.

691	   [RFC7656]  Lennox, J., Gross, K., Nandakumar, S., Salgueiro, G., and
692	              B. Burman, Ed., "A Taxonomy of Semantics and Mechanisms
693	              for Real-Time Transport Protocol (RTP) Sources", RFC 7656,
694	              DOI 10.17487/RFC7656, November 2015,
695	              <https://www.rfc-editor.org/info/rfc7656>.

697	   [RFC8285]  Singer, D., Desineni, H., and R. Even, Ed., "A General
698	              Mechanism for RTP Header Extensions", RFC 8285,
699	              DOI 10.17487/RFC8285, October 2017,
700	              <https://www.rfc-editor.org/info/rfc8285>.

702	13.2.  Informative References

704	   [RFC2198]  Perkins, C., Kouvelas, I., Hodson, O., Hardman, V.,
705	              Handley, M., Bolot, J.C., Vega-Garcia, A., and S. Fosse-
706	              Parisis, "RTP Payload for Redundant Audio Data", RFC 2198,
707	              DOI 10.17487/RFC2198, September 1997,
708	              <https://www.rfc-editor.org/info/rfc2198>.

710	   [RFC4588]  Rey, J., Leon, D., Miyazaki, A., Varsa, V., and R.
711	              Hakenberg, "RTP Retransmission Payload Format", RFC 4588,
712	              DOI 10.17487/RFC4588, July 2006,
713	              <https://www.rfc-editor.org/info/rfc4588>.

715	   [RFC5109]  Li, A., Ed., "RTP Payload Format for Generic Forward Error
716	              Correction", RFC 5109, DOI 10.17487/RFC5109, December
717	              2007, <https://www.rfc-editor.org/info/rfc5109>.

719	   [RFC6184]  Wang, Y.-K., Even, R., Kristensen, T., and R. Jesup, "RTP
720	              Payload Format for H.264 Video", RFC 6184,
721	              DOI 10.17487/RFC6184, May 2011,
722	              <https://www.rfc-editor.org/info/rfc6184>.

724	   [RFC6464]  Lennox, J., Ed., Ivov, E., and E. Marocco, "A Real-time
725	              Transport Protocol (RTP) Header Extension for Client-to-
726	              Mixer Audio Level Indication", RFC 6464,
727	              DOI 10.17487/RFC6464, December 2011,
728	              <https://www.rfc-editor.org/info/rfc6464>.

730	   [RFC6465]  Ivov, E., Ed., Marocco, E., Ed., and J. Lennox, "A Real-
731	              time Transport Protocol (RTP) Header Extension for Mixer-
732	              to-Client Audio Level Indication", RFC 6465,
733	              DOI 10.17487/RFC6465, December 2011,
734	              <https://www.rfc-editor.org/info/rfc6465>.

736	   [RFC6904]  Lennox, J., "Encryption of Header Extensions in the Secure
737	              Real-time Transport Protocol (SRTP)", RFC 6904,
738	              DOI 10.17487/RFC6904, April 2013,
739	              <https://www.rfc-editor.org/info/rfc6904>.

741	   [RFC7742]  Roach, A.B., "WebRTC Video Processing and Codec
742	              Requirements", RFC 7742, DOI 10.17487/RFC7742, March 2016,
743	              <https://www.rfc-editor.org/info/rfc7742>.

745	   [RFC7874]  Valin, JM. and C. Bran, "WebRTC Audio Codec and Processing
746	              Requirements", RFC 7874, DOI 10.17487/RFC7874, May 2016,
747	              <https://www.rfc-editor.org/info/rfc7874>.

749	   [SFrame]   "Secure Frame (SFrame)", n.d.,
750	              <https://tools.ietf.org/html/draft-omara-sframe>.

752	   [WebRTCInsertableStreams]
753	              "WebRTC Insertable Media using Streams", n.d.,
754	              <https://w3c.github.io/webrtc-insertable-streams>.

756	Authors' Addresses

758	   Sergio Garcia Murillo
759	   CoSMo

761	   Email: sergio.garcia.murillo@cosmosoftware.io
762	   Youenn Fablet
763	   Apple Inc.

765	   Email: youenn@apple.com

767	   Alex Gouaillard
768	   CoSMo

770	   Email: alex.gouaillard@cosmosoftware.io

772	   Justin Uberti
773	   Clubhouse

775	   Email: justin@uberti.name