idnits 2.17.1 

draft-gouaillard-avtcore-codec-agn-rtp-payload-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The abstract seems to contain references ([SFrame],
     [WebRTCInsertableStreams]), which it shouldn't.  Please replace those
     with straight textual mentions of the documents in question.

  ** The document seems to lack a both a reference to RFC 2119 and the
     recommended RFC 2119 boilerplate, even if it appears to use RFC 2119
     keywords. 

     RFC 2119 keyword, line 298: '...   spatial layer MUST be split in its ...'
     RFC 2119 keyword, line 301: '... a frame from the application, it MUST...'
     RFC 2119 keyword, line 306: '...ketizer or any relying server MUST NOT...'
     RFC 2119 keyword, line 308: '...s for each frame MUST produce the exac...'
     RFC 2119 keyword, line 311: '...acket in a frame MUST be set according...'
     (2 more instances...)


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (February 19, 2021) is 1161 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  ** Obsolete normative reference: RFC 5285 (Obsoleted by RFC 8285)

  ** Downref: Normative reference to an Informational RFC: RFC 7656


     Summary: 4 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	AVTCORE                                                S. Garcia Murillo
2	Internet-Draft                                             A. Gouaillard
3	Intended status: Standards Track                          CoSMo Software
4	Expires: August 23, 2021                               February 19, 2021

6	              Codec agnostic RTP payload format for video
7	           draft-gouaillard-avtcore-codec-agn-rtp-payload-00

9	Abstract

11	   RTP Media Chains usually rely on piping encoder output directly to
12	   packetizers.  Media packetization formats often support a specific
13	   codec format and optimize RTP packets generation accordingly.

15	   With the development of Selective Forward Unit (SFU) solutions, that
16	   do not process media content server side, the need for media content
17	   processing at the origin and at the destination has arised.

19	   RTP Media Chains used e.g. in WebRTC solutions are increasingly
20	   relying on application-specific transforms that sit in-between
21	   encoder and packetizer on one end and in-between depacketizer and
22	   decoder on the other end.  This use case has become so important,
23	   that the W3C is standardizing the capacity to access encoded content
24	   with the [WebRTCInsertableStreams] API proposal.  An extremely
25	   popular use case is application level end-to-end encryption of media
26	   content, using for instance [SFrame].

28	   Whatever the modification applied to the media content, RTP
29	   packetizers can no longer expect to use packetization formats that
30	   mandate media content to be in a specific codec format.

32	   In the extreme cases like encryption, where the RTP Payload is made
33	   completely opaque to the SFUs, some extra mechanism must also be
34	   added for them to be able to route the packets without depending on
35	   RTP payload or payload headers.

37	   The traditionnal process of creating a new RTP Payload specification
38	   per content would not be practical as we would need to make a new one
39	   for each codec-transform pair.

41	   This document describes a solution, which provides the following
42	   features in the case the encoded content has been modified before
43	   reaching the packetizer: - a paylaod agnostic RTP packetization
44	   format that can be used on any media content, - a signalling
45	   mechanism for the above format and the inner payload, Both of the
46	   above mechanism are backward compatible with most of (S)RTP/RTCP
47	   mechanisms used for bandwidth estimation and congestion control in
48	   RTP/SRTP/webrtc, including but not limited to SSRC, RED, FEC, RTX,
49	   NACK, SR/RR, REMB, transport-wide-CC, TIMBR, .... It as illustrated
50	   by existing implementations in chrome, safari, and Medooze.

52	   This document also describes a solution to allow SFUs to continue
53	   performing packet routing on top of this generic RTP packetization
54	   format.

56	   This document complements the SFrame (media encryption), and
57	   Dependency Descriptor (AV1 payload annex) documents to provide an
58	   End-to-End-Encryption solution that would sit on top of SRTP/Webrtc,
59	   use SFUs on the media back-end, and leverage W3C APIs in the browser.
60	   A high level description of such system will be provided as an
61	   informational I-D in the SFrame WG and then cited here.

63	Status of This Memo

65	   This Internet-Draft is submitted in full conformance with the
66	   provisions of BCP 78 and BCP 79.

68	   Internet-Drafts are working documents of the Internet Engineering
69	   Task Force (IETF).  Note that other groups may also distribute
70	   working documents as Internet-Drafts.  The list of current Internet-
71	   Drafts is at https://datatracker.ietf.org/drafts/current/.

73	   Internet-Drafts are draft documents valid for a maximum of six months
74	   and may be updated, replaced, or obsoleted by other documents at any
75	   time.  It is inappropriate to use Internet-Drafts as reference
76	   material or to cite them other than as "work in progress."

78	   This Internet-Draft will expire on August 23, 2021.

80	Copyright Notice

82	   Copyright (c) 2021 IETF Trust and the persons identified as the
83	   document authors.  All rights reserved.

85	   This document is subject to BCP 78 and the IETF Trust's Legal
86	   Provisions Relating to IETF Documents
87	   (https://trustee.ietf.org/license-info) in effect on the date of
88	   publication of this document.  Please review these documents
89	   carefully, as they describe your rights and restrictions with respect
90	   to this document.  Code Components extracted from this document must
91	   include Simplified BSD License text as described in Section 4.e of
92	   the Trust Legal Provisions and are provided without warranty as
93	   described in the Simplified BSD License.

95	Table of Contents

97	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
98	   2.  Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . .   6
99	   3.  RTP Packetization . . . . . . . . . . . . . . . . . . . . . .   6
100	   4.  Payload Multiplexing  . . . . . . . . . . . . . . . . . . . .   7
101	   5.  SDP Negotiation . . . . . . . . . . . . . . . . . . . . . . .   8
102	   6.  SFU Packet Selection  . . . . . . . . . . . . . . . . . . . .   9
103	   7.  Redundancy Techniques Considerations  . . . . . . . . . . . .  10
104	     7.1.  Retransmission Techniques . . . . . . . . . . . . . . . .  10
105	     7.2.  Forward Error Correction (FEC) Techniques . . . . . . . .  10
106	     7.3.  Redundant Audio Data Techniques . . . . . . . . . . . . .  10
107	   8.  Alternatives  . . . . . . . . . . . . . . . . . . . . . . . .  11
108	     8.1.  Generic Packetization With In-Payload APT . . . . . . . .  11
109	     8.2.  A Payload Type for Generic Packetization AND Media Format  11
110	     8.3.  A RTP Header To Choose Packetization  . . . . . . . . . .  13
111	   9.  Security Considerations . . . . . . . . . . . . . . . . . . .  13
112	   10. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  14
113	     10.1.  Registration of audio/generic  . . . . . . . . . . . . .  14
114	   11. Registration of video/generic . . . . . . . . . . . . . . . .  14
115	   12. References  . . . . . . . . . . . . . . . . . . . . . . . . .  15
116	     12.1.  Normative References . . . . . . . . . . . . . . . . . .  15
117	     12.2.  Informative References . . . . . . . . . . . . . . . . .  15
118	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  16

120	1.  Introduction

122	   As per Figure 1 of [RFC7656], a Media Packetizer transforms a single
123	   Encoded Stream into one or several RTP packets.  The Encoded Stream
124	   is coming straight from the Media Encoder and is expected to follow
125	   the format produced by the Media Encoder.  A number of Media
126	   Packetizer formats have been designed to process a specific format
127	   produced by Media Encoder.  For instance [RFC6184] is dedicated to
128	   the processing of content produced by H.264 Media Encoders, and
129	   generates packets following NALUs organization.

131	   WebRTC applications are increasingly deploying end-to-end encryption
132	   solutions on top of RTP Media Chains.  End-to-end encryption is
133	   implemented by inserting application-specific Media Transformers
134	   between Media Encoder and Media Packetizer on the sending side, and
135	   between Media Depacketizer and Media Decoder on the receiving side,
136	   as described in Figure 1 and Figure 2.  To support end-to-end
137	   encryption, Media Transformers can use the [SFrame] format.  In
138	   browsers, Media Transformers are implemented using
139	   [WebRTCInsertableStreams], for instance by injecting JavaScript code
140	   provided by web pages.

142	           Physical Stimulus
143	                    |
144	                    V
145	         +----------------------+
146	         |     Media Capture    |
147	         +----------------------+
148	                    |
149	               Raw Stream
150	                    V
151	         +----------------------+
152	         |     Media Source     |<-- Synchronization Timing
153	         +----------------------+
154	                    |
155	              Source Stream
156	                    V
157	         +----------------------+
158	         |    Media Encoder     |
159	         +----------------------+
160	                    |
161	              Encoded Stream
162	                    V
163	         +----------------------+
164	         |   Media Transformer  |<-- NEW: application-specific transform
165	         +----------------------+         (e.g. SFrame Encryption)
166	                    |
167	            Transformed Stream    +------------+
168	                    V             |            V
169	         +----------------------+ | +----------------------+
170	         |   Media Packetizer   | | | RTP-Based Redundancy |
171	         +----------------------+ | +----------------------+
172	                    |             |            |
173	                    +-------------+  Redundancy RTP Stream
174	             Source RTP Stream                 |
175	                    V                          V
176	         +----------------------+   +----------------------+
177	         |  RTP-Based Security  |   |  RTP-Based Security  |
178	         +----------------------+   +----------------------+
179	                    |                          |
180	            Secured RTP Stream   Secured Redundancy RTP Stream
181	                    V                          V
182	         +----------------------+   +----------------------+
183	         |   Media Transport    |   |   Media Transport    |
184	         +----------------------+   +----------------------+

186	         Figure 1: Sender Side Concepts in the Media Chain
187	            With Application-level Media Transform

189	   These RTP packets are sent over the wire to a receiver media chain
190	   matching the sender side, reaching the Media Depacketizer that will
191	   reconstruct the Encoded Stream before passing it to the Media
192	   Decoder.

194	         +----------------------+   +----------------------+
195	         |   Media Transport    |   |   Media Transport    |
196	         +----------------------+   +----------------------+
197	           Received |                 Received | Secured
198	           Secured RTP Stream       Redundancy RTP Stream
199	                    V                          V
200	         +----------------------+   +----------------------+
201	         | RTP-Based Validation |   | RTP-Based Validation |
202	         +----------------------+   +----------------------+
203	                    |                          |
204	           Received RTP Stream   Received Redundancy RTP Stream
205	                    |                          |
206	                    |     +--------------------+
207	                    V     V
208	         +----------------------+
209	         |   RTP-Based Repair   |
210	         +----------------------+
211	                    |
212	           Repaired RTP Stream
213	                    V
214	         +----------------------+
215	         |  Media Depacketizer  |
216	         +----------------------+
217	                    |
218	        Received Transformed Stream
219	                    V
220	         +----------------------+
221	         |   Media Transformer  |<-- NEW: application-specific transform
222	         +----------------------+         (e.g. SFrame Decryption)
223	                    |
224	          Received Encoded Stream
225	                    V
226	         +----------------------+
227	         |    Media Decoder     |
228	         +----------------------+
229	                    |
230	          Received Source Stream
231	                    V
232	         +----------------------+
233	         |      Media Sink      |--> Synchronization Information
234	         +----------------------+
235	                    |
236	           Received Raw Stream
237	                    V
238	         +----------------------+
239	         |     Media Render     |
240	         +----------------------+
241	                    |
242	                    V
243	            Physical Stimulus

245	            Figure 2: Receiver Side Concepts in the Media Chain
246	            With Application-level Media Transform

248	   This generic packetization does not change how the mapping between
249	   one or several encoded or dependant streams are mapped to the RTP
250	   streams or how the synchronization sources(s) (SSRC) are assigned.

252	   Given the use of post-encoder application-specific transforms, the
253	   whole Media Chain needs to be made aware of it.  This includes the
254	   sender post-transform Media Chain, Media Transport intermediaries
255	   (SFUs typically) and receiver pre-transform Media Chain.

257	   As these transforms can alter Encoded Streams in any possible way,
258	   the use of codec-specific Media Packetizers like [RFC6184] on
259	   Transformed Stream may be suboptimal on sender side.  It may also be
260	   problematic on the receiving side in case codec-specific processing
261	   is done prior the Media Transformer.  Media Transport intermediaries
262	   are often looking at the Media Content itself to fuel their packet
263	   selection algorithms.

265	2.  Goals

267	   The objective of this document is to support inserting any
268	   application-specific transform between encoders and packetizers in
269	   the Media Chain.  For that purpose, this document will: 1.  Provide a
270	   generic packetization format that supports any media content
271	   (compressed audio, compressed video, encrypted content...) that
272	   allows reuse of existing RTP mechanisms in place in WebRTC
273	   applications such as RTX, RED or FEC.  2.  Provide a way to negotiate
274	   use of the generic packetization format between sender and receiver,
275	   with minimum impact on existing negotiation approaches.  3.  Provide
276	   a side-channel information so that network intermediaries (SFU in
277	   particular) can do their existing packet routing strategies without
278	   inspecting the media content.

280	3.  RTP Packetization

282	   A generic packetizer, by design, is not expected to understand the
283	   format of the media to transmit.  The unit used by the packetizer to
284	   do processing is called a frame in the remainder of the document.

286	   It is the responsibility of the application using the packetizer to
287	   group media content in meaningful frames.  In the common case of a
288	   video codec, the packetizer frame is the frame in byte format (h264
289	   annex b for example) generated by the encoder.

291	   If the application wants to transform encoded content, the
292	   application needs to split the encoded content into frames prior the
293	   transform.  Each frame is then transformed independently, for
294	   instance encrypted using [SFrame].  The content of each transformed
295	   frame is then processed by the packetizer.

297	   In the case of a video codec supporting spatial scalability, each
298	   spatial layer MUST be split in its own frame by the application
299	   before passing it to the packetizer.

301	   When the packetizer receives a frame from the application, it MUST
302	   fragment the frame content in multiple RTP packets to ensure packets
303	   do not exceed the network maximum transmission unit.  The content of
304	   the frame will be treated as a binary blob by the packetizer, so the
305	   decision about the boundaries of each fragment is decided arbitrarily
306	   by the packetizer.  The packetizer or any relying server MUST NOT
307	   modify the frame content and concatenating the RTP payload of the RTP
308	   packets for each frame MUST produce the exact binary content of the
309	   input frame content.

311	   The marker bit of each RTP packet in a frame MUST be set according to
312	   the audio and video profiles specified in [RFC3551].

314	   The spatial layer frames are sent in ascending order, with the same
315	   RTP timestamp, and only the last RTP packet of the last spatial layer
316	   frame will have the marker bit set to 1.

318	4.  Payload Multiplexing

320	   In order to reduce the number of payload type in the SDP exchange, a
321	   single payload type code for the generic packetization can be used
322	   for all negotiated media formats.  That requires to identify the
323	   original payload type code of the frame negotiated media format,
324	   called the associated payload type (APT) hereunder.  The APT value is
325	   the payload type code of the associated format passed to the generic
326	   Media Packetizer before any transformation is applied.

328	   The APT value is sent in a dedicated header extension.  The payload
329	   of this header extension can be encoded using either the one-byte or
330	   two-byte header defined in [RFC5285].  Figures 3 and 4 show examples
331	   with each one of these examples.

333	                       0                   1
334	                       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
335	                      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
336	                      |  ID   | len=0 |S|     APT     |
337	                      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

339	   Figure 3: Frame Associated Payload Type Encoding Using the One-Byte
340	   Header Format

342	        0                   1                   2                   3
343	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
344	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
345	       |      ID       |     len=1     |S|     APT     |    0 (pad)    |
346	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

348	   Figure 4: Frame Associated Payload Type Encoding Using the Two-Byte
349	   Header Format

351	   The APT value is the associated payload type value.  The S bit
352	   indicates if the media stream can be forwarded safely starting from
353	   this RTP packet.  Typically, it will be set to 1 on the first RTP
354	   packet of an intra video frame and in all RTP audio packets.

356	   Receivers MUST be ready to receive RTP packets with different
357	   associated payload types in the same way they would receive different
358	   payload type codes on the RTP packets.

360	   The URI for declaring this header extension in an extmap attribute is
361	   "urn:ietf:params:rtp-hdrext:associated-payload-type".

363	5.  SDP Negotiation

365	   To use the RTP generic packetization, the SDP Offer/Answer exchange
366	   MUST negotiate: - The payload type of the negotiated codec format -
367	   The generic payload type - The associated payload type header
368	   extension

370	   Only the negotiated payload types are allowed to be used as
371	   associated payload types.  Figure 5 illustrates a SDP that negotiates
372	   exchange of video using either VP8 or VP9 codecs with the possibility
373	   to use the generic packetization.  In this example, RTX is also
374	   negotiated and will be applied normally on each associated payload
375	   type.

377	   m=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101
378	   c=IN IP4 0.0.0.0
379	   a=rtcp:9 IN IP4 0.0.0.0
380	   a=setup:actpass
381	   a=mid:1
382	   a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid
383	   a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id
384	   a=extmap:3 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id
385	   a=extmap:4 urn:ietf:params:rtp-hdrext:associated-payload-type
386	   a=sendrecv
387	   a=rtpmap:96 vp9/90000
388	   a=rtpmap:97 vp8/90000
389	   a=rtpmap:98 generic/90000
390	   a=rtpmap:99 rtx/90000
391	   a=fmtp:99 apt=96
392	   a=rtpmap:100 rtx/90000
393	   a=fmtp:100 apt=97
394	   a=rtpmap:101 rtx/90000
395	   a=fmtp:101 apt=98

397	   Figure 5: SDP example negotiating the generic payload type and
398	   related header extension for video

400	6.  SFU Packet Selection

402	   SFUs need to have a basic understanding of each frame they receive so
403	   they can decide to forward it or not and to which endpoint.  They
404	   might need similar information to support media content recording.
405	   This information is either generic to a group of frame (called a
406	   stream hereafter) or specific to each frame.

408	   The information is transmitted as a RTP header extension as the RTP
409	   packet payload should be treated as opaque by the SFU.  This is
410	   especially necessary if the payload is end-to-end encrypted.  The
411	   amount of information should be limited to what is strictly necessary
412	   to the SFU task since it is not always as trusted as individual
413	   peers.

415	   For audio, configuration information such as Opus TOC might be
416	   useful.  For video, configuration information might include: - Stream
417	   configuration information: resolution, quality, frame rate... - Codec
418	   specific configuration information: codec profile like profile_idc...
419	   - Frame specific information: whether the stream is decodable when
420	   starting from this frame, whether the frame is skippable...

422	   For video content, this information can be sent using a Dependency
423	   Descriptor header extension.  In that case, the first RTP packet of
424	   the frame will have its start_of_frame equal to 1 and the last packet
425	   will have its end_of_frame equal to 1.

427	7.  Redundancy Techniques Considerations

429	   The solution described in this document is expected to integrate well
430	   with the existing RTP ecosystem.  This section describes how the
431	   generic packetizer can be used jointly with existing techniques that
432	   allow to mitigate unreliable transports.

434	7.1.  Retransmission Techniques

436	   [RFC4588] defines a retransmission payload format (RTX) that can be
437	   used in case of packet loss.  As defined in [RFC4588], RTX is able to
438	   handle any payload format, including the format described in this
439	   document.  Given RTX preserves both RTP packet payload and headers,
440	   the receiver will be able to identify the payload type of the
441	   recovered packet and whether generic packetization is used.  RTX will
442	   also allow recovering RTP header extensions that convey information
443	   on the media content itself.

445	7.2.  Forward Error Correction (FEC) Techniques

447	   FEC is another technique used in RTP Media Chains to protect media
448	   content against packet loss.  [RFC5109] defines such a payload format
449	   used to transmit FEC for specific packets protection.

451	   FEC may protect some parts of the media content more than others.
452	   For instance, intra video frame encoded data or important network
453	   abstraction layer units (NALUs) like SPS/PPS may be more protected.
454	   With a post-encoder transform and the use of a generic packetization,
455	   the granularity of the recovery mechanism is no longer at the NALU
456	   level but at the level of the frame generated by the post-encoder
457	   transform.  In case a SVC codec is used, each spatial layer will be
458	   processed as an independent frame.  In that case, base layers can be
459	   protected more heavily than higher resolution layers.

461	7.3.  Redundant Audio Data Techniques

463	   As defined in [RFC7656] RTP-based redundancy is defined here as a
464	   transformation that generates redundant or repair packets sent out as
465	   a Redundancy RTP Stream to mitigate Network Transport impairments,
466	   like packet loss and delay.

468	   [RFC2198] defines a payload format for sending the same audio data
469	   encoded multiple times at different quality levels.  This allows to
470	   use a lower quality encoding of the audio data, should the higher
471	   quality encoding of the audio data is lost during the transmission.

473	   If a Media Transformation is in use, both the primary and redundant
474	   encoding must be transformed independently and the redundant packet
475	   created normally.  As the RTP headers present in the redundant packet
476	   are only applicable to the primary encoding, if the payload type for
477	   a redundant encoding block is mapped to the generic packetizer, the
478	   value of the associated payload type for the primary encoding is
479	   applied to the redundant encoding block as well.

481	8.  Alternatives

483	   Various alternatives can be used to implement and negotiate generic
484	   packetization.  This section describes a few additional alternatives.
485	   This section is to be removed before finalization of the document.

487	8.1.  Generic Packetization With In-Payload APT

489	   Instead of using a RTP header extension to convey the APT value, it
490	   is prepended in the RTP payload itself.  As the value cannot change
491	   for a whole frame, its value is prepended to the first packet
492	   generated of the frame only.  This removes the need to negotiate a
493	   dedicated header extension, but may require the SFU to update the
494	   payload when sending or recording content.

496	8.2.  A Payload Type for Generic Packetization AND Media Format

498	   The payload type is negotiated in the SDP so as to identify both the
499	   negotiated codec format and the generic packetization use.  There is
500	   no network cost but this increases the number of payload types used
501	   in the SDP.

503	   m=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101
504	   c=IN IP4 0.0.0.0
505	   a=rtcp:9 IN IP4 0.0.0.0
506	   a=setup:actpass
507	   a=mid:1
508	   a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid
509	   a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id
510	   a=extmap:3 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id
511	   a=sendrecv
512	   a=rtpmap:96 vp9/90000
513	   a=rtpmap:97 generic/90000
514	   a=fmtp:97 apt=96
515	   a=rtpmap:98 vp8/90000
516	   a=rtpmap:99 generic/90000
517	   a=fmtp:99 apt=98
518	   a=rtpmap:100 rtx/90000
519	   a=fmtp:100 apt=96
520	   a=rtpmap:101 rtx/90000
521	   a=fmtp:101 apt=97
522	   a=rtpmap:102 rtx/90000
523	   a=fmtp:102 apt=98
524	   a=rtpmap:103 rtx/90000
525	   a=fmtp:103 apt=99

527	   Figure 6: SDP example negotiating a payload type for format and
528	   generic packetization

530	   A variation of this approach is to consider defining generic payload
531	   types, each of them having an identified codec format.

533	   m=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101
534	   c=IN IP4 0.0.0.0
535	   a=rtcp:9 IN IP4 0.0.0.0
536	   a=setup:actpass
537	   a=mid:1
538	   a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid
539	   a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id
540	   a=extmap:3 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id
541	   a=sendrecv
542	   a=rtpmap:96 generic/90000
543	   a=fmtp:96 codec=vp9
544	   a=rtpmap:97 generic/90000
545	   a=fmtp:97 codec=vp8
546	   a=rtpmap:98 rtx/90000
547	   a=fmtp:98 apt=96
548	   a=rtpmap:99 rtx/90000
549	   a=fmtp:99 apt=97
550	   Figure 7: SDP example negotiating a payload type for format and
551	   generic packetization

553	8.3.  A RTP Header To Choose Packetization

555	   A RTP header extension can be used to flag content as opaque so that
556	   the receiver knows whether to use or not the generic packetization.
557	   As for the API header extension, the RTP header extension may not
558	   need to be sent for every packet, it could for instance be sent for
559	   the first packet of every intra video frame.  The main advantage of
560	   this approach is the reduced impact on SDP negotiation.

562	   m=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101
563	   c=IN IP4 0.0.0.0
564	   a=rtcp:9 IN IP4 0.0.0.0
565	   a=setup:actpass
566	   a=mid:1
567	   a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid
568	   a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id
569	   a=extmap:3 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id
570	   a=extmap:4 urn:ietf:params:rtp-hdrext:generic-packetization-use
571	   a=sendrecv
572	   a=rtpmap:96 vp9/90000
573	   a=rtpmap:97 vp8/90000
574	   a=rtpmap:98 rtx/90000
575	   a=fmtp:98 apt=96
576	   a=rtpmap:99 rtx/90000
577	   a=fmtp:99 apt=97

579	   Figure 8: SDP example negotiating generic packetization as RTP header
580	   extension

582	9.  Security Considerations

584	   RTP packets using the payload format defined in this specification
585	   are subject to the general security considerations discussed in
586	   [RFC3550].  It is not expected that the proposed solutions (generic
587	   packetization and header extension) presented in this document can
588	   create new security threats.  The use and implementation of RTP Media
589	   Chains containing Media Transformers needs to be done carerefully.
590	   It is important to refer to the security considerations discussed in
591	   [SFrame] and [WebRTCInsertableStreams].  In particular Media
592	   Transformers on the receiver side need to be prepared to receive
593	   arbitrary content, like decoders already do.  Similarly, since Media
594	   Transformers can be implemented as JavaScript in browsers, RTP
595	   Packetizers should be prepared to receive arbitrary content.

597	10.  IANA Considerations

599	   Two new media subtypes have been registered with IANA, as described
600	   in this section.

602	10.1.  Registration of audio/generic

604	   Type name: audio

606	   Subtype name: generic

608	   Required parameters: none

610	   Optional parameters: none

612	   Encoding considerations: This format is framed (see Section 4.8 in
613	   the template document) and contains binary data.

615	   Security considerations: TBD.

617	   Interoperability considerations: TBD

619	   Published specification: TBD.

621	   Applications that use this media type: TBD.

623	   Additional information: none

625	   Intended usage: COMMON

627	   Restrictions on usage: TBD

629	   Author:

631	   Change controller:

633	11.  Registration of video/generic

635	   Type name: video

637	   Subtype name: generic

639	   Required parameters: none

641	   Optional parameters: none

643	   Encoding considerations: This format is framed (see Section 4.8 in
644	   the template document) and contains binary data.

646	   Security considerations: TBD.

648	   Interoperability considerations: TBD

650	   Published specification: TBD.

652	   Applications that use this media type: TBD.

654	   Additional information: none

656	   Intended usage: COMMON

658	   Restrictions on usage: TBD

660	   Author:

662	   Change controller:

664	12.  References

666	12.1.  Normative References

668	   [RFC3550]  Schulzrinne, H., Casner, S., Frederick, R., and V.
669	              Jacobson, "RTP: A Transport Protocol for Real-Time
670	              Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550,
671	              July 2003, <https://www.rfc-editor.org/info/rfc3550>.

673	   [RFC3551]  Schulzrinne, H. and S. Casner, "RTP Profile for Audio and
674	              Video Conferences with Minimal Control", STD 65, RFC 3551,
675	              DOI 10.17487/RFC3551, July 2003,
676	              <https://www.rfc-editor.org/info/rfc3551>.

678	   [RFC5285]  Singer, D. and H. Desineni, "A General Mechanism for RTP
679	              Header Extensions", RFC 5285, DOI 10.17487/RFC5285, July
680	              2008, <https://www.rfc-editor.org/info/rfc5285>.

682	   [RFC7656]  Lennox, J., Gross, K., Nandakumar, S., Salgueiro, G., and
683	              B. Burman, Ed., "A Taxonomy of Semantics and Mechanisms
684	              for Real-Time Transport Protocol (RTP) Sources", RFC 7656,
685	              DOI 10.17487/RFC7656, November 2015,
686	              <https://www.rfc-editor.org/info/rfc7656>.

688	12.2.  Informative References

690	   [RFC2198]  Perkins, C., Kouvelas, I., Hodson, O., Hardman, V.,
691	              Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse-
692	              Parisis, "RTP Payload for Redundant Audio Data", RFC 2198,
693	              DOI 10.17487/RFC2198, September 1997,
694	              <https://www.rfc-editor.org/info/rfc2198>.

696	   [RFC4588]  Rey, J., Leon, D., Miyazaki, A., Varsa, V., and R.
697	              Hakenberg, "RTP Retransmission Payload Format", RFC 4588,
698	              DOI 10.17487/RFC4588, July 2006,
699	              <https://www.rfc-editor.org/info/rfc4588>.

701	   [RFC5109]  Li, A., Ed., "RTP Payload Format for Generic Forward Error
702	              Correction", RFC 5109, DOI 10.17487/RFC5109, December
703	              2007, <https://www.rfc-editor.org/info/rfc5109>.

705	   [RFC6184]  Wang, Y., Even, R., Kristensen, T., and R. Jesup, "RTP
706	              Payload Format for H.264 Video", RFC 6184,
707	              DOI 10.17487/RFC6184, May 2011,
708	              <https://www.rfc-editor.org/info/rfc6184>.

710	   [SFrame]   "Secure Frame (SFrame)", n.d.,
711	              <https://tools.ietf.org/html/draft-omara-sframe>.

713	   [WebRTCInsertableStreams]
714	              "WebRTC Insertable Media using Streams", n.d.,
715	              <https://w3c.github.io/webrtc-insertable-streams>.

717	Authors' Addresses

719	   Sergio Garcia Murillo
720	   CoSMo Software

722	   Email: sergio.garcia.murillo@cosmosoftware.io

724	   Alexandre Gouaillard
725	   CoSMo Software

727	   Email: alex.gouaillard@cosmosoftware.io