idnits 2.17.1 

draft-aboba-avtcore-sfu-rtp-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (6 July 2015) is 3215 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Unused Reference: 'I-D.grange-vp9-bitstream' is defined on line 539, but
     no explicit reference was found in the text

  == Unused Reference: 'I-D.ietf-avtcore-rtp-multi-stream' is defined on line
     544, but no explicit reference was found in the text

  == Unused Reference: 'I-D.ietf-payload-rtp-h265' is defined on line 566,
     but no explicit reference was found in the text

  == Outdated reference: A later version (-11) exists of
     draft-ietf-avtcore-rtp-multi-stream-07

  == Outdated reference: A later version (-08) exists of
     draft-ietf-avtext-rtp-grouping-taxonomy-07

  == Outdated reference: A later version (-10) exists of
     draft-ietf-avtext-rtp-stream-pause-08

  == Outdated reference: A later version (-15) exists of
     draft-ietf-payload-rtp-h265-13

  == Outdated reference: A later version (-17) exists of
     draft-ietf-payload-vp8-16

  == Outdated reference: A later version (-26) exists of
     draft-ietf-rtcweb-rtp-usage-25


     Summary: 0 errors (**), 0 flaws (~~), 10 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	AVTCORE Working Group                                           B. Aboba
3	INTERNET-DRAFT                                     Microsoft Corporation
4	Category: Informational
5	Expires: January 6, 2016                                     6 July 2015

7	                 Codec-Independent Selective Forwarding
8	                   draft-aboba-avtcore-sfu-rtp-00.txt

10	Abstract

12	   Selective Forwarding Units (SFUs) supporting Scalable Video Coding
13	   (SVC) typically parse the RTP payload in the forwarding plane, and
14	   often utilize codec-specific control messages within the control
15	   plane.  As a result, the control and/or forwarding planes of these
16	   implementations need to be modified (sometimes substantially) in
17	   order to support additional video codecs.  With SFUs now supporting
18	   VP8 in addition to H.264/SVC, and with additional video codecs
19	   expected to become popular, the inflexibility of SFU designs that
20	   depend on RTP payload parsing has become increasingly apparent.  In
21	   addition, these designs cannot function where the RTP payload is
22	   inaccessible, such as when it is encrypted with a key not available
23	   to the SFU.  Based on a summary of SFU implementation practice, this
24	   document develops requirements for codec-independent SFUs.

26	Status of This Memo

28	   This Internet-Draft is submitted in full conformance with the
29	   provisions of BCP 78 and BCP 79.

31	   Internet-Drafts are working documents of the Internet Engineering
32	   Task Force (IETF).  Note that other groups may also distribute
33	   working documents as Internet-Drafts.  The list of current Internet-
34	   Drafts is at http://datatracker.ietf.org/drafts/current/.

36	   Internet-Drafts are draft documents valid for a maximum of six months
37	   and may be updated, replaced, or obsoleted by other documents at any
38	   time.  It is inappropriate to use Internet-Drafts as reference
39	   material or to cite them other than as "work in progress."

41	   This Internet-Draft will expire on January 6, 2016.

43	Copyright Notice

45	   Copyright (c) 2015 IETF Trust and the persons identified as the
46	   document authors.  All rights reserved.

48	   This document is subject to BCP 78 and the IETF Trust's Legal
49	   Provisions Relating to IETF Documents
50	   (http://trustee.ietf.org/license-info) in effect on the date of
51	   publication of this document.  Please review these documents
52	   carefully, as they describe your rights and restrictions with respect
53	   to this document.  Code Components extracted from this document must
54	   include Simplified BSD License text as described in Section 4.e of
55	   the Trust Legal Provisions and are provided without warranty as
56	   described in the Simplified BSD License.

58	Table of Contents

60	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
61	       1.1.  Performance  . . . . . . . . . . . . . . . . . . . . . .  4
62	       1.2.  Terminology  . . . . . . . . . . . . . . . . . . . . . .  4
63	   2.  Control Plane  . . . . . . . . . . . . . . . . . . . . . . . .  6
64	       2.1   RTCP . . . . . . . . . . . . . . . . . . . . . . . . . .  6
65	   3.  Forwarding Plane . . . . . . . . . . . . . . . . . . . . . . .  7
66	       3.1   RTP header . . . . . . . . . . . . . . . . . . . . . . .  7
67	       3.2   RTP payload  . . . . . . . . . . . . . . . . . . . . . .  8
68	       3.3   Problematic payload fields . . . . . . . . . . . . . . . 10
69	   4.  Security Considerations  . . . . . . . . . . . . . . . . . . . 11
70	   5.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 12
71	   6.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 12
72	       6.1. Informative references  . . . . . . . . . . . . . . . . . 12
73	   Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . . . 14
74	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 15

76	1.  Introduction

78	   Selective Forwarding Units (SFUs) supporting simulcast and Scalable
79	   Video Coding (SVC) are widely deployed for use in video conferencing.
80	   While initial deployments focused on H.264/SVC [H.264] [RFC6190],
81	   SFUs supporting the VP8 [RFC6386] [I-D.ietf-payload-vp8] codec have
82	   also been deployed and implementations supporting VP9 [I-D.grange-
83	   vp9-bitstream] [I-D.uberti-payload-vp9] and HEVC [HEVC] [I-D.ietf-
84	   payload-rtp-h265] are expected.

86	   Given the growing demand for SFUs supporting multiple codecs, the
87	   disadvantages of codec-specific forwarding and control planes have
88	   become apparent.  These include:

90	   1. Forwarding plane maintenance.  SFU implementations originally
91	   developed for a specific codec such as H.264/SVC typically require
92	   updates and testing to support each additional codec.  Where the
93	   original forwarding logic depended on on RTP payload fields in the
94	   original supported codec that have no analog within additional video
95	   codecs (such as the PACSI NAL unit in H.264/SVC), forwarding path re-
96	   design or per-codec forwarding logic may be required, increasing
97	   complexity and potentially adversely impacting implementation
98	   performance.

100	   2. Control plane maintenance.  SFU implementations developed for a
101	   specific codec often utilize codec-specific control messages (such as
102	   the SEI layer drop message [RFC6190]) that may have no analog within
103	   other video codecs.  In addition, SFU implementations may depend on
104	   support for RTCP feedback messages [RFC4585] outside the mainstream
105	   [I-D.ietf-rtcweb-rtp-usage], thereby requiring control path re-design
106	   and/or support for additional feedback messages in order to support
107	   additional codecs.

109	   3. Interoperability issues.  Experience has shown that SFUs that
110	   depend on the RTP payload can be sensitive to differences in the
111	   encoder implementation.  As an example, implementations supporting
112	   H.264/SVC [RFC6190] vary in their treatment of the PACSI NAL unit.
113	   While some implementations require the PACSI NAL unit to be present
114	   in each packet and to be located first within the RTP payload, other
115	   implementations do not impose these restrictions, and some do not
116	   generate or parse the PACSI NAL unit at all.  Due to these
117	   differences, even implementations that are compatible at the
118	   bitstream level [H.264] and are generally conformant to the RTP
119	   payload specification [RFC6190] can fail to interoperate.

121	   4. Payload snooping.  While all SFUs require access to the IP header
122	   as well as RTCP, and many require the ability to modify RTP header
123	   fields and to insert, delete or modify RTP header extensions, in
124	   addition codec-dependent SFU implementations require access to the
125	   RTP payload.  Therefore codec-dependent SFUs have access to audio and
126	   video data in addition to metadata.  This is potentially problematic
127	   for multi-tenant SFUs where the same SFU may be shared by multiple
128	   customers, each of whom can require assurance that the contents of
129	   the RTP payload is available only to endpoints under their control.

131	   To address these problems, this document develops requirements for
132	   codec-independent SFU operation, both within the control and
133	   forwarding planes.

135	1.1.  Performance

137	   This document does not take a position on whether codec-independent
138	   SFUs offer enduring performance benefits.  By itself, substituting
139	   RTP header extension(s) for RTP payload fields within the SFU
140	   forwarding plane seems unlikely to have a long-term impact on SFU
141	   performance, although in the short-term, forwarding performance might
142	   be improve in some implementations.

144	   As described in Section 3.1, SFU implementations frequently modify
145	   one or more RTP [RFC3550] header fields covered by the SRTP [RFC3711]
146	   authentication tag, requiring the authentication tag to be recomputed
147	   and modified prior to forwarding.  In addition, where an SRTP cipher-
148	   suite is negotiated that depends upon one or more modified RTP header
149	   fields (such as the SSRC field), it is also necessary for the SFU to
150	   decrypt and then re-encrypt SRTP packets.  As a result, SFUs
151	   modifying RTP header fields cannot reduce cryptographic operations
152	   where SRTP per-packet integrity protection and confidentiality is
153	   desired.

155	   Nevertheless, measurements presented in [Hellwagner] indicate
156	   reductions in forwarding latency and improvements in performance from
157	   removing cryptographic operations from the forwarding plane.  Such a
158	   savings might be possible if only per-packet integrity protection
159	   were utilized within SRTP.  Investigating the security implications
160	   of this (which would involve reliance on end-to-end confidentiality
161	   and integrity protection applied to the RTP payload) is outside the
162	   scope of this document.

164	1.2.  Terminology

166	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
167	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
168	   document are to be interpreted as described in [RFC2119].

170	   "media aware network element (MANE)" is defined as in [RFC6184]
171	   Section 4.1:

173	      A network element, such as a middle-box, selective forwarding
174	      unit, or application layer gateway that is capable of parsing
175	      certain aspects of the RTP payload headers or the RTP payload and
176	      reacting to their contents.

178	   "Selective Forwarding Unit" is defined as in [I-D.ietf-avtcore-rtp-
179	   topologies-update] Section 3.7:

181	      The selective forwarding middle-box has been introduced in
182	      recently developed videoconferencing systems in conjunction with,
183	      and to capitalize on, scalable video coding as well as
184	      simulcasting...  In both scalable coding and simulcast cases the
185	      video signal is represented by a set of two or more bitstreams,
186	      providing a corresponding number of distinct fidelity points.  The
187	      middle-box selects which parts of a scalable bitstream (or which
188	      bitstream, in the case of simulcasting) to forward to each of the
189	      receiving End Points.  The decision may be driven by a number of
190	      factors, such as available bit rate, desired layout, etc.
191	      Contrary to transcoding MCUs, these "Selective Forwarding Units"
192	      (SFUs) have extremely low delay, and provide features that are
193	      typically associated with high-end systems (personalized layout,
194	      error localization) without any signal processing at the middle-
195	      box.  They are also capable of scaling to a large number of
196	      concurrent users, and--due to their very low delay-- can also be
197	      cascaded.

199	   A "Codec Independent SFU" is an SFU that only utilizes information
200	   from the the RTP header or header extensions, but does not parse or
201	   modify the RTP payload.  This implies that, unlike a MANE, a codec-
202	   independent SFU cannot do NAL-unit re-packetization, and needs to
203	   rely on IP fragmentation in the event of an MTU mismatch.  In
204	   addition, codec-independent SFUs cannot source or interpret codec-
205	   specific control messages, such as the SEI layer-drop message defined
206	   in H.264/SVC [RFC6190].

208	   Multiple RTP streams on a Single Transport (MRST): Multiple RTP
209	   streams on a single transport, carrying either a single Scalable
210	   Video Coding (SVC) bitstream or multiple simulcast streams.

212	   Multiple RTP streams on Multiple Transports (MRMT):  Multiple RTP
213	   streams carrying a single Scalable Video Coding (SVC) bitstream on
214	   Multiple Transports.

216	   RTP stream: See [I-D.ietf-avtext-rtp-grouping-taxonomy].  Within this
217	   document, one RTP stream can be utilized to transport one or more
218	   sub-layers.

220	   Single RTP stream on a Single Transport (SRST):  Single RTP stream
221	   carrying a single Scalable Video Coding (SVC) bitstream on a Single
222	   (Media) Transport.

224	   Transport: A tuple comprising the source and destination IP address
225	   and ports.

227	2.  Control Plane

229	2.1.  RTCP

231	   Within the Control Plane, SFU implementations require cleartext
232	   access to RTCP messages as well as to information provided within
233	   signaling.  This implies that an SFU supporting SRTP MUST have access
234	   to SRTCP master keys.

236	   RTCP messages commonly used by SFUs include the Sender Report (SR),
237	   Receiver Report (RR), SDES and BYE messages, as well as feedback
238	   messages.  Note that even though SFU implementations need to decrypt
239	   RTCP messages, designs that do not modify SSRCs can forward RTCP
240	   messages such as SDES without modification (or even parsing), which
241	   has the additional potential advantage of enabling SFUs to forward
242	   RTCP messages they do not understand.

244	2.1.1.  Feedback Messages

246	   RTCP feedback messages widely supported by existing SFU
247	   implementations include Generic NACK [RFC4585], PLI [RFC4585] and FIR
248	   [RFC5104].  Generic NACK, defined in [RFC4585] Section 6.2.1 and
249	   retransmission payload, defined in [RFC4588] are particularly
250	   effective where temporal scalability is supported, making it possible
251	   to repair losses within the base layer of an SVC bitstream in time to
252	   enable decoding of the succeeding base layer P frame.

254	   FIR, defined in [RFC5104] Section 4.3.1, is frequently employed
255	   within SFUs when switching between video streams.  When the SFU
256	   decides to forward an alternative simulcast stream to a participant,
257	   or when an SFU decides to switch from forwarding video stream(s) sent
258	   by a now-quiet participant to stream(s) originated by an active
259	   speaker, an FIR is sent by the SFU to the sender of the switched-in
260	   RTP stream.  The sender responds with an I-frame that is subsequently
261	   forwarded to the recipients, ensuring that they can decode subsequent
262	   frames within the switched in RTP stream.

264	   In contrast, feedback messages such as SLI, defined in [RFC4585]
265	   Section 6.3.2 and RPSI, defined in [RFC4585] Section 6.3.3 are rarely
266	   supported in modern SFU implementations.  SLI is infrequently
267	   supported because Generic NACK provides similar functionality in a
268	   codec-independent manner, while more efficiently supporting non-
269	   contiguous packet loss.

271	   RPSI is rarely supported since it requires that the encoder revert to
272	   use of an alternative reference picture not only for the participant
273	   experiencing the loss but for all participants.  Thus RPSI implies
274	   additional bandwidth usage not only on the link experiencing the
275	   loss, but also on the links utilized by other conference
276	   participants, thereby potentially exacerbating congestion problems.
277	   When temporal scalability is implemented, it is typically possible to
278	   employ retransmission [RFC4588] and/or Forward Error Correction
279	   [RFC5109] to protect against loss within the base layer.  In the
280	   (unlikely) event that these measures are insufficient, PLI and/or FIR
281	   feedback messages can be employed, so that RPSI support is
282	   unnecessary.

284	3.  Forwarding Plane

286	3.1.  RTP header

288	   Most existing SFU implementations modify one or more RTP header
289	   fields, including the payload type (PT), SSRC, CSRC, and sequence
290	   number fields.  SFU implementations may also modify RTP header
291	   extensions such as the abs-send-time extension [ABS-SEND].

293	   In signaling mechanisms such as SDP Offer/Answer [RFC3264], each
294	   conference participant can specify the PT they wish to receive and/or
295	   send for a particular codec.  Since conference participants can
296	   choose different PT values for the same codec, the SFU needs to be
297	   able to modify the PT field before forwarding RTP packets to a
298	   recipient.

300	   Some SFU implementations allocate their own SSRCs, which they place
301	   within RTP packets that they send to recipients, as well as within
302	   RTCP messages that they send.  As a result, the SSRC of RTP packets
303	   they receive is re-written prior to forwarding to conference
304	   participants.  As noted in [I-D.ietf-avtcore-rtp-topologies-update],
305	   SSRC re-writing has implications for the control plane as well as the
306	   forwarding plane.

308	   Some SFUs only forward video streams from participants that are
309	   considered to be active or recent speakers or (to enable access by
310	   hearing or speech impaired users) recent real-time text and/or chat
311	   contributors.  Techniques used for switching are described in
312	   [LASTN].  In order to determine the active or recent speakers, RTP
313	   header extensions such as the client-to-mixer audio level extension
314	   [RFC6464] may be used, and this header extension may therefore need
315	   to be available to the SFU in cleartext.

317	   SFUs that allocate their own SSRCs and re-write the SSRC field may
318	   utilize the CSRC field with video streams they forward in order to
319	   indicate a switch of an RTP stream from one contributor to another.
320	   SFUs that do not allocate their own SSRCs typically do not utilize
321	   the CSRC field in this way.

323	   Where an SFU drops one or more layers within an SVC bitstream
324	   utilizing SRST transport, it is necessary to rewrite sequence numbers
325	   as well as to customize RTCP sender reports to reflect the packets
326	   actually sent to individual participants.  Rewriting of sequence
327	   numbers is also required in SFUs that receive simulcast streams with
328	   distinct SSRCs, but forward a single stream with a fixed SSRC, a
329	   practice common in SFUs supporting VP8.

331	   Where MRST transport is used for SVC and/or simulcast, it is not
332	   necessary to rewrite sequence numbers on dropping a layer or
333	   simulcast stream, since each one uses its own SSRC and therefore has
334	   its own sequence number space.  However, in order to avoid rewriting
335	   of sequence numbers on resuming a previously dropped stream, it is
336	   necessary to indicate the drop explicitly, such as via the PAUSED
337	   mechanism described in [I-D.ietf-avtext-rtp-stream-pause] or via the
338	   SEI layer drop message.  The latter requires the SFU to insert a
339	   sequence number within RTP packets originated by the SFU.

341	   One of the implications of SFU modifications to RTP header fields is
342	   that SFUs require access to SRTP master keys so as to be able to
343	   recompute the authentication tag, which covers the modified RTP
344	   header fields.

346	3.2.  RTP payload

348	   In addition to utilizing information from the RTP header in the
349	   forwarding path, existing SFU implementations also utilize
350	   information present in the RTP payload field. Within the VP8 and
351	   H.264/SVC codecs, this includes the following fields:

353	   Frame type: IDR

355	      Knowledge of the frame type enables the SFU to better protect I-
356	      frames using mechanisms such as QoS, FEC or RTX, as well as to
357	      determine when a stream switch can take place.  Within H.264/SVC,
358	      the frame type can be determined from the idr_flag; within VP8,
359	      the P bit (Inverse key frame flag) can be used.

361	   Discardable

363	      A 'Discardable' bit indicates whether other frames depend on this
364	      one.  For example within an implementation of temporal
365	      scalability, only the highest temporal extension layer would
366	      provide a 'Discardable' indication - all other temporal layers
367	      would not.  Within H.264/SVC this information is provided by the D
368	      bit;  within VP8, it is provided by the N bit.

370	   Layer identifier

372	      Layer identifiers enable the SFU to determine the temporal,
373	      spatial or quality layer of a given packet.  This is used in
374	      forwarding/drop decisions.  Within H.264/SVC this information is
375	      present in the DID(3 bits), QID (4 bits) and TID (3 bits) fields.
376	      Within VP8, it is present in the TID field (2 bits).  Within
377	      H.265, it is present in the LayerID (6 bits) and TID (3 bits)
378	      fields.

380	   TL0PICIDX

382	      This field enables an SFU to determine whether a received frame is
383	      dependent on a base layer frame which the SFU has not previously
384	      received in its entirety.  This is important information since it
385	      may represent the first indication that all or part of a base
386	      layer frame needs to be recovered before the arrival of subsequent
387	      base layer frames that depend on it.  Within H.264/SVC the
388	      IDRPICID and TL0PICIDX fields are present if the Y bit is set.
389	      Within VP8, the TL0PICIDX field is present if the Y bit is set.

391	   S/E

393	      The S bit indicates the first packet within a frame for a given
394	      layer.  For example, the S bit would always be set in the first
395	      packet of a base layer frame, as well as in the first packet of a
396	      spatial, quality or temporal enhancement to that base layer frame.
397	      Similarly, the E bit indicates the last packet within a frame for
398	      a given layer.

400	      Note that the E bit does not have an identical meaning to the M
401	      bit within the RTP header, which indicates the last packet in an
402	      access unit.  As an example, where spatial scalability is in use,
403	      the last packet of a base layer frame would have the E bit set,
404	      but would not have the M bit set - the M bit would only be set on
405	      the last packet of the spatial enhancement to that base layer
406	      frame.

408	      Both the S and E bits are defined within H.264/SVC as well as
409	      within H.265.  Only the S bit is defined within VP8;  the lack of
410	      an E bit within VP8 is not an issue since VP8 only supports
411	      temporal scalability.

413	3.3.  Problematic payload fields

415	   In addition to payload fields recommended for support within codec-
416	   independent SFUs, it is also useful to cover payload fields that are
417	   not recommended, because there are better alternatives.  These
418	   include the following fields:

420	   PictureID

422	      While the PictureID field is utilized within the RPSI feedback
423	      message, as well as being supported within the VP8 and VP9 codecs,
424	      its use is not recommended within codec-independent SFUs.  As
425	      noted earlier, RPSI feedback messages are rarely supported within
426	      existing SVC and simulcast implementations, since better loss
427	      recovery mechanisms are available.  As a result, an SFU does not
428	      need to originate an RPSI message which would require the
429	      PictureID.  Where it is necessary for the SFU to determine which
430	      picture a given packet belongs to, RTP header fields such as the
431	      timestamp and sequence number can be used instead.

433	   Macroblocks

435	      While macroblock identifiers are universally supported within
436	      video codecs, their use by codec-independent SFUs is not
437	      recommended.  As noted earlier, SLI feedback messages (which
438	      require a starting macroblock) are rarely supported within
439	      existing SFUs, since better loss recovery mechanisms are
440	      available.  Therefore a codec-independent SFU should never need to
441	      source an SLI message.

443	   NRI

445	      While the NRI field defined in H.264/SVC is used by some SFU
446	      implementations, its use in codec-independent SFUs is not
447	      recommended since there is no equivalent within the VP8 and VP9
448	      codecs, and the IDR, discardable and layer identification
449	      information provide an adequate substitute.

451	   PRID

453	      The PRID field defined in H.264/SVC has no analog in VP8 or VP9.
454	      While some SFU implementations do use this field (sometimes for
455	      purposes other than those for which it was originally defined),
456	      its use in codec independent SFUs is not recommended, since the
457	      D/N bits as well as the layer identification fields provide much
458	      the same information.

460	   F
461	      The F bit defined in H.264/SVC has no analog in VP8 or VP9 and is
462	      always set to zero in H.265.  Therefore its use in codec-
463	      independent SFUs is not recommended.

465	   KEYIDX

467	      The KEYIDX field is defined in VP8 as well as VP9, providing
468	      information on the key frame that a given packet is dependent on.
469	      Existing SFU implementations do not utilize this field in their
470	      forwarding/drop decisions, preferring to instead utilize the
471	      TL0PICIDX field.

473	4.  Security Considerations

475	   Codec-independent forwarding provides significant architectural
476	   benefits, even in situations where it is not possible to fully
477	   protect the privacy of conference participants.

479	   As noted earlier in this document, SFUs require cleartext access to
480	   control plane information such as information provided in signaling
481	   as well as RTCP reports and feedback messages.  SFUs also require
482	   access to information required for forwarding plane operation, such
483	   as information available within RTP extensions (such as frame
484	   start/end bits, layer information, etc.) and RTP header extensions
485	   (e.g. abs-send-time [ABS-SEND] and client-to-mixer audio level
486	   indication [RFC6464]).

488	   In addition, it should be understood that the information available
489	   to passive observers is significant, since this includes metadata
490	   such as the IP addresses of conference participants (unless protected
491	   via a relay or onion routing service),  statistics such as
492	   packets/bytes sent, and packet sizes.

494	   Based on frame size, an SFU can determine periods of rapid motion,
495	   and based on the client-to-mixer audio level indication, the SFU can
496	   determine recent speakers.  Much of this same information may also be
497	   gleaned by a passive observer with access to packet size and RTP
498	   header data (such as the SSRC field).  Combined with metadata, this
499	   information would allow the SFU or a passive observer to determine
500	   the roles of various participants within the conference as well as
501	   their levels of engagement.

503	   Based on RTCP reports and feedback message providing information on
504	   delay, jitter and losses, the SFU can determine the bandwidth
505	   available to participants as well as aspects of endpoint connectivity
506	   (e.g. whether their network access is wireless/wired,
507	   satellite/terrestrial, etc.).

509	   Even when the RTP payload is encrypted with a key unavailable to the
510	   SFU, the ability of the SFU to obtain and potentially forge signaling
511	   implies that end-to-end privacy is only guaranteed when media is
512	   authenticated end-to-end by an entity that can be considered
513	   trusthworthy even when presented with an intercept order.

515	5.  IANA Considerations

517	   This document does not require actions by IANA.

519	6.  References

521	6.1.  Informative References

523	[ABS-SEND]   WebRTC site, "The Absolute Send Time extension",
524	             http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time,
525	             retrieved July 6, 2015.

527	[H.264]      ITU-T Recommendation H.264, "Advanced video coding for
528	             generic audiovisual services", March 2010.

530	[Hellwagner] Hellwagner, H., Kuschnig, R., Stutz, T. and Andreas Uhl,
531	             "Efficient In-Network Adaptation of Encrypted H.264/SVC
532	             Content", Journal of Image Communication, Volume 24, Issue
533	             9, October 2009 Pages 740-758, Esevier Science, retrieved
534	             from http://wavelab.at/papers/Hellwagner09a.pdf

536	[HEVC]       ITU-T Recommendation H.265, "High efficiency video coding",
537	             April 2013.

539	[I-D.grange-vp9-bitstream]
540	             Grange, A. and H. Alvestrand, "A VP9 Bitstream Overview",
541	             draft-grange-vp9-bitstream-00 (work in progress), February
542	             2013.

544	[I-D.ietf-avtcore-rtp-multi-stream]
545	             Lennox, J., Westerlund, M., Wu, W., and C. Perkins,
546	             "Sending Multiple Media Streams in a Single RTP Session",
547	             draft-ietf-avtcore-rtp-multi-stream-07 (work in progress),
548	             March 2015.

550	[I-D.ietf-avtcore-rtp-topologies-update]
551	             Westerlund, M. and S. Wenger, "RTP Topologies", draft-ietf-
552	             avtcore-rtp-topologies-update-10 (work in progress), July
553	             2015.

555	[I-D.ietf-avtext-rtp-grouping-taxonomy]
556	             Lennox, J., Gross, K., Nandakumar, S., Salgueiro, G., and
557	             B. Burman, "A Taxonomy of Grouping Semantics and Mechanisms
558	             for Real-Time Transport", draft-ietf-avtext-rtp-grouping-
559	             taxonomy-07 (work in progress), June 2015.

561	[I-D.ietf-avtext-rtp-stream-pause]
562	             Burman, B., Akram, A., Even, R. and M. Westerlund, "RTP
563	             Stream Pause and Resume", draft-ietf-avtext-rtp-stream-
564	             pause-08 (work in progress), July 2015.

566	[I-D.ietf-payload-rtp-h265]
567	             Wang, Y.-K., Sanchez, Y., Schierl, T., Wenger, S. and M. M.
568	             Hannuksela, "RTP Payload Format for High Efficiency Video
569	             Coding", draft-ietf-payload-rtp-h265-13 (work in progress),
570	             June 2015.

572	[I-D.ietf-payload-vp8]
573	             Westin, P., Lundin, H., Glover, M., Uberti, J. and F.
574	             Galligan, "RTP Payload Format for VP8 Video", draft-ietf-
575	             payload-vp8-16 (work in progress), June 2015.

577	[I-D.ietf-rtcweb-rtp-usage]
578	             Perkins, C., Westerlund, M. and J. Ott, "Web Real-Time
579	             Communication (WebRTC): Media Transport and Use of RTP",
580	             draft-ietf-rtcweb-rtp-usage-25 (work in progress), June
581	             2015.

583	[I-D.uberti-payload-vp9]
584	             Uberti, J., Holmer, S., Flodman, M. and J. Lennox, "RTP
585	             Payload Format for VP9 Video", draft-uberti-payload-vp9-01
586	             (work in progress), March 2015.

588	[LASTN]      Grozev, B., Marinov, L., Singh, V. and E. Ivov, "Last N:
589	             relevance-based selectivity for forwarding video in
590	             multimedia conferences",  Proceedings of the 25th ACM
591	             Workshop on Network and Operating Systems Support for
592	             Digital Audio and Video, pp. 19-24, 2015, ISBN:
593	             978-1-4503-3352-8

595	[RFC2119]    Bradner, S., "Key Words for Use in RFCs to Indicate
596	             Requirement Levels", BCP 14, RFC 2119, March 1997.

598	[RFC3264]    Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
599	             with Session Description Protocol (SDP)", RFC 3264, June
600	             2002.

602	[RFC3550]    Schulzrinne, H., Casner, S., Frederick, R., and V.
603	             Jacobson, "RTP: A Transport Protocol for Real-Time
604	             Applications", STD 64, RFC 3550, July 2003.

606	[RFC3711]    Baugher, M., McGrew, D., Naslund, M., Carrara, E., and
607	             Norrman, K., "The Secure Real-time Transport Protocol
608	             (SRTP)", RFC 3711, March 2004.

610	[RFC4585]    Ott, J., Wenger, S., Sato, N., Burmeister, C., and Rey, J.,
611	             "Extended RTP Profile for Real-time Transport Control
612	             Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, July
613	             2006.

615	[RFC4588]    Rey, J., Leon, D., Miyazaki, A., Varsa, V., and R.
616	             Hakenberg, "RTP Retransmission Payload Format", RFC 4588,
617	             July 2006.

619	[RFC5104]    Wenger, S., Chandra, U., Westerlund, M., and Burman, B.,
620	             "Codec Control Messages in the RTP Audio-Visual Profile
621	             with Feedback (AVPF)", RFC 5104, February 2008.

623	[RFC5109]    Li, A., "RTP Payload Format for Generic Forward Error
624	             Correction", RFC 5109, December 2007.

626	[RFC6184]    Wang, Y.-K., Even, R., Kristensen, T., and R. Jesup, "RTP
627	             Payload Format for H.264 Video", RFC 6184, May 2011.

629	[RFC6190]    Wenger, S., Wang, Y., Schierl, T., and A. Eleftheriadis,
630	             "RTP Payload Format for Scalable Video Coding", RFC 6190,
631	             May 2011.

633	[RFC6386]    Bankoski, J., Koleszar, J., Quillio, L., Salonen, J.,
634	             Wilkins, P., and Y. Xu, "VP8 Data Format and Decoding
635	             Guide", RFC 6386, November 2011.

637	[RFC6464]    Lennox, J., Ivov, E. and E. Marocco, "A Real-time Transport
638	             Protocol (RTP) Header Extension for Client-to-Mixer Audio
639	             Level Indication", RFC 6464, December 2011.

641	Acknowledgments

643	   The author acknowledges discussions relating to the topic of this
644	   memo within the IETF PAYLOAD, AVTCORE and AVTEXT WGs, as well as
645	   discussions with the IMTC MANE Activity Group.  In particular, Emil
646	   Ivov, Alex Eleftheriadis and Stephan Wenger have provided helpful
647	   comments relating to this topic.

649	Authors' Addresses

651	   Bernard Aboba
652	   Microsoft Corporation
653	   One Microsoft Way
654	   Redmond, WA  98052
655	   US

657	   Email:  bernard_aboba@hotmail.com