idnits 2.17.1 

draft-ietf-mops-streaming-opcons-10.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == There are 2 instances of lines with non-ascii characters in the document.


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (21 April 2022) is 733 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-14) exists of
     draft-pantos-hls-rfc8216bis-10

  == Outdated reference: A later version (-18) exists of
     draft-ietf-quic-manageability-16

  == Outdated reference: A later version (-07) exists of
     draft-ietf-quic-qlog-h3-events-01

  == Outdated reference: A later version (-08) exists of
     draft-ietf-quic-qlog-main-schema-02

  == Outdated reference: A later version (-07) exists of
     draft-ietf-quic-qlog-quic-events-01

  -- Obsolete informational reference (is this intentional?): RFC  793
     (Obsoleted by RFC 9293)

  -- Obsolete informational reference (is this intentional?): RFC 2001
     (Obsoleted by RFC 2581)

  -- Obsolete informational reference (is this intentional?): RFC 4960
     (Obsoleted by RFC 9260)

  -- Obsolete informational reference (is this intentional?): RFC 8312
     (Obsoleted by RFC 9438)


     Summary: 0 errors (**), 0 flaws (~~), 7 warnings (==), 5 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	MOPS                                                          J. Holland
3	Internet-Draft                                 Akamai Technologies, Inc.
4	Intended status: Informational                                  A. Begen
5	Expires: 23 October 2022                                 Networked Media
6	                                                              S. Dawkins
7	                                                     Tencent America LLC
8	                                                           21 April 2022

10	             Operational Considerations for Streaming Media
11	                  draft-ietf-mops-streaming-opcons-10

13	Abstract

15	   This document provides an overview of operational networking issues
16	   that pertain to quality of experience when streaming video and other
17	   high-bitrate media over the Internet.

19	Status of This Memo

21	   This Internet-Draft is submitted in full conformance with the
22	   provisions of BCP 78 and BCP 79.

24	   Internet-Drafts are working documents of the Internet Engineering
25	   Task Force (IETF).  Note that other groups may also distribute
26	   working documents as Internet-Drafts.  The list of current Internet-
27	   Drafts is at https://datatracker.ietf.org/drafts/current/.

29	   Internet-Drafts are draft documents valid for a maximum of six months
30	   and may be updated, replaced, or obsoleted by other documents at any
31	   time.  It is inappropriate to use Internet-Drafts as reference
32	   material or to cite them other than as "work in progress."

34	   This Internet-Draft will expire on 23 October 2022.

36	Copyright Notice

38	   Copyright (c) 2022 IETF Trust and the persons identified as the
39	   document authors.  All rights reserved.

41	   This document is subject to BCP 78 and the IETF Trust's Legal
42	   Provisions Relating to IETF Documents (https://trustee.ietf.org/
43	   license-info) in effect on the date of publication of this document.
44	   Please review these documents carefully, as they describe your rights
45	   and restrictions with respect to this document.  Code Components
46	   extracted from this document must include Revised BSD License text as
47	   described in Section 4.e of the Trust Legal Provisions and are
48	   provided without warranty as described in the Revised BSD License.

50	Table of Contents

52	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
53	     1.1.  Notes for Contributors and Reviewers  . . . . . . . . . .   4
54	       1.1.1.  Venues for Contribution and Discussion  . . . . . . .   4
55	   2.  Our Focus on Streaming Video  . . . . . . . . . . . . . . . .   5
56	   3.  Bandwidth Provisioning  . . . . . . . . . . . . . . . . . . .   6
57	     3.1.  Scaling Requirements for Media Delivery . . . . . . . . .   6
58	       3.1.1.  Video Bitrates  . . . . . . . . . . . . . . . . . . .   6
59	       3.1.2.  Virtual Reality Bitrates  . . . . . . . . . . . . . .   6
60	     3.2.  Path Bandwidth Constraints  . . . . . . . . . . . . . . .   7
61	       3.2.1.  Recognizing Changes from an Expected Baseline . . . .   8
62	     3.3.  Path Requirements . . . . . . . . . . . . . . . . . . . .   9
63	     3.4.  Caching Systems . . . . . . . . . . . . . . . . . . . . .   9
64	     3.5.  Predictable Usage Profiles  . . . . . . . . . . . . . . .  11
65	     3.6.  Unpredictable Usage Profiles  . . . . . . . . . . . . . .  11
66	     3.7.  Extremely Unpredictable Usage Profiles  . . . . . . . . .  12
67	   4.  Latency Considerations  . . . . . . . . . . . . . . . . . . .  14
68	     4.1.  Ultra Low-Latency . . . . . . . . . . . . . . . . . . . .  14
69	     4.2.  Low-Latency Live  . . . . . . . . . . . . . . . . . . . .  15
70	     4.3.  Non-Low-Latency Live  . . . . . . . . . . . . . . . . . .  16
71	     4.4.  On-Demand . . . . . . . . . . . . . . . . . . . . . . . .  16
72	   5.  Adaptive Encoding, Adaptive Delivery, and Measurement
73	           Collection  . . . . . . . . . . . . . . . . . . . . . . .  17
74	     5.1.  Overview  . . . . . . . . . . . . . . . . . . . . . . . .  17
75	     5.2.  Adaptive Encoding . . . . . . . . . . . . . . . . . . . .  18
76	     5.3.  Adaptive Segmented Delivery . . . . . . . . . . . . . . .  18
77	     5.4.  Advertising . . . . . . . . . . . . . . . . . . . . . . .  18
78	     5.5.  Bitrate Detection Challenges  . . . . . . . . . . . . . .  20
79	       5.5.1.  Idle Time between Segments  . . . . . . . . . . . . .  21
80	       5.5.2.  Head-of-Line Blocking . . . . . . . . . . . . . . . .  21
81	       5.5.3.  Wide and Rapid Variation in Path Capacity . . . . . .  22
82	     5.6.  Measurement Collection  . . . . . . . . . . . . . . . . .  22
83	   6.  Evolution of Transport Protocols and Transport Protocol
84	           Behaviors . . . . . . . . . . . . . . . . . . . . . . . .  23
85	     6.1.  UDP and Its Behavior  . . . . . . . . . . . . . . . . . .  24
86	     6.2.  TCP and Its Behavior  . . . . . . . . . . . . . . . . . .  25
87	     6.3.  QUIC and Its Behavior . . . . . . . . . . . . . . . . . .  26
88	   7.  Streaming Encrypted Media . . . . . . . . . . . . . . . . . .  28
89	     7.1.  General Considerations for Media Encryption . . . . . . .  29
90	     7.2.  Considerations for "Hop-by-Hop" Media Encryption  . . . .  30
91	     7.3.  Considerations for "End-to-End" Media Encryption  . . . .  32
92	   8.  Further Reading and References  . . . . . . . . . . . . . . .  32
93	   9.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  33
94	   10. Security Considerations . . . . . . . . . . . . . . . . . . .  33
95	   11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .  33
96	   12. Informative References  . . . . . . . . . . . . . . . . . . .  33
97	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  41

99	1.  Introduction

101	   This document examines networking and transport protocol issues as
102	   they relate to quality of experience (QOE) in Internet media
103	   delivery, especially focusing on capturing characteristics of
104	   streaming video delivery that have surprised network designers or
105	   transport experts who lack specific video expertise, since streaming
106	   media highlights key differences between common assumptions in
107	   existing networking practices and observations of video delivery
108	   issues encountered when streaming media over those existing networks.

110	   This document specifically focuses on streaming applications and
111	   defines streaming as follows:

113	   *  Streaming is transmission of a continuous media from a server to a
114	      client and its simultaneous consumption by the client.

116	   *  Here, "continuous media" refers to media and associated streams
117	      such as video, audio, metadata, etc.  In this definition, the
118	      critical term is "simultaneous", as it is not considered streaming
119	      if one downloads a video file and plays it after the download is
120	      completed, which would be called download-and-play.

122	   This has two implications.

124	   *  First, the server's transmission rate must (loosely or tightly)
125	      match to client's consumption rate in order to provide
126	      uninterrupted playback.  That is, the client must not run out of
127	      data (buffer underrun) or accept more data than it can buffer
128	      before playback (buffer overrun) as any excess media that cannot
129	      be buffered is simply discarded.

131	   *  Second, the client's consumption rate is limited not only by
132	      bandwidth availability,but also media availability.  The client
133	      cannot fetch media that is not available from a server yet.

135	   This document contains

137	   *  A short description of streaming video characteristics in
138	      Section 2, to set the stage for the rest of the document,

140	   *  General guidance on bandwidth provisioning (Section 3) and latency
141	      considerations (Section 4) for streaming video delivery,

143	   *  A description of adaptive encoding and adaptive delivery
144	      techniques in common use for streaming video, along with a
145	      description of the challenges media senders face in detecting the
146	      bitrate available between the media sender and media receiver, and
147	      collection of measurements by a third party for use in analytics
148	      (Section 5),

150	   *  A description of existing transport protocols used for video
151	      streaming, and the issues encountered when using those protocols,
152	      along with a description of the QUIC transport protocol [RFC9000]
153	      that we expect to be used for streaming media (Section 6),

155	   *  A description of implications when streaming encrypted media
156	      (Section 7), and

158	   *  A number of useful pointers for further reading on this rapidly
159	      changing subject (Section 8).

161	   Making specific recommendations on operational practices aimed at
162	   mitigating the issues described in this document is out of scope,
163	   though some existing mitigations are mentioned in passing.  The
164	   intent is to provide a point of reference for future solution
165	   proposals to use in describing how new technologies address or avoid
166	   existing observed problems.

168	1.1.  Notes for Contributors and Reviewers

170	   Note to RFC Editor: Please remove this section and its subsections
171	   before publication.

173	   This section is to provide references to make it easier to review the
174	   development and discussion on the draft so far.

176	1.1.1.  Venues for Contribution and Discussion

178	   This document is in the Github repository at:

180	   https://github.com/ietf-wg-mops/draft-ietf-mops-streaming-opcons
181	   (https://github.com/ietf-wg-mops/draft-ietf-mops-streaming-opcons)

183	   Readers are welcome to open issues and send pull requests for this
184	   document.

186	   Substantial discussion of this document should take place on the MOPS
187	   working group mailing list (mops@ietf.org).

189	   *  Join: https://www.ietf.org/mailman/listinfo/mops
190	      (https://www.ietf.org/mailman/listinfo/mops)

192	   *  Search: https://mailarchive.ietf.org/arch/browse/mops/
193	      (https://mailarchive.ietf.org/arch/browse/mops/)

195	2.  Our Focus on Streaming Video

197	   As the internet has grown, an increasingly large share of the traffic
198	   delivered to end users has become video.  The most recent available
199	   estimates found that 75% of the total traffic to end users was video
200	   in 2019.  At that time, the share of traffic that was video had been
201	   growing for years and was projected to continue growing (Appendix D
202	   of [CVNI]).

204	   A substantial part of this growth is due to increased use of
205	   streaming video, although the amount of video traffic in real-time
206	   communications (for example, online videoconferencing) has also grown
207	   significantly.  While both streaming video and videoconferencing have
208	   real-time delivery and latency requirements, these requirements vary
209	   from one application to another.  For additional discussion of
210	   latency requirements, see Section 4.

212	   In many contexts, video traffic can be handled transparently as
213	   generic application-level traffic.  However, as the volume of video
214	   traffic continues to grow, it is becoming increasingly important to
215	   consider the effects of network design decisions on application-level
216	   performance, with considerations for the impact on video delivery.

218	   Much of the focus of this document is on reliable media using HTTP.
219	   HTTP is widely used because

221	   *  support for HTTP is widely available in a wide range of operating
222	      systems,

224	   *  HTTP is also used in a wide variety of other applications,

226	   *  HTTP has been demonstrated to provide acceptable performance over
227	      the open Internet,

229	   *  HTTP includes state of the art standardized security mechanisms,
230	      and

232	   *  HTTP can make use of already-deployed caching infrastructure such
233	      as CDNs (Content Delivery Networks), local proxies, and browser
234	      caches.

236	   Various HTTP versions have been used for media delivery.  HTTP/1.0,
237	   HTTP/1.1 and HTTP/2 are carried over TCP, and TCP's transport
238	   behavior is described in Section 6.2.  HTTP/3 is carried over QUIC,
239	   and QUIC's transport behavior is described in Section 6.3.

241	   Unreliable media delivery using RTP and other UDP-based protocols is
242	   also discussed in Section 4.1, Section 6.1, and Section 7.2, but it
243	   is difficult to give general guidance for these applications.  For
244	   instance, when loss occurs, the most appropriate response may depend
245	   on the type of codec being used.

247	3.  Bandwidth Provisioning

249	3.1.  Scaling Requirements for Media Delivery

251	3.1.1.  Video Bitrates

253	   Video bitrate selection depends on many variables including the
254	   resolution (height and width), frame rate, color depth, codec,
255	   encoding parameters, scene complexity and amount of motion.
256	   Generally speaking, as the resolution, frame rate, color depth, scene
257	   complexity and amount of motion increase, the encoding bitrate
258	   increases.  As newer codecs with better compression tools are used,
259	   the encoding bitrate decreases.  Similarly, a multi-pass encoding
260	   generally produces better quality output compared to single-pass
261	   encoding at the same bitrate, or delivers the same quality at a lower
262	   bitrate.

264	   Here are a few common resolutions used for video content, with
265	   typical ranges of bitrates for the two most popular video codecs
266	   [Encodings].

268	         +============+================+============+============+
269	         | Name       | Width x Height | H.264      | H.265      |
270	         +============+================+============+============+
271	         | DVD        | 720 x 480      | 1.0 Mbps   | 0.5 Mbps   |
272	         +------------+----------------+------------+------------+
273	         | 720p (1K)  | 1280 x 720     | 3-4.5 Mbps | 2-4 Mbps   |
274	         +------------+----------------+------------+------------+
275	         | 1080p (2K) | 1920 x 1080    | 6-8 Mbps   | 4.5-7 Mbps |
276	         +------------+----------------+------------+------------+
277	         | 2160p (4k) | 3840 x 2160    | N/A        | 10-20 Mbps |
278	         +------------+----------------+------------+------------+

280	                                  Table 1

282	3.1.2.  Virtual Reality Bitrates

284	   The bitrates given in Section 3.1.1 describe video streams that
285	   provide the user with a single, fixed, point of view - so, the user
286	   has no "degrees of freedom", and the user sees all of the video image
287	   that is available.

289	   Even basic virtual reality (360-degree) videos that allow users to
290	   look around freely (referred to as "three degrees of freedom", or
291	   3DoF) require substantially larger bitrates when they are captured
292	   and encoded as such videos require multiple fields of view of the
293	   scene.  Yet, due to smart delivery methods such as viewport-based or
294	   tiled-based streaming, we do not need to send the whole scene to the
295	   user.  Instead, the user needs only the portion corresponding to its
296	   viewpoint at any given time ([Survey360o]).

298	   In more immersive applications, where limited user movement ("three
299	   degrees of freedom plus", or 3DoF+) or full user movement ("six
300	   degrees of freedom", or 6DoF) is allowed, the required bitrate grows
301	   even further.  In this case, immersive content is typically referred
302	   to as volumetric media.  One way to represent the volumetric media is
303	   to use point clouds, where streaming a single object may easily
304	   require a bitrate of 30 Mbps or higher.  Refer to [MPEGI] and [PCC]
305	   for more details.

307	3.2.  Path Bandwidth Constraints

309	   Even when the bandwidth requirements for video streams along a path
310	   are well understood, additional analysis is required to understand
311	   the contraints on bandwidth at various points in the network.  This
312	   analysis is necessary because media servers may react to bandwith
313	   constraints using two independent feedback loops:

315	   *  Media servers often respond to application-level feedback from the
316	      media player that indicates a bottleneck link somewhere along the
317	      path, by adjusting the amount of media that the media server will
318	      send to the media player in a given timeframe.  This is described
319	      in greater detail in Section 5.

321	   *  Media servers also typically implement transport protocols with
322	      capacity-seeking congestion controllers that probe for bandwidth,
323	      and adjust the sending rate based on transport mechanisms.  This
324	      is described in greater detail in Section 6.

326	   The result is that these two (potentially competing) "helpful"
327	   mechanisms each respond to the same bottleneck with no coordination
328	   between themselves, so that each is unaware of actions taken by the
329	   other, and this can result in QOE for users that is significantly
330	   lower than what could have been achieved.

332	   In one example, if a media server overestimates the available
333	   bandwidth to the media player,

335	   *  the transport protocol detects loss due to congestion, and reduces
336	      its sending window size per round trip,

338	   *  the media server adapts to application-level feedback from the
339	      media player, and reduces its own sending rate,

341	   *  the transport protocol sends media at the new, lower rate, and
342	      confirms that this new, lower rate is "safe", because no
343	      transport-level loss is occuring, but

345	   *  because the media server continues to send at the new, lower rate,
346	      the transport protocol's maximum sending rate is now limited by
347	      the amount of information the media server queues for
348	      transmission, so

350	   *  the transport protocol can't probe for available path bandwidth by
351	      sending at a higher rate.

353	   In order to avoid these types of situations, which can potentially
354	   affect all the users whose streaming media traverses a bottleneck
355	   link, there are several possible mitigations that streaming operators
356	   can use, but the first step toward mitigating a problem is knowing
357	   when that problem occurs.

359	3.2.1.  Recognizing Changes from an Expected Baseline

361	   There are many reasons why path characteristics might change
362	   suddenly, for example,

364	   *  "cross traffic" that traverses part of the path, especially if
365	      this traffic is "inelastic", and does not, itself, respond to
366	      indications of path congestion.

368	   *  routing changes, which can happen in normal operation, especially
369	      if the new path now includes path segments that are more heavily
370	      loaded, offer lower total bandwidth, or simply cover more
371	      distance.

373	   In order to recognize that a path carrying streaming media is "not
374	   behaving the way it normally does", having an expected baseline that
375	   describes "the way it normally does" is fundamental.  Analytics that
376	   aid in that recognition can be more or less sophisticated, and can be
377	   as simple as noticing that the apparent round trip times for media
378	   traffic carried over TCP transport on some paths are suddenly and
379	   significantly longer than usual.  Passive monitors can detect changes
380	   in the elapsed time between the acknowledgements for specific TCP
381	   segments from a TCP receiver, since TCP octet sequence numbers and
382	   acknowledgements for those sequence numbers are "carried in the
383	   clear", even if the TCP payload itself is encrypted.  See Section 6.2
384	   for more information.

386	   As transport protocols evolve to encrypt their transport header
387	   fields, one side effect of increasing encryption is that the kind of
388	   passive monitoring, or even "performance enhancement" ([RFC3135])
389	   that was possible with the older transport protocols (UDP, described
390	   in Section 6.1 and TCP, described in Section 6.2) is no longer
391	   possible with newer transport protocols such as QUIC (described in
392	   Section 6.3).  The IETF has specified a "latency spin bit" mechanism
393	   in Section 17.4 of [RFC9000] to allow passive latency monitoring from
394	   observation points on the network path throughout the duration of a
395	   connection, but currently chartered work in the IETF is focusing on
396	   end-point monitoring and reporting, rather than on passive
397	   monitoring.

399	   One example is the "qlog" mechanism [I-D.ietf-quic-qlog-main-schema],
400	   a protocol-agnostic mechanism used to provide better visibility for
401	   encrypted protocols such as QUIC ([I-D.ietf-quic-qlog-quic-events])
402	   and for HTTP/3 ([I-D.ietf-quic-qlog-h3-events]).

404	3.3.  Path Requirements

406	   The bitrate requirements in Section 3.1 are per end-user actively
407	   consuming a media feed, so in the worst case, the bitrate demands can
408	   be multiplied by the number of simultaneous users to find the
409	   bandwidth requirements for a router on the delivery path with that
410	   number of users downstream.  For example, at a node with 10,000
411	   downstream users simultaneously consuming video streams,
412	   approximately 80 Gbps might be necessary in order for all of them to
413	   get typical content at 1080p resolution.

415	   However, when there is some overlap in the feeds being consumed by
416	   end users, it is sometimes possible to reduce the bandwidth
417	   provisioning requirements for the network by performing some kind of
418	   replication within the network.  This can be achieved via object
419	   caching with delivery of replicated objects over individual
420	   connections, and/or by packet-level replication using multicast.

422	   To the extent that replication of popular content can be performed,
423	   bandwidth requirements at peering or ingest points can be reduced to
424	   as low as a per-feed requirement instead of a per-user requirement.

426	3.4.  Caching Systems

428	   When demand for content is relatively predictable, and especially
429	   when that content is relatively static, caching content close to
430	   requesters, and pre-loading caches to respond quickly to initial
431	   requests is often useful (for example, HTTP/1.1 caching is described
432	   in [I-D.ietf-httpbis-cache]).  This is subject to the usual
433	   considerations for caching - for example, how much data must be
434	   cached to make a significant difference to the requester, and how the
435	   benefits of caching and pre-loading caches balances against the costs
436	   of tracking "stale" content in caches and refreshing that content.

438	   It is worth noting that not all high-demand content is "live"
439	   content.  One relevant example is when popular streaming content can
440	   be staged close to a significant number of requesters, as can happen
441	   when a new episode of a popular show is released.  This content may
442	   be largely stable, so low-cost to maintain in multiple places
443	   throughout the Internet.  This can reduce demands for high end-to-end
444	   bandwidth without having to use mechanisms like multicast.

446	   Caching and pre-loading can also reduce exposure to peering point
447	   congestion, since less traffic crosses the peering point exchanges if
448	   the caches are placed in peer networks, especially when the content
449	   can be pre-loaded during off-peak hours, and especially if the
450	   transfer can make use of "Lower-Effort Per-Hop Behavior (LE PHB) for
451	   Differentiated Services" [RFC8622], "Low Extra Delay Background
452	   Transport (LEDBAT)" [RFC6817], or similar mechanisms.

454	   All of this depends, of course, on the ability of a content provider
455	   to predict usage and provision bandwidth, caching, and other
456	   mechanisms to meet the needs of users.  In some cases (Section 3.5),
457	   this is relatively routine, but in other cases, it is more difficult
458	   (Section 3.6, Section 3.7).

460	   And as with other parts of the ecosystem, new technology brings new
461	   challenges.  For example, with the emergence of ultra-low-latency
462	   streaming, responses have to start streaming to the end user while
463	   still being transmitted to the cache, and while the cache does not
464	   yet know the size of the object.  Some of the popular caching systems
465	   were designed around cache footprint and had deeply ingrained
466	   assumptions about knowing the size of objects that are being stored,
467	   so the change in design requirements in long-established systems
468	   caused some errors in production.  Incidents occurred where a
469	   transmission error in the connection from the upstream source to the
470	   cache could result in the cache holding a truncated segment and
471	   transmitting it to the end user's device.  In this case, players
472	   rendering the stream often had the video freeze until the player was
473	   reset.  In some cases the truncated object was even cached that way
474	   and served later to other players as well, causing continued stalls
475	   at the same spot in the video for all players playing the segment
476	   delivered from that cache node.

478	3.5.  Predictable Usage Profiles

480	   Historical data shows that users consume more videos and at a higher
481	   bit rate than they did in the past on their connected devices.
482	   Improvements in the codecs that help with reducing the encoding
483	   bitrates with better compression algorithms could not have offset the
484	   increase in the demand for the higher quality video (higher
485	   resolution, higher frame rate, better color gamut, better dynamic
486	   range, etc.).  In particular, mobile data usage has shown a large
487	   jump over the years due to increased consumption of entertainment as
488	   well as conversational video.

490	3.6.  Unpredictable Usage Profiles

492	   Although TCP/IP has been used with a number of widely used
493	   applications that have symmetric bandwidth requirements (similar
494	   bandwidth requirements in each direction between endpoints), many
495	   widely-used Internet applications operate in client-server roles,
496	   with asymmetric bandwidth requirements.  A common example might be an
497	   HTTP GET operation, where a client sends a relatively small HTTP GET
498	   request for a resource to an HTTP server, and often receives a
499	   significantly larger response carrying the requested resource.  When
500	   HTTP is commonly used to stream movie-length video, the ratio between
501	   response size and request size can become arbitrarily large.

503	   For this reason, operators may pay more attention to downstream
504	   bandwidth utilization when planning and managing capacity.  In
505	   addition, operators have been able to deploy access networks for end
506	   users using underlying technologies that are inherently asymmetric,
507	   favoring downstream bandwidth (e.g.  ADSL, cellular technologies,
508	   most IEEE 802.11 variants), assuming that users will need less
509	   upstream bandwidth than downstream bandwidth.  This strategy usually
510	   works, except when it fails because application bandwidth usage
511	   patterns have changed in ways that were not predicted.

513	   One example of this type of change was when peer-to-peer file sharing
514	   applications gained popularity in the early 2000s.  To take one well-
515	   documented case ([RFC5594]), the Bittorrent application created
516	   "swarms" of hosts, uploading and downloading files to each other,
517	   rather than communicating with a server.  Bittorrent favored peers
518	   who uploaded as much as they downloaded, so that new Bittorrent users
519	   had an incentive to significantly increase their upstream bandwidth
520	   utilization.

522	   The combination of the large volume of "torrents" and the peer-to-
523	   peer characteristic of swarm transfers meant that end user hosts were
524	   suddenly uploading higher volumes of traffic to more destinations
525	   than was the case before Bittorrent.  This caused at least one large
526	   Internet service provider (ISP) to attempt to "throttle" these
527	   transfers in order to to mitigate the load that these hosts placed on
528	   their network.  These efforts were met by increased use of encryption
529	   in Bittorrent, and complaints to regulators calling for regulatory
530	   action.

532	   The BitTorrent case study is just one example, but the example is
533	   included here to make it clear that unpredicted and unpredictable
534	   massive traffic spikes may not be the result of natural disasters,
535	   but they can still have significant impacts.

537	   Especially as end users increase use of video-based social networking
538	   applications, it will be helpful for access network providers to
539	   watch for increasing numbers of end users uploading significant
540	   amounts of content.

542	3.7.  Extremely Unpredictable Usage Profiles

544	   The causes of unpredictable usage described in Section 3.6 were more
545	   or less the result of human choices, but we were reminded during a
546	   post-IETF 107 meeting that humans are not always in control, and
547	   forces of nature can cause enormous fluctuations in traffic patterns.

549	   In his talk, Sanjay Mishra [Mishra] reported that after the CoViD-19
550	   pandemic broke out in early 2020,

552	   *  Comcast's streaming and web video consumption rose by 38%, with
553	      their reported peak traffic up 32% overall between March 1 to
554	      March 30,

556	   *  AT&T reported a 28% jump in core network traffic (single day in
557	      April, as compared to pre stay-at-home daily average traffic),
558	      with video accounting for nearly half of all mobile network
559	      traffic, while social networking and web browsing remained the
560	      highest percentage (almost a quarter each) of overall mobility
561	      traffic, and

563	   *  Verizon reported similar trends with video traffic up 36% over an
564	      average day (pre COVID-19)}.

566	   We note that other operators saw similar spikes during this time
567	   period.  Craig Labowitz [Labovitz] reported

569	   *  Weekday peak traffic increases over 45%-50% from pre-lockdown
570	      levels,

572	   *  A 30% increase in upstream traffic over their pre-pandemic levels,
573	      and

575	   *  A steady increase in the overall volume of DDoS traffic, with
576	      amounts exceeding the pre-pandemic levels by 40%. (He attributed
577	      this increase to the significant rise in gaming-related DDoS
578	      attacks ([LabovitzDDoS]), as gaming usage also increased.)

580	   Subsequently, the Internet Architecture Board (IAB) held a COVID-19
581	   Network Impacts Workshop [IABcovid] in November 2020.  Given a larger
582	   number of reports and more time to reflect, the following
583	   observations from the draft workshop report are worth considering.

585	   *  Participants describing different types of networks reported
586	      different kinds of impacts, but all types of networks saw impacts.

588	   *  Mobile networks saw traffic reductions and residential networks
589	      saw significant increases.

591	   *  Reported traffic increases from ISPs and Internet Exchange Points
592	      (IXP) over just a few weeks were as big as the traffic growth over
593	      the course of a typical year, representing a 15-20% surge in
594	      growth to land at a new normal that was much higher than
595	      anticipated.

597	   *  At DE-CIX Frankfurt, the world's largest Internet Exchange Point
598	      in terms of data throughput, the year 2020 has seen the largest
599	      increase in peak traffic within a single year since the IXP was
600	      founded in 1995.

602	   *  The usage pattern changed significantly as work-from-home and
603	      videoconferencing usage peaked during normal work hours, which
604	      would have typically been off-peak hours with adults at work and
605	      children at school.  One might expect that the peak would have had
606	      more impact on networks if it had happened during typical evening
607	      peak hours for video streaming applications.

609	   *  The increase in daytime bandwidth consumption reflected both
610	      significant increases in "essential" applications such as
611	      videoconferencing and virtual private networks (VPN), and
612	      entertainment applications as people watched videos or played
613	      games.

615	   *  At the IXP level, it was observed that physical link utilization
616	      increased.  This phenomenon could probably be explained by a
617	      higher level of uncacheable traffic such as videoconferencing and
618	      VPNs from residential users as they stopped commuting and switched
619	      to work-at-home.

621	4.  Latency Considerations

623	   Streaming media latency refers to the "glass-to-glass" time duration,
624	   which is the delay between the real-life occurrence of an event and
625	   the streamed media being appropriately displayed on an end user's
626	   device.  Note that this is different from the network latency
627	   (defined as the time for a packet to cross a network from one end to
628	   another end) because it includes video encoding/decoding and
629	   buffering time, and for most cases also ingest to an intermediate
630	   service such as a CDN or other video distribution service, rather
631	   than a direct connection to an end user.

633	   Streaming media can be usefully categorized according to the
634	   application's latency requirements into a few rough categories:

636	   *  ultra low-latency (less than 1 second)

638	   *  low-latency live (less than 10 seconds)

640	   *  non-low-latency live (10 seconds to a few minutes)

642	   *  on-demand (hours or more)

644	4.1.  Ultra Low-Latency

646	   Ultra low-latency delivery of media is defined here as having a
647	   glass-to-glass delay target under one second.

649	   Some media content providers aim to achieve this level of latency for
650	   live media events.  This introduces new challenges relative to less-
651	   restricted levels of latency requirements because this latency is the
652	   same scale as commonly observed end-to-end network latency variation
653	   (for example, due to effects such as bufferbloat ([CoDel]), Wi-Fi
654	   error correction, or packet reordering).  These effects can make it
655	   difficult to achieve this level of latency for the general case, and
656	   may require tradeoffs in relatively frequent user-visible media
657	   artifacts.  However, for controlled environments or targeted networks
658	   that provide mitigations against such effects, this level of latency
659	   is potentially achievable with the right provisioning.

661	   Applications requiring ultra low latency for media delivery are
662	   usually tightly constrained on the available choices for media
663	   transport technologies, and sometimes may need to operate in
664	   controlled environments to reliably achieve their latency and quality
665	   goals.

667	   Most applications operating over IP networks and requiring latency
668	   this low use the Real-time Transport Protocol (RTP) [RFC3550] or
669	   WebRTC [RFC8825], which uses RTP for the media transport as well as
670	   several other protocols necessary for safe operation in browsers.

672	   Worth noting is that many applications for ultra low-latency delivery
673	   do not need to scale to more than a few users at a time, which
674	   simplifies many delivery considerations relative to other use cases.

676	   Recommended reading for applications adopting an RTP-based approach
677	   also includes [RFC7656].  For increasing the robustness of the
678	   playback by implementing adaptive playout methods, refer to [RFC4733]
679	   and [RFC6843].

681	   Applications with further-specialized latency requirements are out of
682	   scope for this document.

684	4.2.  Low-Latency Live

686	   Low-latency live delivery of media is defined here as having a glass-
687	   to-glass delay target under 10 seconds.

689	   This level of latency is targeted to have a user experience similar
690	   to traditional broadcast TV delivery.  A frequently cited problem
691	   with failing to achieve this level of latency for live sporting
692	   events is the user experience failure from having crowds within
693	   earshot of one another who react audibly to an important play, or
694	   from users who learn of an event in the match via some other channel,
695	   for example social media, before it has happened on the screen
696	   showing the sporting event.

698	   Applications requiring low-latency live media delivery are generally
699	   feasible at scale with some restrictions.  This typically requires
700	   the use of a premium service dedicated to the delivery of live video,
701	   and some tradeoffs may be necessary relative to what is feasible in a
702	   higher latency service.  The tradeoffs may include higher costs, or
703	   delivering a lower quality video, or reduced flexibility for adaptive
704	   bitrates, or reduced flexibility for available resolutions so that
705	   fewer devices can receive an encoding tuned for their display.  Low-
706	   latency live delivery is also more susceptible to user-visible
707	   disruptions due to transient network conditions than higher latency
708	   services.

710	   Implementation of a low-latency live video service can be achieved
711	   with the use of low-latency extensions of HLS (called LL-HLS)
712	   [I-D.draft-pantos-hls-rfc8216bis] and DASH (called LL-DASH)
713	   [LL-DASH].  These extensions use the Common Media Application Format
714	   (CMAF) standard [MPEG-CMAF] that allows the media to be packaged into
715	   and transmitted in units smaller than segments, which are called
716	   chunks in CMAF language.  This way, the latency can be decoupled from
717	   the duration of the media segments.  Without a CMAF-like packaging,
718	   lower latencies can only be achieved by using very short segment
719	   durations.  However, shorter segments means more frequent intra-coded
720	   frames and that is detrimental to video encoding quality.  CMAF
721	   allows us to still use longer segments (improving encoding quality)
722	   without penalizing latency.

724	   While a LL-HLS client retrieves each chunk with a separate HTTP GET
725	   request, a LL-DASH client uses the chunked transfer encoding feature
726	   of the HTTP [CMAF-CTE] which allows the LL-DASH client to fetch all
727	   the chunks belonging to a segment with a single GET request.  An HTTP
728	   server can transmit the CMAF chunks to the LL-DASH client as they
729	   arrive from the encoder/packager.  A detailed comparison of LL-HLS
730	   and LL-DASH is given in [MMSP20].

732	4.3.  Non-Low-Latency Live

734	   Non-low-latency live delivery of media is defined here as a live
735	   stream that does not have a latency target shorter than 10 seconds.

737	   This level of latency is the historically common case for segmented
738	   video delivery using HLS [RFC8216] and DASH [MPEG-DASH].  This level
739	   of latency is often considered adequate for content like news or pre-
740	   recorded content.  This level of latency is also sometimes achieved
741	   as a fallback state when some part of the delivery system or the
742	   client-side players do not have the necessary support for the
743	   features necessary to support low-latency live streaming.

745	   This level of latency can typically be achieved at scale with
746	   commodity CDN services for HTTP(s) delivery, and in some cases the
747	   increased time window can allow for production of a wider range of
748	   encoding options relative to the requirements for a lower latency
749	   service without the need for increasing the hardware footprint, which
750	   can allow for wider device interoperability.

752	4.4.  On-Demand

754	   On-Demand media streaming refers to playback of pre-recorded media
755	   based on a user's action.  In some cases on-demand media is produced
756	   as a by-product of a live media production, using the same segments
757	   as the live event, but freezing the manifest after the live event has
758	   finished.  In other cases, on-demand media is constructed out of pre-
759	   recorded assets with no streaming necessarily involved during the
760	   production of the on-demand content.

762	   On-demand media generally is not subject to latency concerns, but
763	   other timing-related considerations can still be as important or even
764	   more important to the user experience than the same considerations
765	   with live events.  These considerations include the startup time, the
766	   stability of the media stream's playback quality, and avoidance of
767	   stalls and video artifacts during the playback under all but the most
768	   severe network conditions.

770	   In some applications, optimizations are available to on-demand video
771	   that are not always available to live events, such as pre-loading the
772	   first segment for a startup time that doesn't have to wait for a
773	   network download to begin.

775	5.  Adaptive Encoding, Adaptive Delivery, and Measurement Collection

777	5.1.  Overview

779	   A simple model of video playback can be described as a video stream
780	   consumer, a buffer, and a transport mechanism that fills the buffer.
781	   The consumption rate is fairly static and is represented by the
782	   content bitrate.  The size of the buffer is also commonly a fixed
783	   size.  The fill process needs to be at least fast enough to ensure
784	   that the buffer is never empty, however it also can have significant
785	   complexity when things like personalization or ad workflows are
786	   introduced.

788	   The challenges in filling the buffer in a timely way fall into two
789	   broad categories: 1. content selection and 2. content variation.
790	   Content selection comprises all of the steps needed to determine
791	   which content variation to offer the client.  Content variation is
792	   the number of content options that exist at any given selection
793	   point.  A common example, easily visualized, is Adaptive BitRate
794	   (ABR), described in more detail below.  The mechanism used to select
795	   the bitrate is part of the content selection, and the content
796	   variation are all of the different bitrate renditions.

798	   ABR is a sort of application-level response strategy in which the
799	   streaming client attempts to detect the available bandwidth of the
800	   network path by observing the successful application-layer download
801	   speed, then chooses a bitrate for each of the video, audio, subtitles
802	   and metadata (among the limited number of available options) that
803	   fits within that bandwidth, typically adjusting as changes in
804	   available bandwidth occur in the network or changes in capabilities
805	   occur during the playback (such as available memory, CPU, display
806	   size, etc.).

808	5.2.  Adaptive Encoding

810	   Media servers can provide media streams at various bitrates because
811	   the media has been encoded at various bitrates.  This is a so-called
812	   "ladder" of bitrates, that can be offered to media players as part of
813	   the manifest that describes the media being requested by the media
814	   player, so that the media player can select among the available
815	   bitrate choices.

817	   The media server may also choose to alter which bitrates are made
818	   available to players by adding or removing bitrate options from the
819	   ladder delivered to the player in subsequent manifests built and sent
820	   to the player.  This way, both the player, through its selection of
821	   bitrate to request from the manifest, and the server, through its
822	   construction of the bitrates offered in the manifest, are able to
823	   affect network utilization.

825	5.3.  Adaptive Segmented Delivery

827	   ABR playback is commonly implemented by streaming clients using HLS
828	   [RFC8216] or DASH [MPEG-DASH] to perform a reliable segmented
829	   delivery of media over HTTP.  Different implementations use different
830	   strategies [ABRSurvey], often relying on proprietary algorithms
831	   (called rate adaptation or bitrate selection algorithms) to perform
832	   available bandwidth estimation/prediction and the bitrate selection.

834	   Many systems will do an initial probe or a very simple throughput
835	   speed test at the start of a video playback.  This is done to get a
836	   rough sense of the highest video bitrate in the ABR ladder that the
837	   network between the server and player will likely be able to provide
838	   under initial network conditions.  After the initial testing, clients
839	   tend to rely upon passive network observations and will make use of
840	   player side statistics such as buffer fill rates to monitor and
841	   respond to changing network conditions.

843	   The choice of bitrate occurs within the context of optimizing for one
844	   or more metrics monitored by the client, such as highest achievable
845	   video quality or lowest chances for a rebuffering event (playback
846	   stall).

848	5.4.  Advertising

850	   A variety of business models exist for producers of streaming media.
851	   Some content providers derive the majority of the revenue associated
852	   with streaming media directly from consumer subscriptions or one-time
853	   purchases.  Others derive the majority of their streaming media
854	   associated revenue from advertising.  Many content providers derive
855	   income from a mix of these and other sources of funding.  The
856	   inclusion of advertising alongside or interspersed with streaming
857	   media content is therefore common in today's media landscape.

859	   Some commonly used forms of advertising can introduce potential user
860	   experience issues for a media stream.  This section provides a very
861	   brief overview of a complex and evolving space, but a complete
862	   coverage of the potential issues is out of scope for this document.

864	   The same techniques used to allow a media player to switch between
865	   renditions of different bitrates at segment or chunk boundaries can
866	   also be used to enable the dynamic insertion of advertisements
867	   (herafter referred to as "ads").

869	   Ads may be inserted either with Client Side Ad Insertion (CSAI) or
870	   Server Side Ad Insertion (SSAI).  In CSAI, the ABR manifest will
871	   generally include links to an external ad server for some segments of
872	   the media stream, while in SSAI the server will remain the same
873	   during advertisements, but will include media segments that contain
874	   the advertising.  In SSAI, the media segments may or may not be
875	   sourced from an external ad server like with CSAI.

877	   In general, the more targeted the ad request is, the more requests
878	   the ad service needs to be able to handle concurrently.  If
879	   connectivity is poor to the ad service, this can cause rebuffering
880	   even if the underlying video assets (both content and ads) are able
881	   to be accessed quickly.  The less targeted, the more likely the ad
882	   requests can be consolidated and can leverage the same caching
883	   techniques as the video content.

885	   In some cases, especially with SSAI, advertising space in a stream is
886	   reserved for a specific advertiser and can be integrated with the
887	   video so that the segments share the same encoding properties such as
888	   bitrate, dynamic range, and resolution.  However, in many cases ad
889	   servers integrate with a Supply Side Platform (SSP) that offers
890	   advertising space in real-time auctions via an Ad Exchange, with bids
891	   for the advertising space coming from Demand Side Platforms (DSPs)
892	   that collect money from advertisers for delivering the
893	   advertisements.  Most such Ad Exchanges use application-level
894	   protocol specifications published by the Interactive Advertising
895	   Bureau [IAB-ADS], an industry trade organization.

897	   This ecosystem balances several competing objectives, and integrating
898	   with it naively can produce surprising user experience results.  For
899	   example, ad server provisioning and/or the bitrate of the ad segments
900	   might be different from that of the main video, either of which can
901	   sometimes result in video stalls.  For another example, since the
902	   inserted ads are often produced independently they might have a
903	   different base volume level than the main video, which can make for a
904	   jarring user experience.

906	   Additionally, this market historically has had incidents of ad fraud
907	   (misreporting of ad delivery to end users for financial gain).  As a
908	   mitigation for concerns driven by those incidents, some SSPs have
909	   required the use of players with features like reporting of ad
910	   delivery, or providing information that can be used for user
911	   tracking.  Some of these and other measures have raised privacy
912	   concerns for end users.

914	   In general this is a rapidly developing space with many
915	   considerations, and media streaming operators engaged in advertising
916	   may need to research these and other concerns to find solutions that
917	   meet their user experience, user privacy, and financial goals.  For
918	   further reading on mitigations, [BAP] has published some standards
919	   and best practices based on user experience research.

921	5.5.  Bitrate Detection Challenges

923	   This kind of bandwidth-measurement system can experience trouble in
924	   several ways that are affected by networking and transport protocol
925	   issues.  Because adaptive application-level response strategies are
926	   often using rates as observed by the application layer, there are
927	   sometimes inscrutable transport-level protocol behaviors that can
928	   produce surprising measurement values when the application-level
929	   feedback loop is interacting with a transport-level feedback loop.

931	   A few specific examples of surprising phenomena that affect bitrate
932	   detection measurements are described in the following subsections.
933	   As these examples will demonstrate, it is common to encounter cases
934	   that can deliver application level measurements that are too low, too
935	   high, and (possibly) correct but varying more quickly than a lab-
936	   tested selection algorithm might expect.

938	   These effects and others that cause transport behavior to diverge
939	   from lab modeling can sometimes have a significant impact on bitrate
940	   selection and on user QOE, especially where players use naive
941	   measurement strategies and selection algorithms that don't account
942	   for the likelihood of bandwidth measurements that diverge from the
943	   true path capacity.

945	5.5.1.  Idle Time between Segments

947	   When the bitrate selection is chosen substantially below the
948	   available capacity of the network path, the response to a segment
949	   request will typically complete in much less absolute time than the
950	   duration of the requested segment, leaving significant idle time
951	   between segment downloads.  This can have a few surprising
952	   consequences:

954	   *  TCP slow-start when restarting after idle requires multiple RTTs
955	      to re-establish a throughput at the network's available capacity.
956	      When the active transmission time for segments is substantially
957	      shorter than the time between segments, leaving an idle gap
958	      between segments that triggers a restart of TCP slow-start, the
959	      estimate of the successful download speed coming from the
960	      application-visible receive rate on the socket can thus end up
961	      much lower than the actual available network capacity.  This in
962	      turn can prevent a shift to the most appropriate bitrate.
963	      [RFC7661] provides some mitigations for this effect at the TCP
964	      transport layer, for senders who anticipate a high incidence of
965	      this problem.

967	   *  Mobile flow-bandwidth spectrum and timing mapping can be impacted
968	      by idle time in some networks.  The carrier capacity assigned to a
969	      link can vary with activity.  Depending on the idle time
970	      characteristics, this can result in a lower available bitrate than
971	      would be achievable with a steadier transmission in the same
972	      network.

974	   Some receiver-side ABR algorithms such as [ELASTIC] are designed to
975	   try to avoid this effect.

977	   Another way to mitigate this effect is by the help of two
978	   simultaneous TCP connections, as explained in [MMSys11] for Microsoft
979	   Smooth Streaming.  In some cases, the system-level TCP slow-start
980	   restart can also be disabled, for example as described in
981	   [OReilly-HPBN].

983	5.5.2.  Head-of-Line Blocking

985	   In the event of a lost packet on a TCP connection with SACK support
986	   (a common case for segmented delivery in practice), loss of a packet
987	   can provide a confusing bandwidth signal to the receiving
988	   application.  Because of the sliding window in TCP, many packets may
989	   be accepted by the receiver without being available to the
990	   application until the missing packet arrives.  Upon arrival of the
991	   one missing packet after retransmit, the receiver will suddenly get
992	   access to a lot of data at the same time.

994	   To a receiver measuring bytes received per unit time at the
995	   application layer, and interpreting it as an estimate of the
996	   available network bandwidth, this appears as a high jitter in the
997	   goodput measurement, presenting as a stall, followed by a sudden leap
998	   that can far exceed the actual capacity of the transport path from
999	   the server when the hole in the received data is filled by a later
1000	   retransmission.

1002	   It is worth noting that more modern transport protocols such as QUIC
1003	   have mitigation of head-of-line blocking as a protocol design goal.
1004	   See Section 6.3 for more details.

1006	5.5.3.  Wide and Rapid Variation in Path Capacity

1008	   As many end devices have moved to wireless connectivity for the final
1009	   hop (Wi-Fi, 5G, or LTE), new problems in bandwidth detction have
1010	   emerged from radio interference and signal strength effects.

1012	   Each of these technologies can experience sudden changes in capacity
1013	   as the end user device moves from place to place and encounters new
1014	   sources of interference.  Microwave ovens, for example, can cause a
1015	   throughput degradation of more than a factor of 2 while active
1016	   [Micro]. 5G and LTE likewise can easily see rate variation by a
1017	   factor of 2 or more over a span of seconds as users move around.

1019	   These swings in actual transport capacity can result in user
1020	   experience issues that can be exacerbated by insufficiently
1021	   responsive ABR algorithms.

1023	5.6.  Measurement Collection

1025	   Media players use measurements to guide their segment-by-segment
1026	   adaptive streaming requests, but may also provide measurements to
1027	   streaming media providers.

1029	   In turn, providers may base analytics on these measurements, to guide
1030	   decisions such as whether adaptive encoding bitrates in use are the
1031	   best ones to provide to media players, or whether current media
1032	   content caching is providing the best experience for viewers.

1034	   To that effect, the Consumer Technology Association (CTA) who owns
1035	   the Web Application Video Ecosystem (WAVE) project has published two
1036	   important specifications.

1038	   *  CTA-2066: Streaming Quality of Experience Events, Properties and
1039	      Metrics

1041	   [CTA-2066] specifies a set of media player events, properties, QOE
1042	   metrics and associated terminology for representing streaming media
1043	   QOE across systems, media players and analytics vendors.  While all
1044	   these events, properties, metrics and associated terminology is used
1045	   across a number of proprietary analytics and measurement solutions,
1046	   they were used in slightly (or vastly) different ways that led to
1047	   interoperability issues.  CTA-2066 attempts to address this issue by
1048	   defining a common terminology as well as how each metric should be
1049	   computed for consistent reporting.

1051	   *  CTA-5004: Common Media Client Data (CMCD)

1053	   Many assume that the CDNs have a holistic view into the health and
1054	   performance of the streaming clients.  However, this is not the case.
1055	   The CDNs produce millions of log lines per second across hundreds of
1056	   thousands of clients and they have no concept of a "session" as a
1057	   client would have, so CDNs are decoupled from the metrics the clients
1058	   generate and report.  A CDN cannot tell which request belongs to
1059	   which playback session, the duration of any media object, the
1060	   bitrate, or whether any of the clients have stalled and are
1061	   rebuffering or are about to stall and will rebuffer.  The consequence
1062	   of this decoupling is that a CDN cannot prioritize delivery for when
1063	   the client needs it most, prefetch content, or trigger alerts when
1064	   the network itself may be underperforming.  One approach to couple
1065	   the CDN to the playback sessions is for the clients to communicate
1066	   standardized media-relevant information to the CDNs while they are
1067	   fetching data.  [CTA-5004] was developed exactly for this purpose.

1069	6.  Evolution of Transport Protocols and Transport Protocol Behaviors

1071	   Because networking resources are shared between users, a good place
1072	   to start our discussion is how contention between users, and
1073	   mechanisms to resolve that contention in ways that are "fair" between
1074	   users, impact streaming media users.  These topics are closely tied
1075	   to transport protocol behaviors.

1077	   As noted in Section 5, ABR response strategies such as HLS [RFC8216]
1078	   or DASH [MPEG-DASH] are attempting to respond to changing path
1079	   characteristics, and underlying transport protocols are also
1080	   attempting to respond to changing path characteristics.

1082	   For most of the history of the Internet, these transport protocols,
1083	   described in Section 6.1 and Section 6.2, have had relatively
1084	   consistent behaviors that have changed slowly, if at all, over time.
1085	   Newly standardized transport protocols like QUIC [RFC9000] can behave
1086	   differently from existing transport protocols, and these behaviors
1087	   may evolve over time more rapidly than currently-used transport
1088	   protocols.

1090	   For this reason, we have included a description of how the path
1091	   characteristics that streaming media providers may see are likely to
1092	   evolve over time.

1094	6.1.  UDP and Its Behavior

1096	   For most of the history of the Internet, we have trusted UDP-based
1097	   applications to limit their impact on other users.  One of the
1098	   strategies used was to use UDP for simple query-response application
1099	   protocols, such as DNS, which is often used to send a single-packet
1100	   request to look up the IP address for a DNS name, and return a
1101	   single-packet response containing the IP address.  Although it is
1102	   possible to saturate a path between a DNS client and DNS server with
1103	   DNS requests, in practice, that was rare enough that DNS included few
1104	   mechanisms to resolve contention between DNS users and other users
1105	   (whether they are also using DNS, or using other application
1106	   protocols that share the same pathways).

1108	   In recent times, the usage of UDP-based applications that were not
1109	   simple query-response protocols has grown substantially, and since
1110	   UDP does not provide any feedback mechanism to senders to help limit
1111	   impacts on other users, application-level protocols such as RTP
1112	   [RFC3550] have been responsible for the decisions that TCP-based
1113	   applications have delegated to TCP - what to send, how much to send,
1114	   and when to send it.  Because UDP itself has no transport-layer
1115	   feedback mechanisms, UDP-based applications that send and receive
1116	   substantial amounts of information are expected to provide their own
1117	   feedback mechanisms, and to respond to the feedback the application
1118	   receives.  This expectation is most recently codified in Best Current
1119	   Practice [RFC8085].

1121	   In contrast to adaptive segmented delivery over a reliable tansport
1122	   as described in Section 5.3, some applications deliver streaming
1123	   media using an unreliable transport, and rely on a variety of
1124	   approaches, including:

1126	   *  raw MPEG Transport Stream ("MPEG-TS")-formatted video [MPEG-TS]
1127	      over UDP, which makes no attempt to account for reordering or loss
1128	      in the transport,

1130	   *  RTP [RFC3550], which can notice loss and repair some limited
1131	      reordering,

1133	   *  SCTP [RFC4960], which can use partial reliability [RFC3758] to
1134	      recover from some loss, but can abandon recovery to limit head-of-
1135	      line blocking, and

1137	   *  SRT [SRT], which can use forward error correction and time-bound
1138	      retransmission to recover from loss within certain limits, but can
1139	      abandon recovery to limit head-of-line blocking.

1141	   Under congestion and loss, approaches like the above generally
1142	   experiences transient video artifacts more often and delay of
1143	   playback effects less often, as compared with reliable segment
1144	   transport.  Often one of the key goals of using a UDP-based transport
1145	   that allows some unreliability is to reduce latency and better
1146	   support applications like videoconferencing, or for other live-action
1147	   video with interactive components, such as some sporting events.

1149	   Congestion avoidance strategies for deployments using unreliable
1150	   transport protocols vary widely in practice, ranging from being
1151	   entirely unresponsive to congestion, to using feedback signaling to
1152	   change encoder settings (as in [RFC5762]), to using fewer enhancement
1153	   layers (as in [RFC6190]), to using proprietary methods to detect QOE
1154	   issues and turn off video in order to allow less bandwidth-intensive
1155	   media such as audio to be delivered.

1157	   RTP relies on RTCP Sender and Receiver Reports [RFC3550] as its own
1158	   feedback mechanism, and even includes Circuit Breakers for Unicast
1159	   RTP Sessions [RFC8083] for situations when normal RTP congestion
1160	   control has not been able to react sufficiently to RTP flows sending
1161	   at rates that result in sustained packet loss.

1163	   The notion of "Circuit Breakers" has also been applied to other UDP
1164	   applications in [RFC8084], such as tunneling packets over UDP that
1165	   are potentially not congestion-controlled (for example,
1166	   "Encapsulating MPLS in UDP", as described in [RFC7510]).  If
1167	   streaming media is carried in tunnels encapsulated in UDP, these
1168	   media streams may encounter "tripped circuit breakers", with
1169	   resulting user-visible impacts.

1171	6.2.  TCP and Its Behavior

1173	   For most of the history of the Internet, we have trusted TCP to limit
1174	   the impact of applications that sent a significant number of packets,
1175	   in either or both directions, on other users.  Although early
1176	   versions of TCP were not particularly good at limiting this impact
1177	   [RFC0793], the addition of Slow Start and Congestion Avoidance, as
1178	   described in [RFC2001], were critical in allowing TCP-based
1179	   applications to "use as much bandwidth as possible, but to avoid
1180	   using more bandwidth than was possible".  Although dozens of RFCs
1181	   have been written refining TCP decisions about what to send, how much
1182	   to send, and when to send it, since 1988 [Jacobson-Karels] the
1183	   signals available for TCP senders remained unchanged - end-to-end
1184	   acknowledgements for packets that were successfully sent and
1185	   received, and packet timeouts for packets that were not.

1187	   The success of the largely TCP-based Internet is evidence that the
1188	   mechanisms TCP used to achieve equilibrium quickly, at a point where
1189	   TCP senders do not interfere with other TCP senders for sustained
1190	   periods of time, have been largely successful.  The Internet
1191	   continued to work even when the specific mechanisms used to reach
1192	   equilibrium changed over time.  Because TCP provides a common tool to
1193	   avoid contention, as some TCP-based applications like FTP were
1194	   largely replaced by other TCP-based applications like HTTP, the
1195	   transport behavior remained consistent.

1197	   In recent times, the TCP goal of probing for available bandwidth, and
1198	   "backing off" when a network path is saturated, has been supplanted
1199	   by the goal of avoiding growing queues along network paths, which
1200	   prevent TCP senders from reacting quickly when a network path is
1201	   saturated.  Congestion control mechanisms such as COPA [COPA18] and
1202	   BBR [I-D.cardwell-iccrg-bbr-congestion-control] make these decisions
1203	   based on measured path delays, assuming that if the measured path
1204	   delay is increasing, the sender is injecting packets onto the network
1205	   path faster than the receiver can accept them, so the sender should
1206	   adjust its sending rate accordingly.

1208	   Although TCP behavior has changed over time, the common practice of
1209	   implementing TCP as part of an operating system kernel has acted to
1210	   limit how quickly TCP behavior can change.  Even with the widespread
1211	   use of automated operating system update installation on many end-
1212	   user systems, streaming media providers could have a reasonable
1213	   expectation that they could understand TCP transport protocol
1214	   behaviors, and that those behaviors would remain relatively stable in
1215	   the short term.

1217	6.3.  QUIC and Its Behavior

1219	   The QUIC protocol, developed from a proprietary protocol into an IETF
1220	   standards-track protocol [RFC9000], turns many of the statements made
1221	   in Section 6.1 and Section 6.2 on their heads.

1223	   Although QUIC provides an alternative to the TCP and UDP transport
1224	   protocols, QUIC is itself encapsulated in UDP.  As noted elsewhere in
1225	   Section 7.1, the QUIC protocol encrypts almost all of its transport
1226	   parameters, and all of its payload, so any intermediaries that
1227	   network operators may be using to troubleshoot HTTP streaming media
1228	   performance issues, perform analytics, or even intercept exchanges in
1229	   current applications will not work for QUIC-based applications
1230	   without making changes to their networks.  Section 7 describes the
1231	   implications of media encryption in more detail.

1233	   While QUIC is designed as a general-purpose transport protocol, and
1234	   can carry different application-layer protocols, the current
1235	   standardized mapping is for HTTP/3 [I-D.ietf-quic-http], which
1236	   describes how QUIC transport features are used for HTTP.  The
1237	   convention is for HTTP/3 to run over UDP port 443 [Port443] but this
1238	   is not a strict requirement.

1240	   When HTTP/3 is encapsulated in QUIC, which is then encapsulated in
1241	   UDP, streaming operators (and network operators) might see UDP
1242	   traffic patterns that are similar to HTTP(S) over TCP.  Since earlier
1243	   versions of HTTP(S) rely on TCP, UDP ports may be blocked for any
1244	   port numbers that are not commonly used, such as UDP 53 for DNS.
1245	   Even when UDP ports are not blocked and HTTP/3 can flow, streaming
1246	   operators (and network operators) may severely rate-limit this
1247	   traffic because they do not expect to see legitimate high-bandwidth
1248	   traffic such as streaming media over the UDP ports that HTTP/3 is
1249	   using.

1251	   As noted in Section 5.5.2, because TCP provides a reliable, in-order
1252	   delivery service for applications, any packet loss for a TCP
1253	   connection causes "head-of-line blocking", so that no TCP segments
1254	   arriving after a packet is lost will be delivered to the receiving
1255	   application until the lost packet is retransmitted, allowing in-order
1256	   delivery to the application to continue.  As described in [RFC9000],
1257	   QUIC connections can carry multiple streams, and when packet losses
1258	   do occur, only the streams carried in the lost packet are delayed.

1260	   A QUIC extension currently being specified ([I-D.ietf-quic-datagram])
1261	   adds the capability for "unreliable" delivery, similar to the service
1262	   provided by UDP, but these datagrams are still subject to the QUIC
1263	   connection's congestion controller, providing some transport-level
1264	   congestion avoidance measures, which UDP does not.

1266	   As noted in Section 6.2, there is an increasing interest in transport
1267	   protocol behaviors that respond to delay measurements, instead of
1268	   responding to packet loss.  These behaviors may deliver improved user
1269	   experience, but in some cases have not responded to sustained packet
1270	   loss, which exhausts available buffers along the end-to-end path that
1271	   may affect other users sharing that path.  The QUIC protocol provides
1272	   a set of congestion control hooks that can be used for algorithm
1273	   agility, and [RFC9002] defines a basic algorithm with transport
1274	   behavior that is roughly similar to TCP NewReno [RFC6582].  However,
1275	   QUIC senders can and do unilaterally choose to use different
1276	   algorithms such as loss-based CUBIC [RFC8312], delay-based COPA or
1277	   BBR, or even something completely different.

1279	   The Internet community does have experience with deploying new
1280	   congestion controllers without melting the Internet.  As noted in
1281	   [RFC8312], both the CUBIC congestion controller and its predecessor
1282	   BIC have significantly different behavior from Reno-style congestion
1283	   controllers such as TCP NewReno [RFC6582], but both CUBIC and BIC
1284	   were added to the Linux kernel in order to allow experimentation and
1285	   analysis, and both were then selected as the default TCP congestion
1286	   controllers in Linux, and both were deployed globally.

1288	   The point mentioned in Section 6.2 about TCP congestion controllers
1289	   being implemented in operating system kernels is different with QUIC.
1290	   Although QUIC can be implemented in operating system kernels, one of
1291	   the design goals when this work was chartered was "QUIC is expected
1292	   to support rapid, distributed development and testing of features",
1293	   and to meet this expectation, many implementers have chosen to
1294	   implement QUIC in user space, outside the operating system kernel,
1295	   and to even distribute QUIC libraries with their own applications.
1296	   It is worth noting that streaming operators using HTTP/3, carried
1297	   over QUIC, can expect more frequent deployment of new congestion
1298	   controller behavior than has been the case with HTTP/1 and HTTP/2,
1299	   carried over TCP.

1301	   It is worth considering that if TCP-based HTTP traffic and UDP-based
1302	   HTTP/3 traffic are allowed to enter operator networks on roughly
1303	   equal terms, questions of fairness and contention will be heavily
1304	   dependent on interactions between the congestion controllers in use
1305	   for TCP-based HTTP traffic and UDP-based HTTP/3 traffic.

1307	7.  Streaming Encrypted Media

1309	   "Encrypted Media" has at least three meanings:

1311	   *  Media encrypted at the application layer, typically using some
1312	      sort of Digital Rights Management (DRM) system, and typically
1313	      remaining encrypted "at rest", when senders and receivers store
1314	      it.

1316	   *  Media encrypted by the sender at the transport layer, and
1317	      remaining encrypted until it reaches the ultimate media consumer
1318	      (in this document, referred to as "end-to-end media encryption").

1320	   *  Media encrypted by the sender at the transport layer, and
1321	      remaining encrypted until it reaches some intermediary that is
1322	      _not_ the ultimate media consumer, but has credentials allowing
1323	      decryption of the media content.  This intermediary may examine
1324	      and even transform the media content in some way, before
1325	      forwarding re-encrypted media content (in this document referred
1326	      to as "hop-by-hop media encryption").

1328	   In this document, we will focus on media encrypted at the transport
1329	   layer, whether encrypted "hop-by-hop" or "end-to-end".  Because media
1330	   encrypted at the application layer will only be processed by
1331	   application-level entities, this encryption does not have transport-
1332	   layer implications.  Of course, both "hop-by-hop" and "end-to-end"
1333	   encrypted transport may carry media that is, in addition, encrypted
1334	   at the application layer.

1336	   Each of these encryption strategies is intended to achieve a
1337	   different goal.  For instance, application-level encryption may be
1338	   used for business purposes, such as avoiding piracy or enforcing
1339	   geographic restrictions on playback, while transport-layer encryption
1340	   may be used to prevent media steam manipulation or to protect
1341	   manifests.

1343	   This document does not take a position on whether those goals are
1344	   "valid" (whatever that might mean).

1346	   Both "end-to-end" and "hop-by-hop" media encryption have specific
1347	   implications for streaming operators.  These are described in
1348	   Section 7.2 and Section 7.3.

1350	7.1.  General Considerations for Media Encryption

1352	   The use of strong encryption does provide confidentiality for
1353	   encrypted streaming media, from the sender to either an intermediary
1354	   or the ultimate media consumer, and this does prevent Deep Packet
1355	   Inspection by any intermediary that does not possess credentials
1356	   allowing decryption.  However, even encrypted content streams may be
1357	   vulnerable to traffic analysis.  An intermediary that can identify an
1358	   encrypted media stream without decrypting it, may be able to
1359	   "fingerprint" the encrypted media stream of known content, and then
1360	   match the targeted media stream against the fingerprints of known
1361	   content.  This protection can be lessened if a media provider is
1362	   repeatedly encrypting the same content.  [CODASPY17] is an example of
1363	   what is possible when identifying HTTPS-protected videos over TCP
1364	   transport, based either on the length of entire resources being
1365	   transferred, or on characteristic packet patterns at the beginning of
1366	   a resource being transferred.

1368	   If traffic analysis is successful at identifying encrypted content
1369	   and associating it with specific users, this breaks privacy as
1370	   certainly as examining decrypted traffic.

1372	   Because HTTPS has historically layered HTTP on top of TLS, which is
1373	   in turn layered on top of TCP, intermediaries do have access to
1374	   unencrypted TCP-level transport information, such as retransmissions,
1375	   and some carriers exploited this information in attempts to improve
1376	   transport-layer performance [RFC3135].  The most recent standardized
1377	   version of HTTPS, HTTP/3 [I-D.ietf-quic-http], uses the QUIC protocol
1378	   [RFC9000] as its transport layer.  QUIC relies on the TLS 1.3 initial
1379	   handshake [RFC8446] only for key exchange [RFC9001], and encrypts
1380	   almost all transport parameters itself, with the exception of a few
1381	   invariant header fields.  In the QUIC short header, the only
1382	   transport-level parameter which is sent "in the clear" is the
1383	   Destination Connection ID [RFC8999], and even in the QUIC long
1384	   header, the only transport-level parameters sent "in the clear" are
1385	   the Version, Destination Connection ID, and Source Connection ID.
1386	   For these reasons, HTTP/3 is significantly more "opaque" than HTTPS
1387	   with HTTP/1 or HTTP/2.

1389	   [I-D.ietf-quic-manageability] discusses manageability of the QUIC
1390	   transport protocol that is used to encapsulate HTTP/3, focusing on
1391	   the implications of QUIC's design and wire image on network
1392	   operations involving QUIC traffic.  It discusses what network
1393	   operators can consider in some detail.

1395	   More broadly, RFC 9065 [RFC9065], "Considerations around Transport
1396	   Header Confidentiality, Network Operations, and the Evolution of
1397	   Internet Transport Protocols" describes the impact of increased
1398	   encryption of transport headers in general terms.

1400	7.2.  Considerations for "Hop-by-Hop" Media Encryption

1402	   Although the IETF has put considerable emphasis on end-to-end
1403	   streaming media encryption, there are still important use cases that
1404	   require the insertion of intermediaries.

1406	   There are a variety of ways to involve intermediaries, and some are
1407	   much more intrusive than others.

1409	   From a content provider's perspective, a number of considerations are
1410	   in play.  The first question is likely whether the content provider
1411	   intends that intermediaries are explicitly addressed from endpoints,
1412	   or whether the content provider is willing to allow intermediaries to
1413	   "intercept" streaming content transparently, with no awareness or
1414	   permission from either endpoint.

1416	   If a content provider does not actively work to avoid interception by
1417	   intermediaries, the effect will be indistinguishable from
1418	   "impersonation attacks", and endpoints cannot be assumed of any level
1419	   of privacy.

1421	   Assuming that a content provider does intend to allow intermediaries
1422	   to participate in content streaming, and does intend to provide some
1423	   level of privacy for endpoints, there are a number of possible tools,
1424	   either already available or still being specified.  These include

1426	   *  Server And Network assisted DASH [MPEG-DASH-SAND] - this
1427	      specification introduces explicit messaging between DASH clients
1428	      and network elements or between various network elements for the
1429	      purpose of improving the efficiency of streaming sessions by
1430	      providing information about real-time operational characteristics
1431	      of networks, servers, proxies, caches, CDNs, as well as DASH
1432	      client's performance and status.

1434	   *  "Double Encryption Procedures for the Secure Real-Time Transport
1435	      Protocol (SRTP)" [RFC8723] - this specification provides a
1436	      cryptographic transform for the Secure Real-time Transport
1437	      Protocol that provides both hop-by-hop and end-to-end security
1438	      guarantees.

1440	   *  Secure Media Frames [SFRAME] - [RFC8723] is closely tied to SRTP,
1441	      and this close association impeded widespread deployment, because
1442	      it could not be used for the most common media content delivery
1443	      mechanisms.  A more recent proposal, Secure Media Frames [SFRAME],
1444	      also provides both hop-by-hop and end-to-end security guarantees,
1445	      but can be used with other transport protocols beyond SRTP.

1447	   The choice of whether to involve intermediaries sometimes requires
1448	   careful consideration.  As an example, when ABR manifests were
1449	   commonly sent unencrypted some networks would modify manifests during
1450	   peak hours by removing high-bitrate renditions in order to prevent
1451	   players from choosing those renditions, thus reducing the overall
1452	   bandwidth consumed for delivering these media streams and thereby
1453	   improving the network load and the user experience for their
1454	   customers.  Now that ubiquitous encryption typically prevents this
1455	   kind of modification, in order to maintain the same level of network
1456	   health and user experience across networks whose users would have
1457	   benefitted from this intervention a media streaming operator
1458	   sometimes needs to choose between adding intermediaries who are
1459	   authorized to change the manifests or adding significant extra
1460	   complexity to their service.

1462	   Some resources that might inform other similar considerations are
1463	   further discussed in [RFC8824] (for WebRTC) and
1464	   [I-D.ietf-quic-manageability] (for HTTP/3 and QUIC).

1466	7.3.  Considerations for "End-to-End" Media Encryption

1468	   "End-to-end" media encryption offers the potential of providing
1469	   privacy for streaming media consumers, with the idea being that if an
1470	   unauthorized intermediary can't decrypt streaming media, the
1471	   intermediary can't use Deep Packet Inspection to examine HTTP request
1472	   and response headers and identify the media content being streamed.

1474	   "End-to-end" media encryption has become much more widespread in the
1475	   years since the IETF issued "Pervasive Monitoring Is an Attack"
1476	   [RFC7258] as a Best Current Practice, describing pervasive monitoring
1477	   as a much greater threat than previously appreciated.  After the
1478	   Snowden disclosures, many content providers made the decision to use
1479	   HTTPS protection - HTTP over TLS - for most or all content being
1480	   delivered as a routine practice, rather than in exceptional cases for
1481	   content that was considered "sensitive".

1483	   Unfortunately, as noted in [RFC7258], there is no way to prevent
1484	   pervasive monitoring by an "attacker", while allowing monitoring by a
1485	   more benign entity who "only" wants to use DPI to examine HTTP
1486	   requests and responses in order to provide a better user experience.
1487	   If a modern encrypted transport protocol is used for end-to-end media
1488	   encryption, intermediary streaming operators are unable to examine
1489	   transport and application protocol behavior.  As described in
1490	   Section 7.2, only an intermediary streaming operator who is
1491	   explicitly authorized to examine packet payloads, rather than
1492	   intercepting packets and examining them without authorization, can
1493	   continue these practices.

1495	   [RFC7258] said that "The IETF will strive to produce specifications
1496	   that mitigate pervasive monitoring attacks", so streaming operators
1497	   should expect the IETF's direction toward preventing unauthorized
1498	   monitoring of IETF protocols to continue for the forseeable future.

1500	8.  Further Reading and References

1502	   The Media Operations community maintains a list of references and
1503	   resources for further reading at this location:

1505	   *  https://github.com/ietf-wg-mops/draft-ietf-mops-streaming-
1506	      opcons/blob/main/living-doc-mops-streaming-opcons.md
1507	      (https://github.com/ietf-wg-mops/draft-ietf-mops-streaming-
1508	      opcons/blob/main/living-doc-mops-streaming-opcons.md)

1510	   Editor's note: The link above might or might not be changed during
1511	   IESG Evaluation.  See https://github.com/ietf-wg-mops/draft-ietf-
1512	   mops-streaming-opcons/issues/114 (https://github.com/ietf-wg-mops/
1513	   draft-ietf-mops-streaming-opcons/issues/114) for updates.

1515	9.  IANA Considerations

1517	   This document requires no actions from IANA.

1519	10.  Security Considerations

1521	   Security is an important matter for streaming media applications and
1522	   it was briefly touched on in Section 7.1.  This document itself
1523	   introduces no new security issues.

1525	11.  Acknowledgments

1527	   Thanks to Alexandre Gouaillard, Aaron Falk, Chris Lemmons, Dave Oran,
1528	   Eric Vyncke, Glenn Deen, Kyle Rose, Leslie Daigle, Lucas Pardue, Mark
1529	   Nottingham, Matt Stock, Mike English, Renan Krishna, Roni Even,
1530	   Sanjay Mishra, and Will Law for very helpful suggestions, reviews and
1531	   comments.

1533	12.  Informative References

1535	   [ABRSurvey]
1536	              Taani, B., Begen, A. C., Timmerer, C., Zimmermann, R., and
1537	              A. Bentaleb et al, "A Survey on Bitrate Adaptation Schemes
1538	              for Streaming Media Over HTTP", IEEE Communications
1539	              Surveys & Tutorials , 2019,
1540	              <https://ieeexplore.ieee.org/abstract/document/8424813>.

1542	   [BAP]      "The Coalition for Better Ads", n.d.,
1543	              <https://www.betterads.org/>.

1545	   [CMAF-CTE] Law, W., "Ultra-Low-Latency Streaming Using Chunked-
1546	              Encoded and Chunked Transferred CMAF", October 2018,
1547	              <https://www.akamai.com/us/en/multimedia/documents/white-
1548	              paper/low-latency-streaming-cmaf-whitepaper.pdf>.

1550	   [CODASPY17]
1551	              Reed, A. and M. Kranch, "Identifying HTTPS-Protected
1552	              Netflix Videos in Real-Time", ACM CODASPY , March 2017,
1553	              <https://dl.acm.org/doi/10.1145/3029806.3029821>.

1555	   [CoDel]    Nichols, K. and V. Jacobson, "Controlling Queue Delay",
1556	              Communications of the ACM, Volume 55, Issue 7, pp. 42-50 ,
1557	              July 2012.

1559	   [COPA18]   Arun, V. and H. Balakrishnan, "Copa: Practical Delay-Based
1560	              Congestion Control for the Internet", USENIX NSDI , April
1561	              2018, <https://web.mit.edu/copa/>.

1563	   [CTA-2066] Consumer Technology Association, "Streaming Quality of
1564	              Experience Events, Properties and Metrics", March 2020,
1565	              <https://shop.cta.tech/products/streaming-quality-of-
1566	              experience-events-properties-and-metrics>.

1568	   [CTA-5004] CTA, "Common Media Client Data (CMCD)", September 2020,
1569	              <https://shop.cta.tech/products/web-application-video-
1570	              ecosystem-common-media-client-data-cta-5004>.

1572	   [CVNI]     "Cisco Visual Networking Index: Forecast and Trends,
1573	              2017-2022 White Paper", 27 February 2019,
1574	              <https://www.cisco.com/c/en/us/solutions/collateral/
1575	              service-provider/visual-networking-index-vni/white-paper-
1576	              c11-741490.html>.

1578	   [ELASTIC]  De Cicco, L., Caldaralo, V., Palmisano, V., and S.
1579	              Mascolo, "ELASTIC: A client-side controller for dynamic
1580	              adaptive streaming over HTTP (DASH)", Packet Video
1581	              Workshop , December 2013,
1582	              <https://ieeexplore.ieee.org/document/6691442>.

1584	   [Encodings]
1585	              Apple, Inc, "HLS Authoring Specification for Apple
1586	              Devices", June 2020,
1587	              <https://developer.apple.com/documentation/
1588	              http_live_streaming/
1589	              hls_authoring_specification_for_apple_devices>.

1591	   [I-D.cardwell-iccrg-bbr-congestion-control]
1592	              Cardwell, N., Cheng, Y., Yeganeh, S. H., Swett, I., and V.
1593	              Jacobson, "BBR Congestion Control", Work in Progress,
1594	              Internet-Draft, draft-cardwell-iccrg-bbr-congestion-
1595	              control-02, 7 March 2022,
1596	              <https://datatracker.ietf.org/doc/html/draft-cardwell-
1597	              iccrg-bbr-congestion-control-02>.

1599	   [I-D.draft-pantos-hls-rfc8216bis]
1600	              Pantos, R., "HTTP Live Streaming 2nd Edition", Work in
1601	              Progress, Internet-Draft, draft-pantos-hls-rfc8216bis-10,
1602	              8 November 2021, <https://datatracker.ietf.org/doc/html/
1603	              draft-pantos-hls-rfc8216bis-10>.

1605	   [I-D.ietf-httpbis-cache]
1606	              Fielding, R. T., Nottingham, M., and J. Reschke, "HTTP
1607	              Caching", Work in Progress, Internet-Draft, draft-ietf-
1608	              httpbis-cache-19, 12 September 2021,
1609	              <https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-
1610	              cache-19>.

1612	   [I-D.ietf-quic-datagram]
1613	              Pauly, T., Kinnear, E., and D. Schinazi, "An Unreliable
1614	              Datagram Extension to QUIC", Work in Progress, Internet-
1615	              Draft, draft-ietf-quic-datagram-10, 4 February 2022,
1616	              <https://datatracker.ietf.org/doc/html/draft-ietf-quic-
1617	              datagram-10>.

1619	   [I-D.ietf-quic-http]
1620	              Bishop, M., "Hypertext Transfer Protocol Version 3
1621	              (HTTP/3)", Work in Progress, Internet-Draft, draft-ietf-
1622	              quic-http-34, 2 February 2021,
1623	              <https://datatracker.ietf.org/doc/html/draft-ietf-quic-
1624	              http-34>.

1626	   [I-D.ietf-quic-manageability]
1627	              Kuehlewind, M. and B. Trammell, "Manageability of the QUIC
1628	              Transport Protocol", Work in Progress, Internet-Draft,
1629	              draft-ietf-quic-manageability-16, 6 April 2022,
1630	              <https://datatracker.ietf.org/doc/html/draft-ietf-quic-
1631	              manageability-16>.

1633	   [I-D.ietf-quic-qlog-h3-events]
1634	              Marx, R., Niccolini, L., and M. Seemann, "HTTP/3 and QPACK
1635	              qlog event definitions", Work in Progress, Internet-Draft,
1636	              draft-ietf-quic-qlog-h3-events-01, 7 March 2022,
1637	              <https://datatracker.ietf.org/doc/html/draft-ietf-quic-
1638	              qlog-h3-events-01>.

1640	   [I-D.ietf-quic-qlog-main-schema]
1641	              Marx, R., Niccolini, L., and M. Seemann, "Main logging
1642	              schema for qlog", Work in Progress, Internet-Draft, draft-
1643	              ietf-quic-qlog-main-schema-02, 7 March 2022,
1644	              <https://datatracker.ietf.org/doc/html/draft-ietf-quic-
1645	              qlog-main-schema-02>.

1647	   [I-D.ietf-quic-qlog-quic-events]
1648	              Marx, R., Niccolini, L., and M. Seemann, "QUIC event
1649	              definitions for qlog", Work in Progress, Internet-Draft,
1650	              draft-ietf-quic-qlog-quic-events-01, 7 March 2022,
1651	              <https://datatracker.ietf.org/doc/html/draft-ietf-quic-
1652	              qlog-quic-events-01>.

1654	   [IAB-ADS]  "IAB", n.d., <https://www.iab.com/>.

1656	   [IABcovid] Arkko, J., Farrel, S., Kühlewind, M., and C. Perkins,
1657	              "Report from the IAB COVID-19 Network Impacts Workshop
1658	              2020", November 2020, <https://datatracker.ietf.org/doc/
1659	              draft-iab-covid19-workshop/>.

1661	   [Jacobson-Karels]
1662	              Jacobson, V. and M. Karels, "Congestion Avoidance and
1663	              Control", November 1988,
1664	              <https://ee.lbl.gov/papers/congavoid.pdf>.

1666	   [Labovitz] Labovitz, C., "Network traffic insights in the time of
1667	              COVID-19: April 9 update", April 2020,
1668	              <https://www.nokia.com/blog/network-traffic-insights-time-
1669	              covid-19-april-9-update/>.

1671	   [LabovitzDDoS]
1672	              Takahashi, D., "Why the game industry is still vulnerable
1673	              to DDoS attacks", May 2018,
1674	              <https://venturebeat.com/2018/05/13/why-the-game-industry-
1675	              is-still-vulnerable-to-distributed-denial-of-service-
1676	              attacks/>.

1678	   [LL-DASH]  DASH-IF, "Low-latency Modes for DASH", March 2020,
1679	              <https://dashif.org/docs/CR-Low-Latency-Live-r8.pdf>.

1681	   [Micro]    Taher, T. M., Misurac, M. J., LoCicero, J. L., and D. R.
1682	              Ucci, "Microwave Oven Signal Interference Mitigation For
1683	              Wi-Fi Communication Systems", 2008 5th IEEE Consumer
1684	              Communications and Networking Conference 5th IEEE, pp.
1685	              67-68 , 2008.

1687	   [Mishra]   Mishra, S. and J. Thibeault, "An update on Streaming Video
1688	              Alliance", April 2020,
1689	              <https://datatracker.ietf.org/meeting/interim-2020-mops-
1690	              01/materials/slides-interim-2020-mops-01-sessa-april-
1691	              15-2020-mops-interim-an-update-on-streaming-video-
1692	              alliance>.

1694	   [MMSP20]   Durak, K. and et al, "Evaluating the performance of
1695	              Apple's low-latency HLS", IEEE MMSP , September 2020,
1696	              <https://ieeexplore.ieee.org/document/9287117>.

1698	   [MMSys11]  Akhshabi, S., Begen, A. C., and C. Dovrolis, "An
1699	              experimental evaluation of rate-adaptation algorithms in
1700	              adaptive streaming over HTTP", ACM MMSys , February 2011,
1701	              <https://dl.acm.org/doi/10.1145/1943552.1943574>.

1703	   [MPEG-CMAF]
1704	              "ISO/IEC 23000-19:2020 Multimedia application format
1705	              (MPEG-A) - Part 19: Common media application format (CMAF)
1706	              for segmented media", March 2020,
1707	              <https://www.iso.org/standard/79106.html>.

1709	   [MPEG-DASH]
1710	              "ISO/IEC 23009-1:2019 Dynamic adaptive streaming over HTTP
1711	              (DASH) - Part 1: Media presentation description and
1712	              segment formats", December 2019,
1713	              <https://www.iso.org/standard/79329.html>.

1715	   [MPEG-DASH-SAND]
1716	              "ISO/IEC 23009-5:2017 Dynamic adaptive streaming over HTTP
1717	              (DASH) - Part 5: Server and network assisted DASH (SAND)",
1718	              February 2017, <https://www.iso.org/standard/69079.html>.

1720	   [MPEG-TS]  "H.222.0 : Information technology - Generic coding of
1721	              moving pictures and associated audio information:
1722	              Systems", 29 August 2018,
1723	              <https://www.itu.int/rec/T-REC-H.222.0>.

1725	   [MPEGI]    Boyce, J. M. and et al, "MPEG Immersive Video Coding
1726	              Standard", Proceedings of the IEEE , n.d.,
1727	              <https://ieeexplore.ieee.org/document/9374648>.

1729	   [OReilly-HPBN]
1730	              "High Performance Browser Networking (Chapter 2: Building
1731	              Blocks of TCP)", May 2021,
1732	              <https://hpbn.co/building-blocks-of-tcp/>.

1734	   [PCC]      Schwarz, S. and et al, "Emerging MPEG Standards for Point
1735	              Cloud Compression", IEEE Journal on Emerging and Selected
1736	              Topics in Circuits and Systems , March 2019,
1737	              <https://ieeexplore.ieee.org/document/8571288>.

1739	   [Port443]  "Service Name and Transport Protocol Port Number
1740	              Registry", April 2021, <https://www.iana.org/assignments/
1741	              service-names-port-numbers/service-names-port-
1742	              numbers.txt>.

1744	   [RFC0793]  Postel, J., "Transmission Control Protocol", STD 7,
1745	              RFC 793, DOI 10.17487/RFC0793, September 1981,
1746	              <https://www.rfc-editor.org/rfc/rfc793>.

1748	   [RFC2001]  Stevens, W., "TCP Slow Start, Congestion Avoidance, Fast
1749	              Retransmit, and Fast Recovery Algorithms", RFC 2001,
1750	              DOI 10.17487/RFC2001, January 1997,
1751	              <https://www.rfc-editor.org/rfc/rfc2001>.

1753	   [RFC3135]  Border, J., Kojo, M., Griner, J., Montenegro, G., and Z.
1754	              Shelby, "Performance Enhancing Proxies Intended to
1755	              Mitigate Link-Related Degradations", RFC 3135,
1756	              DOI 10.17487/RFC3135, June 2001,
1757	              <https://www.rfc-editor.org/rfc/rfc3135>.

1759	   [RFC3550]  Schulzrinne, H., Casner, S., Frederick, R., and V.
1760	              Jacobson, "RTP: A Transport Protocol for Real-Time
1761	              Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550,
1762	              July 2003, <https://www.rfc-editor.org/rfc/rfc3550>.

1764	   [RFC3758]  Stewart, R., Ramalho, M., Xie, Q., Tuexen, M., and P.
1765	              Conrad, "Stream Control Transmission Protocol (SCTP)
1766	              Partial Reliability Extension", RFC 3758,
1767	              DOI 10.17487/RFC3758, May 2004,
1768	              <https://www.rfc-editor.org/rfc/rfc3758>.

1770	   [RFC4733]  Schulzrinne, H. and T. Taylor, "RTP Payload for DTMF
1771	              Digits, Telephony Tones, and Telephony Signals", RFC 4733,
1772	              DOI 10.17487/RFC4733, December 2006,
1773	              <https://www.rfc-editor.org/rfc/rfc4733>.

1775	   [RFC4960]  Stewart, R., Ed., "Stream Control Transmission Protocol",
1776	              RFC 4960, DOI 10.17487/RFC4960, September 2007,
1777	              <https://www.rfc-editor.org/rfc/rfc4960>.

1779	   [RFC5594]  Peterson, J. and A. Cooper, "Report from the IETF Workshop
1780	              on Peer-to-Peer (P2P) Infrastructure, May 28, 2008",
1781	              RFC 5594, DOI 10.17487/RFC5594, July 2009,
1782	              <https://www.rfc-editor.org/rfc/rfc5594>.

1784	   [RFC5762]  Perkins, C., "RTP and the Datagram Congestion Control
1785	              Protocol (DCCP)", RFC 5762, DOI 10.17487/RFC5762, April
1786	              2010, <https://www.rfc-editor.org/rfc/rfc5762>.

1788	   [RFC6190]  Wenger, S., Wang, Y.-K., Schierl, T., and A.
1789	              Eleftheriadis, "RTP Payload Format for Scalable Video
1790	              Coding", RFC 6190, DOI 10.17487/RFC6190, May 2011,
1791	              <https://www.rfc-editor.org/rfc/rfc6190>.

1793	   [RFC6582]  Henderson, T., Floyd, S., Gurtov, A., and Y. Nishida, "The
1794	              NewReno Modification to TCP's Fast Recovery Algorithm",
1795	              RFC 6582, DOI 10.17487/RFC6582, April 2012,
1796	              <https://www.rfc-editor.org/rfc/rfc6582>.

1798	   [RFC6817]  Shalunov, S., Hazel, G., Iyengar, J., and M. Kuehlewind,
1799	              "Low Extra Delay Background Transport (LEDBAT)", RFC 6817,
1800	              DOI 10.17487/RFC6817, December 2012,
1801	              <https://www.rfc-editor.org/rfc/rfc6817>.

1803	   [RFC6843]  Clark, A., Gross, K., and Q. Wu, "RTP Control Protocol
1804	              (RTCP) Extended Report (XR) Block for Delay Metric
1805	              Reporting", RFC 6843, DOI 10.17487/RFC6843, January 2013,
1806	              <https://www.rfc-editor.org/rfc/rfc6843>.

1808	   [RFC7258]  Farrell, S. and H. Tschofenig, "Pervasive Monitoring Is an
1809	              Attack", BCP 188, RFC 7258, DOI 10.17487/RFC7258, May
1810	              2014, <https://www.rfc-editor.org/rfc/rfc7258>.

1812	   [RFC7510]  Xu, X., Sheth, N., Yong, L., Callon, R., and D. Black,
1813	              "Encapsulating MPLS in UDP", RFC 7510,
1814	              DOI 10.17487/RFC7510, April 2015,
1815	              <https://www.rfc-editor.org/rfc/rfc7510>.

1817	   [RFC7656]  Lennox, J., Gross, K., Nandakumar, S., Salgueiro, G., and
1818	              B. Burman, Ed., "A Taxonomy of Semantics and Mechanisms
1819	              for Real-Time Transport Protocol (RTP) Sources", RFC 7656,
1820	              DOI 10.17487/RFC7656, November 2015,
1821	              <https://www.rfc-editor.org/rfc/rfc7656>.

1823	   [RFC7661]  Fairhurst, G., Sathiaseelan, A., and R. Secchi, "Updating
1824	              TCP to Support Rate-Limited Traffic", RFC 7661,
1825	              DOI 10.17487/RFC7661, October 2015,
1826	              <https://www.rfc-editor.org/rfc/rfc7661>.

1828	   [RFC8083]  Perkins, C. and V. Singh, "Multimedia Congestion Control:
1829	              Circuit Breakers for Unicast RTP Sessions", RFC 8083,
1830	              DOI 10.17487/RFC8083, March 2017,
1831	              <https://www.rfc-editor.org/rfc/rfc8083>.

1833	   [RFC8084]  Fairhurst, G., "Network Transport Circuit Breakers",
1834	              BCP 208, RFC 8084, DOI 10.17487/RFC8084, March 2017,
1835	              <https://www.rfc-editor.org/rfc/rfc8084>.

1837	   [RFC8085]  Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage
1838	              Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085,
1839	              March 2017, <https://www.rfc-editor.org/rfc/rfc8085>.

1841	   [RFC8216]  Pantos, R., Ed. and W. May, "HTTP Live Streaming",
1842	              RFC 8216, DOI 10.17487/RFC8216, August 2017,
1843	              <https://www.rfc-editor.org/rfc/rfc8216>.

1845	   [RFC8312]  Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and
1846	              R. Scheffenegger, "CUBIC for Fast Long-Distance Networks",
1847	              RFC 8312, DOI 10.17487/RFC8312, February 2018,
1848	              <https://www.rfc-editor.org/rfc/rfc8312>.

1850	   [RFC8446]  Rescorla, E., "The Transport Layer Security (TLS) Protocol
1851	              Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018,
1852	              <https://www.rfc-editor.org/rfc/rfc8446>.

1854	   [RFC8622]  Bless, R., "A Lower-Effort Per-Hop Behavior (LE PHB) for
1855	              Differentiated Services", RFC 8622, DOI 10.17487/RFC8622,
1856	              June 2019, <https://www.rfc-editor.org/rfc/rfc8622>.

1858	   [RFC8723]  Jennings, C., Jones, P., Barnes, R., and A.B. Roach,
1859	              "Double Encryption Procedures for the Secure Real-Time
1860	              Transport Protocol (SRTP)", RFC 8723,
1861	              DOI 10.17487/RFC8723, April 2020,
1862	              <https://www.rfc-editor.org/rfc/rfc8723>.

1864	   [RFC8824]  Minaburo, A., Toutain, L., and R. Andreasen, "Static
1865	              Context Header Compression (SCHC) for the Constrained
1866	              Application Protocol (CoAP)", RFC 8824,
1867	              DOI 10.17487/RFC8824, June 2021,
1868	              <https://www.rfc-editor.org/rfc/rfc8824>.

1870	   [RFC8825]  Alvestrand, H., "Overview: Real-Time Protocols for
1871	              Browser-Based Applications", RFC 8825,
1872	              DOI 10.17487/RFC8825, January 2021,
1873	              <https://www.rfc-editor.org/rfc/rfc8825>.

1875	   [RFC8999]  Thomson, M., "Version-Independent Properties of QUIC",
1876	              RFC 8999, DOI 10.17487/RFC8999, May 2021,
1877	              <https://www.rfc-editor.org/rfc/rfc8999>.

1879	   [RFC9000]  Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based
1880	              Multiplexed and Secure Transport", RFC 9000,
1881	              DOI 10.17487/RFC9000, May 2021,
1882	              <https://www.rfc-editor.org/rfc/rfc9000>.

1884	   [RFC9001]  Thomson, M., Ed. and S. Turner, Ed., "Using TLS to Secure
1885	              QUIC", RFC 9001, DOI 10.17487/RFC9001, May 2021,
1886	              <https://www.rfc-editor.org/rfc/rfc9001>.

1888	   [RFC9002]  Iyengar, J., Ed. and I. Swett, Ed., "QUIC Loss Detection
1889	              and Congestion Control", RFC 9002, DOI 10.17487/RFC9002,
1890	              May 2021, <https://www.rfc-editor.org/rfc/rfc9002>.

1892	   [RFC9065]  Fairhurst, G. and C. Perkins, "Considerations around
1893	              Transport Header Confidentiality, Network Operations, and
1894	              the Evolution of Internet Transport Protocols", RFC 9065,
1895	              DOI 10.17487/RFC9065, July 2021,
1896	              <https://www.rfc-editor.org/rfc/rfc9065>.

1898	   [SFRAME]   "Secure Media Frames Working Group (Home Page)", n.d.,
1899	              <https://datatracker.ietf.org/doc/charter-ietf-sframe/>.

1901	   [SRT]      Sharabayko, M., "Secure Reliable Transport (SRT) Protocol
1902	              Overview", 15 April 2020,
1903	              <https://datatracker.ietf.org/meeting/interim-2020-mops-
1904	              01/materials/slides-interim-2020-mops-01-sessa-april-
1905	              15-2020-mops-interim-an-update-on-streaming-video-
1906	              alliance>.

1908	   [Survey360o]
1909	              Yaqoob, A., Bi, T., and G. Muntean, "A Survey on Adaptive
1910	              360° Video Streaming: Solutions, Challenges and
1911	              Opportunities", IEEE Communications Surveys & Tutorials ,
1912	              July 2020, <https://ieeexplore.ieee.org/document/9133103>.

1914	Authors' Addresses

1916	   Jake Holland
1917	   Akamai Technologies, Inc.
1918	   150 Broadway
1919	   Cambridge, MA 02144,
1920	   United States of America
1921	   Email: jakeholland.net@gmail.com

1923	   Ali Begen
1924	   Networked Media
1925	   Turkey
1926	   Email: ali.begen@networked.media

1928	   Spencer Dawkins
1929	   Tencent America LLC
1930	   United States of America
1931	   Email: spencerdawkins.ietf@gmail.com