idnits 2.17.1 

draft-mekuria-mmediaingest-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses
     in the document.  If these are example addresses, they should be changed.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == Line 508 has weird spacing: '...elative  to tf...'

  == Line 560 has weird spacing: '... ingest  sourc...'

  == Line 568 has weird spacing: '...lishing  point...'

  -- The document date (May 7, 2018) is 2180 days in the past.  Is this
     intentional?

  -- Found something which looks like a code comment -- if you have code
     sections in the document, please surround them with '<CODE BEGINS>' and
     '<CODE ENDS>' lines.


  Checking references for intended status: Best Current Practice
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'MozillaTLS' is mentioned on line 850, but not defined

  == Missing Reference: 'ID3v2' is mentioned on line 855, but not defined

  -- Possible downref: Non-RFC (?) normative reference: ref. 'DASH'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'SCTE-35'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'ISOBMFF'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'HEVC'

  ** Obsolete normative reference: RFC  793 (Obsoleted by RFC 9293)

  -- Possible downref: Non-RFC (?) normative reference: ref. 'CMAF'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'CENC'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'MPEG-4-30'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'ISO639-2'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'DVB-DASH'

  -- Obsolete informational reference (is this intentional?): RFC 2818
     (Obsoleted by RFC 9110)


     Summary: 1 error (**), 0 flaws (~~), 7 warnings (==), 12 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Internet Engineering Task Force             R. Mekuria
3	Internet-Draft                              Unified Streaming B.V.
4	Expires: November 7th, 2018

6	Intended status: Best Current Practice      May 7, 2018

8	          Live Media and Metadata Ingest Protocol
9	             draft-mekuria-mmediaingest-00.txt

11	Abstract

13	   This Internet draft presents a protocol specification for
14	   ingesting live media and metadata content from a
15	   live media source such as a live encoder towards a media
16	   processing entity or content delivery network.
17	   It defines the media format usage, the preferred transmission
18	   methods and the handling of failovers and redundancy.
19	   The live media considered includes high quality encoded
20	   audio visual content. The timed metadata supported
21	   includes timed graphics, captions, subtitles and
22	   metadata markers and information. This protocol can
23	   for example be used advanced live streaming workflows
24	   that combine high quality live encoders and advanced
25	   media processing entities. The specification follows
26	   best current industry practice.

28	Status of This Memo

30	   This Internet-Draft is submitted in full conformance with the
31	   provisions of BCP 78 and BCP 79.

33	   Internet-Drafts are working documents of the Internet Engineering
34	   Task Force (IETF).  Note that other groups may also distribute
35	   working documents as Internet-Drafts.  The list of current Internet-
36	   Drafts is at http://datatracker.ietf.org/drafts/current/.

38	   Internet-Drafts are draft documents valid for a maximum of six
39	   months and may be updated, replaced, or obsoleted by other documents
40	   at any time.  It is inappropriate to use Internet-Drafts as
41	   reference material or to cite them other than as "work in progress."

43	   <Mekuria>          Expires November 7 2018                [Page1]
44	Copyright Notice

46	   Copyright (c) 2018 IETF Trust and the persons identified as the
47	   document authors.  All rights reserved.

49	   This document is subject to BCP 78 and the IETF Trust's Legal
50	   Provisions Relating to IETF Documents
51	   (http://trustee.ietf.org/license-info) in effect on the date of
52	   publication of this document.  Please review these documents
53	   carefully, as they describe your rights and restrictions with respect
54	   to this document.  Code Components extracted from this document must
55	   include Simplified BSD License text as described in Section 4.e of
56	   the Trust Legal Provisions and are provided without warranty as
57	   described in the Simplified BSD License.

59	Table of Contents

61	   1.  Introduction
62	   2.  Conventions and Terminology
63	   3.  Media Ingest Protocol Behavior
64	   4.  Formatting Requirements for Timed Text, Captions and Subtitles
65	   5.  Formatting Requirements for Timed Metadata Markers
66	   6.  Guidelines for Handling of Media Processing Entity Failover
67	   7.  Guidelines for Handling of Live Media Source Failover
68	   8.  Security Considerations
69	   9.  IANA Considerations
70	   10. Contributors
71	   11. References
72	     11.1.  Normative References
73	     11.2.  Informative References
74	     11.3.  URL References
75	   Author's Address

77	1.  Introduction

79	   This specification describes a protocol for media ingest from
80	   a live source (e.g. live encoder) towards media processing
81	   entities. Examples of media processing entities
82	   include media packagers, publishing points, streaming origins,
83	   content delivery networks and others. In particular, we
84	   distinguish active media processing entities and passive media
85	   processing entities. Active media processing entities perform
86	   media processing such as encryption, packaging, changing (parts of)
87	   the media content and deriving additional information. Passive
88	   media processing entities provide pass through functionality
89	   and/or delivery and caching functions that do not alter the media
90	   content itself. An example of a passive media processing entity
91	   could be a content delivery network (CDN) that provides
92	   functionalities for the delivery of the content.
93	   An example of an active media processing entity could
94	   be a just-in-time packager or a just in time transcoder.

96	     <Mekuria>          Expires November 7 2018                [Page2]
97	   Diagram 1: Example workflow with media ingest
98	   Live Media Source -> Media processing entity -> CDN -> End User

100	   Diagram 1 shows the workflow with a live media ingest from a
101	   live media source towards a media processing entity. The media
102	   processing entity provides additional processing such as
103	   content stitching, encryption, packaging, manifest generation,
104	   transcoding etc. Such setups are beneficial for advanced
105	   media delivery. The ingest described in this draft includes
106	   the latest technologies and standards used in the industry
107	   such as timed metadata, captions, timed text and encoding
108	   standards such as HEVC [HEVC]. The media ingest protocol
109	   specification and associated requirements were discussed
110	   with stakeholders, including broadcasters, live encoder vendors,
111	   content delivery networks, telecommunications companies
112	   and cloud service providers. While this draft specification
113	   has also been extensively discussed and reviewed by these
114	   stakeholders representing current best practices.
115	   Nevertheless,   this current draft solely reflects the
116	   point of view of the authors of this draft taking received
117	   feedback from these stakeholders into account. Some
118	   insights on the discussions leading to this draft
119	   can be found on [fmp4git].

121	2.  Conventions and Terminology

123	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
124	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
125	   document are to be interpreted as described in BCP 14, RFC 2119
126	   [RFC2119].

128	   This specification uses the following additional terminology.
129	   ISOBMFF: the ISO Base Media File Format specified in [ISOBMFF].
130	   ftyp: the filetype and compatibility "ftyp" box as described
131	         in the ISOBMFF [ISOBMFF] that describes the "brand"
132	   moov: the container box for all metadata "moov" described in the
133	         ISOBMFF base media file format [ISOBMFF]
134	   moof: the movie fragment "moof" box as described in the
135	         ISOBMFF  base media file format [ISOBMFF] that describes
136	         the metadata of a fragment of media.
137	   mdat: the media data container "mdat" box contained in
138	         an ISOBMFF [ISOBMFF], this box contains the
139	         compressed media samples
140	   kind: the track kind box defined in the ISOBMFF [ISOBMFF]
141	         to label a track with its usage
142	   mfra: the movie fragment random access "mfra" box defined in
143	         the ISOBMFF [ISOBMFF] to signal random access samples
144	         (these are samples that require no prior
145	         or other samples for decoding) [ISOBMFF].
146	   tfdt: the TrackFragmentDecodeTimeBox box "tfdt"
147	         in the base media file format [ISOBMFF] used
148	         to signal the decode time of the media
149	         fragment signalled in the moof box.

151	<Mekuria>          Expires November 7 2018                [Page3]
152	   mdhd: The media header box "mdhd" as defined in [ISOBMFF],
153	         this box contains information about the media such
154	         as timescale, duration, language using ISO 639-2/T codes
155	         [ISO639-2]

157	   pssh: The protection specific system header "pssh" box defined
158	         in [CENC] that can be used to signal the content protection
159	         information according to the MPEG Common Encryption (CENC)
160	   sinf: Protection scheme information box "sinf" defined in [ISOBMFF]
161	         that provides information on the encryption
162	         scheme used in the file
163	   elng: extended language box "elng" defined in [ISOBMFF] that
164	         can override the language information
165	   nmhd: The null media header Box "nmhd" as defined in [ISOBMFF]
166	         to signal a track for which no specific
167	         media header is defined, often used for metadata tracks
168	   HTTP: Hyper Text Transfer Protocol,
169	                version 1.1 as specified by [RFC2626]
170	   HTTP POST: Command used in the Hyper Text Transfer Protocol for
171	              sending data from a source to a destination [RFC2626]
172	   fragmentedMP4stream: stream of [ISOBMFF] fragments
173	           (moof and mdat) see page 5 for definition
174	   POST_URL: Target URL of a POST command in the HTTP protocol
175	             for pushing data from a source to a destination.
176	   TCP: Transmission Control Protocol (TCP) as defined in [RFC793]
177	   URI_SAFE_IDENTIFIER: identifier/string
178	          formatted according to [RFC3986]
179	   Connection: connection setup between a host and a source.
180	   Live stream event: the total media broadcast stream of the ingest.
181	   (Live) encoder: entity performing live encoding and producing
182	   a high quality encoded stream, can serve as Media ingest source
183	   (Media) Ingest source: a media source ingesting media content
184	   , typically a live encoder but not restricted to this,
185	   the media ingest source could by any type of media ingest
186	   source such as a stored file that is send in partial chunks
187	   Publishing point: entity used to publish the media content,
188	   consumes/receives the incoming media ingest stream
189	   Media processing entity: entity used to process media content,
190	   receives/consumes a media ingest stream.
191	   Media processing function: Media processing entity

193	3.  Media Ingest Protocol Behavior

195	   The specification uses multiple HTTP POST and/or PUT requests
196	   to transmit an optional manifest followed by encoded media data
197	   packaged in fragmented [ISOBMFF]. The subsequent posted segments
198	   correspond to those decribed in the manifest. Each HTTP POST sends
199	   a complete manifest or media segment towards the processing entity.
200	   The sequence of POST commands starts with the manifest and init
201	   segments that includes header boxes (ftyp and moov boxes).
202	   It continues with the sequence of segments
203	   (combinations of moof and mdat boxes).

205	<Mekuria>          Expires November 7 2018                [Page4]
206	   An example of a POST URL
207	   targeting the publishing point is:
208	   http://HostName/presentationPath/manifestpath
209	   /rsegmentpath/Identifier

211	   The PostURL the syntax is defined as follows using the
212	   IETF RFC 5234 ANB [RFC5234] to specify the structure.

214	   PostURL = Protocol ://BroadcastURL Identifier
215	   Protocol = "http" / "https"
216	   BroadcastURL = HostName "/" PresentationPath
217	   HostName = URI_SAFE_IDENTIFIER
218	   PresentationPath = URI_SAFE_IDENTIFIER
219	   ManifestPath = URI_SAFE_IDENTIFIER
220	   Rsegmentpath = URI_SAFE_IDENTIFIER
221	   Identifier = segment_file_name

223	   In this PostURL the HostName is typically the hostname of the
224	   media processing entity or publishing point. The presentation path
225	   is the path to the specific presentation at the publishing point.
226	   The manifest path can be used to signal the specific manifest of
227	   the presentation. The rsegmentpath can be a different optional
228	   extended path based on the relative paths in the manifest file.
229	   The identifier describes the filename of the segment as described
230	   in the manifest. The live source sender first sends the manifest
231	   to the path http://hostname/presentationpath/ allowing
232	   the receiving entity to setup reception paths for the following
233	   segments and manifests. In case no manifest is used any POST_URL
234	   setup for media ingest such as http://hostname/presentationpath/
235	   can be used. The fragmentedMP4stream can be defined
236	   using the IETF RFC 5234 ANB [RFC5234] as follows.

238	   fragmentedMP4stream = headerboxes fragments
239	   headerboxes = ftyp moov
240	   fragments = X fragment
241	   fragment = Moof Mdat

243	   The communication between the live encoder/media ingest source
244	   and the receiving media procesing entity follows the following
245	   requirements.

247	   1. The live encoder or ingest source communicates to
248	      the publishing point/processing entity using the HTTP
249	      POST method as defined in the HTTP protocol [RFC2626],
250	      or in the case for manifest updates the HTTP PUT Method.
251	   2. The live encoder or ingest source SHOULD start
252	      by sending an HTTP POST request with an empty "body"
253	      (zero content length) by using the same POSTURL.
254	      This can help the live encoder or media
255	      ingest source to quickly detect whether the
256	      live ingest publishing point is valid,
257	      and if there are any authentication or other conditions required.

259	<Mekuria>          Expires November 7 2018                [Page5]
260	   3. The live encoder/media source SHOULD use secured
261	       transmission using HTTPS protocol
262	      as specified in [RFC2818] for connecting
263	       to the receiving media processing entity
264	      or publishing point.
265	   4. In case HTTPS protocol is used,
266	      basic authentication HTTP AUTH [RFC7617]
267	      or better methods like
268	      TLS client certificates SHOULD be used to
269	      secure the connection.
270	   5. As compatibility profile for the TLS encryption
271	      we recommend the mozzilla
272	      intermediate compatibility profile which is supported
273	      in many available implementations [MozillaTLS].
274	   6. Before sending the segments
275	      based on fragmentedMP4Stream the live encoder/source
276	      MAY send a manifest
277	      with the following the limitations/constraints.
278	   6a. Only relative URL paths to be used for each segment
279	   6b. Only unique paths are used for each new presentation
280	   6c. In case the manifest contains these relative paths,
281	      these paths MAY be used in combination with the
282	      POST_URL + relative URLs
283	      to POST each of the different segments from
284	      the live encoder or ingest source
285	      to the processing entity.
286	   6d. In case the manifest contains no relative paths,
287	      or no manifest is used the
288	      segments SHOULD be posted to the original
289	      POST_URL specified by the service.
290	   6e. In this case the tdft and trackids MAY
291	       be used by the processing entity
292	       to distinguish incoming segments
293	       instead of the target POST_URL.

295	  7. The live encoder MAY send an updated version of the manifest,
296	     this manifest cannot override current settings and relative
297	     paths or break currently running and incoming POST requests.
298	     The updated manifest can only be slightly different from
299	     the one that was send previously, e.g. introduce new segments
300	     available or event messages. The updated manifest SHOULD be
301	     send using a PUT request instead of a POST request.

303	     Note: this manifest will be useful for passive media processing
304	           entities mostly, for ingest towards active media processing
305	           entities this manifest could be avoided and information
306	           is signalled through the boxes available in the ISOBMFF.

308	  8. The encoder or ingest source MUST handle any error or failed
309	     authentication responses received from the media processing
310	     entity such as 403 (forbidden), 400 bad request, 415
311	     unsupported media type, 412 not fulfilling conditions

313	<Mekuria>          Expires November 7 2018                [Page6]
314	  9. In case of a 412 not fullfilling conditions or 415
315	      unsupported media type,
316	      the live source/encoder MUST resend the init segment
317	      consisting of a "moov" and "ftyp" box.
318	  10. The live encoder or ingest source SHOULD start
319	      a new HTTP POST segment request sequence with the
320	      init segment including header boxes "ftyp" and "moov"
321	  11. Following media segment requests SHOULD be corresponding
322	      to the segments listed in the manifest if a manifest was sent.
323	  12. The payload of each request MAY start with the header boxes
324	      "ftyp" and "moov", followed by segments which consist of
325	      a combination of "moof" and "mdat" boxes.

327	      Note that the "ftyp", and "moov" boxes (in this order) MAY be
328	      transmitted with each request, especially if the encoder must
329	      reconnect because the previous POST request was terminated
330	      prior to the end of the stream with a 412 or
331	      415 message. Resending the "moov" and "ftyp" boxes
332	      allows the receiving entitity to recover the init segment
333	      and the track information needed for interpreting the content.
334	  13. The encoder or ingest source MAY use chunked transfer
335	      encoding option of the HTTP POST command [RFC2626] for uploading
336	      as it might be difficult to predict the entire content length
337	      of the segment. This can be used for example to support use
338	      cases that require low latency.
339	  14. The encoder or ingest source SHOULD use individual HTTP POST
340	      commands [RFC2626] for uploading media segments when ready.
341	  15. If the HTTP POST request terminates or times out with a TCP
342	      error prior to the end of the stream, the encoder MUST issue
343	      a new POST request by using a new connection, and follow the
344	      preceding requirements. Additionally, the encoder MAY resend
345	      the previous two segments that were already sent again.
346	  16. In case fixed length POST Commands are used, the live source
347	      entity MUST resend the segment
348	      to be posted decribed in the manifest entirely
349	      in case of responses HTTP 400, 412 or 415 together
350	      with the init segment consisting of "moov" and "ftyp" boxes.
351	  17. In case the live stream event is over the live media
352	      source/encoder should signal
353	      the stop by transmitting an empty "mfra" box
354	      towards the publishing point/processing entity
355	  18. The trackFragmentDecodeTime box "tfdt" box
356	      MUST be present for each segment posted.
357	  19. The ISOBMFF media fragment duration SHOULD be constant,
358	      to reduce the size of the client manifests.
359	      A constant MPEG-4 fragment duration also improves client
360	      download heuristics through the use of repeat tags.
361	      The duration MAY fluctuate to compensate
362	      for non-integer frame rates. By choosing an appropriate
363	      timescale (a multiple of the frame rate is recommended)
364	      this issue can be avoided.

366	<Mekuria>          Expires November 7 2018                [Page6]
367	  20. The MPEG-4 fragment duration SHOULD be between
368	      approximately 2 and 6 seconds.
369	  21. The fragment decode timestamps "tfdt" of fragments in the
370	      fragmentedMP4stream and the indexes base_media_decode_ time
371	      SHOULD arrive in increasing order for each of the different
372	      tracks/streams that are ingested.
373	  22. The segments formatted as fragmented MP4 stream SHOULD use
374	      a timescale for video streams based on the framerate
375	      and 44.1 KHz or 48 KHz for audio streams
376	      or any another timescale that enables integer
377	      increments of the decode times of
378	      fragments signalled in the "tfdt" box based on this scale.
379	  23. The manifest MAY be used to signal the language of the stream,
380	      which SHOULD also be signalled in the "mdhd" box or "elng" boxes
381	      in the init segment and/or moof headers ("mdhd")
382	  24. The manifest SHOULD be used to signal encryption specific
383	      information, which SHOULD also be signalled in the "pssh",
384	      "schm" and "sinf" boxes in the segments of
385	      the init segment and media segments
386	  25. The manifest SHOULD be used to signal information
387	      about the different
388	      tracks such as the durations, media encoding types,
389	      content types, which SHOULD also be signalled in the
390	      "moov" box in the init segment or the "moof" box
391	      in the media segments
392	  26. The manifest SHOULD be used to signal information
393	      about the timed text, images and sub-titles in adaptation
394	      sets and this information SHOULD also be signalled
395	      in the "moov" box in the init segment,
396	      for more information see the next section.
397	  27. Segments posted towards the media procesing entity MUST contain
398	      the bitrate "btrt" box specifying the target bitrate of
399	      the segments and the "tfdt" box specifying the fragments
400	      decode time and the "tfhd" box specifying the track id.
401	  28. The live encoder/media source SHOULD repeatedly resolve
402	      the Hostname to adapt to changes in the IP to Hostname mapping
403	      such as for example by using the dynamic naming system
404	      DNS [RFC1035] or any other system that is in place.
405	  29. The Live encoder media source MUST update the IP to hostname
406	      resolution respecting the TTL (time to live) from DNS
407	      query responses, this will enable better resillience
408	      to changes of the IP address in large scale deployments
409	      where the IP adress of the publishing point media
410	      processing nodes may change frequenty.
411	  30. To support the ingest of live events with low latency,
412	      shorter segment and fragment durations MAY be used
413	      such as segments with a duration of 1 second.
414	  31. The live encoder/media source SHOULD use a separate TCP
415	      connection for ingest of each different bit-rate
416	      tracks ingested

418	<Mekuria>          Expires November 7 2018            [Page8]
419	4. Formatting Requirements for Timed Text, Captions and Subtitles

421	The specification supports ingest of timed text,
422	images, captions and subtitles. we follow the normative
423	reference [MPEG-4-30] in this section.

425	  1. The tracks containing timed text, images, captions
426	  or subtitles MAY be signalled in the manifest by
427	  an adaptationset with the different segments
428	  containing the data of the track.
429	  2. The segment data MAY be posted to the URL
430	  corresponding to the path in the manifest for the segment,
431	  else they MUST be posted towards the original POST_URL
432	  3. The track will be a sparse track signalled by a null media header
433	  "nmhd" containing the timed text, images, captions corresponding
434	  to the recommendation of storing tracks in fragmented MPEG-4 [CMAF]
435	  4. Based on this recommendation the trackhandler "hdlr" shall
436	  be set to "text" for WebVTT and "subt" for TTML
437	  5. In case TTML is used the track must use the XMLSampleEntry
438	  to signal sample description of the sub-title stream
439	  6. In case WebVTT is used the track must use the WVTTSampleEntry
440	  to signal sample description of the text stream
441	  7. These boxes SHOULD signal the mime type and specifics as
442	  described in [CMAF] sections 11.3 ,11.4 and 11.5
443	  8. The boxes described in 3-7 must be present in the init
444	  segment ("ftyp" + "moov") for the given track
445	  9. subtitles in CTA-608 and CTA-708 can be transmitted
446	  following the recommendation section 11.5 in [CMAF] via
447	  SEI messages in the video track
448	  10. The "ftyp" box in the init segment for the track
449	      containing timed text, images, captions and sub-titles
450	      can use signalling using CMAF profiles based on [CMAF]

452	   10a. WebVTT   Specified in 11.2 ISO/IEC 14496-30
453	        [MPEG-4-30] 'cwvt'
454	   10b.TTML IMSC1 Text  Specified in 11.3.3 [MPEG-4-30]
455	       IMSC1 Text Profile   'im1t'
456	   10c.TTML IMSC1 Image Specified in 11.3.4 [MPEG-4-30]
457	       IMSC1 Image Profile  'im1i'
458	   10d. CEA  CTA-608 and CTA-708 Specified in 11.4 [MPEG-4-30]
459	       Caption data is embedded in SEI messages in video track;
460	      'ccea'
461	   11. The segments of the tracks containing Timed Text, Images,
462	       Captions and Sub-titles SHOULD use the bit-rate box "btrt" to
463	       signal bit-rate of the track in each segment.

465	 <Mekuria>          Expires November 7 2018                [Page9]
466	5. Formatting Requirements for Timed Metadata

468	  This section discusses the specific formatting requirements
469	  for ingest of timed metadata related to events and markers for
470	  ad- insertion or other timed metadata relating to the media
471	  content such as information about the content.
472	  When delivering a live streaming presentation with a rich
473	  client experience, often it is necessary to transmit time-synced
474	  events, metadata or other signals in-band with the main
475	  media data. An example of these are opportunities for dynamic
476	  live ad insertion signalled by SCTE-35 markers. This type of
477	  event signalling is different from regular audio/video streaming
478	  because of its sparse nature. In other words, the signalling data
479	  usually does not happen continuously, and the interval can
480	  be hard to predict. Examples of timed metadata are ID3 tags
481	  [ID3v2], SCTE-35 markers [SCTE-35] and DASH emsg
482	  messages defined in section 5.10.3.3 of [DASH]. For example,
483	  DASH Event messages contain a schemeIdUri that defines
484	  the payload of the message. Table 1 provides some
485	  example schemes in DASH event messages and Table 2
486	  illustrates an example of a SCTE-35 marker stored
487	  in a dash emsg. The presented approach allows ingest of
488	  timed metadata from different sources,
489	  possibly on different locations by embedding them in
490	  sparse metadata tracks.

492	Table 1 Example of DASH emsg schemes  URI

494	Scheme URI               | Reference
495	-------------------------|------------------
496	urn:mpeg:dash:event:2012 | [DASH], 5.10.4
497	urn:dvb:iptv:cpm:2014    | [DVB-DASH], 9.1.2.1
498	urn:scte:scte35:2013:bin | [SCTE-35] 14-3 (2015), 7.3.2
499	www.nielsen.com:id3:v1   | Nielsen ID3 in MPEG-DASH

501	Table 2 example of a SCTE-35 marker embedded in a DASH emsg
502	Tag                     |          Value
503	------------------------|-----------------------------
504	scheme_uri_id           | "urn:scte:scte35:2013:bin"
505	Value                   | the value of the SCTE 35 PID
506	Timescale               | positive number
507	presentation_time_delta | non-negative number expressing splice time
508	                        | relative  to tfdt
509	event_duration          | duration of event
510	                        | "0xFFFFFFFF" indicates unknown duration
511	Id                      | unique identifier for message
512	message_data            | splice info section including CRC

514	<Mekuria>          Expires November 7 2018                [Page10]
515	  The following steps are recommended for timed metadata
516	  ingest related to events, tags, ad markers and
517	  program information:
518	  1. Create a fragmentedMP4stream that contains only a sparse
519	   metadata track which are tracks without audio/video.
520	  2. Metadata tracks MAY be signalled in a manifest using an
521	   adaptationset with a sparse track, the actual data
522	   is in the sparse media track in the segments.
523	  3. For a metadata track the media handler type is "meta"
524	   and the tracks handler box is a null media header box "nmhd".
525	  4. The URIMetaSampleEntry entry contains, in a URIbox,
526	     the URI following the URI syntax in [RFC3986] defining the form
527	     of the metadata

529	     (see the ISO Base media file format specification [ISOBMFF]).
530	     For example, the URIBox could contain for ID3 tags  [ID3v2]
531	     the URL  http://www.id3.org
532	  5. For the case of ID3, a sample contains a single ID3 tag.
533	     The ID3 tag may contain one or more ID3 frames.
534	  6. For the case of DASH e-msg, a sample may contain
535	     one or more event message ("emsg") boxes.
536	     Version 0 Event Message SHOULD be used.
537	     The presentation_time_delta field is relative to the absolute
538	     timestamp specified in the TrackFragmentBaseMediaDecode-TimeBox
539	    ("tfdt"). The timescale field should match the value specified
540	     in the media header box "mdhd".
541	  7. For the case of a DASH e-msg, the kind box
542	     (contained in the udta) MUST be used to signal
543	     the scheme URI of the type of metadata
544	  8. A BitRateBox ("btrt") SHOULD be present at the end of
545	     MetaDataSampleEntry to signal the bit rate information
546	     of the stream.
547	  9. If the specific format uses internal timing values,
548	     then the timescale must match the timescale field set
549	     in the media header box "mdhd".
550	  10. All Timed Metadata samples are sync samples [ISOBMFF],
551	    defining the entire set of metadata for the time interval
552	    they cover. Hence, the sync sample table box is not present.
553	  11.   When Timed Metadata is stored in a TrackRunBox ("trun"),
554	    a single sample is present with the duration set to the
555	    duration for that run.

557	  Given the sparse nature of the signalling event, the following
558	  is recommended:
559	  12. At the beginning of the live event, the encoder or
560	      media ingest  source sends the initial header boxes to
561	      the processing entity/publishing point,
562	      which allows the service to register the sparse track.
563	  13. When sending segments, the encoder SHOULD start sending
564	      from the header boxes, followed by the new fragments.

566	 <Mekuria>          Expires November 7 2018                 [Page11]
567	  14. The sparse track segment becomes available to the
568	     publishing  point/processing entity when the corresponding
569	     parent track fragment that has an equal or larger timestamp
570	     value is made available. For example, if the sparse fragment
571	     has a timestamp of t=1000, it is expected that after the
572	     publishing point/processing entity sees "video"
573	    (assuming the parent track name is "video")
574	    fragment timestamp 1000 or beyond, it can retrieve the
575	    sparse fragment t=1000. Note that the actual
576	    signal could be used for a different position
577	    in the presentation timeline for its designated purpose.
578	    In this example, it is possible that the sparse fragment
579	    of t=1000 has an XML payload, which is for inserting
580	    an ad in a position that is a few seconds later.
581	  15.   The payload of sparse track fragments can be in
582	    different formats (such as XML, text, or binary),
583	    depending on the scenario

585	6. Guidelines for Handling of Media Processing Entity Failover

587	  Given the nature of live streaming, good failover support is
588	  critical for ensuring the availability of the service.
589	  Typically, media services are designed to handle various types
590	  of failures, including network errors, server errors, and storage
591	  issues. When used in conjunction with proper failover
592	  logic from the live encoder side, customers can achieve
593	  a highly reliable live streaming service from the cloud.
594	  In this section, we discuss service failover scenarios.
595	  In this case, the failure happens somewhere within the service,
596	  and it manifests itself as a network error. Here are some
597	  recommendations for the encoder implementation for handling
598	  service failover:
599	  1.    Use a 10-second timeout for establishing the
600	     TCP connection.
601	    If an attempt to establish the connection takes longer
602	    than 10 seconds, abort the operation and try again.
603	  2.    Use a short timeout for sending the HTTP requests.
604	    If the target segment duration is N seconds, use a send
605	    timeout between N and 2 N seconds; for example, if
606	    the segment duration is 6 seconds,
607	    use a timeout of 6 to 12 seconds.
608	    If a timeout occurs, reset the connection,
609	    open a new connection,
610	    and resume stream ingest on the new connection.
611	    This is needed to avoid latency introduced
612	    by failing connectivity in the workflow.
613	  3. completely resend segments from the ingest source
614	    for which a connection was terminated early
615	  4.    We recommend that the encoder or ingest source
616	    does NOT limit the number of retries to establish a
617	    connection or resume streaming after a TCP error occurs.

619	<Mekuria>          Expires November 7 2018                 [Page12]
620	  5.    After a TCP error:
621	   a. The current connection MUST be closed,
622	      and a new connection MUST be created
623	      for a new HTTP POST request.
624	   b. The new HTTP POST URL MUST be the same
625	      as the initial POST URL for the
626	      segment to be ingested.
627	   c. The new HTTP POST MUST include stream
628	      headers ("ftyp", and "moov" boxes) that are
629	      identical to the stream headers in the
630	      initial POST request for fragmented media ingest.
631	   d. The last two fragments sent for each segment
632	      MAY be retransmitted. Other ISOBMFF fragment
633	      timestamps MUST increase continuously,
634	      even across HTTP POST requests.
635	  6.  The encoder or ingest source SHOULD terminate
636	    the HTTP POST request if data is not being sent
637	    at a rate commensurate with the MP4 segment duration.
638	    An HTTP POST request that does not send data can
639	    prevent publishing points or media processing entities
640	    from quickly disconnecting from the live encoder or
641	    media ingest source in the event of a service update.
642	    For this reason, the HTTP POST for sparse (ad signal)
643	    tracks SHOULD be short-lived, terminating as soon as
644	    the sparse fragment is sent.
645	   In addition this draft defines responses to the
646	   POST requests in order to signal the live media source its status.
647	   7.  In case the media processing entity cannot process the manifest
648	    or segment POST request due to authentication or permission
649	    problems then it can return a permission denied HTTP 403
650	   8.  In case the media processing entity can process the manifest
651	    or segment POSTED to the POST_URL it returns HTTP 200 OK or
652	    202 Accepted
653	   9.  In case the media processing entity can process
654	    the manifest or segment POST request but finds
655	    the media type cannot be supported it returns HTTP 415
656	    unsupported media type
657	   10. In case an unknown error happened during
658	       the processing of the HTTP
659	        POST request a HTTP 400 Bad request is returned
660	   11. In case the media processing entity cannot
661	       proces a segment posted
662	       due to missing init segment, a HTTP 412
663	       unfulfilled condition
664	       is returned
665	   12. In case a media source receives an HTTP 412 response,
666	       it SHOULD resend the manifest and "ftyp" and "moov"
667	       boxes for the track.

669	<Mekuria>          Expires November 7 2018                 [Page13]
670	An example of media ingest with failure and HTTP
671	responses is shown in the following figure:

673	||===============================================================||
674	||=====================            ============================  ||
675	||| live media source |            |  Media processing entity |  ||
676	||=====================            ============================  ||
677	||        ||                                     ||              ||
678	||===============Initial Manifest Sending========================||
679	||        ||                                     ||              ||
680	||        ||-- POST /prefix/media.mpd  -------->>||              ||
681	||        ||          Succes                     ||              ||
682	||        || <<------ 200 OK --------------------||              ||
683	||        ||      Permission denied              ||              ||
684	||        || <<------ 403 Forbidden -------------||              ||
685	||        ||             Bad Request             ||              ||
686	||        || <<------ 400 Forbidden -------------||              ||
687	||        ||         Unsupported Media Type      ||              ||
688	||        || <<------ 415 Unsupported Media -----||              ||
689	||        ||                                     ||              ||
690	||==================== Segment Sending ==========================||
691	||        ||-- POST /prefix/chunk.cmaf  ------->>||              ||
692	||        ||          Succes/Accepted            ||              ||
693	||        || <<------ 200 OK --------------------||              ||
694	||        ||          Succes/Accepted            ||              ||
695	||        || <<------ 202 OK --------------------||              ||
696	||        ||      Premission Denied              ||              ||
697	||        || <<------ 403 Forbidden -------------||              ||
698	||        ||             Bad Request             ||              ||
699	||        || <<------ 400 Forbidden -------------||              ||
700	||        ||         Unsupported Media Type      ||              ||
701	||        || <<------ 415 Forbidden -------------||              ||
702	||        ||         Unsupported Media Type      ||              ||
703	||        || <<-- 412 Unfulfilled Condition -----||              ||
704	||        ||                                     ||              ||
705	||        ||                                     ||              ||
706	||=====================            ============================  ||
707	||| live media source |            |  Media processing entity |  ||
708	||=====================            ============================  ||
709	||        ||                                     ||              ||
710	||===============================================================||

712	<Mekuria>          Expires November 7 2018                 [Page13]
713	7. Guidelines for Handling of Live Media Source Failover
714	  Encoder or media ingest source failover is the second type
715	  of failover scenario that needs to be addressed for end-to-end
716	  live streaming delivery. In this scenario, the error condition
717	  occurs on the encoder side. The following expectations apply fro
718	  m the live ingestion endpoint when encoder failover happens:
719	  1.    A new encoder or media ingest source instance
720	        SHOULD be created to continue streaming
721	  2.    The new encoder or media ingest source MUST use
722	        the same URL for HTTP POST requests as the failed instance.
723	  3.    The new encoder or media ingest source POST request
724	        MUST include the same header boxes moov
725	        and ftyp as the failed instance.
726	  4.    The new encoder or media ingest source
727	        MUST be properly synced with all other running encoders
728	        for the same live presentation to generate synced audio/video
729	        samples with aligned fragment boundaries.
730	        This implies that UTC timestamps
731	        for fragments in the "tdft" match between decoders,
732	        and encoders start running at
733	        an appropriate segment boundary.
734	  5.    The new stream MUST be semantically equivalent
735	        with the previous stream, and interchangeable
736	        at the header and media fragment levels.
737	  6.    The new encoder or media ingest source SHOULD
738	        try to minimize data loss. The basemediadecodetime tdft
739	        of media fragments SHOULD increase from the point where
740	        the encoder last stopped. The basemediadecodetime in the
741	        "tdft" box SHOULD increase in a continuous manner, but it
742	        is permissible to introduce a discontinuity, if necessary.
743	        Media processing entities or publishing points can ignore
744	        fragments that it has already received and processed, so
745	        it is better to error on the side of resending fragments
746	        than to introduce discontinuities in the media timeline.

748	8.  Security Considerations

750	   No security considerations except the ones mentioned
751	   in the preceding text. Further
752	   security considerations will be updated
753	   when they become known.

755	9.  IANA Considerations

757	  This memo includes no request to IANA.

759	10.  Contributors

761	Arjen Wagenaar, Dirk Griffioen, Unified Streaming B.V.
762	We thank all of the individual contributors to the discussions
763	in [fmp4git] representing major content delivery networks,
764	broadcasters, commercial encoders and cloud service providers.

766	<Mekuria>          Expires November 7 2018                 [Page14]
767	11.  References

769	11.1.  Normative References

771	    [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
772	              Requirement Levels", BCP 14, RFC 2119, March 1997.

774	    [DASH]  MPEG ISO/IEC JTC1/SC29 WG11, "ISO/IEC 23009-1:2014:
775	            Dynamic adaptive streaming over HTTP (DASH) -- Part 1:
776	            Media presentation description and segment formats," 2014.

778	    [SCTE-35] Society of Cable Television Engineers,
779	              "SCTE-35 (ANSI/SCTE 35 2013)
780	               Digital Program Insertion Cueing Message for Cable,"
781	               SCTE-35 (ANSI/SCTE 35 2013).

783	    [ISOBMFF] MPEG ISO/IEC JTC1/SC29 WG11, " Information technology
784	              -- Coding of audio-visual objects Part 12: ISO base
785	              media file format ISO/IEC 14496-12:2012"

787	    [HEVC]    MPEG ISO/IEC JTC1/SC29 WG11,
788	              "Information technology -- High efficiency coding
789	              and media delivery in heterogeneous environments
790	              -- Part 2: High efficiency video coding",
791	              ISO/IEC 23008-2:2015, 2015.

793	    [RFC793]  J Postel IETF DARPA, "TRANSMISSION CONTROL PROTOCOL,"
794	               IETF RFC 793, 1981.

796	    [RFC3986] R. Fielding, L. Masinter, T. Berners Lee,
797	              "Uniform Resource Identifiers (URI): Generic Syntax,"
798	               IETF RFC 3986, 2004.

800	    [RFC1035] P. Mockapetris,
801	              "DOMAIN NAMES - IMPLEMENTATION AND SPECIFICATION"
802	              IETF RFC 1035, 1987.

804	    [CMAF]   MPEG ISO/IEC JTC1/SC29 WG11, "Information technology
805	             (MPEG-A) -- Part 19: Common media application
806	             format (CMAF) for segmented media,"
807	             MPEG, ISO/IEC International standard

809	    [RFC5234] D. Crocker "Augmented BNF for Syntax Specifications:
810	              ABNF"  IETF RFC 5234 2008

812	    [CENC]   MPEG ISO/IEC JTC1 SC29 WG11 "Information technology --
813	             MPEG systems technologies -- Part 7: Common encryption
814	             in ISO base media file format files"
815	             ISO/IEC 23001-7:2016

817	 <Mekuria>          Expires November 7 2018                [Page15]

819	    [MPEG-4-30] MPEG ISO/IEC JTC1 SC29 WG11
820	              "ISO/IEC 14496-30:2014 Information technology
821	              Coding of audio-visual objects -- Part 30":
822	              Timed text and other visual overlays in
823	              ISO base media file format

825	   [ISO639-2] ISO 639-2  "Codes for the Representation of Names
826	              of Languages -- Part 2 ISO 639-2:1998"

828	   [DVB-DASH] ETSI Digital Video Broadcasting
829	               "MPEG-DASH Profile for Transport of ISOBMFF
830	               Based DVB Services over IP Based Networks"
831	               ETSI TS 103 285

833	   [RFC7617] J Reschke "The 'Basic' HTTP Authentication Scheme"
834	             IETF RFC 7617 September 2015

836	11.2.  Informative References

838	    [RFC2626]  R. Fielding et al
839	             "Hypertext Transfer Protocol HTTP/1.1",
840	             RFC 2626 June 1999

842	    [RFC2818] E. Rescorla RFC 2818 HTTP over TLS
843	             IETF RFC 2818 May 2000

845	11.3.  URL References

847	   [fmp4git]    Unified Streaming github fmp4 ingest,
848	                "https://github.com/unifiedstreaming/fmp4-ingest".

850	   [MozillaTLS] Mozilla Wikie Security/Server Side TLS
851	                https://wiki.mozilla.org/Security/Server_Side_TLS
852	                #Intermediate_compatibility_.28default.29
853	                (last acessed 30th of March 2018)

855	    [ID3v2]      M. Nilsson  "ID3 Tag version 2.4.0 Main structure"
856	                http://id3.org/id3v2.4.0-structure
857	                November 2000 (last acessed 2nd of May 2018)

859	Author's Address

861	   Rufael Mekuria (editor)
862	   Unified Streaming
863	   Overtoom 60 1054HK

865	   Phone: +31 (0)202338801
866	   E-Mail: rufael@unified-streaming.com

868	<Mekuria>          Expires November 7 2018                 [Page16]