idnits 2.17.1 

draft-ietf-avtext-avpf-ccm-layered-02.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

     (Using the creation date from RFC5104, updated by this document, for
     RFC5378 checks: 2006-08-29)

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (September 22, 2016) is 2773 days in the past.  Is
     this intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

     No issues found here.

     Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	Network Working Group                                          S. Wenger
3	Internet-Draft                                                 J. Lennox
4	Updates: 5104 (if approved)                                  Vidyo, Inc.
5	Intended status: Standards Track                               B. Burman
6	Expires: March 26, 2017                                    M. Westerlund
7	                                                                Ericsson
8	                                                      September 22, 2016

10	   Using Codec Control Messages in the RTP Audio-Visual Profile with
11	                      Feedback with Layered Codecs
12	                 draft-ietf-avtext-avpf-ccm-layered-02

14	Abstract

16	   This document updates RFC5104 by fixing a shortcoming in the
17	   specification language of the Codec Control Message Full Intra
18	   Request (FIR) as defined in RFC5104 when using it with layered
19	   codecs.  In particular, a Decoder Refresh Point needs to be sent by a
20	   media sender when a FIR is received on any layer of the layered
21	   bitstream, regardless on whether those layers are being sent in a
22	   single or in multiple RTP flows.  The other payload-specific feedback
23	   messages defined in RFC 5104 and RFC 4585 as updated by RFC 5506 have
24	   also been analyzed, and no corresponding shortcomings have been
25	   found.

27	Status of This Memo

29	   This Internet-Draft is submitted in full conformance with the
30	   provisions of BCP 78 and BCP 79.

32	   Internet-Drafts are working documents of the Internet Engineering
33	   Task Force (IETF).  Note that other groups may also distribute
34	   working documents as Internet-Drafts.  The list of current Internet-
35	   Drafts is at http://datatracker.ietf.org/drafts/current/.

37	   Internet-Drafts are draft documents valid for a maximum of six months
38	   and may be updated, replaced, or obsoleted by other documents at any
39	   time.  It is inappropriate to use Internet-Drafts as reference
40	   material or to cite them other than as "work in progress."

42	   This Internet-Draft will expire on March 26, 2017.

44	Copyright Notice

46	   Copyright (c) 2016 IETF Trust and the persons identified as the
47	   document authors.  All rights reserved.

49	   This document is subject to BCP 78 and the IETF Trust's Legal
50	   Provisions Relating to IETF Documents
51	   (http://trustee.ietf.org/license-info) in effect on the date of
52	   publication of this document.  Please review these documents
53	   carefully, as they describe your rights and restrictions with respect
54	   to this document.  Code Components extracted from this document must
55	   include Simplified BSD License text as described in Section 4.e of
56	   the Trust Legal Provisions and are provided without warranty as
57	   described in the Simplified BSD License.

59	Table of Contents

61	   1.  Introduction and Problem Statement  . . . . . . . . . . . . .   2
62	   2.  Requirements Language . . . . . . . . . . . . . . . . . . . .   4
63	   3.  Updated definition of Decoder Refresh Point . . . . . . . . .   4
64	   4.  Full Intra Request for Layered Codecs . . . . . . . . . . . .   5
65	   5.  Identifying the use of Layered Codecs (Informative) . . . . .   5
66	   6.  Layered Codecs and non-FIR codec control messages
67	       (Informative) . . . . . . . . . . . . . . . . . . . . . . . .   6
68	     6.1.  Picture Loss Indication (PLI) . . . . . . . . . . . . . .   6
69	     6.2.  Slice Loss Indication (SLI) . . . . . . . . . . . . . . .   6
70	     6.3.  Reference Picture Selection Indication (RPSI) . . . . . .   7
71	     6.4.  Temporal-Spatial Trade-off Request and Notification
72	           (TSTR/TSTN) . . . . . . . . . . . . . . . . . . . . . . .   7
73	     6.5.  H.271 Video Back Channel Message (VBCM) . . . . . . . . .   8
74	   7.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .   8
75	   8.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   8
76	   9.  Security Considerations . . . . . . . . . . . . . . . . . . .   8
77	   10. References  . . . . . . . . . . . . . . . . . . . . . . . . .   8
78	     10.1.  Normative References . . . . . . . . . . . . . . . . . .   8
79	     10.2.  Informative References . . . . . . . . . . . . . . . . .   9
80	   Appendix A.  Change Log . . . . . . . . . . . . . . . . . . . . .  10
81	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  10

83	1.  Introduction and Problem Statement

85	   Extended RTP Profile for Real-time Transport Control Protocol (RTCP)-
86	   Based Feedback (RTP/AVPF) [RFC4585] and Codec Control Messages in the
87	   RTP Audio-Visual Profile with Feedback (AVPF) [RFC5104] specify a
88	   number of payload-specific feedback messages which a media receiver
89	   can use to inform a media sender of certain conditions, or make
90	   certain requests.  The feedback messages are being sent as RTCP
91	   receiver reports, and RFC 4585 specifies timing rules that make the
92	   use of those messages practical for time-sensitive codec control.

94	   Since the time those RFCs were developed, layered codecs have gained
95	   in popularity and deployment.  Layered codecs use multiple sub-
96	   bitstreams called layers to represent the content in different
97	   fidelities.  Depending on the media codec and its RTP payload format
98	   in use, single layers or groups of layers may be sent in their own
99	   RTP streams (in MRST or MRMT mode as defined in A Taxonomy of
100	   Semantics and Mechanisms for Real-Time Transport Protocol (RTP)
101	   Sources [RFC7656]), or multiplexed (using media-codec specific
102	   multiplexing mechanisms) in a single RTP stream (SRST mode as defined
103	   in [RFC7656]).  The dependency relationship between layers forms a
104	   directed graph, with the base layer at the root.  Enhancement layers
105	   depend on the base layer and potentially on other enhancement layers,
106	   and the target layer and all layers it depends on have to be decoded
107	   jointly in order to re-create the uncompressed media signal at the
108	   fidelity of the target layer.

110	   Implementation experience has shown that the Full Intra Request
111	   command as defined in [RFC5104] is underspecified when used with
112	   layered codecs and when more than one RTP stream is used to transport
113	   the layers of a layered bitstream at a given fidelity.  In
114	   particular, from the [RFC5104] specification language it is not clear
115	   whether an FIR received for only a single RTP stream of multiple RTP
116	   streams covering the same layered bitstream necessarily triggers the
117	   sending of a Decoder Refresh Point (as defined in [RFC5104] section
118	   2.2) for all layers, or only for the layer which is transported in
119	   the RTP stream which the FIR request is associated with.

121	   This document fixes this shortcoming by:

123	   a.  Updating the definition of the Decoder Refresh Point (as defined
124	       in [RFC5104] section 2.2) to cover layered codecs, in line with
125	       the corresponding definitions used in a popular layered codec
126	       format, namely H.264/SVC [H.264].  Specifically, a decoder
127	       refresh point, in conjunction with layered codecs, resets the
128	       state of the whole decoder, which implies that it includes hard
129	       or gradual single-layer decoder refresh for all layers;

131	   b.  Requiring that, when a media sender receives a Full Intra Request
132	       over the RTCP stream associated with any of the RTP streams over
133	       which a part of the layered bitstream is transported, to send a
134	       Decoder Refresh Point;

136	   c.  Require that a media receiver sends the FIR on the RTCP stream
137	       associated with the base layer (the option of receiving FIR on
138	       enhancement layer-associated RTCP stream as specified in point b)
139	       above is kept for backward compatibility); and

141	   d.  Providing guidance on how to detect that a layered codec is in
142	       use for which the above rules apply.

144	   While, clearly, the reaction to FIR for layered codecs in [RFC5104]
145	   and companion documents is underspecified, it appears that this is
146	   not the case for any of the other payload-specific codec control
147	   messages defined in any of [RFC4585], [RFC5104].  A brief summary of
148	   the analysis that led to this conclusion is also included in this
149	   document.

151	2.  Requirements Language

153	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
154	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
155	   document are to be interpreted as described in RFC 2119 [RFC2119].

157	3.  Updated definition of Decoder Refresh Point

159	   The text below updates the definition of Decoder Refresh Point in
160	   section 2.2 of [RFC5104].

162	   Decoder Refresh Point: A bit string, packetized in one or more RTP
163	   packets, that completely resets the decoder to a known state.

165	   Examples for "hard" single layer decoder refresh points are Intra
166	   pictures in H.261 [H.261], H.263 [H.263], MPEG-1 [MPEG-1], MPEG-2
167	   [MPEG-2], and MPEG-4 [MPEG-4]; Instantaneous Decoder Refresh (IDR)
168	   pictures in H.264 [H.264], and H.265 [H.265]; and Keyframes in VP8
169	   [RFC6386] and VP9 [I-D.grange-vp9-bitstream].  "Gradual" decoder
170	   refresh points may also be used; see for example H.264 [H.264].
171	   While both "hard" and "gradual" decoder refresh points are acceptable
172	   in the scope of this specification, in most cases the user experience
173	   will benefit from using a "hard" decoder refresh point.

175	   A decoder refresh point also contains all header information above
176	   the syntactical level of the picture layer (or equivalent, depending
177	   on the video compression standard) that is conveyed in-band.  In
178	   [H.264], for example, a decoder refresh point contains parameter set
179	   Network Adaptation Layer (NAL) units that generate parameter sets
180	   necessary for the decoding of the following slice/data partition NAL
181	   units (and that are not conveyed out of band).

183	   When a layered codec is in use, the above definition (and, in
184	   particular, the requirement to COMPLETELY reset the decoder to a
185	   known state) implies that the decoder refresh point includes hard or
186	   gradual single layer decoder refresh points for all layers.

188	4.  Full Intra Request for Layered Codecs

190	   When a media receiver or middlebox has decided to send a FIR command
191	   (based on the guidance provided in Section 4.3.1 of [RFC5104], it
192	   MUST target the RTP stream that carries the base layer of the layered
193	   bitstream, and this is done by setting the Feedback Control
194	   Information (FCI, and in particular the SSRC field therein) to refer
195	   to the SSRC of the forward RTP stream that carries the base layer.

197	   When a Full Intra Request Command is received by the designated media
198	   sender in the RTCP stream associated with any of the RTP streams in
199	   which any layer of a layered bitstream are sent, the designated media
200	   sender MUST send a Decoder Refresh Point (Section 3) as defined above
201	   at its earliest opportunity.  The requirements related to congestion
202	   control on the forward RTP streams as specified in sections 3.5.1 and
203	   5. of [RFC5104] apply for the RTP streams both in isolation and
204	   combined.

206	   Note: the requirement to react to FIR commands associated with
207	   enhancement layers is included for robustness and backward
208	   compatibility reasons.

210	5.  Identifying the use of Layered Codecs (Informative)

212	   The above modifications to RFC 5104 unambiguously define how to deal
213	   with FIR when layered bitstreams are in use.  However, it is
214	   surprisingly difficult to identify this situation.  In general, it is
215	   expected that implementers know when layered coding (in its commonly
216	   understood sense: with inter-layer prediction between pyramided-
217	   arranged layers) is in use and when not, and can therefore implement
218	   the above updates to RFC 5104 correctly.  However, there are use
219	   cases of the use of layered codecs that may be viewed as somewhat
220	   exotic today but clearly are supported by the video coding syntax, in
221	   which the above rules would lead to suboptimal system behavior.
222	   Nothing would break, and there would not be an interop failure, but
223	   the user experience may suffer through the sending or receiving of
224	   Decoder Refresh Points at times or on parts of the bitstream that are
225	   unnecessary from a user experience viewpoint.  Therefore, this
226	   informative section is included that provides the current
227	   understanding of when a layered codec is in use and when not.

229	   The key observation made here is that the RTP payload format
230	   negotiated for the RTP streams, in isolation, is not necessarily an
231	   indicator for the use of layering.  Some layered codecs (including
232	   H.264/SVC) can form decodable bitstreams including only (one or more)
233	   enhancement layers, without the base layer, effectively creating
234	   simulcastable sub-bitstreams in a scalable bitstream that does not
235	   take advantage of inter-layer prediction.  In such a scenario, it is
236	   potentially (though not necessarily) unnecessary--or even counter-
237	   productive--to send a decoder refresh point on all RTP streams using
238	   that payload format and SSRC.

240	   One good indication of the likely use of layering with interlayer
241	   prediction is when the various RTP streams are "bound" together on
242	   the signaling level.  In an SDP environment, this would be the case
243	   if they are marked as being dependent from each other using The
244	   Session Description Protocol (SDP) Grouping Framework [RFC5888] and
245	   the layer dependency RFC 5583 [RFC5583].

247	6.  Layered Codecs and non-FIR codec control messages (Informative)

249	   Between them, AVPF [RFC4585] and Codec Control Messages [RFC5104]
250	   define a total of seven Payload-specific Feedback messages.  For the
251	   FIR command message, guidance has been provided above.  In this
252	   section, some information is provided with respect to the remaining
253	   six codec control messages.

255	6.1.  Picture Loss Indication (PLI)

257	   PLI is defined in section 6.3.1 of [RFC4585].  The prudent response
258	   to a PLI message received for an enhancement layer is to "repair"
259	   (through whatever source-coding specific means) that enhancement
260	   layer and all dependent enhancement layers, but not the reference
261	   layer(s) used by the enhancement layer for which the PLI was
262	   received.  The encoder can figure out by itself what constitutes a
263	   dependent enhancement layer and does not need help from the system
264	   stack in doing so.  Insofar, there is nothing that needs to be
265	   specified herein.

267	6.2.  Slice Loss Indication (SLI)

269	   SLI is defined in section 6.3.2 of [RFC4585].  The authors' current
270	   understanding is that the prudent response to a SLI message received
271	   for an enhancement layer is to "repair" (through whatever source-
272	   coding specific means) the affected spatial area of that enhancement
273	   layer and all dependent enhancement layers, but not the reference
274	   layers used by the enhancement layer for which the SLI was received.
275	   The encoder can figure out by itself what constitutes a dependent
276	   enhancement layer and does not need help from the system stack in
277	   doing so.  Insofar, there is nothing that needs to be specified
278	   herein.  SLI has seen very little implementation and, as far as it is
279	   known, none in conjunction with layered systems.

281	6.3.  Reference Picture Selection Indication (RPSI)

283	   RPSI is defined in section 6.3.3 of [RFC4585].  While a technical
284	   equivalent of RPSI has been in use with non-layered systems for many
285	   years, no implementations are known in conjunction of layered codecs.
286	   The authors' current understanding is that the reception of an RPSI
287	   message on any layer indicating a missing reference picture forces
288	   the encoder to appropriately handle that missing reference picture in
289	   the layer indicated, and all dependent layers.  Insofar, RPSI should
290	   work without further need for specification language.

292	6.4.  Temporal-Spatial Trade-off Request and Notification (TSTR/TSTN)

294	   TSTN/TSTR are defined in section 4.3.2 and 4.3.3 of [RFC5104],
295	   respectively.  The TSTR request allows to communicate (typically
296	   user-interface-obtained) guidance of the preferred trade-off between
297	   spatial quality and frame rate.  A technical equivalent of TSTN/TSTR
298	   has seen deployment for many years in non-scalable systems.

300	   The Temporal-Spatial Trade-off request and notification messages
301	   include an SSRC target, which (similarly to FIR) may refer to an RTP
302	   stream carrying a base layer, an enhancement layer, or multiple
303	   layers.  Therefore, the authors' current understanding is that the
304	   semantics of the message applies to the layers present in the
305	   targeted RTP stream.

307	   It is noted that per-layer TSTR/TSTN is a mechanism that is, in some
308	   ways, counterproductive in a system using layered codecs.  Given a
309	   sufficiently complex layered bitstream layout, a sending system has
310	   flexibility in adjusting the spatio/temporal quality balance by
311	   adding and removing temporal, spatial, or quality enhancement layers.
312	   At present it is unclear whether an allowed (or even recommended)
313	   option to the reception of a TSTR is to adjust the bit allocation
314	   within the layer(s) present in the addressed RTP stream, or to adjust
315	   the layering structure accordingly--which can involve more than just
316	   the addressed RTP stream.

318	   Until there is a sufficient critical mass of implementation practice,
319	   it is probably prudent for an implementer not to assume either of the
320	   two options (or any middleground that may exist between the two), be
321	   liberal in accepting TSTR messages, perhaps responding in TSTN
322	   indicating "no change," not sending TSTR messages except when
323	   operating in SRST mode as defined in [RFC7656], and contribute to the
324	   IETF documentation of any implementation requirements that make per-
325	   layer TSTR/TSTN useful.

327	6.5.  H.271 Video Back Channel Message (VBCM)

329	   VBCM is defined in section 4.3.4 of [RFC5104].  What was said above
330	   for RPSI (Section 6.3) applies here as well.

332	7.  Acknowledgements

334	   The authors want to thank Mo Zanaty for useful discussions.

336	8.  IANA Considerations

338	   This memo includes no request to IANA.

340	9.  Security Considerations

342	   The security considerations of AVPF [RFC4585] (as updated by Support
343	   for Reduced-Size Real-Time Transport Control Protocol (RTCP):
344	   Opportunities and Consequences [RFC5506]) and Codec Control Messages
345	   [RFC5104] apply.  The clarified response to FIR does not require any
346	   updates.

348	10.  References

350	10.1.  Normative References

352	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
353	              Requirement Levels", BCP 14, RFC 2119,
354	              DOI 10.17487/RFC2119, March 1997,
355	              <http://www.rfc-editor.org/info/rfc2119>.

357	   [RFC4585]  Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey,
358	              "Extended RTP Profile for Real-time Transport Control
359	              Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585,
360	              DOI 10.17487/RFC4585, July 2006,
361	              <http://www.rfc-editor.org/info/rfc4585>.

363	   [RFC5104]  Wenger, S., Chandra, U., Westerlund, M., and B. Burman,
364	              "Codec Control Messages in the RTP Audio-Visual Profile
365	              with Feedback (AVPF)", RFC 5104, DOI 10.17487/RFC5104,
366	              February 2008, <http://www.rfc-editor.org/info/rfc5104>.

368	   [RFC5506]  Johansson, I. and M. Westerlund, "Support for Reduced-Size
369	              Real-Time Transport Control Protocol (RTCP): Opportunities
370	              and Consequences", RFC 5506, DOI 10.17487/RFC5506, April
371	              2009, <http://www.rfc-editor.org/info/rfc5506>.

373	10.2.  Informative References

375	   [H.261]    ITU-T, "ITU-T Rec. H.261: Video codec for audiovisual
376	              services at p x 64 kbit/s", 1993,
377	              <http://handle.itu.int/11.1002/1000/1088>.

379	   [H.263]    ITU-T, "ITU-T Rec. H.263: Video coding for low bit rate
380	              communication", 2005,
381	              <http://handle.itu.int/11.1002/1000/7497>.

383	   [H.264]    ITU-T, "ITU-T Rec. H.264: Advanced video coding for
384	              generic audiovisual services", 2014,
385	              <http://handle.itu.int/11.1002/1000/12063>.

387	   [H.265]    ITU-T, "ITU-T Rec. H.265: High efficiency video coding",
388	              2015, <http://handle.itu.int/11.1002/1000/12455>.

390	   [I-D.grange-vp9-bitstream]
391	              Grange, A. and H. Alvestrand, "A VP9 Bitstream Overview",
392	              draft-grange-vp9-bitstream-00 (work in progress), February
393	              2013.

395	   [MPEG-1]   ISO/IEC, "ISO/IEC 11172-2:1993 Information technology --
396	              Coding of moving pictures and associated audio for digital
397	              storage media at up to about 1,5 Mbit/s -- Part 2: Video",
398	              1993.

400	   [MPEG-2]   ISO/IEC, "ISO/IEC 13818-2:2013 Information technology --
401	              Generic coding of moving pictures and associated audio
402	              information -- Part 2: Video", 2013.

404	   [MPEG-4]   ISO/IEC, "ISO/IEC 14496-2:2004 Information technology --
405	              Coding of audio-visual objects -- Part 2: Visual", 2004.

407	   [RFC5583]  Schierl, T. and S. Wenger, "Signaling Media Decoding
408	              Dependency in the Session Description Protocol (SDP)",
409	              RFC 5583, DOI 10.17487/RFC5583, July 2009,
410	              <http://www.rfc-editor.org/info/rfc5583>.

412	   [RFC5888]  Camarillo, G. and H. Schulzrinne, "The Session Description
413	              Protocol (SDP) Grouping Framework", RFC 5888,
414	              DOI 10.17487/RFC5888, June 2010,
415	              <http://www.rfc-editor.org/info/rfc5888>.

417	   [RFC6386]  Bankoski, J., Koleszar, J., Quillio, L., Salonen, J.,
418	              Wilkins, P., and Y. Xu, "VP8 Data Format and Decoding
419	              Guide", RFC 6386, DOI 10.17487/RFC6386, November 2011,
420	              <http://www.rfc-editor.org/info/rfc6386>.

422	   [RFC7656]  Lennox, J., Gross, K., Nandakumar, S., Salgueiro, G., and
423	              B. Burman, Ed., "A Taxonomy of Semantics and Mechanisms
424	              for Real-Time Transport Protocol (RTP) Sources", RFC 7656,
425	              DOI 10.17487/RFC7656, November 2015,
426	              <http://www.rfc-editor.org/info/rfc7656>.

428	Appendix A.  Change Log

430	   NOTE TO RFC EDITOR: Please remove this section prior to publication.

432	   draft-wenger-avtext-avpf-ccm-layered-00-00: initial version

434	   draft-ietf-avtext-avpf-ccm-layered-00: resubmit as avtext WG draft
435	   per IETF95 and list confirmation by Rachel 4/25/2016

437	   draft-ietf-avtext-avpf-ccm-layered-00: In section "Identifying the
438	   use of Layered Codecs (Informative)", removed last sentence that
439	   could be misread that the explicit signaling of simulcasting in
440	   conjunction with payload formats supporting layered coding implies no
441	   layering.

443	Authors' Addresses

445	   Stephan Wenger
446	   Vidyo, Inc.

448	   Email: stewe@stewe.org

450	   Jonathan Lennox
451	   Vidyo, Inc.

453	   Email: jonathan@vidyo.com

455	   Bo Burman
456	   Ericsson
457	   Kistavagen 25
458	   SE - 164 80 Kista
459	   Sweden

461	   Email: bo.burman@ericsson.com
462	   Magnus Westerlund
463	   Ericsson
464	   Farogatan 2
465	   SE- 164 80 Kista
466	   Sweden

468	   Phone: +46107148287
469	   Email: magnus.westerlund@ericsson.com